Message ID | 20231210164009.1551147-3-Jiqian.Chen@amd.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Support device passthrough when dom0 is PVH on Xen | expand |
On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: > If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for > a passthrough device by using gsi, see > xen_pt_realize->xc_physdev_map_pirq and > pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq > will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq > is not allowed because currd is PVH dom0 and PVH has no > X86_EMU_USE_PIRQ flag, it will fail at has_pirq check. > So, allow PHYSDEVOP_map_pirq when currd is dom0 no matter if > dom0 has X86_EMU_USE_PIRQ flag and also allow > PHYSDEVOP_unmap_pirq for the failed path to unmap pirq. > > What's more, in PVH dom0, the gsis don't get registered, but > the gsi of a passthrough device must be configured for it to > be able to be mapped into a hvm domU. > So, add PHYSDEVOP_setup_gsi for PVH dom0, because PVH dom0 > will setup gsi during assigning a device to passthrough. > > Co-developed-by: Huang Rui <ray.huang@amd.com> > Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> > --- > xen/arch/x86/hvm/hypercall.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c > index 6ad5b4d5f1..621d789bd3 100644 > --- a/xen/arch/x86/hvm/hypercall.c > +++ b/xen/arch/x86/hvm/hypercall.c > @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > > switch ( cmd ) > { > + case PHYSDEVOP_setup_gsi: I think given the new approach on the Linux side patches, where pciback will configure the interrupt, there's no need to expose setup_gsi anymore? > case PHYSDEVOP_map_pirq: > case PHYSDEVOP_unmap_pirq: > + if ( is_hardware_domain(currd) ) > + break; Also Jan already pointed this out in v2: this hypercall needs to be limited so a PVH dom0 cannot execute it against itself. IOW: refuse the hypercall if DOMID_SELF or the passed domid matches the current domain domid. Thanks, Roger.
On 2023/12/11 23:31, Roger Pau Monné wrote: > On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: >> If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for >> a passthrough device by using gsi, see >> xen_pt_realize->xc_physdev_map_pirq and >> pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq >> will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq >> is not allowed because currd is PVH dom0 and PVH has no >> X86_EMU_USE_PIRQ flag, it will fail at has_pirq check. >> So, allow PHYSDEVOP_map_pirq when currd is dom0 no matter if >> dom0 has X86_EMU_USE_PIRQ flag and also allow >> PHYSDEVOP_unmap_pirq for the failed path to unmap pirq. >> >> What's more, in PVH dom0, the gsis don't get registered, but >> the gsi of a passthrough device must be configured for it to >> be able to be mapped into a hvm domU. >> So, add PHYSDEVOP_setup_gsi for PVH dom0, because PVH dom0 >> will setup gsi during assigning a device to passthrough. >> >> Co-developed-by: Huang Rui <ray.huang@amd.com> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> >> --- >> xen/arch/x86/hvm/hypercall.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c >> index 6ad5b4d5f1..621d789bd3 100644 >> --- a/xen/arch/x86/hvm/hypercall.c >> +++ b/xen/arch/x86/hvm/hypercall.c >> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >> >> switch ( cmd ) >> { >> + case PHYSDEVOP_setup_gsi: > > I think given the new approach on the Linux side patches, where > pciback will configure the interrupt, there's no need to expose > setup_gsi anymore? The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. > >> case PHYSDEVOP_map_pirq: >> case PHYSDEVOP_unmap_pirq: >> + if ( is_hardware_domain(currd) ) >> + break; > > Also Jan already pointed this out in v2: this hypercall needs to be > limited so a PVH dom0 cannot execute it against itself. IOW: refuse > the hypercall if DOMID_SELF or the passed domid matches the current > domain domid. Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. > > Thanks, Roger.
On 12.12.2023 07:49, Chen, Jiqian wrote: > On 2023/12/11 23:31, Roger Pau Monné wrote: >> On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: >>> --- a/xen/arch/x86/hvm/hypercall.c >>> +++ b/xen/arch/x86/hvm/hypercall.c >>> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >>> >>> switch ( cmd ) >>> { >>> + case PHYSDEVOP_setup_gsi: >> >> I think given the new approach on the Linux side patches, where >> pciback will configure the interrupt, there's no need to expose >> setup_gsi anymore? > The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. > >> >>> case PHYSDEVOP_map_pirq: >>> case PHYSDEVOP_unmap_pirq: >>> + if ( is_hardware_domain(currd) ) >>> + break; >> >> Also Jan already pointed this out in v2: this hypercall needs to be >> limited so a PVH dom0 cannot execute it against itself. IOW: refuse >> the hypercall if DOMID_SELF or the passed domid matches the current >> domain domid. > Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. And why exactly would it do specifically the map_pirq? (Even the setup_gsi looks questionable to me, but there might be reasons there.) Jan
On 2023/12/12 17:30, Jan Beulich wrote: > On 12.12.2023 07:49, Chen, Jiqian wrote: >> On 2023/12/11 23:31, Roger Pau Monné wrote: >>> On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: >>>> --- a/xen/arch/x86/hvm/hypercall.c >>>> +++ b/xen/arch/x86/hvm/hypercall.c >>>> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >>>> >>>> switch ( cmd ) >>>> { >>>> + case PHYSDEVOP_setup_gsi: >>> >>> I think given the new approach on the Linux side patches, where >>> pciback will configure the interrupt, there's no need to expose >>> setup_gsi anymore? >> The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. >> >>> >>>> case PHYSDEVOP_map_pirq: >>>> case PHYSDEVOP_unmap_pirq: >>>> + if ( is_hardware_domain(currd) ) >>>> + break; >>> >>> Also Jan already pointed this out in v2: this hypercall needs to be >>> limited so a PVH dom0 cannot execute it against itself. IOW: refuse >>> the hypercall if DOMID_SELF or the passed domid matches the current >>> domain domid. >> Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. > > And why exactly would it do specifically the map_pirq? (Even the setup_gsi > looks questionable to me, but there might be reasons there.) Map_pirq is to solve the check failure problem. (pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) Setup_gsi is because the gsi is never be unmasked, so the gsi is never be registered( vioapic_hwdom_map_gsi-> mp_register_gsi is never be called). > > Jan
On 13.12.2023 03:47, Chen, Jiqian wrote: > On 2023/12/12 17:30, Jan Beulich wrote: >> On 12.12.2023 07:49, Chen, Jiqian wrote: >>> On 2023/12/11 23:31, Roger Pau Monné wrote: >>>> On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: >>>>> --- a/xen/arch/x86/hvm/hypercall.c >>>>> +++ b/xen/arch/x86/hvm/hypercall.c >>>>> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >>>>> >>>>> switch ( cmd ) >>>>> { >>>>> + case PHYSDEVOP_setup_gsi: >>>> >>>> I think given the new approach on the Linux side patches, where >>>> pciback will configure the interrupt, there's no need to expose >>>> setup_gsi anymore? >>> The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. >>> >>>> >>>>> case PHYSDEVOP_map_pirq: >>>>> case PHYSDEVOP_unmap_pirq: >>>>> + if ( is_hardware_domain(currd) ) >>>>> + break; >>>> >>>> Also Jan already pointed this out in v2: this hypercall needs to be >>>> limited so a PVH dom0 cannot execute it against itself. IOW: refuse >>>> the hypercall if DOMID_SELF or the passed domid matches the current >>>> domain domid. >>> Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. >> >> And why exactly would it do specifically the map_pirq? (Even the setup_gsi >> looks questionable to me, but there might be reasons there.) > Map_pirq is to solve the check failure problem. (pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) > Setup_gsi is because the gsi is never be unmasked, so the gsi is never be registered( vioapic_hwdom_map_gsi-> mp_register_gsi is never be called). And it was previously made pretty clear by Roger, I think, that doing a "map" just for the purpose of granting permission is, well, at best a temporary workaround in the early development phase. If there's presently no hypercall to _only_ grant permission to IRQ, we need to add one. In fact "map" would likely better not have done two things at a time from the very beginning ... Jan
On 2023/12/13 15:03, Jan Beulich wrote: > On 13.12.2023 03:47, Chen, Jiqian wrote: >> On 2023/12/12 17:30, Jan Beulich wrote: >>> On 12.12.2023 07:49, Chen, Jiqian wrote: >>>> On 2023/12/11 23:31, Roger Pau Monné wrote: >>>>> On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: >>>>>> --- a/xen/arch/x86/hvm/hypercall.c >>>>>> +++ b/xen/arch/x86/hvm/hypercall.c >>>>>> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >>>>>> >>>>>> switch ( cmd ) >>>>>> { >>>>>> + case PHYSDEVOP_setup_gsi: >>>>> >>>>> I think given the new approach on the Linux side patches, where >>>>> pciback will configure the interrupt, there's no need to expose >>>>> setup_gsi anymore? >>>> The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. >>>> >>>>> >>>>>> case PHYSDEVOP_map_pirq: >>>>>> case PHYSDEVOP_unmap_pirq: >>>>>> + if ( is_hardware_domain(currd) ) >>>>>> + break; >>>>> >>>>> Also Jan already pointed this out in v2: this hypercall needs to be >>>>> limited so a PVH dom0 cannot execute it against itself. IOW: refuse >>>>> the hypercall if DOMID_SELF or the passed domid matches the current >>>>> domain domid. >>>> Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. >>> >>> And why exactly would it do specifically the map_pirq? (Even the setup_gsi >>> looks questionable to me, but there might be reasons there.) >> Map_pirq is to solve the check failure problem. (pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) >> Setup_gsi is because the gsi is never be unmasked, so the gsi is never be registered( vioapic_hwdom_map_gsi-> mp_register_gsi is never be called). > > And it was previously made pretty clear by Roger, I think, that doing a "map" > just for the purpose of granting permission is, well, at best a temporary > workaround in the early development phase. If there's presently no hypercall > to _only_ grant permission to IRQ, we need to add one. Could you please describe it in detail? Do you mean to add a new hypercall to grant irq access for dom0 or domU? It seems XEN_DOMCTL_irq_permission is the hypercall to grant irq access from dom0 to domU(see XEN_DOMCTL_irq_permission-> irq_permit_access). There is no need to add hypercall to grant irq access. We failed here (XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) is because the PVH dom0 didn't use PIRQ, so we can't get irq from pirq if "current" is PVH dom0. So, it seems the logic of XEN_DOMCTL_irq_permission is not suitable when PVH dom0? Maybe it directly needs to get irq from the caller(domU) instead of "current" if the "current" has no PIRQ flag? > In fact "map" would likely better not have done two things at a time from the very beginning ... > > Jan
On 14.12.2023 09:55, Chen, Jiqian wrote: > On 2023/12/13 15:03, Jan Beulich wrote: >> On 13.12.2023 03:47, Chen, Jiqian wrote: >>> On 2023/12/12 17:30, Jan Beulich wrote: >>>> On 12.12.2023 07:49, Chen, Jiqian wrote: >>>>> On 2023/12/11 23:31, Roger Pau Monné wrote: >>>>>> On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: >>>>>>> --- a/xen/arch/x86/hvm/hypercall.c >>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c >>>>>>> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >>>>>>> >>>>>>> switch ( cmd ) >>>>>>> { >>>>>>> + case PHYSDEVOP_setup_gsi: >>>>>> >>>>>> I think given the new approach on the Linux side patches, where >>>>>> pciback will configure the interrupt, there's no need to expose >>>>>> setup_gsi anymore? >>>>> The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. >>>>> >>>>>> >>>>>>> case PHYSDEVOP_map_pirq: >>>>>>> case PHYSDEVOP_unmap_pirq: >>>>>>> + if ( is_hardware_domain(currd) ) >>>>>>> + break; >>>>>> >>>>>> Also Jan already pointed this out in v2: this hypercall needs to be >>>>>> limited so a PVH dom0 cannot execute it against itself. IOW: refuse >>>>>> the hypercall if DOMID_SELF or the passed domid matches the current >>>>>> domain domid. >>>>> Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. >>>> >>>> And why exactly would it do specifically the map_pirq? (Even the setup_gsi >>>> looks questionable to me, but there might be reasons there.) >>> Map_pirq is to solve the check failure problem. (pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) >>> Setup_gsi is because the gsi is never be unmasked, so the gsi is never be registered( vioapic_hwdom_map_gsi-> mp_register_gsi is never be called). >> >> And it was previously made pretty clear by Roger, I think, that doing a "map" >> just for the purpose of granting permission is, well, at best a temporary >> workaround in the early development phase. If there's presently no hypercall >> to _only_ grant permission to IRQ, we need to add one. > Could you please describe it in detail? Do you mean to add a new hypercall to grant irq access for dom0 or domU? > It seems XEN_DOMCTL_irq_permission is the hypercall to grant irq access from dom0 to domU(see XEN_DOMCTL_irq_permission-> irq_permit_access). There is no need to add hypercall to grant irq access. Hmm, yes and no. May I turn your attention to https://lists.xen.org/archives/html/xen-devel/2023-07/msg02056.html and its earlier version https://lists.xen.org/archives/html/xen-devel/2023-05/msg00301.html (it's imo a shame that this series continues to be stuck)? Both make pretty clear that without pIRQ, this domctl cannot be used in its present shape anyway, for ... > We failed here (XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) is because the PVH dom0 didn't use PIRQ, so we can't get irq from pirq if "current" is PVH dom0. ... this very reason. Addressing this one way or another is a necessary part of making passthrough work with PVH Dom0. So _effectively_ there is no hypercall allowing PVH Dom0 to grant IRQ permission. > So, it seems the logic of XEN_DOMCTL_irq_permission is not suitable when PVH dom0? That's my view, yes. > Maybe it directly needs to get irq from the caller(domU) instead of "current" if the "current" has no PIRQ flag? I don't think the IRQ mapping in the DomU is necessary to be known here. What we want to grant is access to a host resource. That host resource is therefore all that should need specifying for the operation to be carried out. It just so happens that a PV Dom0 would specify the host IRQ by way of supplying its own equivalent pIRQ. Things are more "interesting" for MSI, though: The (Xen) IRQ may not be known early enough. There wants to be a way of indicating that when such an IRQ is created, permission should be granted to the domain that is going to use that IRQ (by way of being assigned the respective device). (This aspect may be part of why "map" presently also grants permission, yet I continue to think that was wrong from the start. The more that access there is [likely needlessly] granted to the domain requesting the mapping, just for it to then further grant access to the DomU.) Jan
On Thu, Dec 14, 2023 at 08:55:45AM +0000, Chen, Jiqian wrote: > On 2023/12/13 15:03, Jan Beulich wrote: > > On 13.12.2023 03:47, Chen, Jiqian wrote: > >> On 2023/12/12 17:30, Jan Beulich wrote: > >>> On 12.12.2023 07:49, Chen, Jiqian wrote: > >>>> On 2023/12/11 23:31, Roger Pau Monné wrote: > >>>>> On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: > >>>>>> --- a/xen/arch/x86/hvm/hypercall.c > >>>>>> +++ b/xen/arch/x86/hvm/hypercall.c > >>>>>> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > >>>>>> > >>>>>> switch ( cmd ) > >>>>>> { > >>>>>> + case PHYSDEVOP_setup_gsi: > >>>>> > >>>>> I think given the new approach on the Linux side patches, where > >>>>> pciback will configure the interrupt, there's no need to expose > >>>>> setup_gsi anymore? > >>>> The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. > >>>> > >>>>> > >>>>>> case PHYSDEVOP_map_pirq: > >>>>>> case PHYSDEVOP_unmap_pirq: > >>>>>> + if ( is_hardware_domain(currd) ) > >>>>>> + break; > >>>>> > >>>>> Also Jan already pointed this out in v2: this hypercall needs to be > >>>>> limited so a PVH dom0 cannot execute it against itself. IOW: refuse > >>>>> the hypercall if DOMID_SELF or the passed domid matches the current > >>>>> domain domid. > >>>> Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. > >>> > >>> And why exactly would it do specifically the map_pirq? (Even the setup_gsi > >>> looks questionable to me, but there might be reasons there.) > >> Map_pirq is to solve the check failure problem. (pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) > >> Setup_gsi is because the gsi is never be unmasked, so the gsi is never be registered( vioapic_hwdom_map_gsi-> mp_register_gsi is never be called). > > > > And it was previously made pretty clear by Roger, I think, that doing a "map" > > just for the purpose of granting permission is, well, at best a temporary > > workaround in the early development phase. If there's presently no hypercall > > to _only_ grant permission to IRQ, we need to add one. > Could you please describe it in detail? Do you mean to add a new hypercall to grant irq access for dom0 or domU? > It seems XEN_DOMCTL_irq_permission is the hypercall to grant irq access from dom0 to domU(see XEN_DOMCTL_irq_permission-> irq_permit_access). There is no need to add hypercall to grant irq access. > We failed here (XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) is because the PVH dom0 didn't use PIRQ, so we can't get irq from pirq if "current" is PVH dom0. One way to bodge this would be to detect whether the caller of XEN_DOMCTL_irq_permission is a PV or an HVM domain, and in case of HVM assume the pirq field is a GSI. I'm unsure however how that will work with non-x86 architectures. It would be better to introduce a new XEN_DOMCTL_gsi_permission, or maybe XEN_DOMCTL_intr_permission that can take a struct we can use to accommodate GSIs and other arch specific interrupt identifiers. I'm also wondering whether the hypercall should be in a stable interface so it could be easily used from QEMU if needed. > So, it seems the logic of XEN_DOMCTL_irq_permission is not suitable when PVH dom0? Maybe it directly needs to get irq from the caller(domU) instead of "current" if the "current" has no PIRQ flag? Hm, I'm kind of confused by this last sentence, as you mention "the caller(domU)". The caller of XEN_DOMCTL_irq_permission will always be dom0 or the hardware domain. Thanks, Roger.
On 14.12.2023 10:55, Roger Pau Monné wrote: > On Thu, Dec 14, 2023 at 08:55:45AM +0000, Chen, Jiqian wrote: >> On 2023/12/13 15:03, Jan Beulich wrote: >>> On 13.12.2023 03:47, Chen, Jiqian wrote: >>>> On 2023/12/12 17:30, Jan Beulich wrote: >>>>> On 12.12.2023 07:49, Chen, Jiqian wrote: >>>>>> On 2023/12/11 23:31, Roger Pau Monné wrote: >>>>>>> On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: >>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c >>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c >>>>>>>> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >>>>>>>> >>>>>>>> switch ( cmd ) >>>>>>>> { >>>>>>>> + case PHYSDEVOP_setup_gsi: >>>>>>> >>>>>>> I think given the new approach on the Linux side patches, where >>>>>>> pciback will configure the interrupt, there's no need to expose >>>>>>> setup_gsi anymore? >>>>>> The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. >>>>>> >>>>>>> >>>>>>>> case PHYSDEVOP_map_pirq: >>>>>>>> case PHYSDEVOP_unmap_pirq: >>>>>>>> + if ( is_hardware_domain(currd) ) >>>>>>>> + break; >>>>>>> >>>>>>> Also Jan already pointed this out in v2: this hypercall needs to be >>>>>>> limited so a PVH dom0 cannot execute it against itself. IOW: refuse >>>>>>> the hypercall if DOMID_SELF or the passed domid matches the current >>>>>>> domain domid. >>>>>> Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. >>>>> >>>>> And why exactly would it do specifically the map_pirq? (Even the setup_gsi >>>>> looks questionable to me, but there might be reasons there.) >>>> Map_pirq is to solve the check failure problem. (pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) >>>> Setup_gsi is because the gsi is never be unmasked, so the gsi is never be registered( vioapic_hwdom_map_gsi-> mp_register_gsi is never be called). >>> >>> And it was previously made pretty clear by Roger, I think, that doing a "map" >>> just for the purpose of granting permission is, well, at best a temporary >>> workaround in the early development phase. If there's presently no hypercall >>> to _only_ grant permission to IRQ, we need to add one. >> Could you please describe it in detail? Do you mean to add a new hypercall to grant irq access for dom0 or domU? >> It seems XEN_DOMCTL_irq_permission is the hypercall to grant irq access from dom0 to domU(see XEN_DOMCTL_irq_permission-> irq_permit_access). There is no need to add hypercall to grant irq access. >> We failed here (XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) is because the PVH dom0 didn't use PIRQ, so we can't get irq from pirq if "current" is PVH dom0. > > One way to bodge this would be to detect whether the caller of > XEN_DOMCTL_irq_permission is a PV or an HVM domain, and in case of HVM > assume the pirq field is a GSI. I'm unsure however how that will work > with non-x86 architectures. > > It would be better to introduce a new XEN_DOMCTL_gsi_permission, or > maybe XEN_DOMCTL_intr_permission that can take a struct we can use to > accommodate GSIs and other arch specific interrupt identifiers. How would you see MSI being handled then? Jan
On Thu, Dec 14, 2023 at 10:58:24AM +0100, Jan Beulich wrote: > On 14.12.2023 10:55, Roger Pau Monné wrote: > > On Thu, Dec 14, 2023 at 08:55:45AM +0000, Chen, Jiqian wrote: > >> On 2023/12/13 15:03, Jan Beulich wrote: > >>> On 13.12.2023 03:47, Chen, Jiqian wrote: > >>>> On 2023/12/12 17:30, Jan Beulich wrote: > >>>>> On 12.12.2023 07:49, Chen, Jiqian wrote: > >>>>>> On 2023/12/11 23:31, Roger Pau Monné wrote: > >>>>>>> On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: > >>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c > >>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c > >>>>>>>> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > >>>>>>>> > >>>>>>>> switch ( cmd ) > >>>>>>>> { > >>>>>>>> + case PHYSDEVOP_setup_gsi: > >>>>>>> > >>>>>>> I think given the new approach on the Linux side patches, where > >>>>>>> pciback will configure the interrupt, there's no need to expose > >>>>>>> setup_gsi anymore? > >>>>>> The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. > >>>>>> > >>>>>>> > >>>>>>>> case PHYSDEVOP_map_pirq: > >>>>>>>> case PHYSDEVOP_unmap_pirq: > >>>>>>>> + if ( is_hardware_domain(currd) ) > >>>>>>>> + break; > >>>>>>> > >>>>>>> Also Jan already pointed this out in v2: this hypercall needs to be > >>>>>>> limited so a PVH dom0 cannot execute it against itself. IOW: refuse > >>>>>>> the hypercall if DOMID_SELF or the passed domid matches the current > >>>>>>> domain domid. > >>>>>> Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. > >>>>> > >>>>> And why exactly would it do specifically the map_pirq? (Even the setup_gsi > >>>>> looks questionable to me, but there might be reasons there.) > >>>> Map_pirq is to solve the check failure problem. (pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) > >>>> Setup_gsi is because the gsi is never be unmasked, so the gsi is never be registered( vioapic_hwdom_map_gsi-> mp_register_gsi is never be called). > >>> > >>> And it was previously made pretty clear by Roger, I think, that doing a "map" > >>> just for the purpose of granting permission is, well, at best a temporary > >>> workaround in the early development phase. If there's presently no hypercall > >>> to _only_ grant permission to IRQ, we need to add one. > >> Could you please describe it in detail? Do you mean to add a new hypercall to grant irq access for dom0 or domU? > >> It seems XEN_DOMCTL_irq_permission is the hypercall to grant irq access from dom0 to domU(see XEN_DOMCTL_irq_permission-> irq_permit_access). There is no need to add hypercall to grant irq access. > >> We failed here (XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) is because the PVH dom0 didn't use PIRQ, so we can't get irq from pirq if "current" is PVH dom0. > > > > One way to bodge this would be to detect whether the caller of > > XEN_DOMCTL_irq_permission is a PV or an HVM domain, and in case of HVM > > assume the pirq field is a GSI. I'm unsure however how that will work > > with non-x86 architectures. > > > > It would be better to introduce a new XEN_DOMCTL_gsi_permission, or > > maybe XEN_DOMCTL_intr_permission that can take a struct we can use to > > accommodate GSIs and other arch specific interrupt identifiers. > > How would you see MSI being handled then? I wasn't really accounting for MSI here, as MSI is not handled by XEN_DOMCTL_irq_permission now either. My plan long term was to introduce a new hypercall (part of dm_ops possibly) in order to be able to bind MSI directly without having to 'map' it first. Roger.
On Thu, 14 Dec 2023, Roger Pau Monné wrote: > On Thu, Dec 14, 2023 at 10:58:24AM +0100, Jan Beulich wrote: > > On 14.12.2023 10:55, Roger Pau Monné wrote: > > > On Thu, Dec 14, 2023 at 08:55:45AM +0000, Chen, Jiqian wrote: > > >> On 2023/12/13 15:03, Jan Beulich wrote: > > >>> On 13.12.2023 03:47, Chen, Jiqian wrote: > > >>>> On 2023/12/12 17:30, Jan Beulich wrote: > > >>>>> On 12.12.2023 07:49, Chen, Jiqian wrote: > > >>>>>> On 2023/12/11 23:31, Roger Pau Monné wrote: > > >>>>>>> On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: > > >>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c > > >>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c > > >>>>>>>> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > > >>>>>>>> > > >>>>>>>> switch ( cmd ) > > >>>>>>>> { > > >>>>>>>> + case PHYSDEVOP_setup_gsi: > > >>>>>>> > > >>>>>>> I think given the new approach on the Linux side patches, where > > >>>>>>> pciback will configure the interrupt, there's no need to expose > > >>>>>>> setup_gsi anymore? > > >>>>>> The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. > > >>>>>> > > >>>>>>> > > >>>>>>>> case PHYSDEVOP_map_pirq: > > >>>>>>>> case PHYSDEVOP_unmap_pirq: > > >>>>>>>> + if ( is_hardware_domain(currd) ) > > >>>>>>>> + break; > > >>>>>>> > > >>>>>>> Also Jan already pointed this out in v2: this hypercall needs to be > > >>>>>>> limited so a PVH dom0 cannot execute it against itself. IOW: refuse > > >>>>>>> the hypercall if DOMID_SELF or the passed domid matches the current > > >>>>>>> domain domid. > > >>>>>> Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. > > >>>>> > > >>>>> And why exactly would it do specifically the map_pirq? (Even the setup_gsi > > >>>>> looks questionable to me, but there might be reasons there.) > > >>>> Map_pirq is to solve the check failure problem. (pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) > > >>>> Setup_gsi is because the gsi is never be unmasked, so the gsi is never be registered( vioapic_hwdom_map_gsi-> mp_register_gsi is never be called). > > >>> > > >>> And it was previously made pretty clear by Roger, I think, that doing a "map" > > >>> just for the purpose of granting permission is, well, at best a temporary > > >>> workaround in the early development phase. If there's presently no hypercall > > >>> to _only_ grant permission to IRQ, we need to add one. > > >> Could you please describe it in detail? Do you mean to add a new hypercall to grant irq access for dom0 or domU? > > >> It seems XEN_DOMCTL_irq_permission is the hypercall to grant irq access from dom0 to domU(see XEN_DOMCTL_irq_permission-> irq_permit_access). There is no need to add hypercall to grant irq access. > > >> We failed here (XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) is because the PVH dom0 didn't use PIRQ, so we can't get irq from pirq if "current" is PVH dom0. > > > > > > One way to bodge this would be to detect whether the caller of > > > XEN_DOMCTL_irq_permission is a PV or an HVM domain, and in case of HVM > > > assume the pirq field is a GSI. I'm unsure however how that will work > > > with non-x86 architectures. PIRQ is an x86-only concept. We have event channels but no PIRQs on ARM. I expect RISC-V will be the same. > > > It would be better to introduce a new XEN_DOMCTL_gsi_permission, or "GSI" is another x86-only concept. So actually the best name was indeed XEN_DOMCTL_irq_permission, given that it is using the more arch-neutral "irq" terminology. Perhaps it was always a mistake to pass PIRQs to XEN_DOMCTL_irq_permission and we should always have passed the real interrupt number (GSI on x86, SPI on ARM). So your "bodge" is actually kind of OK in my opinion. Basically everyone else (x86 HVM/PVH, ARM, RISC-V, probably PPC too) will use XEN_DOMCTL_irq_permission with hardware interrupt numbers (GSIs, SPIs, etc.), the only special case is x86 PV. It is x86 PV the odd one. Given that DOMCTL is an unstable interface anyway, I feel OK making changes to it, even better if backward compatible.
On 2023/12/15 06:49, Stefano Stabellini wrote: > On Thu, 14 Dec 2023, Roger Pau Monné wrote: >> On Thu, Dec 14, 2023 at 10:58:24AM +0100, Jan Beulich wrote: >>> On 14.12.2023 10:55, Roger Pau Monné wrote: >>>> On Thu, Dec 14, 2023 at 08:55:45AM +0000, Chen, Jiqian wrote: >>>>> On 2023/12/13 15:03, Jan Beulich wrote: >>>>>> On 13.12.2023 03:47, Chen, Jiqian wrote: >>>>>>> On 2023/12/12 17:30, Jan Beulich wrote: >>>>>>>> On 12.12.2023 07:49, Chen, Jiqian wrote: >>>>>>>>> On 2023/12/11 23:31, Roger Pau Monné wrote: >>>>>>>>>> On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: >>>>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c >>>>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c >>>>>>>>>>> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >>>>>>>>>>> >>>>>>>>>>> switch ( cmd ) >>>>>>>>>>> { >>>>>>>>>>> + case PHYSDEVOP_setup_gsi: >>>>>>>>>> >>>>>>>>>> I think given the new approach on the Linux side patches, where >>>>>>>>>> pciback will configure the interrupt, there's no need to expose >>>>>>>>>> setup_gsi anymore? >>>>>>>>> The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> case PHYSDEVOP_map_pirq: >>>>>>>>>>> case PHYSDEVOP_unmap_pirq: >>>>>>>>>>> + if ( is_hardware_domain(currd) ) >>>>>>>>>>> + break; >>>>>>>>>> >>>>>>>>>> Also Jan already pointed this out in v2: this hypercall needs to be >>>>>>>>>> limited so a PVH dom0 cannot execute it against itself. IOW: refuse >>>>>>>>>> the hypercall if DOMID_SELF or the passed domid matches the current >>>>>>>>>> domain domid. >>>>>>>>> Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. >>>>>>>> >>>>>>>> And why exactly would it do specifically the map_pirq? (Even the setup_gsi >>>>>>>> looks questionable to me, but there might be reasons there.) >>>>>>> Map_pirq is to solve the check failure problem. (pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) >>>>>>> Setup_gsi is because the gsi is never be unmasked, so the gsi is never be registered( vioapic_hwdom_map_gsi-> mp_register_gsi is never be called). >>>>>> >>>>>> And it was previously made pretty clear by Roger, I think, that doing a "map" >>>>>> just for the purpose of granting permission is, well, at best a temporary >>>>>> workaround in the early development phase. If there's presently no hypercall >>>>>> to _only_ grant permission to IRQ, we need to add one. >>>>> Could you please describe it in detail? Do you mean to add a new hypercall to grant irq access for dom0 or domU? >>>>> It seems XEN_DOMCTL_irq_permission is the hypercall to grant irq access from dom0 to domU(see XEN_DOMCTL_irq_permission-> irq_permit_access). There is no need to add hypercall to grant irq access. >>>>> We failed here (XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) is because the PVH dom0 didn't use PIRQ, so we can't get irq from pirq if "current" is PVH dom0. >>>> >>>> One way to bodge this would be to detect whether the caller of >>>> XEN_DOMCTL_irq_permission is a PV or an HVM domain, and in case of HVM >>>> assume the pirq field is a GSI. I'm unsure however how that will work >>>> with non-x86 architectures. > > PIRQ is an x86-only concept. We have event channels but no PIRQs on ARM. > I expect RISC-V will be the same. > > >>>> It would be better to introduce a new XEN_DOMCTL_gsi_permission, or > > "GSI" is another x86-only concept. > > So actually the best name was indeed XEN_DOMCTL_irq_permission, given > that it is using the more arch-neutral "irq" terminology. > > Perhaps it was always a mistake to pass PIRQs to > XEN_DOMCTL_irq_permission and we should always have passed the real > interrupt number (GSI on x86, SPI on ARM). > > So your "bodge" is actually kind of OK in my opinion. Basically everyone > else (x86 HVM/PVH, ARM, RISC-V, probably PPC too) will use > XEN_DOMCTL_irq_permission with hardware interrupt numbers (GSIs, SPIs, > etc.), the only special case is x86 PV. It is x86 PV the odd one. > > Given that DOMCTL is an unstable interface anyway, I feel OK making > changes to it, even better if backward compatible. I try to understand your discussion about the modification of XEN_DOMCTL_irq_permission. At the xl level, gsi needs to be passed in instead of pirq, and then a judgment is added to XEN_DOMCTL_irq_permission, just like the implementation below? diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c index d3507d13a029..f665d17afbf5 100644 --- a/tools/libs/light/libxl_pci.c +++ b/tools/libs/light/libxl_pci.c @@ -1486,6 +1486,7 @@ static void pci_add_dm_done(libxl__egc *egc, goto out_no_irq; } if ((fscanf(f, "%u", &irq) == 1) && irq) { + int gsi = irq; r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq); if (r < 0) { LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)", @@ -1494,7 +1495,7 @@ static void pci_add_dm_done(libxl__egc *egc, rc = ERROR_FAIL; goto out; } - r = xc_domain_irq_permission(ctx->xch, domid, irq, 1); + r = xc_domain_irq_permission(ctx->xch, domid, gsi, 1); if (r < 0) { LOGED(ERROR, domainid, "xc_domain_irq_permission irq=%d (error=%d)", irq, r); diff --git a/xen/common/domctl.c b/xen/common/domctl.c index f5a71ee5f78d..782c4a7a70a4 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -658,7 +658,12 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) ret = -EINVAL; break; } - irq = pirq_access_permitted(current->domain, pirq); + + if ( is_hvm_domain(current->domain) ) + irq = pirq; + else + irq = pirq_access_permitted(current->domain, pirq); + if ( !irq || xsm_irq_permission(XSM_HOOK, d, irq, allow) ) ret = -EPERM; else if ( allow )
On Thu, Dec 14, 2023 at 02:49:18PM -0800, Stefano Stabellini wrote: > On Thu, 14 Dec 2023, Roger Pau Monné wrote: > > On Thu, Dec 14, 2023 at 10:58:24AM +0100, Jan Beulich wrote: > > > On 14.12.2023 10:55, Roger Pau Monné wrote: > > > > On Thu, Dec 14, 2023 at 08:55:45AM +0000, Chen, Jiqian wrote: > > > >> On 2023/12/13 15:03, Jan Beulich wrote: > > > >>> On 13.12.2023 03:47, Chen, Jiqian wrote: > > > >>>> On 2023/12/12 17:30, Jan Beulich wrote: > > > >>>>> On 12.12.2023 07:49, Chen, Jiqian wrote: > > > >>>>>> On 2023/12/11 23:31, Roger Pau Monné wrote: > > > >>>>>>> On Mon, Dec 11, 2023 at 12:40:08AM +0800, Jiqian Chen wrote: > > > >>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c > > > >>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c > > > >>>>>>>> @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > > > >>>>>>>> > > > >>>>>>>> switch ( cmd ) > > > >>>>>>>> { > > > >>>>>>>> + case PHYSDEVOP_setup_gsi: > > > >>>>>>> > > > >>>>>>> I think given the new approach on the Linux side patches, where > > > >>>>>>> pciback will configure the interrupt, there's no need to expose > > > >>>>>>> setup_gsi anymore? > > > >>>>>> The latest patch(the second patch of v3 on kernel side) does setup_gsi and map_pirq for passthrough device in pciback, so we need this and below. > > > >>>>>> > > > >>>>>>> > > > >>>>>>>> case PHYSDEVOP_map_pirq: > > > >>>>>>>> case PHYSDEVOP_unmap_pirq: > > > >>>>>>>> + if ( is_hardware_domain(currd) ) > > > >>>>>>>> + break; > > > >>>>>>> > > > >>>>>>> Also Jan already pointed this out in v2: this hypercall needs to be > > > >>>>>>> limited so a PVH dom0 cannot execute it against itself. IOW: refuse > > > >>>>>>> the hypercall if DOMID_SELF or the passed domid matches the current > > > >>>>>>> domain domid. > > > >>>>>> Yes, I remember Jan's suggestion, but since the latest patch(the second patch of v3 on kernel side) has change the implementation, it does setup_gsi and map_pirq for dom0 itself, so I didn't add the DOMID_SELF check. > > > >>>>> > > > >>>>> And why exactly would it do specifically the map_pirq? (Even the setup_gsi > > > >>>>> looks questionable to me, but there might be reasons there.) > > > >>>> Map_pirq is to solve the check failure problem. (pci_add_dm_done-> xc_domain_irq_permission-> XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) > > > >>>> Setup_gsi is because the gsi is never be unmasked, so the gsi is never be registered( vioapic_hwdom_map_gsi-> mp_register_gsi is never be called). > > > >>> > > > >>> And it was previously made pretty clear by Roger, I think, that doing a "map" > > > >>> just for the purpose of granting permission is, well, at best a temporary > > > >>> workaround in the early development phase. If there's presently no hypercall > > > >>> to _only_ grant permission to IRQ, we need to add one. > > > >> Could you please describe it in detail? Do you mean to add a new hypercall to grant irq access for dom0 or domU? > > > >> It seems XEN_DOMCTL_irq_permission is the hypercall to grant irq access from dom0 to domU(see XEN_DOMCTL_irq_permission-> irq_permit_access). There is no need to add hypercall to grant irq access. > > > >> We failed here (XEN_DOMCTL_irq_permission-> pirq_access_permitted->domain_pirq_to_irq->return irq is 0) is because the PVH dom0 didn't use PIRQ, so we can't get irq from pirq if "current" is PVH dom0. > > > > > > > > One way to bodge this would be to detect whether the caller of > > > > XEN_DOMCTL_irq_permission is a PV or an HVM domain, and in case of HVM > > > > assume the pirq field is a GSI. I'm unsure however how that will work > > > > with non-x86 architectures. > > PIRQ is an x86-only concept. We have event channels but no PIRQs on ARM. > I expect RISC-V will be the same. > > > > > > It would be better to introduce a new XEN_DOMCTL_gsi_permission, or > > "GSI" is another x86-only concept. Yes, that hypercall would be x86-specific. > So actually the best name was indeed XEN_DOMCTL_irq_permission, given > that it is using the more arch-neutral "irq" terminology. > > Perhaps it was always a mistake to pass PIRQs to > XEN_DOMCTL_irq_permission and we should always have passed the real > interrupt number (GSI on x86, SPI on ARM). I really don't know much about Arm, but don't you also have LPIs, and would need to add some kind of type field to xen_domctl_irq_permission? > So your "bodge" is actually kind of OK in my opinion. Basically everyone > else (x86 HVM/PVH, ARM, RISC-V, probably PPC too) will use > XEN_DOMCTL_irq_permission with hardware interrupt numbers (GSIs, SPIs, > etc.), the only special case is x86 PV. It is x86 PV the odd one. x86 PV could also pass the GSI if we wanted to change the interface uniformly. AFAICT the hypercall is only used by libxl, so would likely be fine to change. > Given that DOMCTL is an unstable interface anyway, I feel OK making > changes to it, even better if backward compatible. Me calling this a 'bodge' was mostly because I think it would be nice to take the opportunity to move the hypercall to a stable interface. Thanks, Roger.
On 14.12.2023 23:49, Stefano Stabellini wrote: > On Thu, 14 Dec 2023, Roger Pau Monné wrote: >> On Thu, Dec 14, 2023 at 10:58:24AM +0100, Jan Beulich wrote: >>> On 14.12.2023 10:55, Roger Pau Monné wrote: >>>> One way to bodge this would be to detect whether the caller of >>>> XEN_DOMCTL_irq_permission is a PV or an HVM domain, and in case of HVM >>>> assume the pirq field is a GSI. I'm unsure however how that will work >>>> with non-x86 architectures. > > PIRQ is an x86-only concept. We have event channels but no PIRQs on ARM. > I expect RISC-V will be the same. > > >>>> It would be better to introduce a new XEN_DOMCTL_gsi_permission, or > > "GSI" is another x86-only concept. Just to mention it - going through the ACPI spec, this looks to be an arch-neutral ACPI term. It is also used in places which to me look pretty Arm-centric. Jan > So actually the best name was indeed XEN_DOMCTL_irq_permission, given > that it is using the more arch-neutral "irq" terminology. > > Perhaps it was always a mistake to pass PIRQs to > XEN_DOMCTL_irq_permission and we should always have passed the real > interrupt number (GSI on x86, SPI on ARM). > > So your "bodge" is actually kind of OK in my opinion. Basically everyone > else (x86 HVM/PVH, ARM, RISC-V, probably PPC too) will use > XEN_DOMCTL_irq_permission with hardware interrupt numbers (GSIs, SPIs, > etc.), the only special case is x86 PV. It is x86 PV the odd one. > > Given that DOMCTL is an unstable interface anyway, I feel OK making > changes to it, even better if backward compatible.
On Fri, Dec 15, 2023 at 07:20:24AM +0000, Chen, Jiqian wrote: > diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c > index d3507d13a029..f665d17afbf5 100644 > --- a/tools/libs/light/libxl_pci.c > +++ b/tools/libs/light/libxl_pci.c > @@ -1486,6 +1486,7 @@ static void pci_add_dm_done(libxl__egc *egc, > goto out_no_irq; > } > if ((fscanf(f, "%u", &irq) == 1) && irq) { > + int gsi = irq; > r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq); > if (r < 0) { > LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)", > @@ -1494,7 +1495,7 @@ static void pci_add_dm_done(libxl__egc *egc, > rc = ERROR_FAIL; > goto out; > } > - r = xc_domain_irq_permission(ctx->xch, domid, irq, 1); > + r = xc_domain_irq_permission(ctx->xch, domid, gsi, 1); > if (r < 0) { > LOGED(ERROR, domainid, > "xc_domain_irq_permission irq=%d (error=%d)", irq, r); > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > index f5a71ee5f78d..782c4a7a70a4 100644 > --- a/xen/common/domctl.c > +++ b/xen/common/domctl.c > @@ -658,7 +658,12 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > ret = -EINVAL; > break; > } > - irq = pirq_access_permitted(current->domain, pirq); > + > + if ( is_hvm_domain(current->domain) ) > + irq = pirq; > + else > + irq = pirq_access_permitted(current->domain, pirq); You are dropping an irq_access_permitted() check here for the HVM case, as pirq_access_permitted() translates from pirq to irq and also checks for permissions. This would need to be something along the lines of: irq = 0; if ( is_hvm_domain(current->domain) && irq_access_permitted(current->domain, pirq) ) irq = pirq; else irq = pirq_access_permitted(current->domain, pirq); And then I wonder whether it wouldn't be best to uniformly use a GSI for both PV and HVM. Thanks, Roger.
On Fri, Dec 15, 2023 at 09:24:22AM +0100, Jan Beulich wrote: > On 14.12.2023 23:49, Stefano Stabellini wrote: > > On Thu, 14 Dec 2023, Roger Pau Monné wrote: > >> On Thu, Dec 14, 2023 at 10:58:24AM +0100, Jan Beulich wrote: > >>> On 14.12.2023 10:55, Roger Pau Monné wrote: > >>>> One way to bodge this would be to detect whether the caller of > >>>> XEN_DOMCTL_irq_permission is a PV or an HVM domain, and in case of HVM > >>>> assume the pirq field is a GSI. I'm unsure however how that will work > >>>> with non-x86 architectures. > > > > PIRQ is an x86-only concept. We have event channels but no PIRQs on ARM. > > I expect RISC-V will be the same. > > > > > >>>> It would be better to introduce a new XEN_DOMCTL_gsi_permission, or > > > > "GSI" is another x86-only concept. > > Just to mention it - going through the ACPI spec, this looks to be an > arch-neutral ACPI term. It is also used in places which to me look > pretty Arm-centric. Oh, indeed, they have retrofitted GSI(V?) for Arm also, as a way to have a "flat" uniform interrupt space. So I guess Arm would also need the GSI type, unless the translation from GSI to SPI or whatever platform interrupt type is done by the guest and Xen is completely agnostic to GSIs (if that's even possible). Thanks, Roger.
On Fri, 15 Dec 2023, Roger Pau Monné wrote: > On Fri, Dec 15, 2023 at 09:24:22AM +0100, Jan Beulich wrote: > > On 14.12.2023 23:49, Stefano Stabellini wrote: > > > On Thu, 14 Dec 2023, Roger Pau Monné wrote: > > >> On Thu, Dec 14, 2023 at 10:58:24AM +0100, Jan Beulich wrote: > > >>> On 14.12.2023 10:55, Roger Pau Monné wrote: > > >>>> One way to bodge this would be to detect whether the caller of > > >>>> XEN_DOMCTL_irq_permission is a PV or an HVM domain, and in case of HVM > > >>>> assume the pirq field is a GSI. I'm unsure however how that will work > > >>>> with non-x86 architectures. > > > > > > PIRQ is an x86-only concept. We have event channels but no PIRQs on ARM. > > > I expect RISC-V will be the same. > > > > > > > > >>>> It would be better to introduce a new XEN_DOMCTL_gsi_permission, or > > > > > > "GSI" is another x86-only concept. > > > > Just to mention it - going through the ACPI spec, this looks to be an > > arch-neutral ACPI term. It is also used in places which to me look > > pretty Arm-centric. > > Oh, indeed, they have retrofitted GSI(V?) for Arm also, as a way to have a > "flat" uniform interrupt space. Interesting, and I am not surprised. (I don't usually work with ACPI on ARM because none of our boards come with ACPI, they are all Device Tree.) > So I guess Arm would also need the > GSI type, unless the translation from GSI to SPI or whatever platform > interrupt type is done by the guest and Xen is completely agnostic to > GSIs (if that's even possible). I am guessing that GSIs on ARM must be mapped 1:1 to SPIs otherwise we would have severe inconsistencies between ACPI and DeviceTree booting and some boards support both. Also to answer your question about LPIs: those are MSIs on ARM.
On 2023/12/15 16:29, Roger Pau Monné wrote: > On Fri, Dec 15, 2023 at 07:20:24AM +0000, Chen, Jiqian wrote: >> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c >> index d3507d13a029..f665d17afbf5 100644 >> --- a/tools/libs/light/libxl_pci.c >> +++ b/tools/libs/light/libxl_pci.c >> @@ -1486,6 +1486,7 @@ static void pci_add_dm_done(libxl__egc *egc, >> goto out_no_irq; >> } >> if ((fscanf(f, "%u", &irq) == 1) && irq) { >> + int gsi = irq; >> r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq); >> if (r < 0) { >> LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)", >> @@ -1494,7 +1495,7 @@ static void pci_add_dm_done(libxl__egc *egc, >> rc = ERROR_FAIL; >> goto out; >> } >> - r = xc_domain_irq_permission(ctx->xch, domid, irq, 1); >> + r = xc_domain_irq_permission(ctx->xch, domid, gsi, 1); >> if (r < 0) { >> LOGED(ERROR, domainid, >> "xc_domain_irq_permission irq=%d (error=%d)", irq, r); >> diff --git a/xen/common/domctl.c b/xen/common/domctl.c >> index f5a71ee5f78d..782c4a7a70a4 100644 >> --- a/xen/common/domctl.c >> +++ b/xen/common/domctl.c >> @@ -658,7 +658,12 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) >> ret = -EINVAL; >> break; >> } >> - irq = pirq_access_permitted(current->domain, pirq); >> + >> + if ( is_hvm_domain(current->domain) ) >> + irq = pirq; >> + else >> + irq = pirq_access_permitted(current->domain, pirq); > > You are dropping an irq_access_permitted() check here for the HVM > case, as pirq_access_permitted() translates from pirq to irq and also > checks for permissions. > > This would need to be something along the lines of: > > irq = 0; > if ( is_hvm_domain(current->domain) && > irq_access_permitted(current->domain, pirq) ) Oh, yes, it should add this check. > irq = pirq; > else > irq = pirq_access_permitted(current->domain, pirq); > > And then I wonder whether it wouldn't be best to uniformly use a GSI > for both PV and HVM. If we only look at the value(seems the number of gsi == pirq == irq in PV), it seems that gsi can also be used uniformly for PV. And then here should be. if ( irq_access_permitted(current->domain, pirq) ) irq = pirq; else { ret = -EPERM; break; } > > Thanks, Roger.
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c index 6ad5b4d5f1..621d789bd3 100644 --- a/xen/arch/x86/hvm/hypercall.c +++ b/xen/arch/x86/hvm/hypercall.c @@ -72,8 +72,11 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) switch ( cmd ) { + case PHYSDEVOP_setup_gsi: case PHYSDEVOP_map_pirq: case PHYSDEVOP_unmap_pirq: + if ( is_hardware_domain(currd) ) + break; case PHYSDEVOP_eoi: case PHYSDEVOP_irq_status_query: case PHYSDEVOP_get_free_pirq:
If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for a passthrough device by using gsi, see xen_pt_realize->xc_physdev_map_pirq and pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq is not allowed because currd is PVH dom0 and PVH has no X86_EMU_USE_PIRQ flag, it will fail at has_pirq check. So, allow PHYSDEVOP_map_pirq when currd is dom0 no matter if dom0 has X86_EMU_USE_PIRQ flag and also allow PHYSDEVOP_unmap_pirq for the failed path to unmap pirq. What's more, in PVH dom0, the gsis don't get registered, but the gsi of a passthrough device must be configured for it to be able to be mapped into a hvm domU. So, add PHYSDEVOP_setup_gsi for PVH dom0, because PVH dom0 will setup gsi during assigning a device to passthrough. Co-developed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> --- xen/arch/x86/hvm/hypercall.c | 3 +++ 1 file changed, 3 insertions(+)