diff mbox series

[XEN,v12,2/7] x86/pvh: Allow (un)map_pirq when dom0 is PVH

Message ID 20240708114124.407797-3-Jiqian.Chen@amd.com (mailing list archive)
State Superseded
Headers show
Series Support device passthrough when dom0 is PVH on Xen | expand

Commit Message

Chen, Jiqian July 8, 2024, 11:41 a.m. UTC
If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
a passthrough device by using gsi, see qemu code
xen_pt_realize->xc_physdev_map_pirq and libxl code
pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
is not allowed because currd is PVH dom0 and PVH has no
X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.

So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq.
And add a new check to prevent (un)map when the subject domain
doesn't have a notion of PIRQ.

So that the interrupt of a passthrough device can be
successfully mapped to pirq for domU with a notion of PIRQ
when dom0 is PVH

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
 xen/arch/x86/hvm/hypercall.c |  6 ++++++
 xen/arch/x86/physdev.c       | 12 ++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

Comments

Jan Beulich July 8, 2024, 2:58 p.m. UTC | #1
On 08.07.2024 13:41, Jiqian Chen wrote:
> If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
> a passthrough device by using gsi, see qemu code
> xen_pt_realize->xc_physdev_map_pirq and libxl code
> pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
> will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
> is not allowed because currd is PVH dom0 and PVH has no
> X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.
> 
> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq.
> And add a new check to prevent (un)map when the subject domain
> doesn't have a notion of PIRQ.
> 
> So that the interrupt of a passthrough device can be
> successfully mapped to pirq for domU with a notion of PIRQ
> when dom0 is PVH
> 
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> Signed-off-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
Stefano Stabellini July 22, 2024, 9:37 p.m. UTC | #2
On Mon, 8 Jul 2024, Jiqian Chen wrote:
> If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
> a passthrough device by using gsi, see qemu code
> xen_pt_realize->xc_physdev_map_pirq and libxl code
> pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
> will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
> is not allowed because currd is PVH dom0 and PVH has no
> X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.
> 
> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq.
> And add a new check to prevent (un)map when the subject domain
> doesn't have a notion of PIRQ.
> 
> So that the interrupt of a passthrough device can be
> successfully mapped to pirq for domU with a notion of PIRQ
> when dom0 is PVH
> 
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> Signed-off-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Andrew Cooper July 30, 2024, 1:09 p.m. UTC | #3
On 08/07/2024 12:41 pm, Jiqian Chen wrote:
> If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
> a passthrough device by using gsi, see qemu code
> xen_pt_realize->xc_physdev_map_pirq and libxl code
> pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
> will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
> is not allowed because currd is PVH dom0 and PVH has no
> X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.
>
> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq.
> And add a new check to prevent (un)map when the subject domain
> doesn't have a notion of PIRQ.
>
> So that the interrupt of a passthrough device can be
> successfully mapped to pirq for domU with a notion of PIRQ
> when dom0 is PVH
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> Signed-off-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
>  xen/arch/x86/hvm/hypercall.c |  6 ++++++
>  xen/arch/x86/physdev.c       | 12 ++++++++++--
>  2 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
> index 0fab670a4871..03ada3c880bd 100644
> --- a/xen/arch/x86/hvm/hypercall.c
> +++ b/xen/arch/x86/hvm/hypercall.c
> @@ -71,8 +71,14 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>      switch ( cmd )
>      {
> +        /*
> +        * Only being permitted for management of other domains.
> +        * Further restrictions are enforced in do_physdev_op.
> +        */
>      case PHYSDEVOP_map_pirq:
>      case PHYSDEVOP_unmap_pirq:
> +        break;
> +
>      case PHYSDEVOP_eoi:
>      case PHYSDEVOP_irq_status_query:
>      case PHYSDEVOP_get_free_pirq:
> diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
> index d6dd622952a9..9f30a8c63a06 100644
> --- a/xen/arch/x86/physdev.c
> +++ b/xen/arch/x86/physdev.c
> @@ -323,7 +323,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          if ( !d )
>              break;
>  
> -        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
> +        /* Only mapping when the subject domain has a notion of PIRQ */
> +        if ( !is_hvm_domain(d) || has_pirq(d) )
> +            ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
> +        else
> +            ret = -EOPNOTSUPP;
>  
>          rcu_unlock_domain(d);
>  
> @@ -346,7 +350,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          if ( !d )
>              break;
>  
> -        ret = physdev_unmap_pirq(d, unmap.pirq);
> +        /* Only unmapping when the subject domain has a notion of PIRQ */
> +        if ( !is_hvm_domain(d) || has_pirq(d) )
> +            ret = physdev_unmap_pirq(d, unmap.pirq);
> +        else
> +            ret = -EOPNOTSUPP;
>  
>          rcu_unlock_domain(d);
>  

Gitlab is displeased with your offering.

https://gitlab.com/xen-project/xen/-/pipelines/1393459622

This breaks both {adl,zen3p}-pci-hvm-x86-64-gcc-debug, and given the:

(XEN) [    8.150305] HVM restore d1: CPU 0
libxl: error: libxl_pci.c:1491:pci_add_dm_done: Domain
1:xc_physdev_map_pirq irq=18 (error=-1): Not supported
libxl: error: libxl_pci.c:1809:device_pci_add_done: Domain
1:libxl__device_pci_add failed for PCI device 0:3:0.0 (rc -3)
libxl: error: libxl_create.c:1962:domcreate_attach_devices: Domain
1:unable to add pci devices
libxl: error: libxl_xshelp.c:206:libxl__xs_read_mandatory: xenstore read
failed: `/libxl/1/type': No such file or directory
libxl: warning: libxl_dom.c:49:libxl__domain_type: unable to get domain
type for domid=1, assuming HVM
libxl: error: libxl_domain.c:1616:domain_destroy_domid_cb: Domain
1:xc_domain_destroy failed: No such process

I'd say that we're hitting the newly introduced -EOPNOTSUPP path.

In the test scenario, dom0 is PV, and it's an HVM domU which is breaking.

The sibling *-pci-pv-* tests (a PV domU) are working fine.

Either way, I'm going to revert this for now because clearly the "the
subject domain has a notion of PIRQ" hasn't been reasoned about
correctly, and it's important to keep Gitlab CI green across the board.

~Andrew
Chen, Jiqian July 31, 2024, 1:47 a.m. UTC | #4
Hi Andrew,

On 2024/7/30 21:09, Andrew Cooper wrote:
> On 08/07/2024 12:41 pm, Jiqian Chen wrote:
>> If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
>> a passthrough device by using gsi, see qemu code
>> xen_pt_realize->xc_physdev_map_pirq and libxl code
>> pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
>> will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
>> is not allowed because currd is PVH dom0 and PVH has no
>> X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.
>>
>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq.
>> And add a new check to prevent (un)map when the subject domain
>> doesn't have a notion of PIRQ.
>>
>> So that the interrupt of a passthrough device can be
>> successfully mapped to pirq for domU with a notion of PIRQ
>> when dom0 is PVH
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>>  xen/arch/x86/hvm/hypercall.c |  6 ++++++
>>  xen/arch/x86/physdev.c       | 12 ++++++++++--
>>  2 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
>> index 0fab670a4871..03ada3c880bd 100644
>> --- a/xen/arch/x86/hvm/hypercall.c
>> +++ b/xen/arch/x86/hvm/hypercall.c
>> @@ -71,8 +71,14 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>  
>>      switch ( cmd )
>>      {
>> +        /*
>> +        * Only being permitted for management of other domains.
>> +        * Further restrictions are enforced in do_physdev_op.
>> +        */
>>      case PHYSDEVOP_map_pirq:
>>      case PHYSDEVOP_unmap_pirq:
>> +        break;
>> +
>>      case PHYSDEVOP_eoi:
>>      case PHYSDEVOP_irq_status_query:
>>      case PHYSDEVOP_get_free_pirq:
>> diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
>> index d6dd622952a9..9f30a8c63a06 100644
>> --- a/xen/arch/x86/physdev.c
>> +++ b/xen/arch/x86/physdev.c
>> @@ -323,7 +323,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>          if ( !d )
>>              break;
>>  
>> -        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
>> +        /* Only mapping when the subject domain has a notion of PIRQ */
>> +        if ( !is_hvm_domain(d) || has_pirq(d) )
>> +            ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
>> +        else
>> +            ret = -EOPNOTSUPP;
>>  
>>          rcu_unlock_domain(d);
>>  
>> @@ -346,7 +350,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>          if ( !d )
>>              break;
>>  
>> -        ret = physdev_unmap_pirq(d, unmap.pirq);
>> +        /* Only unmapping when the subject domain has a notion of PIRQ */
>> +        if ( !is_hvm_domain(d) || has_pirq(d) )
>> +            ret = physdev_unmap_pirq(d, unmap.pirq);
>> +        else
>> +            ret = -EOPNOTSUPP;
>>  
>>          rcu_unlock_domain(d);
>>  
> 
> Gitlab is displeased with your offering.
> 
> https://gitlab.com/xen-project/xen/-/pipelines/1393459622
> 
> This breaks both {adl,zen3p}-pci-hvm-x86-64-gcc-debug, and given the:
> 
> (XEN) [    8.150305] HVM restore d1: CPU 0
> libxl: error: libxl_pci.c:1491:pci_add_dm_done: Domain
> 1:xc_physdev_map_pirq irq=18 (error=-1): Not supported
> libxl: error: libxl_pci.c:1809:device_pci_add_done: Domain
> 1:libxl__device_pci_add failed for PCI device 0:3:0.0 (rc -3)
> libxl: error: libxl_create.c:1962:domcreate_attach_devices: Domain
> 1:unable to add pci devices
> libxl: error: libxl_xshelp.c:206:libxl__xs_read_mandatory: xenstore read
> failed: `/libxl/1/type': No such file or directory
> libxl: warning: libxl_dom.c:49:libxl__domain_type: unable to get domain
> type for domid=1, assuming HVM
> libxl: error: libxl_domain.c:1616:domain_destroy_domid_cb: Domain
> 1:xc_domain_destroy failed: No such process
> 
> I'd say that we're hitting the newly introduced -EOPNOTSUPP path.
> 
> In the test scenario, dom0 is PV, and it's an HVM domU which is breaking.
> 
> The sibling *-pci-pv-* tests (a PV domU) are working fine.
> 
> Either way, I'm going to revert this for now because clearly the "the
> subject domain has a notion of PIRQ" hasn't been reasoned about
> correctly, and it's important to keep Gitlab CI green across the board.

OK, I will try to reproduce and investigate this issue, thanks.

> 
> ~Andrew
Roger Pau Monné July 31, 2024, 7:50 a.m. UTC | #5
On Mon, Jul 08, 2024 at 07:41:19PM +0800, Jiqian Chen wrote:
> If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
> a passthrough device by using gsi, see qemu code
> xen_pt_realize->xc_physdev_map_pirq and libxl code
> pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
> will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
> is not allowed because currd is PVH dom0 and PVH has no
> X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.
> 
> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq.
> And add a new check to prevent (un)map when the subject domain
> doesn't have a notion of PIRQ.
> 
> So that the interrupt of a passthrough device can be
> successfully mapped to pirq for domU with a notion of PIRQ
> when dom0 is PVH
> 
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> Signed-off-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
>  xen/arch/x86/hvm/hypercall.c |  6 ++++++
>  xen/arch/x86/physdev.c       | 12 ++++++++++--
>  2 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
> index 0fab670a4871..03ada3c880bd 100644
> --- a/xen/arch/x86/hvm/hypercall.c
> +++ b/xen/arch/x86/hvm/hypercall.c
> @@ -71,8 +71,14 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>      switch ( cmd )
>      {
> +        /*
> +        * Only being permitted for management of other domains.
> +        * Further restrictions are enforced in do_physdev_op.
> +        */
>      case PHYSDEVOP_map_pirq:
>      case PHYSDEVOP_unmap_pirq:
> +        break;
> +
>      case PHYSDEVOP_eoi:
>      case PHYSDEVOP_irq_status_query:
>      case PHYSDEVOP_get_free_pirq:
> diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
> index d6dd622952a9..9f30a8c63a06 100644
> --- a/xen/arch/x86/physdev.c
> +++ b/xen/arch/x86/physdev.c
> @@ -323,7 +323,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          if ( !d )
>              break;
>  
> -        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
> +        /* Only mapping when the subject domain has a notion of PIRQ */
> +        if ( !is_hvm_domain(d) || has_pirq(d) )

I'm afraid this is not true.  It's fine to map interrupts to HVM
domains that don't have XENFEAT_hvm_pirqs enabled.  has_pirq() simply
allow HVM domains to route interrupts from devices (either emulated or
passed through) over event channels.

It might have worked in the past (when using a version of Xen < 4.19)
because XENFEAT_hvm_pirqs was enabled by default for HVM guests.

physdev_map_pirq() will work fine when used against domains that don't
have XENFEAT_hvm_pirqs enabled, and it needs to be kept this way.

I think you want to allow PHYSDEVOP_{,un}map_pirq for HVM domains, but
keep the code in do_physdev_op() as-is.  You will have to check
whether the current paths in do_physdev_op() are not making
assumptions about XENFEAT_hvm_pirqs being enabled when the calling
domain is of HVM type.  I don't think that's the case, but better
check.

Thanks, Roger.
Jan Beulich July 31, 2024, 7:58 a.m. UTC | #6
On 31.07.2024 09:50, Roger Pau Monné wrote:
> On Mon, Jul 08, 2024 at 07:41:19PM +0800, Jiqian Chen wrote:
>> --- a/xen/arch/x86/physdev.c
>> +++ b/xen/arch/x86/physdev.c
>> @@ -323,7 +323,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>          if ( !d )
>>              break;
>>  
>> -        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
>> +        /* Only mapping when the subject domain has a notion of PIRQ */
>> +        if ( !is_hvm_domain(d) || has_pirq(d) )
> 
> I'm afraid this is not true.  It's fine to map interrupts to HVM
> domains that don't have XENFEAT_hvm_pirqs enabled.  has_pirq() simply
> allow HVM domains to route interrupts from devices (either emulated or
> passed through) over event channels.
> 
> It might have worked in the past (when using a version of Xen < 4.19)
> because XENFEAT_hvm_pirqs was enabled by default for HVM guests.
> 
> physdev_map_pirq() will work fine when used against domains that don't
> have XENFEAT_hvm_pirqs enabled, and it needs to be kept this way.
> 
> I think you want to allow PHYSDEVOP_{,un}map_pirq for HVM domains, but
> keep the code in do_physdev_op() as-is.  You will have to check
> whether the current paths in do_physdev_op() are not making
> assumptions about XENFEAT_hvm_pirqs being enabled when the calling
> domain is of HVM type.  I don't think that's the case, but better
> check.

Yet the goal is to disallow mapping into PVH domains. The use of
has_pirq() was aiming at that. If that predicate can't be used (anymore)
for this purpose, which one is appropriate now?

Jan
Roger Pau Monné July 31, 2024, 8:24 a.m. UTC | #7
On Wed, Jul 31, 2024 at 09:58:28AM +0200, Jan Beulich wrote:
> On 31.07.2024 09:50, Roger Pau Monné wrote:
> > On Mon, Jul 08, 2024 at 07:41:19PM +0800, Jiqian Chen wrote:
> >> --- a/xen/arch/x86/physdev.c
> >> +++ b/xen/arch/x86/physdev.c
> >> @@ -323,7 +323,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> >>          if ( !d )
> >>              break;
> >>  
> >> -        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
> >> +        /* Only mapping when the subject domain has a notion of PIRQ */
> >> +        if ( !is_hvm_domain(d) || has_pirq(d) )
> > 
> > I'm afraid this is not true.  It's fine to map interrupts to HVM
> > domains that don't have XENFEAT_hvm_pirqs enabled.  has_pirq() simply
> > allow HVM domains to route interrupts from devices (either emulated or
> > passed through) over event channels.
> > 
> > It might have worked in the past (when using a version of Xen < 4.19)
> > because XENFEAT_hvm_pirqs was enabled by default for HVM guests.
> > 
> > physdev_map_pirq() will work fine when used against domains that don't
> > have XENFEAT_hvm_pirqs enabled, and it needs to be kept this way.
> > 
> > I think you want to allow PHYSDEVOP_{,un}map_pirq for HVM domains, but
> > keep the code in do_physdev_op() as-is.  You will have to check
> > whether the current paths in do_physdev_op() are not making
> > assumptions about XENFEAT_hvm_pirqs being enabled when the calling
> > domain is of HVM type.  I don't think that's the case, but better
> > check.
> 
> Yet the goal is to disallow mapping into PVH domains. The use of
> has_pirq() was aiming at that. If that predicate can't be used (anymore)
> for this purpose, which one is appropriate now?

Why do you want to add such restriction now, when it's not currently
present?

It was already the case that a PV dom0 could issue
PHYSDEVOP_{,un}map_pirq operations against a PVH domU, whatever the
result of such operation be.

Thanks, Roger.
Chen, Jiqian July 31, 2024, 8:31 a.m. UTC | #8
On 2024/7/30 21:09, Andrew Cooper wrote:
> On 08/07/2024 12:41 pm, Jiqian Chen wrote:
>> If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
>> a passthrough device by using gsi, see qemu code
>> xen_pt_realize->xc_physdev_map_pirq and libxl code
>> pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
>> will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
>> is not allowed because currd is PVH dom0 and PVH has no
>> X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.
>>
>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq.
>> And add a new check to prevent (un)map when the subject domain
>> doesn't have a notion of PIRQ.
>>
>> So that the interrupt of a passthrough device can be
>> successfully mapped to pirq for domU with a notion of PIRQ
>> when dom0 is PVH
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>>  xen/arch/x86/hvm/hypercall.c |  6 ++++++
>>  xen/arch/x86/physdev.c       | 12 ++++++++++--
>>  2 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
>> index 0fab670a4871..03ada3c880bd 100644
>> --- a/xen/arch/x86/hvm/hypercall.c
>> +++ b/xen/arch/x86/hvm/hypercall.c
>> @@ -71,8 +71,14 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>  
>>      switch ( cmd )
>>      {
>> +        /*
>> +        * Only being permitted for management of other domains.
>> +        * Further restrictions are enforced in do_physdev_op.
>> +        */
>>      case PHYSDEVOP_map_pirq:
>>      case PHYSDEVOP_unmap_pirq:
>> +        break;
>> +
>>      case PHYSDEVOP_eoi:
>>      case PHYSDEVOP_irq_status_query:
>>      case PHYSDEVOP_get_free_pirq:
>> diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
>> index d6dd622952a9..9f30a8c63a06 100644
>> --- a/xen/arch/x86/physdev.c
>> +++ b/xen/arch/x86/physdev.c
>> @@ -323,7 +323,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>          if ( !d )
>>              break;
>>  
>> -        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
>> +        /* Only mapping when the subject domain has a notion of PIRQ */
>> +        if ( !is_hvm_domain(d) || has_pirq(d) )
>> +            ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
>> +        else
>> +            ret = -EOPNOTSUPP;
>>  
>>          rcu_unlock_domain(d);
>>  
>> @@ -346,7 +350,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>          if ( !d )
>>              break;
>>  
>> -        ret = physdev_unmap_pirq(d, unmap.pirq);
>> +        /* Only unmapping when the subject domain has a notion of PIRQ */
>> +        if ( !is_hvm_domain(d) || has_pirq(d) )
>> +            ret = physdev_unmap_pirq(d, unmap.pirq);
>> +        else
>> +            ret = -EOPNOTSUPP;
>>  
>>          rcu_unlock_domain(d);
>>  
> 
> Gitlab is displeased with your offering.
> 
> https://gitlab.com/xen-project/xen/-/pipelines/1393459622
> 
> This breaks both {adl,zen3p}-pci-hvm-x86-64-gcc-debug, and given the:
> 
> (XEN) [    8.150305] HVM restore d1: CPU 0
> libxl: error: libxl_pci.c:1491:pci_add_dm_done: Domain
> 1:xc_physdev_map_pirq irq=18 (error=-1): Not supported
> libxl: error: libxl_pci.c:1809:device_pci_add_done: Domain
> 1:libxl__device_pci_add failed for PCI device 0:3:0.0 (rc -3)
> libxl: error: libxl_create.c:1962:domcreate_attach_devices: Domain
> 1:unable to add pci devices
> libxl: error: libxl_xshelp.c:206:libxl__xs_read_mandatory: xenstore read
> failed: `/libxl/1/type': No such file or directory
> libxl: warning: libxl_dom.c:49:libxl__domain_type: unable to get domain
> type for domid=1, assuming HVM
> libxl: error: libxl_domain.c:1616:domain_destroy_domid_cb: Domain
> 1:xc_domain_destroy failed: No such process

Sorry to forget to validate the scenario of "hvm_pirq=0" for HVM guest since V10->V11(remove the self-check "d == currd").

V10 version:
+        /* Prevent self-map when currd has no X86_EMU_USE_PIRQ flag */
+        if ( is_hvm_domain(d) && !has_pirq(d) && d == currd )
+        {
+            rcu_unlock_domain(d);
+            return -EOPNOTSUPP;
+        }

V11 version:
+        /* Prevent mapping when the subject domain has no X86_EMU_USE_PIRQ */
+        if ( is_hvm_domain(d) && !has_pirq(d) )
+        {
+            rcu_unlock_domain(d);
+            return -EOPNOTSUPP;
+        }

V10 is fine for when hvm_pirq is enable or disable. 
This issue is from V11, the cause is that when pass "hvm_pirq=0" to HVM guest, then has_pirq() is false, but it still uses the pirq to route the interrupt of passthrough devices.
So, it still does xc_physdev_(un)map_pirq, then fails at the has_pirq() check.

Hi Jan,
Should I need to change to V10 to only prevent the self-mapping when the subject domain has no PIRQ?
So that it can allow PHYSDEVOP_map_pirq for foreign mapping, no matter the dom0 or the domU has PIRQ or not?

> 
> I'd say that we're hitting the newly introduced -EOPNOTSUPP path.
> 
> In the test scenario, dom0 is PV, and it's an HVM domU which is breaking.
> 
> The sibling *-pci-pv-* tests (a PV domU) are working fine.
> 
> Either way, I'm going to revert this for now because clearly the "the
> subject domain has a notion of PIRQ" hasn't been reasoned about
> correctly, and it's important to keep Gitlab CI green across the board.
> 
> ~Andrew
Chen, Jiqian July 31, 2024, 8:39 a.m. UTC | #9
On 2024/7/31 15:50, Roger Pau Monné wrote:
> On Mon, Jul 08, 2024 at 07:41:19PM +0800, Jiqian Chen wrote:
>> If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
>> a passthrough device by using gsi, see qemu code
>> xen_pt_realize->xc_physdev_map_pirq and libxl code
>> pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
>> will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
>> is not allowed because currd is PVH dom0 and PVH has no
>> X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.
>>
>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq.
>> And add a new check to prevent (un)map when the subject domain
>> doesn't have a notion of PIRQ.
>>
>> So that the interrupt of a passthrough device can be
>> successfully mapped to pirq for domU with a notion of PIRQ
>> when dom0 is PVH
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>>  xen/arch/x86/hvm/hypercall.c |  6 ++++++
>>  xen/arch/x86/physdev.c       | 12 ++++++++++--
>>  2 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
>> index 0fab670a4871..03ada3c880bd 100644
>> --- a/xen/arch/x86/hvm/hypercall.c
>> +++ b/xen/arch/x86/hvm/hypercall.c
>> @@ -71,8 +71,14 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>  
>>      switch ( cmd )
>>      {
>> +        /*
>> +        * Only being permitted for management of other domains.
>> +        * Further restrictions are enforced in do_physdev_op.
>> +        */
>>      case PHYSDEVOP_map_pirq:
>>      case PHYSDEVOP_unmap_pirq:
>> +        break;
>> +
>>      case PHYSDEVOP_eoi:
>>      case PHYSDEVOP_irq_status_query:
>>      case PHYSDEVOP_get_free_pirq:
>> diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
>> index d6dd622952a9..9f30a8c63a06 100644
>> --- a/xen/arch/x86/physdev.c
>> +++ b/xen/arch/x86/physdev.c
>> @@ -323,7 +323,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>          if ( !d )
>>              break;
>>  
>> -        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
>> +        /* Only mapping when the subject domain has a notion of PIRQ */
>> +        if ( !is_hvm_domain(d) || has_pirq(d) )
> 
> I'm afraid this is not true.  It's fine to map interrupts to HVM
> domains that don't have XENFEAT_hvm_pirqs enabled.  has_pirq() simply
> allow HVM domains to route interrupts from devices (either emulated or
> passed through) over event channels.
> 
> It might have worked in the past (when using a version of Xen < 4.19)
> because XENFEAT_hvm_pirqs was enabled by default for HVM guests.
> 
> physdev_map_pirq() will work fine when used against domains that don't
> have XENFEAT_hvm_pirqs enabled, and it needs to be kept this way.
> 
> I think you want to allow PHYSDEVOP_{,un}map_pirq for HVM domains, but
> keep the code in do_physdev_op() as-is.  You will have to check
> whether the current paths in do_physdev_op() are not making
> assumptions about XENFEAT_hvm_pirqs being enabled when the calling
> domain is of HVM type.  I don't think that's the case, but better
> check.
If I understand correctly, you also talked about preventing self-mapping when the domain is HVM type and doesn't has XENFEAT_hvm_pirqs.
Change to this?
        if ( !is_hvm_domain(d) || has_pirq(d) || d != currd )
            ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
        else
            ret = -EOPNOTSUPP;

> 
> Thanks, Roger.
Jan Beulich July 31, 2024, 8:40 a.m. UTC | #10
On 31.07.2024 10:24, Roger Pau Monné wrote:
> On Wed, Jul 31, 2024 at 09:58:28AM +0200, Jan Beulich wrote:
>> On 31.07.2024 09:50, Roger Pau Monné wrote:
>>> On Mon, Jul 08, 2024 at 07:41:19PM +0800, Jiqian Chen wrote:
>>>> --- a/xen/arch/x86/physdev.c
>>>> +++ b/xen/arch/x86/physdev.c
>>>> @@ -323,7 +323,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>          if ( !d )
>>>>              break;
>>>>  
>>>> -        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
>>>> +        /* Only mapping when the subject domain has a notion of PIRQ */
>>>> +        if ( !is_hvm_domain(d) || has_pirq(d) )
>>>
>>> I'm afraid this is not true.  It's fine to map interrupts to HVM
>>> domains that don't have XENFEAT_hvm_pirqs enabled.  has_pirq() simply
>>> allow HVM domains to route interrupts from devices (either emulated or
>>> passed through) over event channels.
>>>
>>> It might have worked in the past (when using a version of Xen < 4.19)
>>> because XENFEAT_hvm_pirqs was enabled by default for HVM guests.
>>>
>>> physdev_map_pirq() will work fine when used against domains that don't
>>> have XENFEAT_hvm_pirqs enabled, and it needs to be kept this way.
>>>
>>> I think you want to allow PHYSDEVOP_{,un}map_pirq for HVM domains, but
>>> keep the code in do_physdev_op() as-is.  You will have to check
>>> whether the current paths in do_physdev_op() are not making
>>> assumptions about XENFEAT_hvm_pirqs being enabled when the calling
>>> domain is of HVM type.  I don't think that's the case, but better
>>> check.
>>
>> Yet the goal is to disallow mapping into PVH domains. The use of
>> has_pirq() was aiming at that. If that predicate can't be used (anymore)
>> for this purpose, which one is appropriate now?
> 
> Why do you want to add such restriction now, when it's not currently
> present?
> 
> It was already the case that a PV dom0 could issue
> PHYSDEVOP_{,un}map_pirq operations against a PVH domU, whatever the
> result of such operation be.

Because (a) that was wrong and (b) we'd suddenly permit a PVH DomU to
issue such for itself.

Jan
Jan Beulich July 31, 2024, 8:42 a.m. UTC | #11
On 31.07.2024 10:31, Chen, Jiqian wrote:
> On 2024/7/30 21:09, Andrew Cooper wrote:
>> On 08/07/2024 12:41 pm, Jiqian Chen wrote:
>>> If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
>>> a passthrough device by using gsi, see qemu code
>>> xen_pt_realize->xc_physdev_map_pirq and libxl code
>>> pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
>>> will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
>>> is not allowed because currd is PVH dom0 and PVH has no
>>> X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.
>>>
>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq.
>>> And add a new check to prevent (un)map when the subject domain
>>> doesn't have a notion of PIRQ.
>>>
>>> So that the interrupt of a passthrough device can be
>>> successfully mapped to pirq for domU with a notion of PIRQ
>>> when dom0 is PVH
>>>
>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>>> ---
>>>  xen/arch/x86/hvm/hypercall.c |  6 ++++++
>>>  xen/arch/x86/physdev.c       | 12 ++++++++++--
>>>  2 files changed, 16 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
>>> index 0fab670a4871..03ada3c880bd 100644
>>> --- a/xen/arch/x86/hvm/hypercall.c
>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>> @@ -71,8 +71,14 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>  
>>>      switch ( cmd )
>>>      {
>>> +        /*
>>> +        * Only being permitted for management of other domains.
>>> +        * Further restrictions are enforced in do_physdev_op.
>>> +        */
>>>      case PHYSDEVOP_map_pirq:
>>>      case PHYSDEVOP_unmap_pirq:
>>> +        break;
>>> +
>>>      case PHYSDEVOP_eoi:
>>>      case PHYSDEVOP_irq_status_query:
>>>      case PHYSDEVOP_get_free_pirq:
>>> diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
>>> index d6dd622952a9..9f30a8c63a06 100644
>>> --- a/xen/arch/x86/physdev.c
>>> +++ b/xen/arch/x86/physdev.c
>>> @@ -323,7 +323,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>          if ( !d )
>>>              break;
>>>  
>>> -        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
>>> +        /* Only mapping when the subject domain has a notion of PIRQ */
>>> +        if ( !is_hvm_domain(d) || has_pirq(d) )
>>> +            ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
>>> +        else
>>> +            ret = -EOPNOTSUPP;
>>>  
>>>          rcu_unlock_domain(d);
>>>  
>>> @@ -346,7 +350,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>          if ( !d )
>>>              break;
>>>  
>>> -        ret = physdev_unmap_pirq(d, unmap.pirq);
>>> +        /* Only unmapping when the subject domain has a notion of PIRQ */
>>> +        if ( !is_hvm_domain(d) || has_pirq(d) )
>>> +            ret = physdev_unmap_pirq(d, unmap.pirq);
>>> +        else
>>> +            ret = -EOPNOTSUPP;
>>>  
>>>          rcu_unlock_domain(d);
>>>  
>>
>> Gitlab is displeased with your offering.
>>
>> https://gitlab.com/xen-project/xen/-/pipelines/1393459622
>>
>> This breaks both {adl,zen3p}-pci-hvm-x86-64-gcc-debug, and given the:
>>
>> (XEN) [    8.150305] HVM restore d1: CPU 0
>> libxl: error: libxl_pci.c:1491:pci_add_dm_done: Domain
>> 1:xc_physdev_map_pirq irq=18 (error=-1): Not supported
>> libxl: error: libxl_pci.c:1809:device_pci_add_done: Domain
>> 1:libxl__device_pci_add failed for PCI device 0:3:0.0 (rc -3)
>> libxl: error: libxl_create.c:1962:domcreate_attach_devices: Domain
>> 1:unable to add pci devices
>> libxl: error: libxl_xshelp.c:206:libxl__xs_read_mandatory: xenstore read
>> failed: `/libxl/1/type': No such file or directory
>> libxl: warning: libxl_dom.c:49:libxl__domain_type: unable to get domain
>> type for domid=1, assuming HVM
>> libxl: error: libxl_domain.c:1616:domain_destroy_domid_cb: Domain
>> 1:xc_domain_destroy failed: No such process
> 
> Sorry to forget to validate the scenario of "hvm_pirq=0" for HVM guest since V10->V11(remove the self-check "d == currd").
> 
> V10 version:
> +        /* Prevent self-map when currd has no X86_EMU_USE_PIRQ flag */
> +        if ( is_hvm_domain(d) && !has_pirq(d) && d == currd )
> +        {
> +            rcu_unlock_domain(d);
> +            return -EOPNOTSUPP;
> +        }
> 
> V11 version:
> +        /* Prevent mapping when the subject domain has no X86_EMU_USE_PIRQ */
> +        if ( is_hvm_domain(d) && !has_pirq(d) )
> +        {
> +            rcu_unlock_domain(d);
> +            return -EOPNOTSUPP;
> +        }
> 
> V10 is fine for when hvm_pirq is enable or disable. 
> This issue is from V11, the cause is that when pass "hvm_pirq=0" to HVM guest, then has_pirq() is false, but it still uses the pirq to route the interrupt of passthrough devices.
> So, it still does xc_physdev_(un)map_pirq, then fails at the has_pirq() check.
> 
> Hi Jan,
> Should I need to change to V10 to only prevent the self-mapping when the subject domain has no PIRQ?
> So that it can allow PHYSDEVOP_map_pirq for foreign mapping, no matter the dom0 or the domU has PIRQ or not?

No, my position there hasn't changed. I continue to view it as wrong to
have any d == currd checks here.

Jan
Roger Pau Monné July 31, 2024, 8:51 a.m. UTC | #12
On Wed, Jul 31, 2024 at 10:40:46AM +0200, Jan Beulich wrote:
> On 31.07.2024 10:24, Roger Pau Monné wrote:
> > On Wed, Jul 31, 2024 at 09:58:28AM +0200, Jan Beulich wrote:
> >> On 31.07.2024 09:50, Roger Pau Monné wrote:
> >>> On Mon, Jul 08, 2024 at 07:41:19PM +0800, Jiqian Chen wrote:
> >>>> --- a/xen/arch/x86/physdev.c
> >>>> +++ b/xen/arch/x86/physdev.c
> >>>> @@ -323,7 +323,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> >>>>          if ( !d )
> >>>>              break;
> >>>>  
> >>>> -        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
> >>>> +        /* Only mapping when the subject domain has a notion of PIRQ */
> >>>> +        if ( !is_hvm_domain(d) || has_pirq(d) )
> >>>
> >>> I'm afraid this is not true.  It's fine to map interrupts to HVM
> >>> domains that don't have XENFEAT_hvm_pirqs enabled.  has_pirq() simply
> >>> allow HVM domains to route interrupts from devices (either emulated or
> >>> passed through) over event channels.
> >>>
> >>> It might have worked in the past (when using a version of Xen < 4.19)
> >>> because XENFEAT_hvm_pirqs was enabled by default for HVM guests.
> >>>
> >>> physdev_map_pirq() will work fine when used against domains that don't
> >>> have XENFEAT_hvm_pirqs enabled, and it needs to be kept this way.
> >>>
> >>> I think you want to allow PHYSDEVOP_{,un}map_pirq for HVM domains, but
> >>> keep the code in do_physdev_op() as-is.  You will have to check
> >>> whether the current paths in do_physdev_op() are not making
> >>> assumptions about XENFEAT_hvm_pirqs being enabled when the calling
> >>> domain is of HVM type.  I don't think that's the case, but better
> >>> check.
> >>
> >> Yet the goal is to disallow mapping into PVH domains. The use of
> >> has_pirq() was aiming at that. If that predicate can't be used (anymore)
> >> for this purpose, which one is appropriate now?
> > 
> > Why do you want to add such restriction now, when it's not currently
> > present?
> > 
> > It was already the case that a PV dom0 could issue
> > PHYSDEVOP_{,un}map_pirq operations against a PVH domU, whatever the
> > result of such operation be.
> 
> Because (a) that was wrong and (b) we'd suddenly permit a PVH DomU to
> issue such for itself.

Regarding (b) a PVH domU issuing such operations would fail at the
xsm_map_domain_pirq() check in physdev_map_pirq().

I agree with (a), but I don't think enabling PVH dom0 usage of the
hypercalls should be gated on this.  As said a PV dom0 is already
capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
domU.

Thanks, Roger.
Jan Beulich July 31, 2024, 9:02 a.m. UTC | #13
On 31.07.2024 10:51, Roger Pau Monné wrote:
> On Wed, Jul 31, 2024 at 10:40:46AM +0200, Jan Beulich wrote:
>> On 31.07.2024 10:24, Roger Pau Monné wrote:
>>> On Wed, Jul 31, 2024 at 09:58:28AM +0200, Jan Beulich wrote:
>>>> On 31.07.2024 09:50, Roger Pau Monné wrote:
>>>>> On Mon, Jul 08, 2024 at 07:41:19PM +0800, Jiqian Chen wrote:
>>>>>> --- a/xen/arch/x86/physdev.c
>>>>>> +++ b/xen/arch/x86/physdev.c
>>>>>> @@ -323,7 +323,11 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>          if ( !d )
>>>>>>              break;
>>>>>>  
>>>>>> -        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
>>>>>> +        /* Only mapping when the subject domain has a notion of PIRQ */
>>>>>> +        if ( !is_hvm_domain(d) || has_pirq(d) )
>>>>>
>>>>> I'm afraid this is not true.  It's fine to map interrupts to HVM
>>>>> domains that don't have XENFEAT_hvm_pirqs enabled.  has_pirq() simply
>>>>> allow HVM domains to route interrupts from devices (either emulated or
>>>>> passed through) over event channels.
>>>>>
>>>>> It might have worked in the past (when using a version of Xen < 4.19)
>>>>> because XENFEAT_hvm_pirqs was enabled by default for HVM guests.
>>>>>
>>>>> physdev_map_pirq() will work fine when used against domains that don't
>>>>> have XENFEAT_hvm_pirqs enabled, and it needs to be kept this way.
>>>>>
>>>>> I think you want to allow PHYSDEVOP_{,un}map_pirq for HVM domains, but
>>>>> keep the code in do_physdev_op() as-is.  You will have to check
>>>>> whether the current paths in do_physdev_op() are not making
>>>>> assumptions about XENFEAT_hvm_pirqs being enabled when the calling
>>>>> domain is of HVM type.  I don't think that's the case, but better
>>>>> check.
>>>>
>>>> Yet the goal is to disallow mapping into PVH domains. The use of
>>>> has_pirq() was aiming at that. If that predicate can't be used (anymore)
>>>> for this purpose, which one is appropriate now?
>>>
>>> Why do you want to add such restriction now, when it's not currently
>>> present?
>>>
>>> It was already the case that a PV dom0 could issue
>>> PHYSDEVOP_{,un}map_pirq operations against a PVH domU, whatever the
>>> result of such operation be.
>>
>> Because (a) that was wrong and (b) we'd suddenly permit a PVH DomU to
>> issue such for itself.
> 
> Regarding (b) a PVH domU issuing such operations would fail at the
> xsm_map_domain_pirq() check in physdev_map_pirq().

Hmm, yes, fair point.

> I agree with (a), but I don't think enabling PVH dom0 usage of the
> hypercalls should be gated on this.  As said a PV dom0 is already
> capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
> domU.

Okay, I can accept that as an intermediate position. We ought to deny
such requests at some point though for PVH domains, the latest in the
course of making vPCI work there.

Jan
Roger Pau Monné July 31, 2024, 9:37 a.m. UTC | #14
On Wed, Jul 31, 2024 at 11:02:01AM +0200, Jan Beulich wrote:
> On 31.07.2024 10:51, Roger Pau Monné wrote:
> > I agree with (a), but I don't think enabling PVH dom0 usage of the
> > hypercalls should be gated on this.  As said a PV dom0 is already
> > capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
> > domU.
> 
> Okay, I can accept that as an intermediate position. We ought to deny
> such requests at some point though for PVH domains, the latest in the
> course of making vPCI work there.

Hm, once physdev_map_pirq() works as intended against PVH domains, I
don't see why we would prevent the usage of PHYSDEVOP_{,un}map_pirq
against such domains.

Granted using vPCI for plain PCI passthrough is the best option, but I
also don't think we should limit it in the hypervisor.  Some kind of
passthrough (like when using vfio/mdev) will still need something akin
to a device model I would expect.

Thanks, Roger.
Jan Beulich July 31, 2024, 9:55 a.m. UTC | #15
On 31.07.2024 11:37, Roger Pau Monné wrote:
> On Wed, Jul 31, 2024 at 11:02:01AM +0200, Jan Beulich wrote:
>> On 31.07.2024 10:51, Roger Pau Monné wrote:
>>> I agree with (a), but I don't think enabling PVH dom0 usage of the
>>> hypercalls should be gated on this.  As said a PV dom0 is already
>>> capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
>>> domU.
>>
>> Okay, I can accept that as an intermediate position. We ought to deny
>> such requests at some point though for PVH domains, the latest in the
>> course of making vPCI work there.
> 
> Hm, once physdev_map_pirq() works as intended against PVH domains, I
> don't see why we would prevent the usage of PHYSDEVOP_{,un}map_pirq
> against such domains.

Well. If it can be made work as intended, then I certainly agree. However,
without even the concept of pIRQ in PVH I'm having a hard time seeing how
it can be made work. Iirc you were advocating for us to not introduce pIRQ
into PVH.

Maybe you're thinking of re-using the sub-ops, requiring PVH domains to
pass in GSIs? I think I suggested something along these lines also to
Jiqian, yet with the now intended exposure to !has_pirq() domains I'm
not sure this could be made work reliably.

Which reminds me of another question I had: What meaning does the pirq
field have right now, if Dom0 would issue the request against a PVH DomU?
What meaning will it have for a !has_pirq() HVM domain?

Jan
Roger Pau Monné July 31, 2024, 11:29 a.m. UTC | #16
On Wed, Jul 31, 2024 at 11:55:35AM +0200, Jan Beulich wrote:
> On 31.07.2024 11:37, Roger Pau Monné wrote:
> > On Wed, Jul 31, 2024 at 11:02:01AM +0200, Jan Beulich wrote:
> >> On 31.07.2024 10:51, Roger Pau Monné wrote:
> >>> I agree with (a), but I don't think enabling PVH dom0 usage of the
> >>> hypercalls should be gated on this.  As said a PV dom0 is already
> >>> capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
> >>> domU.
> >>
> >> Okay, I can accept that as an intermediate position. We ought to deny
> >> such requests at some point though for PVH domains, the latest in the
> >> course of making vPCI work there.
> > 
> > Hm, once physdev_map_pirq() works as intended against PVH domains, I
> > don't see why we would prevent the usage of PHYSDEVOP_{,un}map_pirq
> > against such domains.
> 
> Well. If it can be made work as intended, then I certainly agree. However,
> without even the concept of pIRQ in PVH I'm having a hard time seeing how
> it can be made work. Iirc you were advocating for us to not introduce pIRQ
> into PVH.

From what I'm seeing here the intention is to expose
PHYSDEVOP_{,un}map_pirq to PVH dom0, so there must be some notion of
pIRQs or akin in a PVH dom0?  Even if only for passthrough needs.

> Maybe you're thinking of re-using the sub-ops, requiring PVH domains to
> pass in GSIs?

I think that was one my proposals, to either introduce a new
hypercall that takes a GSI, or to modify the PHYSDEVOP_{,un}map_pirq
in an ABI compatible way so that semantically the field could be a GSI
rather than a pIRQ.  We however would also need a way to reference an
MSI entry.

My main concern is not with pIRQs by itself, pIRQs are just an
abstract way to reference interrupts, my concern and what I wanted to
avoid on PVH is being able to route pIRQs over event channels.  IOW:
have interrupts from physical devices delivered over event channels.

> I think I suggested something along these lines also to
> Jiqian, yet with the now intended exposure to !has_pirq() domains I'm
> not sure this could be made work reliably.

I'm afraid I've been lacking behind on reviewing those series.

> Which reminds me of another question I had: What meaning does the pirq
> field have right now, if Dom0 would issue the request against a PVH DomU?
> What meaning will it have for a !has_pirq() HVM domain?

The pirq field could be a way to reference an interrupt.  It doesn't
need to be exposed to the PVH domU at all, but it's a way for the
device model to identify which interrupt should be mapped to which
domain.

Thanks, Roger.
Jan Beulich July 31, 2024, 11:39 a.m. UTC | #17
On 31.07.2024 13:29, Roger Pau Monné wrote:
> On Wed, Jul 31, 2024 at 11:55:35AM +0200, Jan Beulich wrote:
>> On 31.07.2024 11:37, Roger Pau Monné wrote:
>>> On Wed, Jul 31, 2024 at 11:02:01AM +0200, Jan Beulich wrote:
>>>> On 31.07.2024 10:51, Roger Pau Monné wrote:
>>>>> I agree with (a), but I don't think enabling PVH dom0 usage of the
>>>>> hypercalls should be gated on this.  As said a PV dom0 is already
>>>>> capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
>>>>> domU.
>>>>
>>>> Okay, I can accept that as an intermediate position. We ought to deny
>>>> such requests at some point though for PVH domains, the latest in the
>>>> course of making vPCI work there.
>>>
>>> Hm, once physdev_map_pirq() works as intended against PVH domains, I
>>> don't see why we would prevent the usage of PHYSDEVOP_{,un}map_pirq
>>> against such domains.
>>
>> Well. If it can be made work as intended, then I certainly agree. However,
>> without even the concept of pIRQ in PVH I'm having a hard time seeing how
>> it can be made work. Iirc you were advocating for us to not introduce pIRQ
>> into PVH.
> 
> From what I'm seeing here the intention is to expose
> PHYSDEVOP_{,un}map_pirq to PVH dom0, so there must be some notion of
> pIRQs or akin in a PVH dom0?  Even if only for passthrough needs.

Only in so far as it is an abstract, handle-like value pertaining solely
to the target domain.

>> Maybe you're thinking of re-using the sub-ops, requiring PVH domains to
>> pass in GSIs?
> 
> I think that was one my proposals, to either introduce a new
> hypercall that takes a GSI, or to modify the PHYSDEVOP_{,un}map_pirq
> in an ABI compatible way so that semantically the field could be a GSI
> rather than a pIRQ.  We however would also need a way to reference an
> MSI entry.

Of course.

> My main concern is not with pIRQs by itself, pIRQs are just an
> abstract way to reference interrupts, my concern and what I wanted to
> avoid on PVH is being able to route pIRQs over event channels.  IOW:
> have interrupts from physical devices delivered over event channels.

Oh, I might have slightly misunderstood your intentions then.

>> I think I suggested something along these lines also to
>> Jiqian, yet with the now intended exposure to !has_pirq() domains I'm
>> not sure this could be made work reliably.
> 
> I'm afraid I've been lacking behind on reviewing those series.
> 
>> Which reminds me of another question I had: What meaning does the pirq
>> field have right now, if Dom0 would issue the request against a PVH DomU?
>> What meaning will it have for a !has_pirq() HVM domain?
> 
> The pirq field could be a way to reference an interrupt.  It doesn't
> need to be exposed to the PVH domU at all, but it's a way for the
> device model to identify which interrupt should be mapped to which
> domain.

Since pIRQ-s are per-domain, _that_ kind of association won't be
helped. But yes, as per above it could serve as an abstract handle-
like value.

Jan
Roger Pau Monné July 31, 2024, 1:03 p.m. UTC | #18
On Wed, Jul 31, 2024 at 01:39:40PM +0200, Jan Beulich wrote:
> On 31.07.2024 13:29, Roger Pau Monné wrote:
> > On Wed, Jul 31, 2024 at 11:55:35AM +0200, Jan Beulich wrote:
> >> On 31.07.2024 11:37, Roger Pau Monné wrote:
> >>> On Wed, Jul 31, 2024 at 11:02:01AM +0200, Jan Beulich wrote:
> >>>> On 31.07.2024 10:51, Roger Pau Monné wrote:
> >>>>> I agree with (a), but I don't think enabling PVH dom0 usage of the
> >>>>> hypercalls should be gated on this.  As said a PV dom0 is already
> >>>>> capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
> >>>>> domU.
> >>>>
> >>>> Okay, I can accept that as an intermediate position. We ought to deny
> >>>> such requests at some point though for PVH domains, the latest in the
> >>>> course of making vPCI work there.
> >>>
> >>> Hm, once physdev_map_pirq() works as intended against PVH domains, I
> >>> don't see why we would prevent the usage of PHYSDEVOP_{,un}map_pirq
> >>> against such domains.
> >>
> >> Well. If it can be made work as intended, then I certainly agree. However,
> >> without even the concept of pIRQ in PVH I'm having a hard time seeing how
> >> it can be made work. Iirc you were advocating for us to not introduce pIRQ
> >> into PVH.
> > 
> > From what I'm seeing here the intention is to expose
> > PHYSDEVOP_{,un}map_pirq to PVH dom0, so there must be some notion of
> > pIRQs or akin in a PVH dom0?  Even if only for passthrough needs.
> 
> Only in so far as it is an abstract, handle-like value pertaining solely
> to the target domain.
> 
> >> Maybe you're thinking of re-using the sub-ops, requiring PVH domains to
> >> pass in GSIs?
> > 
> > I think that was one my proposals, to either introduce a new
> > hypercall that takes a GSI, or to modify the PHYSDEVOP_{,un}map_pirq
> > in an ABI compatible way so that semantically the field could be a GSI
> > rather than a pIRQ.  We however would also need a way to reference an
> > MSI entry.
> 
> Of course.
> 
> > My main concern is not with pIRQs by itself, pIRQs are just an
> > abstract way to reference interrupts, my concern and what I wanted to
> > avoid on PVH is being able to route pIRQs over event channels.  IOW:
> > have interrupts from physical devices delivered over event channels.
> 
> Oh, I might have slightly misunderstood your intentions then.

My intention would be to not even use pIRQs at all, in order to avoid
the temptation of the guest itself managing interrupts using
hypercalls, hence I would have preferred that abstract interface to be
something else.

Maybe we could even expose the Xen IRQ space directly, and just use
that as interrupt handles, but since I'm not the one doing the work
I'm not sure it's fair to ask for something that would require more
changes internally to Xen.

> >> I think I suggested something along these lines also to
> >> Jiqian, yet with the now intended exposure to !has_pirq() domains I'm
> >> not sure this could be made work reliably.
> > 
> > I'm afraid I've been lacking behind on reviewing those series.
> > 
> >> Which reminds me of another question I had: What meaning does the pirq
> >> field have right now, if Dom0 would issue the request against a PVH DomU?
> >> What meaning will it have for a !has_pirq() HVM domain?
> > 
> > The pirq field could be a way to reference an interrupt.  It doesn't
> > need to be exposed to the PVH domU at all, but it's a way for the
> > device model to identify which interrupt should be mapped to which
> > domain.
> 
> Since pIRQ-s are per-domain, _that_ kind of association won't be
> helped. But yes, as per above it could serve as an abstract handle-
> like value.

I would be fine with doing the interrupt bindings based on IRQs
instead of pIRQs, but I'm afraid that would require more changes to
hypercalls and Xen internals.

At some point I need to work on a new interface to do passthrough, so
that we can remove the usage of domctls from QEMU.  That might be a
good opportunity to switch from using pIRQs.

Thanks, Roger.
Chen, Jiqian Aug. 2, 2024, 2:37 a.m. UTC | #19
On 2024/7/31 21:03, Roger Pau Monné wrote:
> On Wed, Jul 31, 2024 at 01:39:40PM +0200, Jan Beulich wrote:
>> On 31.07.2024 13:29, Roger Pau Monné wrote:
>>> On Wed, Jul 31, 2024 at 11:55:35AM +0200, Jan Beulich wrote:
>>>> On 31.07.2024 11:37, Roger Pau Monné wrote:
>>>>> On Wed, Jul 31, 2024 at 11:02:01AM +0200, Jan Beulich wrote:
>>>>>> On 31.07.2024 10:51, Roger Pau Monné wrote:
>>>>>>> I agree with (a), but I don't think enabling PVH dom0 usage of the
>>>>>>> hypercalls should be gated on this.  As said a PV dom0 is already
>>>>>>> capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
>>>>>>> domU.
>>>>>>
>>>>>> Okay, I can accept that as an intermediate position. We ought to deny
>>>>>> such requests at some point though for PVH domains, the latest in the
>>>>>> course of making vPCI work there.
>>>>>
>>>>> Hm, once physdev_map_pirq() works as intended against PVH domains, I
>>>>> don't see why we would prevent the usage of PHYSDEVOP_{,un}map_pirq
>>>>> against such domains.
>>>>
>>>> Well. If it can be made work as intended, then I certainly agree. However,
>>>> without even the concept of pIRQ in PVH I'm having a hard time seeing how
>>>> it can be made work. Iirc you were advocating for us to not introduce pIRQ
>>>> into PVH.
>>>
>>> From what I'm seeing here the intention is to expose
>>> PHYSDEVOP_{,un}map_pirq to PVH dom0, so there must be some notion of
>>> pIRQs or akin in a PVH dom0?  Even if only for passthrough needs.
>>
>> Only in so far as it is an abstract, handle-like value pertaining solely
>> to the target domain.
>>
>>>> Maybe you're thinking of re-using the sub-ops, requiring PVH domains to
>>>> pass in GSIs?
>>>
>>> I think that was one my proposals, to either introduce a new
>>> hypercall that takes a GSI, or to modify the PHYSDEVOP_{,un}map_pirq
>>> in an ABI compatible way so that semantically the field could be a GSI
>>> rather than a pIRQ.  We however would also need a way to reference an
>>> MSI entry.
>>
>> Of course.
>>
>>> My main concern is not with pIRQs by itself, pIRQs are just an
>>> abstract way to reference interrupts, my concern and what I wanted to
>>> avoid on PVH is being able to route pIRQs over event channels.  IOW:
>>> have interrupts from physical devices delivered over event channels.
>>
>> Oh, I might have slightly misunderstood your intentions then.
> 
> My intention would be to not even use pIRQs at all, in order to avoid
> the temptation of the guest itself managing interrupts using
> hypercalls, hence I would have preferred that abstract interface to be
> something else.
> 
> Maybe we could even expose the Xen IRQ space directly, and just use
> that as interrupt handles, but since I'm not the one doing the work
> I'm not sure it's fair to ask for something that would require more
> changes internally to Xen.
> 
>>>> I think I suggested something along these lines also to
>>>> Jiqian, yet with the now intended exposure to !has_pirq() domains I'm
>>>> not sure this could be made work reliably.
>>>
>>> I'm afraid I've been lacking behind on reviewing those series.
>>>
>>>> Which reminds me of another question I had: What meaning does the pirq
>>>> field have right now, if Dom0 would issue the request against a PVH DomU?
>>>> What meaning will it have for a !has_pirq() HVM domain?
>>>
>>> The pirq field could be a way to reference an interrupt.  It doesn't
>>> need to be exposed to the PVH domU at all, but it's a way for the
>>> device model to identify which interrupt should be mapped to which
>>> domain.
>>
>> Since pIRQ-s are per-domain, _that_ kind of association won't be
>> helped. But yes, as per above it could serve as an abstract handle-
>> like value.
> 
> I would be fine with doing the interrupt bindings based on IRQs
> instead of pIRQs, but I'm afraid that would require more changes to
> hypercalls and Xen internals.
> 
> At some point I need to work on a new interface to do passthrough, so
> that we can remove the usage of domctls from QEMU.  That might be a
> good opportunity to switch from using pIRQs.

Thanks for your input, but I may be a bit behind you with my knowledge and can't fully understand the discussion.
How should I modify this question later?
Should I add a new hypercall specifically for passthrough?
Or if it is to prevent the (un)map from being used for PVH guests, can I just add a new function to check if the subject domain is a PVH type? Like is_pvh_domain().

> 
> Thanks, Roger.
Roger Pau Monné Aug. 2, 2024, 8:11 a.m. UTC | #20
On Fri, Aug 02, 2024 at 02:37:24AM +0000, Chen, Jiqian wrote:
> On 2024/7/31 21:03, Roger Pau Monné wrote:
> > On Wed, Jul 31, 2024 at 01:39:40PM +0200, Jan Beulich wrote:
> >> On 31.07.2024 13:29, Roger Pau Monné wrote:
> >>> On Wed, Jul 31, 2024 at 11:55:35AM +0200, Jan Beulich wrote:
> >>>> On 31.07.2024 11:37, Roger Pau Monné wrote:
> >>>>> On Wed, Jul 31, 2024 at 11:02:01AM +0200, Jan Beulich wrote:
> >>>>>> On 31.07.2024 10:51, Roger Pau Monné wrote:
> >>>>>>> I agree with (a), but I don't think enabling PVH dom0 usage of the
> >>>>>>> hypercalls should be gated on this.  As said a PV dom0 is already
> >>>>>>> capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
> >>>>>>> domU.
> >>>>>>
> >>>>>> Okay, I can accept that as an intermediate position. We ought to deny
> >>>>>> such requests at some point though for PVH domains, the latest in the
> >>>>>> course of making vPCI work there.
> >>>>>
> >>>>> Hm, once physdev_map_pirq() works as intended against PVH domains, I
> >>>>> don't see why we would prevent the usage of PHYSDEVOP_{,un}map_pirq
> >>>>> against such domains.
> >>>>
> >>>> Well. If it can be made work as intended, then I certainly agree. However,
> >>>> without even the concept of pIRQ in PVH I'm having a hard time seeing how
> >>>> it can be made work. Iirc you were advocating for us to not introduce pIRQ
> >>>> into PVH.
> >>>
> >>> From what I'm seeing here the intention is to expose
> >>> PHYSDEVOP_{,un}map_pirq to PVH dom0, so there must be some notion of
> >>> pIRQs or akin in a PVH dom0?  Even if only for passthrough needs.
> >>
> >> Only in so far as it is an abstract, handle-like value pertaining solely
> >> to the target domain.
> >>
> >>>> Maybe you're thinking of re-using the sub-ops, requiring PVH domains to
> >>>> pass in GSIs?
> >>>
> >>> I think that was one my proposals, to either introduce a new
> >>> hypercall that takes a GSI, or to modify the PHYSDEVOP_{,un}map_pirq
> >>> in an ABI compatible way so that semantically the field could be a GSI
> >>> rather than a pIRQ.  We however would also need a way to reference an
> >>> MSI entry.
> >>
> >> Of course.
> >>
> >>> My main concern is not with pIRQs by itself, pIRQs are just an
> >>> abstract way to reference interrupts, my concern and what I wanted to
> >>> avoid on PVH is being able to route pIRQs over event channels.  IOW:
> >>> have interrupts from physical devices delivered over event channels.
> >>
> >> Oh, I might have slightly misunderstood your intentions then.
> > 
> > My intention would be to not even use pIRQs at all, in order to avoid
> > the temptation of the guest itself managing interrupts using
> > hypercalls, hence I would have preferred that abstract interface to be
> > something else.
> > 
> > Maybe we could even expose the Xen IRQ space directly, and just use
> > that as interrupt handles, but since I'm not the one doing the work
> > I'm not sure it's fair to ask for something that would require more
> > changes internally to Xen.
> > 
> >>>> I think I suggested something along these lines also to
> >>>> Jiqian, yet with the now intended exposure to !has_pirq() domains I'm
> >>>> not sure this could be made work reliably.
> >>>
> >>> I'm afraid I've been lacking behind on reviewing those series.
> >>>
> >>>> Which reminds me of another question I had: What meaning does the pirq
> >>>> field have right now, if Dom0 would issue the request against a PVH DomU?
> >>>> What meaning will it have for a !has_pirq() HVM domain?
> >>>
> >>> The pirq field could be a way to reference an interrupt.  It doesn't
> >>> need to be exposed to the PVH domU at all, but it's a way for the
> >>> device model to identify which interrupt should be mapped to which
> >>> domain.
> >>
> >> Since pIRQ-s are per-domain, _that_ kind of association won't be
> >> helped. But yes, as per above it could serve as an abstract handle-
> >> like value.
> > 
> > I would be fine with doing the interrupt bindings based on IRQs
> > instead of pIRQs, but I'm afraid that would require more changes to
> > hypercalls and Xen internals.
> > 
> > At some point I need to work on a new interface to do passthrough, so
> > that we can remove the usage of domctls from QEMU.  That might be a
> > good opportunity to switch from using pIRQs.
> 
> Thanks for your input, but I may be a bit behind you with my knowledge and can't fully understand the discussion.
> How should I modify this question later?
> Should I add a new hypercall specifically for passthrough?
> Or if it is to prevent the (un)map from being used for PVH guests, can I just add a new function to check if the subject domain is a PVH type? Like is_pvh_domain().

I think that would be part of a new interface, as said before I don't
think it would be fair to force you to do all this work.  I won't
oppose with the approach to attempt to re-use the existing interfaces
as much as possible.

I think this patch needs to be adjusted to drop the change to
xen/arch/x86/physdev.c, as just allowing PHYSDEVOP_{,un}map_pirq
without any change to do_physdev_op() should result in the correct
behavior.

Thanks, Roger.
Chen, Jiqian Aug. 2, 2024, 8:17 a.m. UTC | #21
On 2024/8/2 16:11, Roger Pau Monné wrote:
> On Fri, Aug 02, 2024 at 02:37:24AM +0000, Chen, Jiqian wrote:
>> On 2024/7/31 21:03, Roger Pau Monné wrote:
>>> On Wed, Jul 31, 2024 at 01:39:40PM +0200, Jan Beulich wrote:
>>>> On 31.07.2024 13:29, Roger Pau Monné wrote:
>>>>> On Wed, Jul 31, 2024 at 11:55:35AM +0200, Jan Beulich wrote:
>>>>>> On 31.07.2024 11:37, Roger Pau Monné wrote:
>>>>>>> On Wed, Jul 31, 2024 at 11:02:01AM +0200, Jan Beulich wrote:
>>>>>>>> On 31.07.2024 10:51, Roger Pau Monné wrote:
>>>>>>>>> I agree with (a), but I don't think enabling PVH dom0 usage of the
>>>>>>>>> hypercalls should be gated on this.  As said a PV dom0 is already
>>>>>>>>> capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
>>>>>>>>> domU.
>>>>>>>>
>>>>>>>> Okay, I can accept that as an intermediate position. We ought to deny
>>>>>>>> such requests at some point though for PVH domains, the latest in the
>>>>>>>> course of making vPCI work there.
>>>>>>>
>>>>>>> Hm, once physdev_map_pirq() works as intended against PVH domains, I
>>>>>>> don't see why we would prevent the usage of PHYSDEVOP_{,un}map_pirq
>>>>>>> against such domains.
>>>>>>
>>>>>> Well. If it can be made work as intended, then I certainly agree. However,
>>>>>> without even the concept of pIRQ in PVH I'm having a hard time seeing how
>>>>>> it can be made work. Iirc you were advocating for us to not introduce pIRQ
>>>>>> into PVH.
>>>>>
>>>>> From what I'm seeing here the intention is to expose
>>>>> PHYSDEVOP_{,un}map_pirq to PVH dom0, so there must be some notion of
>>>>> pIRQs or akin in a PVH dom0?  Even if only for passthrough needs.
>>>>
>>>> Only in so far as it is an abstract, handle-like value pertaining solely
>>>> to the target domain.
>>>>
>>>>>> Maybe you're thinking of re-using the sub-ops, requiring PVH domains to
>>>>>> pass in GSIs?
>>>>>
>>>>> I think that was one my proposals, to either introduce a new
>>>>> hypercall that takes a GSI, or to modify the PHYSDEVOP_{,un}map_pirq
>>>>> in an ABI compatible way so that semantically the field could be a GSI
>>>>> rather than a pIRQ.  We however would also need a way to reference an
>>>>> MSI entry.
>>>>
>>>> Of course.
>>>>
>>>>> My main concern is not with pIRQs by itself, pIRQs are just an
>>>>> abstract way to reference interrupts, my concern and what I wanted to
>>>>> avoid on PVH is being able to route pIRQs over event channels.  IOW:
>>>>> have interrupts from physical devices delivered over event channels.
>>>>
>>>> Oh, I might have slightly misunderstood your intentions then.
>>>
>>> My intention would be to not even use pIRQs at all, in order to avoid
>>> the temptation of the guest itself managing interrupts using
>>> hypercalls, hence I would have preferred that abstract interface to be
>>> something else.
>>>
>>> Maybe we could even expose the Xen IRQ space directly, and just use
>>> that as interrupt handles, but since I'm not the one doing the work
>>> I'm not sure it's fair to ask for something that would require more
>>> changes internally to Xen.
>>>
>>>>>> I think I suggested something along these lines also to
>>>>>> Jiqian, yet with the now intended exposure to !has_pirq() domains I'm
>>>>>> not sure this could be made work reliably.
>>>>>
>>>>> I'm afraid I've been lacking behind on reviewing those series.
>>>>>
>>>>>> Which reminds me of another question I had: What meaning does the pirq
>>>>>> field have right now, if Dom0 would issue the request against a PVH DomU?
>>>>>> What meaning will it have for a !has_pirq() HVM domain?
>>>>>
>>>>> The pirq field could be a way to reference an interrupt.  It doesn't
>>>>> need to be exposed to the PVH domU at all, but it's a way for the
>>>>> device model to identify which interrupt should be mapped to which
>>>>> domain.
>>>>
>>>> Since pIRQ-s are per-domain, _that_ kind of association won't be
>>>> helped. But yes, as per above it could serve as an abstract handle-
>>>> like value.
>>>
>>> I would be fine with doing the interrupt bindings based on IRQs
>>> instead of pIRQs, but I'm afraid that would require more changes to
>>> hypercalls and Xen internals.
>>>
>>> At some point I need to work on a new interface to do passthrough, so
>>> that we can remove the usage of domctls from QEMU.  That might be a
>>> good opportunity to switch from using pIRQs.
>>
>> Thanks for your input, but I may be a bit behind you with my knowledge and can't fully understand the discussion.
>> How should I modify this question later?
>> Should I add a new hypercall specifically for passthrough?
>> Or if it is to prevent the (un)map from being used for PVH guests, can I just add a new function to check if the subject domain is a PVH type? Like is_pvh_domain().
> 
> I think that would be part of a new interface, as said before I don't
> think it would be fair to force you to do all this work.  I won't
> oppose with the approach to attempt to re-use the existing interfaces
> as much as possible.
Thanks.

> 
> I think this patch needs to be adjusted to drop the change to
> xen/arch/x86/physdev.c, as just allowing PHYSDEVOP_{,un}map_pirq
> without any change to do_physdev_op() should result in the correct
> behavior.
Do you mean that I don't need to add any further restrictions in do_physdev_op(), just simply allow PHYSDEVOP_{,un}map_pirq in hvm_physdev_op() ?

> 
> Thanks, Roger.
Roger Pau Monné Aug. 2, 2024, 8:35 a.m. UTC | #22
On Fri, Aug 02, 2024 at 08:17:15AM +0000, Chen, Jiqian wrote:
> On 2024/8/2 16:11, Roger Pau Monné wrote:
> > On Fri, Aug 02, 2024 at 02:37:24AM +0000, Chen, Jiqian wrote:
> >> On 2024/7/31 21:03, Roger Pau Monné wrote:
> >>> On Wed, Jul 31, 2024 at 01:39:40PM +0200, Jan Beulich wrote:
> >>>> On 31.07.2024 13:29, Roger Pau Monné wrote:
> >>>>> On Wed, Jul 31, 2024 at 11:55:35AM +0200, Jan Beulich wrote:
> >>>>>> On 31.07.2024 11:37, Roger Pau Monné wrote:
> >>>>>>> On Wed, Jul 31, 2024 at 11:02:01AM +0200, Jan Beulich wrote:
> >>>>>>>> On 31.07.2024 10:51, Roger Pau Monné wrote:
> >>>>>>>>> I agree with (a), but I don't think enabling PVH dom0 usage of the
> >>>>>>>>> hypercalls should be gated on this.  As said a PV dom0 is already
> >>>>>>>>> capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
> >>>>>>>>> domU.
> >>>>>>>>
> >>>>>>>> Okay, I can accept that as an intermediate position. We ought to deny
> >>>>>>>> such requests at some point though for PVH domains, the latest in the
> >>>>>>>> course of making vPCI work there.
> >>>>>>>
> >>>>>>> Hm, once physdev_map_pirq() works as intended against PVH domains, I
> >>>>>>> don't see why we would prevent the usage of PHYSDEVOP_{,un}map_pirq
> >>>>>>> against such domains.
> >>>>>>
> >>>>>> Well. If it can be made work as intended, then I certainly agree. However,
> >>>>>> without even the concept of pIRQ in PVH I'm having a hard time seeing how
> >>>>>> it can be made work. Iirc you were advocating for us to not introduce pIRQ
> >>>>>> into PVH.
> >>>>>
> >>>>> From what I'm seeing here the intention is to expose
> >>>>> PHYSDEVOP_{,un}map_pirq to PVH dom0, so there must be some notion of
> >>>>> pIRQs or akin in a PVH dom0?  Even if only for passthrough needs.
> >>>>
> >>>> Only in so far as it is an abstract, handle-like value pertaining solely
> >>>> to the target domain.
> >>>>
> >>>>>> Maybe you're thinking of re-using the sub-ops, requiring PVH domains to
> >>>>>> pass in GSIs?
> >>>>>
> >>>>> I think that was one my proposals, to either introduce a new
> >>>>> hypercall that takes a GSI, or to modify the PHYSDEVOP_{,un}map_pirq
> >>>>> in an ABI compatible way so that semantically the field could be a GSI
> >>>>> rather than a pIRQ.  We however would also need a way to reference an
> >>>>> MSI entry.
> >>>>
> >>>> Of course.
> >>>>
> >>>>> My main concern is not with pIRQs by itself, pIRQs are just an
> >>>>> abstract way to reference interrupts, my concern and what I wanted to
> >>>>> avoid on PVH is being able to route pIRQs over event channels.  IOW:
> >>>>> have interrupts from physical devices delivered over event channels.
> >>>>
> >>>> Oh, I might have slightly misunderstood your intentions then.
> >>>
> >>> My intention would be to not even use pIRQs at all, in order to avoid
> >>> the temptation of the guest itself managing interrupts using
> >>> hypercalls, hence I would have preferred that abstract interface to be
> >>> something else.
> >>>
> >>> Maybe we could even expose the Xen IRQ space directly, and just use
> >>> that as interrupt handles, but since I'm not the one doing the work
> >>> I'm not sure it's fair to ask for something that would require more
> >>> changes internally to Xen.
> >>>
> >>>>>> I think I suggested something along these lines also to
> >>>>>> Jiqian, yet with the now intended exposure to !has_pirq() domains I'm
> >>>>>> not sure this could be made work reliably.
> >>>>>
> >>>>> I'm afraid I've been lacking behind on reviewing those series.
> >>>>>
> >>>>>> Which reminds me of another question I had: What meaning does the pirq
> >>>>>> field have right now, if Dom0 would issue the request against a PVH DomU?
> >>>>>> What meaning will it have for a !has_pirq() HVM domain?
> >>>>>
> >>>>> The pirq field could be a way to reference an interrupt.  It doesn't
> >>>>> need to be exposed to the PVH domU at all, but it's a way for the
> >>>>> device model to identify which interrupt should be mapped to which
> >>>>> domain.
> >>>>
> >>>> Since pIRQ-s are per-domain, _that_ kind of association won't be
> >>>> helped. But yes, as per above it could serve as an abstract handle-
> >>>> like value.
> >>>
> >>> I would be fine with doing the interrupt bindings based on IRQs
> >>> instead of pIRQs, but I'm afraid that would require more changes to
> >>> hypercalls and Xen internals.
> >>>
> >>> At some point I need to work on a new interface to do passthrough, so
> >>> that we can remove the usage of domctls from QEMU.  That might be a
> >>> good opportunity to switch from using pIRQs.
> >>
> >> Thanks for your input, but I may be a bit behind you with my knowledge and can't fully understand the discussion.
> >> How should I modify this question later?
> >> Should I add a new hypercall specifically for passthrough?
> >> Or if it is to prevent the (un)map from being used for PVH guests, can I just add a new function to check if the subject domain is a PVH type? Like is_pvh_domain().
> > 
> > I think that would be part of a new interface, as said before I don't
> > think it would be fair to force you to do all this work.  I won't
> > oppose with the approach to attempt to re-use the existing interfaces
> > as much as possible.
> Thanks.
> 
> > 
> > I think this patch needs to be adjusted to drop the change to
> > xen/arch/x86/physdev.c, as just allowing PHYSDEVOP_{,un}map_pirq
> > without any change to do_physdev_op() should result in the correct
> > behavior.
> Do you mean that I don't need to add any further restrictions in do_physdev_op(), just simply allow PHYSDEVOP_{,un}map_pirq in hvm_physdev_op() ?

That's my understanding, yes, no further restrictions should be added.

Thanks, Roger.
Chen, Jiqian Aug. 2, 2024, 8:40 a.m. UTC | #23
Hi Jan,

On 2024/8/2 16:35, Roger Pau Monné wrote:
> On Fri, Aug 02, 2024 at 08:17:15AM +0000, Chen, Jiqian wrote:
>> On 2024/8/2 16:11, Roger Pau Monné wrote:
>>> I think this patch needs to be adjusted to drop the change to
>>> xen/arch/x86/physdev.c, as just allowing PHYSDEVOP_{,un}map_pirq
>>> without any change to do_physdev_op() should result in the correct
>>> behavior.
>> Do you mean that I don't need to add any further restrictions in do_physdev_op(), just simply allow PHYSDEVOP_{,un}map_pirq in hvm_physdev_op() ?
> 
> That's my understanding, yes, no further restrictions should be added.

Are you okey with this?

> 
> Thanks, Roger.
Jan Beulich Aug. 2, 2024, 9:17 a.m. UTC | #24
On 02.08.2024 10:40, Chen, Jiqian wrote:
> On 2024/8/2 16:35, Roger Pau Monné wrote:
>> On Fri, Aug 02, 2024 at 08:17:15AM +0000, Chen, Jiqian wrote:
>>> On 2024/8/2 16:11, Roger Pau Monné wrote:
>>>> I think this patch needs to be adjusted to drop the change to
>>>> xen/arch/x86/physdev.c, as just allowing PHYSDEVOP_{,un}map_pirq
>>>> without any change to do_physdev_op() should result in the correct
>>>> behavior.
>>> Do you mean that I don't need to add any further restrictions in do_physdev_op(), just simply allow PHYSDEVOP_{,un}map_pirq in hvm_physdev_op() ?
>>
>> That's my understanding, yes, no further restrictions should be added.
> 
> Are you okey with this?

I think I already indicated so - yes, for the time being.

Jan
Jan Beulich Aug. 2, 2024, 9:37 a.m. UTC | #25
On 02.08.2024 10:11, Roger Pau Monné wrote:
> On Fri, Aug 02, 2024 at 02:37:24AM +0000, Chen, Jiqian wrote:
>> On 2024/7/31 21:03, Roger Pau Monné wrote:
>>> On Wed, Jul 31, 2024 at 01:39:40PM +0200, Jan Beulich wrote:
>>>> On 31.07.2024 13:29, Roger Pau Monné wrote:
>>>>> On Wed, Jul 31, 2024 at 11:55:35AM +0200, Jan Beulich wrote:
>>>>>> On 31.07.2024 11:37, Roger Pau Monné wrote:
>>>>>>> On Wed, Jul 31, 2024 at 11:02:01AM +0200, Jan Beulich wrote:
>>>>>>>> On 31.07.2024 10:51, Roger Pau Monné wrote:
>>>>>>>>> I agree with (a), but I don't think enabling PVH dom0 usage of the
>>>>>>>>> hypercalls should be gated on this.  As said a PV dom0 is already
>>>>>>>>> capable of issuing PHYSDEVOP_{,un}map_pirq operations against a PVH
>>>>>>>>> domU.
>>>>>>>>
>>>>>>>> Okay, I can accept that as an intermediate position. We ought to deny
>>>>>>>> such requests at some point though for PVH domains, the latest in the
>>>>>>>> course of making vPCI work there.
>>>>>>>
>>>>>>> Hm, once physdev_map_pirq() works as intended against PVH domains, I
>>>>>>> don't see why we would prevent the usage of PHYSDEVOP_{,un}map_pirq
>>>>>>> against such domains.
>>>>>>
>>>>>> Well. If it can be made work as intended, then I certainly agree. However,
>>>>>> without even the concept of pIRQ in PVH I'm having a hard time seeing how
>>>>>> it can be made work. Iirc you were advocating for us to not introduce pIRQ
>>>>>> into PVH.
>>>>>
>>>>> From what I'm seeing here the intention is to expose
>>>>> PHYSDEVOP_{,un}map_pirq to PVH dom0, so there must be some notion of
>>>>> pIRQs or akin in a PVH dom0?  Even if only for passthrough needs.
>>>>
>>>> Only in so far as it is an abstract, handle-like value pertaining solely
>>>> to the target domain.
>>>>
>>>>>> Maybe you're thinking of re-using the sub-ops, requiring PVH domains to
>>>>>> pass in GSIs?
>>>>>
>>>>> I think that was one my proposals, to either introduce a new
>>>>> hypercall that takes a GSI, or to modify the PHYSDEVOP_{,un}map_pirq
>>>>> in an ABI compatible way so that semantically the field could be a GSI
>>>>> rather than a pIRQ.  We however would also need a way to reference an
>>>>> MSI entry.
>>>>
>>>> Of course.
>>>>
>>>>> My main concern is not with pIRQs by itself, pIRQs are just an
>>>>> abstract way to reference interrupts, my concern and what I wanted to
>>>>> avoid on PVH is being able to route pIRQs over event channels.  IOW:
>>>>> have interrupts from physical devices delivered over event channels.
>>>>
>>>> Oh, I might have slightly misunderstood your intentions then.
>>>
>>> My intention would be to not even use pIRQs at all, in order to avoid
>>> the temptation of the guest itself managing interrupts using
>>> hypercalls, hence I would have preferred that abstract interface to be
>>> something else.
>>>
>>> Maybe we could even expose the Xen IRQ space directly, and just use
>>> that as interrupt handles, but since I'm not the one doing the work
>>> I'm not sure it's fair to ask for something that would require more
>>> changes internally to Xen.
>>>
>>>>>> I think I suggested something along these lines also to
>>>>>> Jiqian, yet with the now intended exposure to !has_pirq() domains I'm
>>>>>> not sure this could be made work reliably.
>>>>>
>>>>> I'm afraid I've been lacking behind on reviewing those series.
>>>>>
>>>>>> Which reminds me of another question I had: What meaning does the pirq
>>>>>> field have right now, if Dom0 would issue the request against a PVH DomU?
>>>>>> What meaning will it have for a !has_pirq() HVM domain?
>>>>>
>>>>> The pirq field could be a way to reference an interrupt.  It doesn't
>>>>> need to be exposed to the PVH domU at all, but it's a way for the
>>>>> device model to identify which interrupt should be mapped to which
>>>>> domain.
>>>>
>>>> Since pIRQ-s are per-domain, _that_ kind of association won't be
>>>> helped. But yes, as per above it could serve as an abstract handle-
>>>> like value.
>>>
>>> I would be fine with doing the interrupt bindings based on IRQs
>>> instead of pIRQs, but I'm afraid that would require more changes to
>>> hypercalls and Xen internals.
>>>
>>> At some point I need to work on a new interface to do passthrough, so
>>> that we can remove the usage of domctls from QEMU.  That might be a
>>> good opportunity to switch from using pIRQs.
>>
>> Thanks for your input, but I may be a bit behind you with my knowledge and can't fully understand the discussion.
>> How should I modify this question later?
>> Should I add a new hypercall specifically for passthrough?
>> Or if it is to prevent the (un)map from being used for PVH guests, can I just add a new function to check if the subject domain is a PVH type? Like is_pvh_domain().
> 
> I think that would be part of a new interface, as said before I don't
> think it would be fair to force you to do all this work.  I won't
> oppose with the approach to attempt to re-use the existing interfaces
> as much as possible.
> 
> I think this patch needs to be adjusted to drop the change to
> xen/arch/x86/physdev.c, as just allowing PHYSDEVOP_{,un}map_pirq
> without any change to do_physdev_op() should result in the correct
> behavior.

Plus perhaps adding respective clarification to the description, as
to exposing the functionality to wider than (presently) necessary
"audience".

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 0fab670a4871..03ada3c880bd 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -71,8 +71,14 @@  long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     switch ( cmd )
     {
+        /*
+        * Only being permitted for management of other domains.
+        * Further restrictions are enforced in do_physdev_op.
+        */
     case PHYSDEVOP_map_pirq:
     case PHYSDEVOP_unmap_pirq:
+        break;
+
     case PHYSDEVOP_eoi:
     case PHYSDEVOP_irq_status_query:
     case PHYSDEVOP_get_free_pirq:
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index d6dd622952a9..9f30a8c63a06 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -323,7 +323,11 @@  ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( !d )
             break;
 
-        ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
+        /* Only mapping when the subject domain has a notion of PIRQ */
+        if ( !is_hvm_domain(d) || has_pirq(d) )
+            ret = physdev_map_pirq(d, map.type, &map.index, &map.pirq, &msi);
+        else
+            ret = -EOPNOTSUPP;
 
         rcu_unlock_domain(d);
 
@@ -346,7 +350,11 @@  ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( !d )
             break;
 
-        ret = physdev_unmap_pirq(d, unmap.pirq);
+        /* Only unmapping when the subject domain has a notion of PIRQ */
+        if ( !is_hvm_domain(d) || has_pirq(d) )
+            ret = physdev_unmap_pirq(d, unmap.pirq);
+        else
+            ret = -EOPNOTSUPP;
 
         rcu_unlock_domain(d);