diff mbox series

[v2,3/3,FUTURE] xen/arm: enable vPCI for domUs

Message ID 20230707014754.51333-4-stewart.hildebrand@amd.com (mailing list archive)
State New, archived
Headers show
Series Kconfig for PCI passthrough on ARM | expand

Commit Message

Stewart Hildebrand July 7, 2023, 1:47 a.m. UTC
Remove is_hardware_domain check in has_vpci, and select HAS_VPCI_GUEST_SUPPORT
in Kconfig.

[1] https://lists.xenproject.org/archives/html/xen-devel/2023-06/msg00863.html

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
As the tag implies, this patch is not intended to be merged (yet).

Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
code base. It will be used by the vPCI series [1]. This patch is intended to be
merged as part of the vPCI series.

v1->v2:
* new patch
---
 xen/arch/arm/Kconfig              | 1 +
 xen/arch/arm/include/asm/domain.h | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

Comments

Julien Grall July 7, 2023, 9 a.m. UTC | #1
Hi,

On 07/07/2023 02:47, Stewart Hildebrand wrote:
> Remove is_hardware_domain check in has_vpci, and select HAS_VPCI_GUEST_SUPPORT
> in Kconfig.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2023-06/msg00863.html
> 
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> As the tag implies, this patch is not intended to be merged (yet).

Can this be included in the vPCI series or resent afterwards?

> 
> Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
> code base. It will be used by the vPCI series [1]. This patch is intended to be
> merged as part of the vPCI series.
> 
> v1->v2:
> * new patch
> ---
>   xen/arch/arm/Kconfig              | 1 +
>   xen/arch/arm/include/asm/domain.h | 2 +-
>   2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 4e0cc421ad48..75dfa2f5a82d 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
>   	depends on ARM_64
>   	select HAS_PCI
>   	select HAS_VPCI
> +	select HAS_VPCI_GUEST_SUPPORT
>   	default n
>   	help
>   	  This option enables PCI device passthrough
> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> index 1a13965a26b8..6e016b00bae1 100644
> --- a/xen/arch/arm/include/asm/domain.h
> +++ b/xen/arch/arm/include/asm/domain.h
> @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
>   
>   #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
>   
> -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
> +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })

As I mentioned in the previous patch, wouldn't this enable vPCI 
unconditionally for all the domain? Shouldn't this be instead an 
optional feature which would be selected by the toolstack?

Cheers,
Roger Pau Monné July 7, 2023, 10:06 a.m. UTC | #2
On Fri, Jul 07, 2023 at 10:00:51AM +0100, Julien Grall wrote:
> On 07/07/2023 02:47, Stewart Hildebrand wrote:
> > Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
> > code base. It will be used by the vPCI series [1]. This patch is intended to be
> > merged as part of the vPCI series.
> > 
> > v1->v2:
> > * new patch
> > ---
> >   xen/arch/arm/Kconfig              | 1 +
> >   xen/arch/arm/include/asm/domain.h | 2 +-
> >   2 files changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > index 4e0cc421ad48..75dfa2f5a82d 100644
> > --- a/xen/arch/arm/Kconfig
> > +++ b/xen/arch/arm/Kconfig
> > @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
> >   	depends on ARM_64
> >   	select HAS_PCI
> >   	select HAS_VPCI
> > +	select HAS_VPCI_GUEST_SUPPORT
> >   	default n
> >   	help
> >   	  This option enables PCI device passthrough
> > diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> > index 1a13965a26b8..6e016b00bae1 100644
> > --- a/xen/arch/arm/include/asm/domain.h
> > +++ b/xen/arch/arm/include/asm/domain.h
> > @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
> >   #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
> > -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
> > +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
> 
> As I mentioned in the previous patch, wouldn't this enable vPCI
> unconditionally for all the domain? Shouldn't this be instead an optional
> feature which would be selected by the toolstack?

I do think so, at least on x86 we signal whether vPCI should be
enabled for a domain using xen_arch_domainconfig at domain creation.

Ideally we would like to do this on a per-device basis for domUs, so
we should consider adding a new flag to xen_domctl_assign_device in
order to signal whether the assigned device should use vPCI.

Thanks, Roger.
Julien Grall July 7, 2023, 10:33 a.m. UTC | #3
Hi,

On 07/07/2023 11:06, Roger Pau Monné wrote:
> On Fri, Jul 07, 2023 at 10:00:51AM +0100, Julien Grall wrote:
>> On 07/07/2023 02:47, Stewart Hildebrand wrote:
>>> Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
>>> code base. It will be used by the vPCI series [1]. This patch is intended to be
>>> merged as part of the vPCI series.
>>>
>>> v1->v2:
>>> * new patch
>>> ---
>>>    xen/arch/arm/Kconfig              | 1 +
>>>    xen/arch/arm/include/asm/domain.h | 2 +-
>>>    2 files changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>> index 4e0cc421ad48..75dfa2f5a82d 100644
>>> --- a/xen/arch/arm/Kconfig
>>> +++ b/xen/arch/arm/Kconfig
>>> @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
>>>    	depends on ARM_64
>>>    	select HAS_PCI
>>>    	select HAS_VPCI
>>> +	select HAS_VPCI_GUEST_SUPPORT
>>>    	default n
>>>    	help
>>>    	  This option enables PCI device passthrough
>>> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
>>> index 1a13965a26b8..6e016b00bae1 100644
>>> --- a/xen/arch/arm/include/asm/domain.h
>>> +++ b/xen/arch/arm/include/asm/domain.h
>>> @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
>>>    #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
>>> -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
>>> +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
>>
>> As I mentioned in the previous patch, wouldn't this enable vPCI
>> unconditionally for all the domain? Shouldn't this be instead an optional
>> feature which would be selected by the toolstack?
> 
> I do think so, at least on x86 we signal whether vPCI should be
> enabled for a domain using xen_arch_domainconfig at domain creation.
> 
> Ideally we would like to do this on a per-device basis for domUs, so
> we should consider adding a new flag to xen_domctl_assign_device in
> order to signal whether the assigned device should use vPCI.

I am a bit confused with this paragraph. If the device is not using 
vPCI, how will it be exposed to the domain? Are you planning to support 
both vPCI and PV PCI passthrough for a same domain?

Cheers,
Roger Pau Monné July 7, 2023, 10:47 a.m. UTC | #4
On Fri, Jul 07, 2023 at 11:33:14AM +0100, Julien Grall wrote:
> Hi,
> 
> On 07/07/2023 11:06, Roger Pau Monné wrote:
> > On Fri, Jul 07, 2023 at 10:00:51AM +0100, Julien Grall wrote:
> > > On 07/07/2023 02:47, Stewart Hildebrand wrote:
> > > > Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
> > > > code base. It will be used by the vPCI series [1]. This patch is intended to be
> > > > merged as part of the vPCI series.
> > > > 
> > > > v1->v2:
> > > > * new patch
> > > > ---
> > > >    xen/arch/arm/Kconfig              | 1 +
> > > >    xen/arch/arm/include/asm/domain.h | 2 +-
> > > >    2 files changed, 2 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > > > index 4e0cc421ad48..75dfa2f5a82d 100644
> > > > --- a/xen/arch/arm/Kconfig
> > > > +++ b/xen/arch/arm/Kconfig
> > > > @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
> > > >    	depends on ARM_64
> > > >    	select HAS_PCI
> > > >    	select HAS_VPCI
> > > > +	select HAS_VPCI_GUEST_SUPPORT
> > > >    	default n
> > > >    	help
> > > >    	  This option enables PCI device passthrough
> > > > diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> > > > index 1a13965a26b8..6e016b00bae1 100644
> > > > --- a/xen/arch/arm/include/asm/domain.h
> > > > +++ b/xen/arch/arm/include/asm/domain.h
> > > > @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
> > > >    #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
> > > > -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
> > > > +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
> > > 
> > > As I mentioned in the previous patch, wouldn't this enable vPCI
> > > unconditionally for all the domain? Shouldn't this be instead an optional
> > > feature which would be selected by the toolstack?
> > 
> > I do think so, at least on x86 we signal whether vPCI should be
> > enabled for a domain using xen_arch_domainconfig at domain creation.
> > 
> > Ideally we would like to do this on a per-device basis for domUs, so
> > we should consider adding a new flag to xen_domctl_assign_device in
> > order to signal whether the assigned device should use vPCI.
> 
> I am a bit confused with this paragraph. If the device is not using vPCI,
> how will it be exposed to the domain? Are you planning to support both vPCI
> and PV PCI passthrough for a same domain?

You could have an external device model handling it using the ioreq
interface, like we currently do passthrough for HVM guests.

Thanks, Roger.
Rahul Singh July 7, 2023, 11:04 a.m. UTC | #5
Hi Stewart,

> On 7 Jul 2023, at 2:47 am, Stewart Hildebrand <Stewart.Hildebrand@amd.com> wrote:
>
> Remove is_hardware_domain check in has_vpci, and select HAS_VPCI_GUEST_SUPPORT
> in Kconfig.
>
> [1] https://lists.xenproject.org/archives/html/xen-devel/2023-06/msg00863.html
>
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> As the tag implies, this patch is not intended to be merged (yet).
>
> Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
> code base. It will be used by the vPCI series [1]. This patch is intended to be
> merged as part of the vPCI series.
>
> v1->v2:
> * new patch
> ---
> xen/arch/arm/Kconfig              | 1 +
> xen/arch/arm/include/asm/domain.h | 2 +-
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 4e0cc421ad48..75dfa2f5a82d 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
> depends on ARM_64
> select HAS_PCI
> select HAS_VPCI
> + select HAS_VPCI_GUEST_SUPPORT

I tested this series on top of "SMMU handling for PCIe Passthrough on ARM” series on the N1SDP board
and observe the SMMUv3 fault.

Enable the Kconfig option PCI_PASSTHROUGH, ARM_SMMU_V3,HAS_ITS and "iommu=on”,
"pci_passthrough_enabled=on" cmd line parameter and after that, there is an SMMU fault
for the ITS doorbell register access from the PCI devices.

As there is no upstream support for ARM for vPCI MSI/MSI-X handling because of that SMMU fault is observed.

Linux Kernel will set the ITS doorbell register( physical address of doorbell register as IOMMU is not enabled in Kernel)
in PCI config space to set up the MSI-X interrupts, but there is no mapping in SMMU page tables because of that SMMU
fault is observed. To fix this we need to map the ITS doorbell register to SMMU page tables to avoid the fault.

We can fix this after setting the mapping for the ITS doorbell offset in the ITS code.

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 299b384250..8227a7a74b 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -682,6 +682,18 @@ static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
                                          BIT(size, UL), valid);
         if ( ret && valid )
             return ret;
+
+        if ( is_iommu_enabled(its->d) ) {
+            ret = map_mmio_regions(its->d, gaddr_to_gfn(its->doorbell_address),
+                           PFN_UP(ITS_DOORBELL_OFFSET),
+                           maddr_to_mfn(its->doorbell_address));
+            if ( ret < 0 )
+            {
+                printk(XENLOG_ERR "GICv3: Map ITS translation register d%d failed.\n",
+                        its->d->domain_id);
+                return ret;
+            }
+        }
     }

Also as per Julien's request, I tried to set up the IOMMU for the PCI device without
"pci_passthroigh_enable=on" and without HAS_VPCI everything works as expected
after applying below patches.

To test enable kconfig options HAS_PCI, ARM_SMMU_V3 and HAS_ITS and add below
patches to make it work.

    • Set the mapping for the ITS doorbell offset in the ITS code when iommu is enabled.
    • Reverted the patch that added the support for pci_passthrough_on.
    • Allow MMIO mapping of ECAM space to dom0 when vPCI is not enabled, as of now MMIO
      mapping for ECAM is based on pci_passthrough_enabled. We need this patch if we want to avoid
     enabling HAS_VPCI

Please find the attached patches in case you want to test at your end.



Regards,
Rahul

> default n
> help
>  This option enables PCI device passthrough
> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> index 1a13965a26b8..6e016b00bae1 100644
> --- a/xen/arch/arm/include/asm/domain.h
> +++ b/xen/arch/arm/include/asm/domain.h
> @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
>
> #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
>
> -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
> +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
>
> struct arch_vcpu_io {
>     struct instr_details dabt_instr; /* when the instruction is decoded */
> --
> 2.41.0
>
>
Julien Grall July 7, 2023, 11:16 a.m. UTC | #6
On 07/07/2023 11:47, Roger Pau Monné wrote:
> On Fri, Jul 07, 2023 at 11:33:14AM +0100, Julien Grall wrote:
>> Hi,
>>
>> On 07/07/2023 11:06, Roger Pau Monné wrote:
>>> On Fri, Jul 07, 2023 at 10:00:51AM +0100, Julien Grall wrote:
>>>> On 07/07/2023 02:47, Stewart Hildebrand wrote:
>>>>> Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
>>>>> code base. It will be used by the vPCI series [1]. This patch is intended to be
>>>>> merged as part of the vPCI series.
>>>>>
>>>>> v1->v2:
>>>>> * new patch
>>>>> ---
>>>>>     xen/arch/arm/Kconfig              | 1 +
>>>>>     xen/arch/arm/include/asm/domain.h | 2 +-
>>>>>     2 files changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>>>> index 4e0cc421ad48..75dfa2f5a82d 100644
>>>>> --- a/xen/arch/arm/Kconfig
>>>>> +++ b/xen/arch/arm/Kconfig
>>>>> @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
>>>>>     	depends on ARM_64
>>>>>     	select HAS_PCI
>>>>>     	select HAS_VPCI
>>>>> +	select HAS_VPCI_GUEST_SUPPORT
>>>>>     	default n
>>>>>     	help
>>>>>     	  This option enables PCI device passthrough
>>>>> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
>>>>> index 1a13965a26b8..6e016b00bae1 100644
>>>>> --- a/xen/arch/arm/include/asm/domain.h
>>>>> +++ b/xen/arch/arm/include/asm/domain.h
>>>>> @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
>>>>>     #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
>>>>> -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
>>>>> +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
>>>>
>>>> As I mentioned in the previous patch, wouldn't this enable vPCI
>>>> unconditionally for all the domain? Shouldn't this be instead an optional
>>>> feature which would be selected by the toolstack?
>>>
>>> I do think so, at least on x86 we signal whether vPCI should be
>>> enabled for a domain using xen_arch_domainconfig at domain creation.
>>>
>>> Ideally we would like to do this on a per-device basis for domUs, so
>>> we should consider adding a new flag to xen_domctl_assign_device in
>>> order to signal whether the assigned device should use vPCI.
>>
>> I am a bit confused with this paragraph. If the device is not using vPCI,
>> how will it be exposed to the domain? Are you planning to support both vPCI
>> and PV PCI passthrough for a same domain?
> 
> You could have an external device model handling it using the ioreq
> interface, like we currently do passthrough for HVM guests.

IMHO, if one decide to use QEMU for emulating the host bridge, then 
there is limited point to also ask Xen to emulate the hostbridge for 
some other device. So what would be the use case where you would want to 
be a per-device basis decision?

Cheers,
Roger Pau Monné July 7, 2023, 11:34 a.m. UTC | #7
On Fri, Jul 07, 2023 at 12:16:46PM +0100, Julien Grall wrote:
> 
> 
> On 07/07/2023 11:47, Roger Pau Monné wrote:
> > On Fri, Jul 07, 2023 at 11:33:14AM +0100, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 07/07/2023 11:06, Roger Pau Monné wrote:
> > > > On Fri, Jul 07, 2023 at 10:00:51AM +0100, Julien Grall wrote:
> > > > > On 07/07/2023 02:47, Stewart Hildebrand wrote:
> > > > > > Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
> > > > > > code base. It will be used by the vPCI series [1]. This patch is intended to be
> > > > > > merged as part of the vPCI series.
> > > > > > 
> > > > > > v1->v2:
> > > > > > * new patch
> > > > > > ---
> > > > > >     xen/arch/arm/Kconfig              | 1 +
> > > > > >     xen/arch/arm/include/asm/domain.h | 2 +-
> > > > > >     2 files changed, 2 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > > > > > index 4e0cc421ad48..75dfa2f5a82d 100644
> > > > > > --- a/xen/arch/arm/Kconfig
> > > > > > +++ b/xen/arch/arm/Kconfig
> > > > > > @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
> > > > > >     	depends on ARM_64
> > > > > >     	select HAS_PCI
> > > > > >     	select HAS_VPCI
> > > > > > +	select HAS_VPCI_GUEST_SUPPORT
> > > > > >     	default n
> > > > > >     	help
> > > > > >     	  This option enables PCI device passthrough
> > > > > > diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> > > > > > index 1a13965a26b8..6e016b00bae1 100644
> > > > > > --- a/xen/arch/arm/include/asm/domain.h
> > > > > > +++ b/xen/arch/arm/include/asm/domain.h
> > > > > > @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
> > > > > >     #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
> > > > > > -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
> > > > > > +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
> > > > > 
> > > > > As I mentioned in the previous patch, wouldn't this enable vPCI
> > > > > unconditionally for all the domain? Shouldn't this be instead an optional
> > > > > feature which would be selected by the toolstack?
> > > > 
> > > > I do think so, at least on x86 we signal whether vPCI should be
> > > > enabled for a domain using xen_arch_domainconfig at domain creation.
> > > > 
> > > > Ideally we would like to do this on a per-device basis for domUs, so
> > > > we should consider adding a new flag to xen_domctl_assign_device in
> > > > order to signal whether the assigned device should use vPCI.
> > > 
> > > I am a bit confused with this paragraph. If the device is not using vPCI,
> > > how will it be exposed to the domain? Are you planning to support both vPCI
> > > and PV PCI passthrough for a same domain?
> > 
> > You could have an external device model handling it using the ioreq
> > interface, like we currently do passthrough for HVM guests.
> 
> IMHO, if one decide to use QEMU for emulating the host bridge, then there is
> limited point to also ask Xen to emulate the hostbridge for some other
> device. So what would be the use case where you would want to be a
> per-device basis decision?

You could also emulate the bridge in Xen and then have QEMU and
vPCI handle accesses to the PCI config space for different devices.
The ioreq interface already allows registering for config space
accesses on a per SBDF basis.

XenServer currently has a use-case where generic PCI device
passthrough is handled by QEMU, while some GPUs are passed through
using a custom emulator.  So some domains effectively end with a QEMU
instance and a custom emulator, I don't see why you couldn't
technically replace QEMU with vPCI in this scenario.

The PCI root complex might be emulated by QEMU, or ideally by Xen.
That shouldn't prevent other device models from handling accesses for
devices, as long as accesses to the ECAM region(s) are trapped and
decoded by Xen.  IOW: if we want bridges to be emulated by ioreq
servers we need to introduce an hypercall to register ECAM regions
with Xen so that it can decode accesses and forward them
appropriately.

Thanks, Roger.
Julien Grall July 7, 2023, 12:09 p.m. UTC | #8
Hi,

On 07/07/2023 12:34, Roger Pau Monné wrote:
> On Fri, Jul 07, 2023 at 12:16:46PM +0100, Julien Grall wrote:
>>
>>
>> On 07/07/2023 11:47, Roger Pau Monné wrote:
>>> On Fri, Jul 07, 2023 at 11:33:14AM +0100, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 07/07/2023 11:06, Roger Pau Monné wrote:
>>>>> On Fri, Jul 07, 2023 at 10:00:51AM +0100, Julien Grall wrote:
>>>>>> On 07/07/2023 02:47, Stewart Hildebrand wrote:
>>>>>>> Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
>>>>>>> code base. It will be used by the vPCI series [1]. This patch is intended to be
>>>>>>> merged as part of the vPCI series.
>>>>>>>
>>>>>>> v1->v2:
>>>>>>> * new patch
>>>>>>> ---
>>>>>>>      xen/arch/arm/Kconfig              | 1 +
>>>>>>>      xen/arch/arm/include/asm/domain.h | 2 +-
>>>>>>>      2 files changed, 2 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>>>>>> index 4e0cc421ad48..75dfa2f5a82d 100644
>>>>>>> --- a/xen/arch/arm/Kconfig
>>>>>>> +++ b/xen/arch/arm/Kconfig
>>>>>>> @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
>>>>>>>      	depends on ARM_64
>>>>>>>      	select HAS_PCI
>>>>>>>      	select HAS_VPCI
>>>>>>> +	select HAS_VPCI_GUEST_SUPPORT
>>>>>>>      	default n
>>>>>>>      	help
>>>>>>>      	  This option enables PCI device passthrough
>>>>>>> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
>>>>>>> index 1a13965a26b8..6e016b00bae1 100644
>>>>>>> --- a/xen/arch/arm/include/asm/domain.h
>>>>>>> +++ b/xen/arch/arm/include/asm/domain.h
>>>>>>> @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
>>>>>>>      #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
>>>>>>> -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
>>>>>>> +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
>>>>>>
>>>>>> As I mentioned in the previous patch, wouldn't this enable vPCI
>>>>>> unconditionally for all the domain? Shouldn't this be instead an optional
>>>>>> feature which would be selected by the toolstack?
>>>>>
>>>>> I do think so, at least on x86 we signal whether vPCI should be
>>>>> enabled for a domain using xen_arch_domainconfig at domain creation.
>>>>>
>>>>> Ideally we would like to do this on a per-device basis for domUs, so
>>>>> we should consider adding a new flag to xen_domctl_assign_device in
>>>>> order to signal whether the assigned device should use vPCI.
>>>>
>>>> I am a bit confused with this paragraph. If the device is not using vPCI,
>>>> how will it be exposed to the domain? Are you planning to support both vPCI
>>>> and PV PCI passthrough for a same domain?
>>>
>>> You could have an external device model handling it using the ioreq
>>> interface, like we currently do passthrough for HVM guests.
>>
>> IMHO, if one decide to use QEMU for emulating the host bridge, then there is
>> limited point to also ask Xen to emulate the hostbridge for some other
>> device. So what would be the use case where you would want to be a
>> per-device basis decision?
> 
> You could also emulate the bridge in Xen and then have QEMU and
> vPCI handle accesses to the PCI config space for different devices.
> The ioreq interface already allows registering for config space
> accesses on a per SBDF basis.
> 
> XenServer currently has a use-case where generic PCI device
> passthrough is handled by QEMU, while some GPUs are passed through
> using a custom emulator.  So some domains effectively end with a QEMU
> instance and a custom emulator, I don't see why you couldn't
> technically replace QEMU with vPCI in this scenario.
> 
> The PCI root complex might be emulated by QEMU, or ideally by Xen.
> That shouldn't prevent other device models from handling accesses for
> devices, as long as accesses to the ECAM region(s) are trapped and
> decoded by Xen.  IOW: if we want bridges to be emulated by ioreq
> servers we need to introduce an hypercall to register ECAM regions
> with Xen so that it can decode accesses and forward them
> appropriately.

Thanks for the clarification. Going back to the original discussion. 
Even with this setup, I think we still need to tell at domain creation 
whether vPCI will be used (think PCI hotplug).

After that, the device assignment hypercall could have a way to say 
whether the device will be emulated by vPCI. But I don't think this is 
necessary to have from day one as the ABI will be not stable (this is a 
DOMCTL).


Cheers,
Roger Pau Monné July 7, 2023, 1:13 p.m. UTC | #9
On Fri, Jul 07, 2023 at 01:09:40PM +0100, Julien Grall wrote:
> Hi,
> 
> On 07/07/2023 12:34, Roger Pau Monné wrote:
> > On Fri, Jul 07, 2023 at 12:16:46PM +0100, Julien Grall wrote:
> > > 
> > > 
> > > On 07/07/2023 11:47, Roger Pau Monné wrote:
> > > > On Fri, Jul 07, 2023 at 11:33:14AM +0100, Julien Grall wrote:
> > > > > Hi,
> > > > > 
> > > > > On 07/07/2023 11:06, Roger Pau Monné wrote:
> > > > > > On Fri, Jul 07, 2023 at 10:00:51AM +0100, Julien Grall wrote:
> > > > > > > On 07/07/2023 02:47, Stewart Hildebrand wrote:
> > > > > > > > Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
> > > > > > > > code base. It will be used by the vPCI series [1]. This patch is intended to be
> > > > > > > > merged as part of the vPCI series.
> > > > > > > > 
> > > > > > > > v1->v2:
> > > > > > > > * new patch
> > > > > > > > ---
> > > > > > > >      xen/arch/arm/Kconfig              | 1 +
> > > > > > > >      xen/arch/arm/include/asm/domain.h | 2 +-
> > > > > > > >      2 files changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > 
> > > > > > > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > > > > > > > index 4e0cc421ad48..75dfa2f5a82d 100644
> > > > > > > > --- a/xen/arch/arm/Kconfig
> > > > > > > > +++ b/xen/arch/arm/Kconfig
> > > > > > > > @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
> > > > > > > >      	depends on ARM_64
> > > > > > > >      	select HAS_PCI
> > > > > > > >      	select HAS_VPCI
> > > > > > > > +	select HAS_VPCI_GUEST_SUPPORT
> > > > > > > >      	default n
> > > > > > > >      	help
> > > > > > > >      	  This option enables PCI device passthrough
> > > > > > > > diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> > > > > > > > index 1a13965a26b8..6e016b00bae1 100644
> > > > > > > > --- a/xen/arch/arm/include/asm/domain.h
> > > > > > > > +++ b/xen/arch/arm/include/asm/domain.h
> > > > > > > > @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
> > > > > > > >      #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
> > > > > > > > -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
> > > > > > > > +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
> > > > > > > 
> > > > > > > As I mentioned in the previous patch, wouldn't this enable vPCI
> > > > > > > unconditionally for all the domain? Shouldn't this be instead an optional
> > > > > > > feature which would be selected by the toolstack?
> > > > > > 
> > > > > > I do think so, at least on x86 we signal whether vPCI should be
> > > > > > enabled for a domain using xen_arch_domainconfig at domain creation.
> > > > > > 
> > > > > > Ideally we would like to do this on a per-device basis for domUs, so
> > > > > > we should consider adding a new flag to xen_domctl_assign_device in
> > > > > > order to signal whether the assigned device should use vPCI.
> > > > > 
> > > > > I am a bit confused with this paragraph. If the device is not using vPCI,
> > > > > how will it be exposed to the domain? Are you planning to support both vPCI
> > > > > and PV PCI passthrough for a same domain?
> > > > 
> > > > You could have an external device model handling it using the ioreq
> > > > interface, like we currently do passthrough for HVM guests.
> > > 
> > > IMHO, if one decide to use QEMU for emulating the host bridge, then there is
> > > limited point to also ask Xen to emulate the hostbridge for some other
> > > device. So what would be the use case where you would want to be a
> > > per-device basis decision?
> > 
> > You could also emulate the bridge in Xen and then have QEMU and
> > vPCI handle accesses to the PCI config space for different devices.
> > The ioreq interface already allows registering for config space
> > accesses on a per SBDF basis.
> > 
> > XenServer currently has a use-case where generic PCI device
> > passthrough is handled by QEMU, while some GPUs are passed through
> > using a custom emulator.  So some domains effectively end with a QEMU
> > instance and a custom emulator, I don't see why you couldn't
> > technically replace QEMU with vPCI in this scenario.
> > 
> > The PCI root complex might be emulated by QEMU, or ideally by Xen.
> > That shouldn't prevent other device models from handling accesses for
> > devices, as long as accesses to the ECAM region(s) are trapped and
> > decoded by Xen.  IOW: if we want bridges to be emulated by ioreq
> > servers we need to introduce an hypercall to register ECAM regions
> > with Xen so that it can decode accesses and forward them
> > appropriately.
> 
> Thanks for the clarification. Going back to the original discussion. Even
> with this setup, I think we still need to tell at domain creation whether
> vPCI will be used (think PCI hotplug).

Well, for PCI hotplug you will still need to execute a
XEN_DOMCTL_assign_device hypercall in order to assign the device, at
which point you could pass the vPCI flag.

What you likely want at domain create is whether the IOMMU should be
enabled or not, as we no longer allow late enabling of the IOMMU once
the domain has been created.

One question I have is whether Arm plans to allow exposing fully
emulated devices on the PCI config space, or that would be limited to
PCI device passthrough?

IOW: should an emulated PCI root complex be unconditionally exposed to
guests so that random ioreq servers can register for SBDF slots?

> After that, the device assignment hypercall could have a way to say whether
> the device will be emulated by vPCI. But I don't think this is necessary to
> have from day one as the ABI will be not stable (this is a DOMCTL).

Indeed, it's not a stable interface, but we might as well get
something sane if we have to plumb it through the tools.  Either if
it's a domain create flag or a device attach flag you will need some
plumbing to do at the toolstack level, at which point we might as well
use an interface that doesn't have arbitrary limits.

Thanks, Roger.
Julien Grall July 7, 2023, 1:27 p.m. UTC | #10
Hi,

On 07/07/2023 14:13, Roger Pau Monné wrote:
> On Fri, Jul 07, 2023 at 01:09:40PM +0100, Julien Grall wrote:
>> Hi,
>>
>> On 07/07/2023 12:34, Roger Pau Monné wrote:
>>> On Fri, Jul 07, 2023 at 12:16:46PM +0100, Julien Grall wrote:
>>>>
>>>>
>>>> On 07/07/2023 11:47, Roger Pau Monné wrote:
>>>>> On Fri, Jul 07, 2023 at 11:33:14AM +0100, Julien Grall wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 07/07/2023 11:06, Roger Pau Monné wrote:
>>>>>>> On Fri, Jul 07, 2023 at 10:00:51AM +0100, Julien Grall wrote:
>>>>>>>> On 07/07/2023 02:47, Stewart Hildebrand wrote:
>>>>>>>>> Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
>>>>>>>>> code base. It will be used by the vPCI series [1]. This patch is intended to be
>>>>>>>>> merged as part of the vPCI series.
>>>>>>>>>
>>>>>>>>> v1->v2:
>>>>>>>>> * new patch
>>>>>>>>> ---
>>>>>>>>>       xen/arch/arm/Kconfig              | 1 +
>>>>>>>>>       xen/arch/arm/include/asm/domain.h | 2 +-
>>>>>>>>>       2 files changed, 2 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>>>>>>>> index 4e0cc421ad48..75dfa2f5a82d 100644
>>>>>>>>> --- a/xen/arch/arm/Kconfig
>>>>>>>>> +++ b/xen/arch/arm/Kconfig
>>>>>>>>> @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
>>>>>>>>>       	depends on ARM_64
>>>>>>>>>       	select HAS_PCI
>>>>>>>>>       	select HAS_VPCI
>>>>>>>>> +	select HAS_VPCI_GUEST_SUPPORT
>>>>>>>>>       	default n
>>>>>>>>>       	help
>>>>>>>>>       	  This option enables PCI device passthrough
>>>>>>>>> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
>>>>>>>>> index 1a13965a26b8..6e016b00bae1 100644
>>>>>>>>> --- a/xen/arch/arm/include/asm/domain.h
>>>>>>>>> +++ b/xen/arch/arm/include/asm/domain.h
>>>>>>>>> @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
>>>>>>>>>       #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
>>>>>>>>> -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
>>>>>>>>> +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
>>>>>>>>
>>>>>>>> As I mentioned in the previous patch, wouldn't this enable vPCI
>>>>>>>> unconditionally for all the domain? Shouldn't this be instead an optional
>>>>>>>> feature which would be selected by the toolstack?
>>>>>>>
>>>>>>> I do think so, at least on x86 we signal whether vPCI should be
>>>>>>> enabled for a domain using xen_arch_domainconfig at domain creation.
>>>>>>>
>>>>>>> Ideally we would like to do this on a per-device basis for domUs, so
>>>>>>> we should consider adding a new flag to xen_domctl_assign_device in
>>>>>>> order to signal whether the assigned device should use vPCI.
>>>>>>
>>>>>> I am a bit confused with this paragraph. If the device is not using vPCI,
>>>>>> how will it be exposed to the domain? Are you planning to support both vPCI
>>>>>> and PV PCI passthrough for a same domain?
>>>>>
>>>>> You could have an external device model handling it using the ioreq
>>>>> interface, like we currently do passthrough for HVM guests.
>>>>
>>>> IMHO, if one decide to use QEMU for emulating the host bridge, then there is
>>>> limited point to also ask Xen to emulate the hostbridge for some other
>>>> device. So what would be the use case where you would want to be a
>>>> per-device basis decision?
>>>
>>> You could also emulate the bridge in Xen and then have QEMU and
>>> vPCI handle accesses to the PCI config space for different devices.
>>> The ioreq interface already allows registering for config space
>>> accesses on a per SBDF basis.
>>>
>>> XenServer currently has a use-case where generic PCI device
>>> passthrough is handled by QEMU, while some GPUs are passed through
>>> using a custom emulator.  So some domains effectively end with a QEMU
>>> instance and a custom emulator, I don't see why you couldn't
>>> technically replace QEMU with vPCI in this scenario.
>>>
>>> The PCI root complex might be emulated by QEMU, or ideally by Xen.
>>> That shouldn't prevent other device models from handling accesses for
>>> devices, as long as accesses to the ECAM region(s) are trapped and
>>> decoded by Xen.  IOW: if we want bridges to be emulated by ioreq
>>> servers we need to introduce an hypercall to register ECAM regions
>>> with Xen so that it can decode accesses and forward them
>>> appropriately.
>>
>> Thanks for the clarification. Going back to the original discussion. Even
>> with this setup, I think we still need to tell at domain creation whether
>> vPCI will be used (think PCI hotplug).
> 
> Well, for PCI hotplug you will still need to execute a
> XEN_DOMCTL_assign_device hypercall in order to assign the device, at
> which point you could pass the vPCI flag.

I am probably missing something here. If you don't pass the vPCI flag at 
domain creation, wouldn't it mean that hostbridge would not be created 
until later? Are you thinking to make it unconditionally or hotplug it 
(even that's even possible)?

> 
> What you likely want at domain create is whether the IOMMU should be
> enabled or not, as we no longer allow late enabling of the IOMMU once
> the domain has been created.
> 
> One question I have is whether Arm plans to allow exposing fully
> emulated devices on the PCI config space, or that would be limited to
> PCI device passthrough?

In the longer term, I would expect to have a mix of physical and 
emulated device (e.g. virtio).

> 
> IOW: should an emulated PCI root complex be unconditionally exposed to
> guests so that random ioreq servers can register for SBDF slots?

I would say no. The vPCI should only be added when the configuration 
requested it. This is to avoid exposing unnecessary emulation to a 
domain (not everyone will want to use a PCI hostbridge).

> 
>> After that, the device assignment hypercall could have a way to say whether
>> the device will be emulated by vPCI. But I don't think this is necessary to
>> have from day one as the ABI will be not stable (this is a DOMCTL).
> 
> Indeed, it's not a stable interface, but we might as well get
> something sane if we have to plumb it through the tools.  Either if
> it's a domain create flag or a device attach flag you will need some
> plumbing to do at the toolstack level, at which point we might as well
> use an interface that doesn't have arbitrary limits.

I think we need both flags. In your approach you seem to want to either 
have the hostbridge created unconditionally or hotplug it (if that's 
even possible).

However, I don't think we should have the vPCI unconditionally created 
and we should still allow the toolstack to say at domain creation that 
PCI will be used.

Cheers,
Roger Pau Monné July 7, 2023, 1:40 p.m. UTC | #11
On Fri, Jul 07, 2023 at 02:27:17PM +0100, Julien Grall wrote:
> Hi,
> 
> On 07/07/2023 14:13, Roger Pau Monné wrote:
> > On Fri, Jul 07, 2023 at 01:09:40PM +0100, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 07/07/2023 12:34, Roger Pau Monné wrote:
> > > > On Fri, Jul 07, 2023 at 12:16:46PM +0100, Julien Grall wrote:
> > > > > 
> > > > > 
> > > > > On 07/07/2023 11:47, Roger Pau Monné wrote:
> > > > > > On Fri, Jul 07, 2023 at 11:33:14AM +0100, Julien Grall wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On 07/07/2023 11:06, Roger Pau Monné wrote:
> > > > > > > > On Fri, Jul 07, 2023 at 10:00:51AM +0100, Julien Grall wrote:
> > > > > > > > > On 07/07/2023 02:47, Stewart Hildebrand wrote:
> > > > > > > > > > Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
> > > > > > > > > > code base. It will be used by the vPCI series [1]. This patch is intended to be
> > > > > > > > > > merged as part of the vPCI series.
> > > > > > > > > > 
> > > > > > > > > > v1->v2:
> > > > > > > > > > * new patch
> > > > > > > > > > ---
> > > > > > > > > >       xen/arch/arm/Kconfig              | 1 +
> > > > > > > > > >       xen/arch/arm/include/asm/domain.h | 2 +-
> > > > > > > > > >       2 files changed, 2 insertions(+), 1 deletion(-)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > > > > > > > > > index 4e0cc421ad48..75dfa2f5a82d 100644
> > > > > > > > > > --- a/xen/arch/arm/Kconfig
> > > > > > > > > > +++ b/xen/arch/arm/Kconfig
> > > > > > > > > > @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
> > > > > > > > > >       	depends on ARM_64
> > > > > > > > > >       	select HAS_PCI
> > > > > > > > > >       	select HAS_VPCI
> > > > > > > > > > +	select HAS_VPCI_GUEST_SUPPORT
> > > > > > > > > >       	default n
> > > > > > > > > >       	help
> > > > > > > > > >       	  This option enables PCI device passthrough
> > > > > > > > > > diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
> > > > > > > > > > index 1a13965a26b8..6e016b00bae1 100644
> > > > > > > > > > --- a/xen/arch/arm/include/asm/domain.h
> > > > > > > > > > +++ b/xen/arch/arm/include/asm/domain.h
> > > > > > > > > > @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
> > > > > > > > > >       #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
> > > > > > > > > > -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
> > > > > > > > > > +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
> > > > > > > > > 
> > > > > > > > > As I mentioned in the previous patch, wouldn't this enable vPCI
> > > > > > > > > unconditionally for all the domain? Shouldn't this be instead an optional
> > > > > > > > > feature which would be selected by the toolstack?
> > > > > > > > 
> > > > > > > > I do think so, at least on x86 we signal whether vPCI should be
> > > > > > > > enabled for a domain using xen_arch_domainconfig at domain creation.
> > > > > > > > 
> > > > > > > > Ideally we would like to do this on a per-device basis for domUs, so
> > > > > > > > we should consider adding a new flag to xen_domctl_assign_device in
> > > > > > > > order to signal whether the assigned device should use vPCI.
> > > > > > > 
> > > > > > > I am a bit confused with this paragraph. If the device is not using vPCI,
> > > > > > > how will it be exposed to the domain? Are you planning to support both vPCI
> > > > > > > and PV PCI passthrough for a same domain?
> > > > > > 
> > > > > > You could have an external device model handling it using the ioreq
> > > > > > interface, like we currently do passthrough for HVM guests.
> > > > > 
> > > > > IMHO, if one decide to use QEMU for emulating the host bridge, then there is
> > > > > limited point to also ask Xen to emulate the hostbridge for some other
> > > > > device. So what would be the use case where you would want to be a
> > > > > per-device basis decision?
> > > > 
> > > > You could also emulate the bridge in Xen and then have QEMU and
> > > > vPCI handle accesses to the PCI config space for different devices.
> > > > The ioreq interface already allows registering for config space
> > > > accesses on a per SBDF basis.
> > > > 
> > > > XenServer currently has a use-case where generic PCI device
> > > > passthrough is handled by QEMU, while some GPUs are passed through
> > > > using a custom emulator.  So some domains effectively end with a QEMU
> > > > instance and a custom emulator, I don't see why you couldn't
> > > > technically replace QEMU with vPCI in this scenario.
> > > > 
> > > > The PCI root complex might be emulated by QEMU, or ideally by Xen.
> > > > That shouldn't prevent other device models from handling accesses for
> > > > devices, as long as accesses to the ECAM region(s) are trapped and
> > > > decoded by Xen.  IOW: if we want bridges to be emulated by ioreq
> > > > servers we need to introduce an hypercall to register ECAM regions
> > > > with Xen so that it can decode accesses and forward them
> > > > appropriately.
> > > 
> > > Thanks for the clarification. Going back to the original discussion. Even
> > > with this setup, I think we still need to tell at domain creation whether
> > > vPCI will be used (think PCI hotplug).
> > 
> > Well, for PCI hotplug you will still need to execute a
> > XEN_DOMCTL_assign_device hypercall in order to assign the device, at
> > which point you could pass the vPCI flag.
> 
> I am probably missing something here. If you don't pass the vPCI flag at
> domain creation, wouldn't it mean that hostbridge would not be created until
> later? Are you thinking to make it unconditionally or hotplug it (even
> that's even possible)?

I think at domain creation more than a vPCI flag you want an 'emulate a
PCI bridge' flag.  Such flag will also be needed if in the future you
want to support virtio-pci devices for example, and those have nothing
do to do with vPCI.

> > 
> > What you likely want at domain create is whether the IOMMU should be
> > enabled or not, as we no longer allow late enabling of the IOMMU once
> > the domain has been created.
> > 
> > One question I have is whether Arm plans to allow exposing fully
> > emulated devices on the PCI config space, or that would be limited to
> > PCI device passthrough?
> 
> In the longer term, I would expect to have a mix of physical and emulated
> device (e.g. virtio).

That's what I would expect.

> > 
> > IOW: should an emulated PCI root complex be unconditionally exposed to
> > guests so that random ioreq servers can register for SBDF slots?
> 
> I would say no. The vPCI should only be added when the configuration
> requested it. This is to avoid exposing unnecessary emulation to a domain
> (not everyone will want to use a PCI hostbridge).

Right, then as replied above you might want a domain create flag to
signal whether to emulate a PCI bridge for the domain.

> > 
> > > After that, the device assignment hypercall could have a way to say whether
> > > the device will be emulated by vPCI. But I don't think this is necessary to
> > > have from day one as the ABI will be not stable (this is a DOMCTL).
> > 
> > Indeed, it's not a stable interface, but we might as well get
> > something sane if we have to plumb it through the tools.  Either if
> > it's a domain create flag or a device attach flag you will need some
> > plumbing to do at the toolstack level, at which point we might as well
> > use an interface that doesn't have arbitrary limits.
> 
> I think we need both flags. In your approach you seem to want to either have
> the hostbridge created unconditionally or hotplug it (if that's even
> possible).

You could in theory have hotplug MCFG (ECAM) regions in ACPI, but I'm
unsure how many OSes support that, but no, I don't think we should try
to hotplug PCI bridges.

I was thinking that for x86 PVH we might want to unconditionally
expose a PCI bridge, but it might be better to signal that from the
domain configuration and not make it mandatory.

> However, I don't think we should have the vPCI unconditionally created and
> we should still allow the toolstack to say at domain creation that PCI will
> be used.

Indeed.  I think a domain create emulate a PCI bridge flag and a vPCI
flag for XEN_DOMCTL_assign_device are required.

Thanks, Roger.
Stewart Hildebrand July 21, 2023, 4:54 a.m. UTC | #12
On 7/7/23 07:04, Rahul Singh wrote:
> Hi Stewart,
> 
>> On 7 Jul 2023, at 2:47 am, Stewart Hildebrand <Stewart.Hildebrand@amd.com> wrote:
>>
>> Remove is_hardware_domain check in has_vpci, and select HAS_VPCI_GUEST_SUPPORT
>> in Kconfig.
>>
>> [1] https://lists.xenproject.org/archives/html/xen-devel/2023-06/msg00863.html
>>
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> ---
>> As the tag implies, this patch is not intended to be merged (yet).
>>
>> Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
>> code base. It will be used by the vPCI series [1]. This patch is intended to be
>> merged as part of the vPCI series.
>>
>> v1->v2:
>> * new patch
>> ---
>> xen/arch/arm/Kconfig              | 1 +
>> xen/arch/arm/include/asm/domain.h | 2 +-
>> 2 files changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index 4e0cc421ad48..75dfa2f5a82d 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
>> depends on ARM_64
>> select HAS_PCI
>> select HAS_VPCI
>> + select HAS_VPCI_GUEST_SUPPORT
> 
> I tested this series on top of "SMMU handling for PCIe Passthrough on ARM” series on the N1SDP board
> and observe the SMMUv3 fault.

Thanks for testing this. After a great deal of tinkering, I can reproduce the SMMU fault.

(XEN) smmu: /axi/smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0xf9030040, fsynr=0x12, cb=0

> Enable the Kconfig option PCI_PASSTHROUGH, ARM_SMMU_V3,HAS_ITS and "iommu=on”,
> "pci_passthrough_enabled=on" cmd line parameter and after that, there is an SMMU fault
> for the ITS doorbell register access from the PCI devices.
> 
> As there is no upstream support for ARM for vPCI MSI/MSI-X handling because of that SMMU fault is observed.
> 
> Linux Kernel will set the ITS doorbell register( physical address of doorbell register as IOMMU is not enabled in Kernel)
> in PCI config space to set up the MSI-X interrupts, but there is no mapping in SMMU page tables because of that SMMU
> fault is observed. To fix this we need to map the ITS doorbell register to SMMU page tables to avoid the fault.
> 
> We can fix this after setting the mapping for the ITS doorbell offset in the ITS code.
> 
> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
> index 299b384250..8227a7a74b 100644
> --- a/xen/arch/arm/vgic-v3-its.c
> +++ b/xen/arch/arm/vgic-v3-its.c
> @@ -682,6 +682,18 @@ static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
>                                           BIT(size, UL), valid);
>          if ( ret && valid )
>              return ret;
> +
> +        if ( is_iommu_enabled(its->d) ) {
> +            ret = map_mmio_regions(its->d, gaddr_to_gfn(its->doorbell_address),
> +                           PFN_UP(ITS_DOORBELL_OFFSET),
> +                           maddr_to_mfn(its->doorbell_address));
> +            if ( ret < 0 )
> +            {
> +                printk(XENLOG_ERR "GICv3: Map ITS translation register d%d failed.\n",
> +                        its->d->domain_id);
> +                return ret;
> +            }
> +        }
>      }

Thank you, this resolves the SMMU fault. If it's okay, I will include this patch in the next revision of the SMMU series (I see your Signed-off-by is already in the attachment).

> Also as per Julien's request, I tried to set up the IOMMU for the PCI device without
> "pci_passthroigh_enable=on" and without HAS_VPCI everything works as expected
> after applying below patches.
> 
> To test enable kconfig options HAS_PCI, ARM_SMMU_V3 and HAS_ITS and add below
> patches to make it work.
> 
>     • Set the mapping for the ITS doorbell offset in the ITS code when iommu is enabled.
>     • Reverted the patch that added the support for pci_passthrough_on.
>     • Allow MMIO mapping of ECAM space to dom0 when vPCI is not enabled, as of now MMIO
>       mapping for ECAM is based on pci_passthrough_enabled. We need this patch if we want to avoid
>      enabling HAS_VPCI
> 
> Please find the attached patches in case you want to test at your end.
> 
> 
> 
> Regards,
> Rahul
> 
>> default n
>> help
>>  This option enables PCI device passthrough
>> diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
>> index 1a13965a26b8..6e016b00bae1 100644
>> --- a/xen/arch/arm/include/asm/domain.h
>> +++ b/xen/arch/arm/include/asm/domain.h
>> @@ -298,7 +298,7 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
>>
>> #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
>>
>> -#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
>> +#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
>>
>> struct arch_vcpu_io {
>>     struct instr_details dabt_instr; /* when the instruction is decoded */
>> --
>> 2.41.0
>>
>>
>
Rahul Singh July 21, 2023, 8:41 a.m. UTC | #13
Hi Stewart,

> On 21 Jul 2023, at 5:54 am, Stewart Hildebrand <Stewart.Hildebrand@amd.com> wrote:
> 
> On 7/7/23 07:04, Rahul Singh wrote:
>> Hi Stewart,
>> 
>>> On 7 Jul 2023, at 2:47 am, Stewart Hildebrand <Stewart.Hildebrand@amd.com> wrote:
>>> 
>>> Remove is_hardware_domain check in has_vpci, and select HAS_VPCI_GUEST_SUPPORT
>>> in Kconfig.
>>> 
>>> [1] https://lists.xenproject.org/archives/html/xen-devel/2023-06/msg00863.html
>>> 
>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>>> ---
>>> As the tag implies, this patch is not intended to be merged (yet).
>>> 
>>> Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
>>> code base. It will be used by the vPCI series [1]. This patch is intended to be
>>> merged as part of the vPCI series.
>>> 
>>> v1->v2:
>>> * new patch
>>> ---
>>> xen/arch/arm/Kconfig              | 1 +
>>> xen/arch/arm/include/asm/domain.h | 2 +-
>>> 2 files changed, 2 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>> index 4e0cc421ad48..75dfa2f5a82d 100644
>>> --- a/xen/arch/arm/Kconfig
>>> +++ b/xen/arch/arm/Kconfig
>>> @@ -195,6 +195,7 @@ config PCI_PASSTHROUGH
>>> depends on ARM_64
>>> select HAS_PCI
>>> select HAS_VPCI
>>> + select HAS_VPCI_GUEST_SUPPORT
>> 
>> I tested this series on top of "SMMU handling for PCIe Passthrough on ARM” series on the N1SDP board
>> and observe the SMMUv3 fault.
> 
> Thanks for testing this. After a great deal of tinkering, I can reproduce the SMMU fault.
> 
> (XEN) smmu: /axi/smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0xf9030040, fsynr=0x12, cb=0
> 
>> Enable the Kconfig option PCI_PASSTHROUGH, ARM_SMMU_V3,HAS_ITS and "iommu=on”,
>> "pci_passthrough_enabled=on" cmd line parameter and after that, there is an SMMU fault
>> for the ITS doorbell register access from the PCI devices.
>> 
>> As there is no upstream support for ARM for vPCI MSI/MSI-X handling because of that SMMU fault is observed.
>> 
>> Linux Kernel will set the ITS doorbell register( physical address of doorbell register as IOMMU is not enabled in Kernel)
>> in PCI config space to set up the MSI-X interrupts, but there is no mapping in SMMU page tables because of that SMMU
>> fault is observed. To fix this we need to map the ITS doorbell register to SMMU page tables to avoid the fault.
>> 
>> We can fix this after setting the mapping for the ITS doorbell offset in the ITS code.
>> 
>> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
>> index 299b384250..8227a7a74b 100644
>> --- a/xen/arch/arm/vgic-v3-its.c
>> +++ b/xen/arch/arm/vgic-v3-its.c
>> @@ -682,6 +682,18 @@ static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
>>                                          BIT(size, UL), valid);
>>         if ( ret && valid )
>>             return ret;
>> +
>> +        if ( is_iommu_enabled(its->d) ) {
>> +            ret = map_mmio_regions(its->d, gaddr_to_gfn(its->doorbell_address),
>> +                           PFN_UP(ITS_DOORBELL_OFFSET),
>> +                           maddr_to_mfn(its->doorbell_address));
>> +            if ( ret < 0 )
>> +            {
>> +                printk(XENLOG_ERR "GICv3: Map ITS translation register d%d failed.\n",
>> +                        its->d->domain_id);
>> +                return ret;
>> +            }
>> +        }
>>     }
> 
> Thank you, this resolves the SMMU fault. If it's okay, I will include this patch in the next revision of the SMMU series (I see your Signed-off-by is already in the attachment).

Yes, you can include this patch in your next version.
> 
>> Also as per Julien's request, I tried to set up the IOMMU for the PCI device without
>> "pci_passthroigh_enable=on" and without HAS_VPCI everything works as expected
>> after applying below patches.
>> 
>> To test enable kconfig options HAS_PCI, ARM_SMMU_V3 and HAS_ITS and add below
>> patches to make it work.
>> 
>>    • Set the mapping for the ITS doorbell offset in the ITS code when iommu is enabled.

Also, If we want to support for adding PCI devices to IOMMU without PCI passthrough
support (without HAS_VPCI and cmd line options “pci-passthrough-enabled=on”)
as suggested by Julien, we also need below 2 patches also.

>>    • Reverted the patch that added the support for pci_passthrough_on.
>>    • Allow MMIO mapping of ECAM space to dom0 when vPCI is not enabled, as of now MMIO
>>      mapping for ECAM is based on pci_passthrough_enabled. We need this patch if we want to avoid
>>     enabling HAS_VPCI
>> 
>> Please find the attached patches in case you want to test at your end.
>> 

 
Regards,
Rahul
Stewart Hildebrand Oct. 9, 2023, 7:12 p.m. UTC | #14
On 7/7/23 05:00, Julien Grall wrote:
> Hi,
> 
> On 07/07/2023 02:47, Stewart Hildebrand wrote:
>> Remove is_hardware_domain check in has_vpci, and select HAS_VPCI_GUEST_SUPPORT
>> in Kconfig.
>>
>> [1] https://lists.xenproject.org/archives/html/xen-devel/2023-06/msg00863.html
>>
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> ---
>> As the tag implies, this patch is not intended to be merged (yet).
> 
> Can this be included in the vPCI series or resent afterwards?

Yes, I'll coordinate with Volodymyr. Since this has a dependency on ("xen/arm: pci: introduce PCI_PASSTHROUGH Kconfig option") I'll continue to include it in this series until the prerequisites are committed.

>>
>> Note that CONFIG_HAS_VPCI_GUEST_SUPPORT is not currently used in the upstream
>> code base. It will be used by the vPCI series [1]. This patch is intended to be
>> merged as part of the vPCI series.
>>
>> v1->v2:
>> * new patch
diff mbox series

Patch

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 4e0cc421ad48..75dfa2f5a82d 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -195,6 +195,7 @@  config PCI_PASSTHROUGH
 	depends on ARM_64
 	select HAS_PCI
 	select HAS_VPCI
+	select HAS_VPCI_GUEST_SUPPORT
 	default n
 	help
 	  This option enables PCI device passthrough
diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
index 1a13965a26b8..6e016b00bae1 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -298,7 +298,7 @@  static inline void arch_vcpu_block(struct vcpu *v) {}
 
 #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
 
-#define has_vpci(d) ({ IS_ENABLED(CONFIG_HAS_VPCI) && is_hardware_domain(d); })
+#define has_vpci(d)    ({ (void)(d); IS_ENABLED(CONFIG_HAS_VPCI); })
 
 struct arch_vcpu_io {
     struct instr_details dabt_instr; /* when the instruction is decoded */