Message ID | 20240109215145.430207-12-stewart.hildebrand@amd.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | PCI devices passthrough on Arm, part 3 | expand |
On Tue, Jan 9, 2024 at 9:54 PM Stewart Hildebrand <stewart.hildebrand@amd.com> wrote: > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h > index 37f5922f3206..b58a822847be 100644 > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -484,6 +484,14 @@ struct domain > * 2. pdev->vpci->lock > */ > rwlock_t pci_lock; > +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT > + /* > + * The bitmap which shows which device numbers are already used by the > + * virtual PCI bus topology and is used to assign a unique SBDF to the > + * next passed through virtual PCI device. > + */ > + DECLARE_BITMAP(vpci_dev_assigned_map, VPCI_MAX_VIRT_DEV); > +#endif > #endif Without digging through the whole series, how big do we expect this bitmap to be on typical systems? If it's only going to be a handful of bytes, keeping it around for all guests would be OK; but it's large, it would be better as a pointer, since it's unused on the vast majority of guests. -George
On 1/12/24 06:46, George Dunlap wrote: > On Tue, Jan 9, 2024 at 9:54 PM Stewart Hildebrand > <stewart.hildebrand@amd.com> wrote: >> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h >> index 37f5922f3206..b58a822847be 100644 >> --- a/xen/include/xen/sched.h >> +++ b/xen/include/xen/sched.h >> @@ -484,6 +484,14 @@ struct domain >> * 2. pdev->vpci->lock >> */ >> rwlock_t pci_lock; >> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT >> + /* >> + * The bitmap which shows which device numbers are already used by the >> + * virtual PCI bus topology and is used to assign a unique SBDF to the >> + * next passed through virtual PCI device. >> + */ >> + DECLARE_BITMAP(vpci_dev_assigned_map, VPCI_MAX_VIRT_DEV); >> +#endif >> #endif > > Without digging through the whole series, how big do we expect this > bitmap to be on typical systems? > > If it's only going to be a handful of bytes, keeping it around for all > guests would be OK; but it's large, it would be better as a pointer, > since it's unused on the vast majority of guests. Since the bitmap is an unsigned long type it will typically be 8 bytes, although only 4 bytes are actually used. VPCI_MAX_VIRT_DEV is currently fixed at 32, as we are only tracking D (not the whole SBDF) in the bitmap so far.
On Fri, Jan 12, 2024 at 1:50 PM Stewart Hildebrand <stewart.hildebrand@amd.com> wrote: > > On 1/12/24 06:46, George Dunlap wrote: > > On Tue, Jan 9, 2024 at 9:54 PM Stewart Hildebrand > > <stewart.hildebrand@amd.com> wrote: > >> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h > >> index 37f5922f3206..b58a822847be 100644 > >> --- a/xen/include/xen/sched.h > >> +++ b/xen/include/xen/sched.h > >> @@ -484,6 +484,14 @@ struct domain > >> * 2. pdev->vpci->lock > >> */ > >> rwlock_t pci_lock; > >> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT > >> + /* > >> + * The bitmap which shows which device numbers are already used by the > >> + * virtual PCI bus topology and is used to assign a unique SBDF to the > >> + * next passed through virtual PCI device. > >> + */ > >> + DECLARE_BITMAP(vpci_dev_assigned_map, VPCI_MAX_VIRT_DEV); > >> +#endif > >> #endif > > > > Without digging through the whole series, how big do we expect this > > bitmap to be on typical systems? > > > > If it's only going to be a handful of bytes, keeping it around for all > > guests would be OK; but it's large, it would be better as a pointer, > > since it's unused on the vast majority of guests. > > Since the bitmap is an unsigned long type it will typically be 8 bytes, although only 4 bytes are actually used. VPCI_MAX_VIRT_DEV is currently fixed at 32, as we are only tracking D (not the whole SBDF) in the bitmap so far. OK, that's fine with me. (FYI I replied because I thought you needed my ack specifically for sched.h; looks like any of THE REST will do, however.) -George
On 09.01.2024 22:51, Stewart Hildebrand wrote: > --- a/xen/drivers/Kconfig > +++ b/xen/drivers/Kconfig > @@ -15,4 +15,8 @@ source "drivers/video/Kconfig" > config HAS_VPCI > bool > > +config HAS_VPCI_GUEST_SUPPORT > + bool > + depends on HAS_VPCI Wouldn't this better be "select", or even just "imply"? > --- a/xen/drivers/vpci/vpci.c > +++ b/xen/drivers/vpci/vpci.c > @@ -40,6 +40,49 @@ extern vpci_register_init_t *const __start_vpci_array[]; > extern vpci_register_init_t *const __end_vpci_array[]; > #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array) > > +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT > +static int add_virtual_device(struct pci_dev *pdev) > +{ > + struct domain *d = pdev->domain; > + unsigned int new_dev_number; > + > + if ( is_hardware_domain(d) ) > + return 0; > + > + ASSERT(rw_is_write_locked(&pdev->domain->pci_lock)); > + > + /* > + * Each PCI bus supports 32 devices/slots at max or up to 256 when > + * there are multi-function ones which are not yet supported. > + */ > + if ( pdev->info.is_extfn && !pdev->info.is_virtfn ) > + { > + gdprintk(XENLOG_ERR, "%pp: only function 0 passthrough supported\n", > + &pdev->sbdf); The message suggests you ought to check pdev->devfn to have the low three bits clear. Yet what you check are two booleans. Further doesn't this require the multi-function bit to be emulated clear? And finally don't you then also need to disallow assignment of devices with phantom functions? > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -484,6 +484,14 @@ struct domain > * 2. pdev->vpci->lock > */ > rwlock_t pci_lock; > +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT > + /* > + * The bitmap which shows which device numbers are already used by the > + * virtual PCI bus topology and is used to assign a unique SBDF to the > + * next passed through virtual PCI device. > + */ > + DECLARE_BITMAP(vpci_dev_assigned_map, VPCI_MAX_VIRT_DEV); > +#endif > #endif With this the 2nd #endif would likely better gain a comment. > --- a/xen/include/xen/vpci.h > +++ b/xen/include/xen/vpci.h > @@ -21,6 +21,13 @@ typedef int vpci_register_init_t(struct pci_dev *dev); > > #define VPCI_ECAM_BDF(addr) (((addr) & 0x0ffff000) >> 12) > > +/* > + * Maximum number of devices supported by the virtual bus topology: > + * each PCI bus supports 32 devices/slots at max or up to 256 when > + * there are multi-function ones which are not yet supported. > + */ > +#define VPCI_MAX_VIRT_DEV (PCI_SLOT(~0) + 1) The limit being this means only bus 0 / seg 0 is supported, which I think the comment would better also say. (In add_virtual_device(), which has a similar comment, there's then at least a 2nd one saying so.) Jan
On 1/25/24 11:00, Jan Beulich wrote: > On 09.01.2024 22:51, Stewart Hildebrand wrote: >> --- a/xen/drivers/Kconfig >> +++ b/xen/drivers/Kconfig >> @@ -15,4 +15,8 @@ source "drivers/video/Kconfig" >> config HAS_VPCI >> bool >> >> +config HAS_VPCI_GUEST_SUPPORT >> + bool >> + depends on HAS_VPCI > > Wouldn't this better be "select", or even just "imply"? I prefer "select" > >> --- a/xen/drivers/vpci/vpci.c >> +++ b/xen/drivers/vpci/vpci.c >> @@ -40,6 +40,49 @@ extern vpci_register_init_t *const __start_vpci_array[]; >> extern vpci_register_init_t *const __end_vpci_array[]; >> #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array) >> >> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT >> +static int add_virtual_device(struct pci_dev *pdev) >> +{ >> + struct domain *d = pdev->domain; >> + unsigned int new_dev_number; >> + >> + if ( is_hardware_domain(d) ) >> + return 0; >> + >> + ASSERT(rw_is_write_locked(&pdev->domain->pci_lock)); >> + >> + /* >> + * Each PCI bus supports 32 devices/slots at max or up to 256 when >> + * there are multi-function ones which are not yet supported. >> + */ >> + if ( pdev->info.is_extfn && !pdev->info.is_virtfn ) >> + { >> + gdprintk(XENLOG_ERR, "%pp: only function 0 passthrough supported\n", >> + &pdev->sbdf); > > The message suggests you ought to check pdev->devfn to have the low > three bits clear. Yet what you check are two booleans. I'll check pdev->sbdf.fn > > Further doesn't this require the multi-function bit to be emulated > clear? I consider this to be future work. The header type register, where the bit resides, is not yet emulated for domUs. I have a series in the works for emulating additional registers (including PCI_HEADER_TYPE), but I'm planning to wait to submit it until after the current series is finished so as to not delay the current series any further. > And finally don't you then also need to disallow assignment of > devices with phantom functions? No, I don't think so. My understanding is that there is no configuration space associated with the phantom SBDFs. There's no special handling required in vPCI per se, because the phantom function RIDs get mapped in the IOMMU when the device gets assigned. Future work would include exposing the PCI Express Capability, including device control register with the phantom function enable bit. I say this having only done limited testing of phantom functions on ARM, and by faking it using the pci-phantom= Xen arg because I don't have a real device with phantom function capability. > >> --- a/xen/include/xen/sched.h >> +++ b/xen/include/xen/sched.h >> @@ -484,6 +484,14 @@ struct domain >> * 2. pdev->vpci->lock >> */ >> rwlock_t pci_lock; >> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT >> + /* >> + * The bitmap which shows which device numbers are already used by the >> + * virtual PCI bus topology and is used to assign a unique SBDF to the >> + * next passed through virtual PCI device. >> + */ >> + DECLARE_BITMAP(vpci_dev_assigned_map, VPCI_MAX_VIRT_DEV); >> +#endif >> #endif > > With this the 2nd #endif would likely better gain a comment. I will add it. Actually, I see no harm in adding a comment for both of these #endifs. > >> --- a/xen/include/xen/vpci.h >> +++ b/xen/include/xen/vpci.h >> @@ -21,6 +21,13 @@ typedef int vpci_register_init_t(struct pci_dev *dev); >> >> #define VPCI_ECAM_BDF(addr) (((addr) & 0x0ffff000) >> 12) >> >> +/* >> + * Maximum number of devices supported by the virtual bus topology: >> + * each PCI bus supports 32 devices/slots at max or up to 256 when >> + * there are multi-function ones which are not yet supported. >> + */ >> +#define VPCI_MAX_VIRT_DEV (PCI_SLOT(~0) + 1) > > The limit being this means only bus 0 / seg 0 is supported, which I > think the comment would better also say. (In add_virtual_device(), > which has a similar comment, there's then at least a 2nd one saying > so.) OK, I'll adjust the comment.
diff --git a/xen/drivers/Kconfig b/xen/drivers/Kconfig index db94393f47a6..780490cf8e39 100644 --- a/xen/drivers/Kconfig +++ b/xen/drivers/Kconfig @@ -15,4 +15,8 @@ source "drivers/video/Kconfig" config HAS_VPCI bool +config HAS_VPCI_GUEST_SUPPORT + bool + depends on HAS_VPCI + endmenu diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c index a0e8b1012509..57cfabfd9ad3 100644 --- a/xen/drivers/vpci/vpci.c +++ b/xen/drivers/vpci/vpci.c @@ -40,6 +40,49 @@ extern vpci_register_init_t *const __start_vpci_array[]; extern vpci_register_init_t *const __end_vpci_array[]; #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array) +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT +static int add_virtual_device(struct pci_dev *pdev) +{ + struct domain *d = pdev->domain; + unsigned int new_dev_number; + + if ( is_hardware_domain(d) ) + return 0; + + ASSERT(rw_is_write_locked(&pdev->domain->pci_lock)); + + /* + * Each PCI bus supports 32 devices/slots at max or up to 256 when + * there are multi-function ones which are not yet supported. + */ + if ( pdev->info.is_extfn && !pdev->info.is_virtfn ) + { + gdprintk(XENLOG_ERR, "%pp: only function 0 passthrough supported\n", + &pdev->sbdf); + return -EOPNOTSUPP; + } + new_dev_number = find_first_zero_bit(d->vpci_dev_assigned_map, + VPCI_MAX_VIRT_DEV); + if ( new_dev_number == VPCI_MAX_VIRT_DEV ) + return -ENOSPC; + + __set_bit(new_dev_number, &d->vpci_dev_assigned_map); + + /* + * Both segment and bus number are 0: + * - we emulate a single host bridge for the guest, e.g. segment 0 + * - with bus 0 the virtual devices are seen as embedded + * endpoints behind the root complex + * + * TODO: add support for multi-function devices. + */ + pdev->vpci->guest_sbdf = PCI_SBDF(0, 0, new_dev_number, 0); + + return 0; +} + +#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */ + void vpci_deassign_device(struct pci_dev *pdev) { unsigned int i; @@ -49,6 +92,12 @@ void vpci_deassign_device(struct pci_dev *pdev) if ( !has_vpci(pdev->domain) || !pdev->vpci ) return; +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT + if ( pdev->vpci->guest_sbdf.sbdf != ~0 ) + __clear_bit(pdev->vpci->guest_sbdf.dev, + &pdev->domain->vpci_dev_assigned_map); +#endif + spin_lock(&pdev->vpci->lock); while ( !list_empty(&pdev->vpci->handlers) ) { @@ -105,6 +154,13 @@ int vpci_assign_device(struct pci_dev *pdev) INIT_LIST_HEAD(&pdev->vpci->handlers); spin_lock_init(&pdev->vpci->lock); +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT + pdev->vpci->guest_sbdf.sbdf = ~0; + rc = add_virtual_device(pdev); + if ( rc ) + goto out; +#endif + for ( i = 0; i < NUM_VPCI_INIT; i++ ) { rc = __start_vpci_array[i](pdev); @@ -112,6 +168,7 @@ int vpci_assign_device(struct pci_dev *pdev) break; } + out: __maybe_unused; if ( rc ) vpci_deassign_device(pdev); diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 37f5922f3206..b58a822847be 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -484,6 +484,14 @@ struct domain * 2. pdev->vpci->lock */ rwlock_t pci_lock; +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT + /* + * The bitmap which shows which device numbers are already used by the + * virtual PCI bus topology and is used to assign a unique SBDF to the + * next passed through virtual PCI device. + */ + DECLARE_BITMAP(vpci_dev_assigned_map, VPCI_MAX_VIRT_DEV); +#endif #endif #ifdef CONFIG_HAS_PASSTHROUGH diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h index 77320a667e55..053467f04982 100644 --- a/xen/include/xen/vpci.h +++ b/xen/include/xen/vpci.h @@ -21,6 +21,13 @@ typedef int vpci_register_init_t(struct pci_dev *dev); #define VPCI_ECAM_BDF(addr) (((addr) & 0x0ffff000) >> 12) +/* + * Maximum number of devices supported by the virtual bus topology: + * each PCI bus supports 32 devices/slots at max or up to 256 when + * there are multi-function ones which are not yet supported. + */ +#define VPCI_MAX_VIRT_DEV (PCI_SLOT(~0) + 1) + #define REGISTER_VPCI_INIT(x, p) \ static vpci_register_init_t *const x##_entry \ __used_section(".data.vpci." p) = x @@ -175,6 +182,10 @@ struct vpci { struct vpci_arch_msix_entry arch; } entries[]; } *msix; +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT + /* Guest SBDF of the device. */ + pci_sbdf_t guest_sbdf; +#endif #endif };