diff mbox series

[v4,10/11] vpci: add initial support for virtual PCI bus topology

Message ID 20211105065629.940943-11-andr2000@gmail.com (mailing list archive)
State Superseded
Headers show
Series PCI devices passthrough on Arm, part 3 | expand

Commit Message

Oleksandr Andrushchenko Nov. 5, 2021, 6:56 a.m. UTC
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Assign SBDF to the PCI devices being passed through with bus 0.
The resulting topology is where PCIe devices reside on the bus 0 of the
root complex itself (embedded endpoints).
This implementation is limited to 32 devices which are allowed on
a single PCI bus.

Please note, that at the moment only function 0 of a multifunction
device can be passed through.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
Since v3:
 - make use of VPCI_INIT
 - moved all new code to vpci.c which belongs to it
 - changed open-coded 31 to PCI_SLOT(~0)
 - revisited locking: add dedicated vdev list's lock
 - added comments and code to reject multifunction devices with
   functions other than 0
 - updated comment about vpci_dev_next and made it unsigned int
 - implement roll back in case of error while assigning/deassigning devices
 - s/dom%pd/%pd
Since v2:
 - remove casts that are (a) malformed and (b) unnecessary
 - add new line for better readability
 - remove CONFIG_HAS_VPCI_GUEST_SUPPORT ifdef's as the relevant vPCI
    functions are now completely gated with this config
 - gate common code with CONFIG_HAS_VPCI_GUEST_SUPPORT
New in v2
---
 xen/drivers/vpci/vpci.c | 52 +++++++++++++++++++++++++++++++++++++++++
 xen/include/xen/sched.h |  8 +++++++
 xen/include/xen/vpci.h  |  4 ++++
 3 files changed, 64 insertions(+)

Comments

Jan Beulich Nov. 18, 2021, 4:45 p.m. UTC | #1
On 05.11.2021 07:56, Oleksandr Andrushchenko wrote:
> Since v3:
>  - make use of VPCI_INIT
>  - moved all new code to vpci.c which belongs to it
>  - changed open-coded 31 to PCI_SLOT(~0)
>  - revisited locking: add dedicated vdev list's lock

What is this about? I can't spot any locking in the patch. In particular ...

> @@ -125,6 +128,54 @@ int vpci_add_handlers(struct pci_dev *pdev)
>  }
>  
>  #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> +int vpci_add_virtual_device(struct pci_dev *pdev)
> +{
> +    struct domain *d = pdev->domain;
> +    pci_sbdf_t sbdf;
> +    unsigned long new_dev_number;
> +
> +    /*
> +     * Each PCI bus supports 32 devices/slots at max or up to 256 when
> +     * there are multi-function ones which are not yet supported.
> +     */
> +    if ( pdev->info.is_extfn )
> +    {
> +        gdprintk(XENLOG_ERR, "%pp: only function 0 passthrough supported\n",
> +                 &pdev->sbdf);
> +        return -EOPNOTSUPP;
> +    }
> +
> +    new_dev_number = find_first_zero_bit(&d->vpci_dev_assigned_map,
> +                                         PCI_SLOT(~0) + 1);
> +    if ( new_dev_number > PCI_SLOT(~0) )
> +        return -ENOSPC;
> +
> +    set_bit(new_dev_number, &d->vpci_dev_assigned_map);

... I wonder whether this isn't racy without any locking around it,
and without looping over test_and_set_bit(). Whereas with locking I
think you could just use __set_bit().

> +    /*
> +     * Both segment and bus number are 0:
> +     *  - we emulate a single host bridge for the guest, e.g. segment 0
> +     *  - with bus 0 the virtual devices are seen as embedded
> +     *    endpoints behind the root complex
> +     *
> +     * TODO: add support for multi-function devices.
> +     */
> +    sbdf.sbdf = 0;

I think this would be better expressed as an initializer, making it
clear to the reader that the whole object gets initialized with out
them needing to go check the type (and find that .sbdf covers the
entire object).

> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -145,6 +145,10 @@ struct vpci {
>              struct vpci_arch_msix_entry arch;
>          } entries[];
>      } *msix;
> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> +    /* Virtual SBDF of the device. */
> +    pci_sbdf_t guest_sbdf;

Would vsbdf perhaps be better in line with things like vpci or vcpu
(as well as with the comment here)?

Jan
Oleksandr Andrushchenko Nov. 24, 2021, 11:28 a.m. UTC | #2
Hi, Jan!

On 18.11.21 18:45, Jan Beulich wrote:
> On 05.11.2021 07:56, Oleksandr Andrushchenko wrote:
>> Since v3:
>>   - make use of VPCI_INIT
>>   - moved all new code to vpci.c which belongs to it
>>   - changed open-coded 31 to PCI_SLOT(~0)
>>   - revisited locking: add dedicated vdev list's lock
> What is this about? I can't spot any locking in the patch. In particular ...
I will update
>
>> @@ -125,6 +128,54 @@ int vpci_add_handlers(struct pci_dev *pdev)
>>   }
>>   
>>   #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
>> +int vpci_add_virtual_device(struct pci_dev *pdev)
>> +{
>> +    struct domain *d = pdev->domain;
>> +    pci_sbdf_t sbdf;
>> +    unsigned long new_dev_number;
>> +
>> +    /*
>> +     * Each PCI bus supports 32 devices/slots at max or up to 256 when
>> +     * there are multi-function ones which are not yet supported.
>> +     */
>> +    if ( pdev->info.is_extfn )
>> +    {
>> +        gdprintk(XENLOG_ERR, "%pp: only function 0 passthrough supported\n",
>> +                 &pdev->sbdf);
>> +        return -EOPNOTSUPP;
>> +    }
>> +
>> +    new_dev_number = find_first_zero_bit(&d->vpci_dev_assigned_map,
>> +                                         PCI_SLOT(~0) + 1);
>> +    if ( new_dev_number > PCI_SLOT(~0) )
>> +        return -ENOSPC;
>> +
>> +    set_bit(new_dev_number, &d->vpci_dev_assigned_map);
> ... I wonder whether this isn't racy without any locking around it,
Locking is going to be implemented by moving vpci->lock to the
outside, so this code will be protected
> and without looping over test_and_set_bit(). Whereas with locking I
> think you could just use __set_bit().
Although __set_bit == set_bit on Arm I see there is  a difference on x86
I wil use __set_bit
>
>> +    /*
>> +     * Both segment and bus number are 0:
>> +     *  - we emulate a single host bridge for the guest, e.g. segment 0
>> +     *  - with bus 0 the virtual devices are seen as embedded
>> +     *    endpoints behind the root complex
>> +     *
>> +     * TODO: add support for multi-function devices.
>> +     */
>> +    sbdf.sbdf = 0;
> I think this would be better expressed as an initializer,
Ok,
-    pci_sbdf_t sbdf;
+    pci_sbdf_t sbdf = { 0 };

>   making it
> clear to the reader that the whole object gets initialized with out
> them needing to go check the type (and find that .sbdf covers the
> entire object).
>
>> --- a/xen/include/xen/vpci.h
>> +++ b/xen/include/xen/vpci.h
>> @@ -145,6 +145,10 @@ struct vpci {
>>               struct vpci_arch_msix_entry arch;
>>           } entries[];
>>       } *msix;
>> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
>> +    /* Virtual SBDF of the device. */
>> +    pci_sbdf_t guest_sbdf;
> Would vsbdf perhaps be better in line with things like vpci or vcpu
> (as well as with the comment here)?
This is the same as guest_addr...
@Roger what is your preference here?
>
> Jan
>
Thank you,
Oleksandr
Roger Pau Monné Nov. 24, 2021, 12:36 p.m. UTC | #3
On Wed, Nov 24, 2021 at 11:28:18AM +0000, Oleksandr Andrushchenko wrote:
> Hi, Jan!
> 
> On 18.11.21 18:45, Jan Beulich wrote:
> > On 05.11.2021 07:56, Oleksandr Andrushchenko wrote:
> >> --- a/xen/include/xen/vpci.h
> >> +++ b/xen/include/xen/vpci.h
> >> @@ -145,6 +145,10 @@ struct vpci {
> >>               struct vpci_arch_msix_entry arch;
> >>           } entries[];
> >>       } *msix;
> >> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> >> +    /* Virtual SBDF of the device. */
> >> +    pci_sbdf_t guest_sbdf;
> > Would vsbdf perhaps be better in line with things like vpci or vcpu
> > (as well as with the comment here)?
> This is the same as guest_addr...
> @Roger what is your preference here?

I'm fine with using guest_ here, but the comment should be slightly
adjusted to s/Virtual/Guest/ IMO. It's already inline with other
guest_ fields added in the series anyway.

Just to confirm, such guest_sbdf is strictly to be used by
unprivileged domains, dom0 will never get such a virtual PCI bus?

Thanks, Roger.
Oleksandr Andrushchenko Nov. 24, 2021, 12:43 p.m. UTC | #4
On 24.11.21 14:36, Roger Pau Monné wrote:
> On Wed, Nov 24, 2021 at 11:28:18AM +0000, Oleksandr Andrushchenko wrote:
>> Hi, Jan!
>>
>> On 18.11.21 18:45, Jan Beulich wrote:
>>> On 05.11.2021 07:56, Oleksandr Andrushchenko wrote:
>>>> --- a/xen/include/xen/vpci.h
>>>> +++ b/xen/include/xen/vpci.h
>>>> @@ -145,6 +145,10 @@ struct vpci {
>>>>                struct vpci_arch_msix_entry arch;
>>>>            } entries[];
>>>>        } *msix;
>>>> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
>>>> +    /* Virtual SBDF of the device. */
>>>> +    pci_sbdf_t guest_sbdf;
>>> Would vsbdf perhaps be better in line with things like vpci or vcpu
>>> (as well as with the comment here)?
>> This is the same as guest_addr...
>> @Roger what is your preference here?
> I'm fine with using guest_ here, but the comment should be slightly
> adjusted to s/Virtual/Guest/ IMO. It's already inline with other
> guest_ fields added in the series anyway.
Ok, I will update the comment
>
> Just to confirm, such guest_sbdf is strictly to be used by
> unprivileged domains, dom0 will never get such a virtual PCI bus?
Right, for unprivileged domains domains only
>
> Thanks, Roger.
Thank you,
Oleksandr
diff mbox series

Patch

diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 45733300f00b..6657d236dc1a 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -101,6 +101,9 @@  int vpci_add_handlers(struct pci_dev *pdev)
 
     INIT_LIST_HEAD(&pdev->vpci->handlers);
     spin_lock_init(&pdev->vpci->lock);
+#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
+    pdev->vpci->guest_sbdf.sbdf = ~0;
+#endif
 
     header = &pdev->vpci->header;
     for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
@@ -125,6 +128,54 @@  int vpci_add_handlers(struct pci_dev *pdev)
 }
 
 #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
+int vpci_add_virtual_device(struct pci_dev *pdev)
+{
+    struct domain *d = pdev->domain;
+    pci_sbdf_t sbdf;
+    unsigned long new_dev_number;
+
+    /*
+     * Each PCI bus supports 32 devices/slots at max or up to 256 when
+     * there are multi-function ones which are not yet supported.
+     */
+    if ( pdev->info.is_extfn )
+    {
+        gdprintk(XENLOG_ERR, "%pp: only function 0 passthrough supported\n",
+                 &pdev->sbdf);
+        return -EOPNOTSUPP;
+    }
+
+    new_dev_number = find_first_zero_bit(&d->vpci_dev_assigned_map,
+                                         PCI_SLOT(~0) + 1);
+    if ( new_dev_number > PCI_SLOT(~0) )
+        return -ENOSPC;
+
+    set_bit(new_dev_number, &d->vpci_dev_assigned_map);
+
+    /*
+     * Both segment and bus number are 0:
+     *  - we emulate a single host bridge for the guest, e.g. segment 0
+     *  - with bus 0 the virtual devices are seen as embedded
+     *    endpoints behind the root complex
+     *
+     * TODO: add support for multi-function devices.
+     */
+    sbdf.sbdf = 0;
+    sbdf.devfn = PCI_DEVFN(new_dev_number, 0);
+    pdev->vpci->guest_sbdf = sbdf;
+
+    return 0;
+
+}
+REGISTER_VPCI_INIT(vpci_add_virtual_device, VPCI_PRIORITY_MIDDLE);
+
+static void vpci_remove_virtual_device(struct domain *d,
+                                       const struct pci_dev *pdev)
+{
+    clear_bit(pdev->vpci->guest_sbdf.dev, &d->vpci_dev_assigned_map);
+    pdev->vpci->guest_sbdf.sbdf = ~0;
+}
+
 /* Notify vPCI that device is assigned to guest. */
 int vpci_assign_device(struct domain *d, struct pci_dev *pdev)
 {
@@ -150,6 +201,7 @@  int vpci_deassign_device(struct domain *d, struct pci_dev *pdev)
     if ( is_system_domain(d) || !has_vpci(d) )
         return 0;
 
+    vpci_remove_virtual_device(d, pdev);
     vpci_remove_device_handlers(pdev);
 
     return 0;
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 28146ee404e6..10bff103317c 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -444,6 +444,14 @@  struct domain
 
 #ifdef CONFIG_HAS_PCI
     struct list_head pdev_list;
+#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
+    /*
+     * The bitmap which shows which device numbers are already used by the
+     * virtual PCI bus topology and is used to assign a unique SBDF to the
+     * next passed through virtual PCI device.
+     */
+    unsigned long vpci_dev_assigned_map;
+#endif
 #endif
 
 #ifdef CONFIG_HAS_PASSTHROUGH
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 143f3166a730..9cc7071bc0af 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -145,6 +145,10 @@  struct vpci {
             struct vpci_arch_msix_entry arch;
         } entries[];
     } *msix;
+#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
+    /* Virtual SBDF of the device. */
+    pci_sbdf_t guest_sbdf;
+#endif
 #endif
 };