Message ID | 7a08c41e1825095814f8c35854d3938c084b2368.1692892275.git.reinette.chatre@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfio/ims: Back guest interrupts from Interrupt Message Store (IMS) | expand |
On Thu, Aug 24, 2023 at 09:15:21AM -0700, Reinette Chatre wrote: > Access from a guest to a virtual device may be either 'direct-path', > where the guest interacts directly with the underlying hardware, > or 'intercepted path' where the virtual device emulates operations. > > Support emulated interrupts that can be used to handle 'intercepted > path' operations. For example, a virtual device may use 'intercepted > path' for configuration. Doing so, configuration requests intercepted > by the virtual device driver are handled within the virtual device > driver with completion signaled to the guest without interacting with > the underlying hardware. Why does this have anything to do with IMS? I thought the point here was that IMS was some back end to the MSI-X emulation - should a purely emulated interrupt logically be part of the MSI code, not IMS? Jason
Hi Jason, On 8/24/2023 9:33 AM, Jason Gunthorpe wrote: > On Thu, Aug 24, 2023 at 09:15:21AM -0700, Reinette Chatre wrote: >> Access from a guest to a virtual device may be either 'direct-path', >> where the guest interacts directly with the underlying hardware, >> or 'intercepted path' where the virtual device emulates operations. >> >> Support emulated interrupts that can be used to handle 'intercepted >> path' operations. For example, a virtual device may use 'intercepted >> path' for configuration. Doing so, configuration requests intercepted >> by the virtual device driver are handled within the virtual device >> driver with completion signaled to the guest without interacting with >> the underlying hardware. > > Why does this have anything to do with IMS? I thought the point here > was that IMS was some back end to the MSI-X emulation - should a > purely emulated interrupt logically be part of the MSI code, not IMS? You are correct, an emulated interrupt is not unique to IMS. The target usage of this library is by pure(?) VFIO devices (struct vfio_device). These are virtual devices that are composed by separate VFIO drivers. For example, a single resource of an accelerator device can be composed into a stand-alone virtual device for use by a guest. Through its API and implementation the current VFIO MSI code expects to work with actual PCI devices (struct vfio_pci_core_device). With the target usage not being an actual PCI device the VFIO MSI code was not found to be a good fit and thus this implementation does not build on current MSI support. Reinette
> From: Chatre, Reinette <reinette.chatre@intel.com> > Sent: Friday, August 25, 2023 1:19 AM > > Hi Jason, > > On 8/24/2023 9:33 AM, Jason Gunthorpe wrote: > > On Thu, Aug 24, 2023 at 09:15:21AM -0700, Reinette Chatre wrote: > >> Access from a guest to a virtual device may be either 'direct-path', > >> where the guest interacts directly with the underlying hardware, > >> or 'intercepted path' where the virtual device emulates operations. > >> > >> Support emulated interrupts that can be used to handle 'intercepted > >> path' operations. For example, a virtual device may use 'intercepted > >> path' for configuration. Doing so, configuration requests intercepted > >> by the virtual device driver are handled within the virtual device > >> driver with completion signaled to the guest without interacting with > >> the underlying hardware. > > > > Why does this have anything to do with IMS? I thought the point here > > was that IMS was some back end to the MSI-X emulation - should a > > purely emulated interrupt logically be part of the MSI code, not IMS? > > You are correct, an emulated interrupt is not unique to IMS. > > The target usage of this library is by pure(?) VFIO devices (struct > vfio_device). These are virtual devices that are composed by separate > VFIO drivers. For example, a single resource of an accelerator device > can be composed into a stand-alone virtual device for use by a guest. > > Through its API and implementation the current VFIO MSI > code expects to work with actual PCI devices (struct > vfio_pci_core_device). With the target usage not being an > actual PCI device the VFIO MSI code was not found to be a good > fit and thus this implementation does not build on current MSI > support. > This might be achieved by creating a structure vfio_pci_intr_ctx included by vfio_pci_core_device and other vfio device types. Then move vfio_pci_intr.c to operate on vfio_pci_intr_ctx instead of vfio_pci_core_device to make MSI frontend code sharable by both PCI devices or virtual devices (mdev or SIOV). Then there is only one irq_ctx. Within the ctx we can abstract backend ops, e.g. enable/disble_msi(), alloc/free_ctx(), alloc/free_irq(), etc. to accommodate pci MSI/MSI-X, IMS, or emulation. The unknown risk is whether a clear abstraction can be defined. If in the end the common library contains many if-else to handle subtle backend differences then it might not be a good choice...
Hi Kevin, On 8/24/2023 8:05 PM, Tian, Kevin wrote: >> From: Chatre, Reinette <reinette.chatre@intel.com> >> Sent: Friday, August 25, 2023 1:19 AM >> >> Hi Jason, >> >> On 8/24/2023 9:33 AM, Jason Gunthorpe wrote: >>> On Thu, Aug 24, 2023 at 09:15:21AM -0700, Reinette Chatre wrote: >>>> Access from a guest to a virtual device may be either 'direct-path', >>>> where the guest interacts directly with the underlying hardware, >>>> or 'intercepted path' where the virtual device emulates operations. >>>> >>>> Support emulated interrupts that can be used to handle 'intercepted >>>> path' operations. For example, a virtual device may use 'intercepted >>>> path' for configuration. Doing so, configuration requests intercepted >>>> by the virtual device driver are handled within the virtual device >>>> driver with completion signaled to the guest without interacting with >>>> the underlying hardware. >>> >>> Why does this have anything to do with IMS? I thought the point here >>> was that IMS was some back end to the MSI-X emulation - should a >>> purely emulated interrupt logically be part of the MSI code, not IMS? >> >> You are correct, an emulated interrupt is not unique to IMS. >> >> The target usage of this library is by pure(?) VFIO devices (struct >> vfio_device). These are virtual devices that are composed by separate >> VFIO drivers. For example, a single resource of an accelerator device >> can be composed into a stand-alone virtual device for use by a guest. >> >> Through its API and implementation the current VFIO MSI >> code expects to work with actual PCI devices (struct >> vfio_pci_core_device). With the target usage not being an >> actual PCI device the VFIO MSI code was not found to be a good >> fit and thus this implementation does not build on current MSI >> support. >> > > This might be achieved by creating a structure vfio_pci_intr_ctx > included by vfio_pci_core_device and other vfio device types. Then > move vfio_pci_intr.c to operate on vfio_pci_intr_ctx instead of > vfio_pci_core_device to make MSI frontend code sharable by both > PCI devices or virtual devices (mdev or SIOV). Thank you very much Kevin. For data there is the per-device context related to interrupts as well as the per-interrupt context. These contexts are different between the different device types and interrupt types and if I understand correctly both context types (per-device as well as per-interrupt) should be made opaque within the new vfio_pci_intr.c. Additionally, with different mechanisms (for example, the locking required) to interact with the different device types the code is also device (PCI or virtual) specific. Considering how both the data and code would be opaque to this new library it looks to me as though in some aspects this library may thus appear as a skeleton with interrupt and device specific logic contained in its users. It is not obvious to me where the MSI frontend and backend boundary is. If I understand correctly majority of the code in vfio_pci_intrs.c would move to callbacks. Potentially leaving little to be shared, vfio_pci_set_irqs_ioctl() and vfio_msihandler() seems like candidates. To clarify it sounds to me as though this design is motivated by the requirement to bring emulated interrupt support to the current INTx/MSI/MSI-X support on PCI devices? It is not clear to me how this feature will be used since interrupts need to be signaled from the driver instead of the hardware. I am missing something. > Then there is only one irq_ctx. Within the ctx we can abstract > backend ops, e.g. enable/disble_msi(), alloc/free_ctx(), alloc/free_irq(), etc. > to accommodate pci MSI/MSI-X, IMS, or emulation. > > The unknown risk is whether a clear abstraction can be defined. If > in the end the common library contains many if-else to handle subtle > backend differences then it might not be a good choice... Thank you very much for your guidance. Instead of Jason's expectation that IMS would be a backend of MSI-X this will change to IMS and MSI-X both being a backend to a new interface. It is difficult for me to envision the end result so I will work on an implementation based on my understanding of your proposal that we can use for further discussion. Reinette
On Fri, Aug 25, 2023 at 09:55:52AM -0700, Reinette Chatre wrote: > Thank you very much for your guidance. Instead of Jason's expectation that > IMS would be a backend of MSI-X this will change to IMS and MSI-X both being > a backend to a new interface. It is difficult for me to envision the end > result so I will work on an implementation based on my understanding of > your proposal that we can use for further discussion. I think the point is that emulating MSI-X is kind of tricky and should be common VFIO code, regardless of what path it takes. So I would expect some library code to do this, entry points the vfio_device can hook into its callbacks (eg config space and rw of a page) Then the other side would connect to the physical implementation, sw,ims,msi-x,future Jason
On 8/25/2023 9:59 AM, Jason Gunthorpe wrote: > On Fri, Aug 25, 2023 at 09:55:52AM -0700, Reinette Chatre wrote: > >> Thank you very much for your guidance. Instead of Jason's expectation that >> IMS would be a backend of MSI-X this will change to IMS and MSI-X both being >> a backend to a new interface. It is difficult for me to envision the end >> result so I will work on an implementation based on my understanding of >> your proposal that we can use for further discussion. > > I think the point is that emulating MSI-X is kind of tricky and should > be common VFIO code, regardless of what path it takes. > > So I would expect some library code to do this, entry points the > vfio_device can hook into its callbacks (eg config space and rw of a > page) > > Then the other side would connect to the physical implementation, > sw,ims,msi-x,future > Thank you very much Jason. I am thus hearing the same from you and Kevin. I'll work on this. Reinette
diff --git a/drivers/vfio/pci/vfio_pci_ims.c b/drivers/vfio/pci/vfio_pci_ims.c index 0926eb921351..fe5b3484ad34 100644 --- a/drivers/vfio/pci/vfio_pci_ims.c +++ b/drivers/vfio/pci/vfio_pci_ims.c @@ -17,16 +17,18 @@ #include <linux/xarray.h> /* - * IMS interrupt context. - * @name: Name of device associated with interrupt. + * Interrupt context. Used for emulated as well as IMS interrupts. + * @emulated: (IMS and emulated) true if context belongs to emulated interrupt. + * @name: (IMS and emulated) Name of device associated with interrupt. * Provided to request_irq(). - * @trigger: eventfd associated with interrupt. - * @producer: Interrupt's registered IRQ bypass producer. - * @ims_id: Interrupt index associated with IMS interrupt. - * @virq: Linux IRQ number associated with IMS interrupt. - * @icookie: Cookie used by irqchip driver. + * @trigger: (IMS and emulated) eventfd associated with interrupt. + * @producer: (IMS only) Interrupt's registered IRQ bypass producer. + * @ims_id: (IMS only) Interrupt index associated with IMS interrupt. + * @virq: (IMS only) Linux IRQ number associated with IMS interrupt. + * @icookie: (IMS only) Cookie used by irqchip driver. */ struct vfio_pci_ims_ctx { + bool emulated; char *name; struct eventfd_ctx *trigger; struct irq_bypass_producer producer; @@ -35,6 +37,31 @@ struct vfio_pci_ims_ctx { union msi_instance_cookie icookie; }; +/* + * Send signal to the eventfd. + * @vdev: VFIO device + * @vector: MSI-X vector of @vdev for which interrupt will be signaled + * + * Intended for use to send signal for emulated interrupts. + */ +void vfio_pci_ims_send_signal(struct vfio_device *vdev, unsigned int vector) +{ + struct vfio_pci_ims *ims = &vdev->ims; + struct vfio_pci_ims_ctx *ctx; + + mutex_lock(&ims->ctx_mutex); + ctx = xa_load(&ims->ctx, vector); + + if (WARN_ON_ONCE(!ctx || !ctx->emulated || !ctx->trigger)) { + mutex_unlock(&ims->ctx_mutex); + return; + } + + eventfd_signal(ctx->trigger, 1); + mutex_unlock(&ims->ctx_mutex); +} +EXPORT_SYMBOL_GPL(vfio_pci_ims_send_signal); + static irqreturn_t vfio_pci_ims_irq_handler(int irq, void *arg) { struct eventfd_ctx *trigger = arg; @@ -46,7 +73,8 @@ static irqreturn_t vfio_pci_ims_irq_handler(int irq, void *arg) /* * Free the interrupt associated with @ctx. * - * Free interrupt from the underlying PCI device's IMS domain. + * For an emulated interrupt there is nothing to do. For an IMS interrupt + * the interrupt is freed from the underlying PCI device's IMS domain. */ static void vfio_pci_ims_irq_free(struct vfio_pci_ims *ims, struct vfio_pci_ims_ctx *ctx) @@ -55,6 +83,9 @@ static void vfio_pci_ims_irq_free(struct vfio_pci_ims *ims, lockdep_assert_held(&ims->ctx_mutex); + if (ctx->emulated) + return; + irq_map.index = ctx->ims_id; irq_map.virq = ctx->virq; pci_ims_free_irq(ims->pdev, irq_map); @@ -65,7 +96,8 @@ static void vfio_pci_ims_irq_free(struct vfio_pci_ims *ims, /* * Allocate an interrupt for @ctx. * - * Allocate interrupt from the underlying PCI device's IMS domain. + * For an emulated interrupt there is nothing to do. For an IMS interrupt + * the interrupt is allocated from the underlying PCI device's IMS domain. */ static int vfio_pci_ims_irq_alloc(struct vfio_pci_ims *ims, struct vfio_pci_ims_ctx *ctx) @@ -74,6 +106,9 @@ static int vfio_pci_ims_irq_alloc(struct vfio_pci_ims *ims, lockdep_assert_held(&ims->ctx_mutex); + if (ctx->emulated) + return -EINVAL; + irq_map = pci_ims_alloc_irq(ims->pdev, &ctx->icookie, NULL); if (irq_map.index < 0) return irq_map.index; @@ -133,9 +168,11 @@ static int vfio_pci_ims_set_vector_signal(struct vfio_device *vdev, ctx = xa_load(&ims->ctx, vector); if (ctx && ctx->trigger) { - irq_bypass_unregister_producer(&ctx->producer); - free_irq(ctx->virq, ctx->trigger); - vfio_pci_ims_irq_free(ims, ctx); + if (!ctx->emulated) { + irq_bypass_unregister_producer(&ctx->producer); + free_irq(ctx->virq, ctx->trigger); + vfio_pci_ims_irq_free(ims, ctx); + } kfree(ctx->name); ctx->name = NULL; eventfd_ctx_put(ctx->trigger); @@ -163,6 +200,9 @@ static int vfio_pci_ims_set_vector_signal(struct vfio_device *vdev, ctx->trigger = trigger; + if (ctx->emulated) + return 0; + ret = vfio_pci_ims_irq_alloc(ims, ctx); if (ret < 0) goto out_put_eventfd_ctx; @@ -219,8 +259,8 @@ static int vfio_pci_ims_set_block(struct vfio_device *vdev, unsigned int start, } /* - * Manage Interrupt Message Store (IMS) interrupts on the host that are - * backing guest MSI-X vectors. + * Manage Interrupt Message Store (IMS) or emulated interrupts on the + * host that are backing guest MSI-X vectors. * * @vdev: VFIO device * @index: Type of guest vectors to set up. Must be @@ -360,6 +400,10 @@ int vfio_pci_ims_set_cookie(struct vfio_device *vdev, unsigned int vector, mutex_lock(&ims->ctx_mutex); ctx = xa_load(&ims->ctx, vector); if (ctx) { + if (WARN_ON_ONCE(ctx->emulated)) { + ret = -EINVAL; + goto out_unlock; + } ctx->icookie = *icookie; goto out_unlock; } @@ -385,5 +429,50 @@ int vfio_pci_ims_set_cookie(struct vfio_device *vdev, unsigned int vector, } EXPORT_SYMBOL_GPL(vfio_pci_ims_set_cookie); +/* + * Set range of interrupts that will be emulated instead of backed by IMS. + * + * Return: error code on failure (-EBUSY if the vector is not available, + * -ENOMEM on allocation failure), 0 on success + */ +int vfio_pci_ims_set_emulated(struct vfio_device *vdev, unsigned int start, + unsigned int count) +{ + struct vfio_pci_ims *ims = &vdev->ims; + struct vfio_pci_ims_ctx *ctx; + unsigned long i, j; + int ret = 0; + + mutex_lock(&ims->ctx_mutex); + + for (i = start; i < start + count; i++) { + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT); + if (!ctx) { + ret = -ENOMEM; + goto out_err; + } + ctx->emulated = true; + ret = xa_insert(&ims->ctx, i, ctx, GFP_KERNEL_ACCOUNT); + if (ret) { + kfree(ctx); + goto out_err; + } + } + + mutex_unlock(&ims->ctx_mutex); + return 0; + +out_err: + for (j = start; j < i; j++) { + ctx = xa_load(&ims->ctx, j); + xa_erase(&ims->ctx, j); + kfree(ctx); + } + mutex_unlock(&ims->ctx_mutex); + + return ret; +} +EXPORT_SYMBOL_GPL(vfio_pci_ims_set_emulated); + MODULE_LICENSE("GPL"); MODULE_AUTHOR("Intel Corporation"); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index aa54239bff4d..906220002ff4 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -334,6 +334,9 @@ int vfio_pci_set_ims_trigger(struct vfio_device *vdev, unsigned int index, void vfio_pci_ims_init(struct vfio_device *vdev, struct pci_dev *pdev, union msi_instance_cookie *default_cookie); void vfio_pci_ims_free(struct vfio_device *vdev); +int vfio_pci_ims_set_emulated(struct vfio_device *vdev, unsigned int start, + unsigned int count); +void vfio_pci_ims_send_signal(struct vfio_device *vdev, unsigned int vector); int vfio_pci_ims_set_cookie(struct vfio_device *vdev, unsigned int vector, union msi_instance_cookie *icookie); #else @@ -353,6 +356,17 @@ static inline void vfio_pci_ims_init(struct vfio_device *vdev, static inline void vfio_pci_ims_free(struct vfio_device *vdev) {} +static inline int vfio_pci_ims_set_emulated(struct vfio_device *vdev, + unsigned int start, + unsigned int count) +{ + return -EOPNOTSUPP; +} + +static inline void vfio_pci_ims_send_signal(struct vfio_device *vdev, + unsigned int vector) +{} + static inline int vfio_pci_ims_set_cookie(struct vfio_device *vdev, unsigned int vector, union msi_instance_cookie *icookie)
Access from a guest to a virtual device may be either 'direct-path', where the guest interacts directly with the underlying hardware, or 'intercepted path' where the virtual device emulates operations. Support emulated interrupts that can be used to handle 'intercepted path' operations. For example, a virtual device may use 'intercepted path' for configuration. Doing so, configuration requests intercepted by the virtual device driver are handled within the virtual device driver with completion signaled to the guest without interacting with the underlying hardware. Add vfio_pci_ims_set_emulated() and vfio_pci_ims_send_signal() to the VFIO PCI IMS API. vfio_pci_ims_set_emulated() configures a range of interrupts that are emulated. Any range of interrupts can be configured as emulated as long as no IMS interrupt has previously been allocated at that vector. The virtual device uses vfio_pci_ims_send_signal() to trigger interrupts in the guest. Originally-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> --- drivers/vfio/pci/vfio_pci_ims.c | 117 ++++++++++++++++++++++++++++---- include/linux/vfio.h | 14 ++++ 2 files changed, 117 insertions(+), 14 deletions(-)