Message ID | CACcEcgQq_yxvjAo7BticTw6ne8S2uUjbCFxPTnWHT24oMkxf=w@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Supporting VFIO on nVidia's Orin platform | expand |
On 2024/9/29 23:02, Michael Williamson wrote: > Hello, > > I've been trying to get VFIO working on nVidia's Orin platform (ARM64) in > order to support moving data efficiently off of an attached FPGA PCIe board > using the SMMU from a user space application. We have a full application > working on x86/x64 boxes that properly support the VFIO interface. We're > trying to port support to the Orin. > > I'm on nVidia's 5.15.36 branch. It doesn't work out of the box, as the > tegra194-pcie platform controller is lumped in the same iommu group as the > actual PCIe card. The acs override patch didn't help to separate them. > > I have a patch below that *seems* to work for us, but I will admit I do not > know the implications of what I am doing here. your below patch is to pass the vfio_dev_viable() check I suppose. If you are sure the tegra194-pcie platform controller will not issue DMA, then it is fine. If not, you should be careful about it. > Can anyone let me know if this is (and why it is) a bad idea, and what really > needs to be done? Or if this is the wrong mailing list, point me in the right > direction? this is the right place to ask. +NV folks I know. > Thanks, > Mike > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > index 818e47fc0896..a598a2204781 100644 > --- a/drivers/vfio/vfio.c > +++ b/drivers/vfio/vfio.c > @@ -638,8 +638,15 @@ static struct vfio_device > *vfio_group_get_device(struct vfio_group *group, > * breaks anything, it only does so for user owned devices downstream. Note > * that error notification via MSI can be affected for platforms that handle > * MSI within the same IOVA space as DMA. > + * > + * [MAW] - the tegra194-pcie driver is a platform PCie device controller and > + * fails the dev_is_pci() check below. Not sure if it's because its grouping > + * needs to be reworked, but I don't know how this is (or if it > should be) done. > + * This is a hack to see if we can get it going well enough to use the > + * SMMU from user space. The other two devices (for the Orin) in the group > + * are the host bridge and the PCIe card itself. > */ > -static const char * const vfio_driver_allowed[] = { "pci-stub" }; > +static const char * const vfio_driver_allowed[] = { "pci-stub", > "tegra194-pcie" }; > > static bool vfio_dev_driver_allowed(struct device *dev, > struct device_driver *drv) > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index 66bbb125d761..e34fbe17ae1a 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -45,7 +45,8 @@ > #define DRIVER_AUTHOR "Alex Williamson <alex.williamson@redhat.com>" > #define DRIVER_DESC "Type1 IOMMU driver for VFIO" > > -static bool allow_unsafe_interrupts; > +/** [MAW] - hack, need this set for Orin test, not compiled is module > currently */ > +static bool allow_unsafe_interrupts = true; > module_param_named(allow_unsafe_interrupts, > allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR); > MODULE_PARM_DESC(allow_unsafe_interrupts, > @@ -1733,8 +1734,18 @@ static int vfio_bus_type(struct device *dev, void *data) > { > struct bus_type **bus = data; > > - if (*bus && *bus != dev->bus) > + /** > + * [MAW] - hack. the orin tegra194-pcie is in this group and > + * reports in as bus-type of "platform". We will ignore it > + * in an attempt to get vfio to play along. > + */ > + if (!strcmp(dev->bus->name,"platform")) { > + return 0; > + } > + > + if (*bus && *bus != dev->bus) { > return -EINVAL; > + } > > *bus = dev->bus; >
On Mon, Sep 30, 2024 at 12:00 AM Yi Liu <yi.l.liu@intel.com> wrote: > > On 2024/9/29 23:02, Michael Williamson wrote: > > Hello, > > > > I've been trying to get VFIO working on nVidia's Orin platform (ARM64) in > > order to support moving data efficiently off of an attached FPGA PCIe board > > using the SMMU from a user space application. We have a full application > > working on x86/x64 boxes that properly support the VFIO interface. We're > > trying to port support to the Orin. > > > > I'm on nVidia's 5.15.36 branch. It doesn't work out of the box, as the > > tegra194-pcie platform controller is lumped in the same iommu group as the > > actual PCIe card. The acs override patch didn't help to separate them. > > > > I have a patch below that *seems* to work for us, but I will admit I do not > > know the implications of what I am doing here. > > your below patch is to pass the vfio_dev_viable() check I suppose. If you yes. > are sure the tegra194-pcie platform controller will not issue DMA, then it > is fine. If not, you should be careful about it. > There are multiple tegra194-pcie platform controllers defined in the dts. This one only has the one PCIe slot we are working with and nothing else. There are other instances of the tegra194-pcie controllers that have control over other PCIe devices in the system. There appear to be two main smmu controllers that support multiple masters. The tegra194-pcie platform controller tied to the PCIe bus I want has a unique smmu/iommu master ID but is sharing the main smmu controller with other devices. However, they are all assigned to different iommu groups, so I think this is OK? > > Can anyone let me know if this is (and why it is) a bad idea, and what really > > needs to be done? Or if this is the wrong mailing list, point me in the right > > direction? > > this is the right place to ask. +NV folks I know. > I appreciate the insight. Thank you. > > Thanks, > > Mike > > > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > > index 818e47fc0896..a598a2204781 100644 > > --- a/drivers/vfio/vfio.c > > +++ b/drivers/vfio/vfio.c > > @@ -638,8 +638,15 @@ static struct vfio_device > > *vfio_group_get_device(struct vfio_group *group, > > * breaks anything, it only does so for user owned devices downstream. Note > > * that error notification via MSI can be affected for platforms that handle > > * MSI within the same IOVA space as DMA. > > + * > > + * [MAW] - the tegra194-pcie driver is a platform PCie device controller and > > + * fails the dev_is_pci() check below. Not sure if it's because its grouping > > + * needs to be reworked, but I don't know how this is (or if it > > should be) done. > > + * This is a hack to see if we can get it going well enough to use the > > + * SMMU from user space. The other two devices (for the Orin) in the group > > + * are the host bridge and the PCIe card itself. > > */ > > -static const char * const vfio_driver_allowed[] = { "pci-stub" }; > > +static const char * const vfio_driver_allowed[] = { "pci-stub", > > "tegra194-pcie" }; > > > > static bool vfio_dev_driver_allowed(struct device *dev, > > struct device_driver *drv) > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > > index 66bbb125d761..e34fbe17ae1a 100644 > > --- a/drivers/vfio/vfio_iommu_type1.c > > +++ b/drivers/vfio/vfio_iommu_type1.c > > @@ -45,7 +45,8 @@ > > #define DRIVER_AUTHOR "Alex Williamson <alex.williamson@redhat.com>" > > #define DRIVER_DESC "Type1 IOMMU driver for VFIO" > > > > -static bool allow_unsafe_interrupts; > > +/** [MAW] - hack, need this set for Orin test, not compiled is module > > currently */ > > +static bool allow_unsafe_interrupts = true; > > module_param_named(allow_unsafe_interrupts, > > allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR); > > MODULE_PARM_DESC(allow_unsafe_interrupts, > > @@ -1733,8 +1734,18 @@ static int vfio_bus_type(struct device *dev, void *data) > > { > > struct bus_type **bus = data; > > > > - if (*bus && *bus != dev->bus) > > + /** > > + * [MAW] - hack. the orin tegra194-pcie is in this group and > > + * reports in as bus-type of "platform". We will ignore it > > + * in an attempt to get vfio to play along. > > + */ > > + if (!strcmp(dev->bus->name,"platform")) { > > + return 0; > > + } > > + > > + if (*bus && *bus != dev->bus) { > > return -EINVAL; > > + } > > > > *bus = dev->bus; > > > > -- > Regards, > Yi Liu
On Mon, Sep 30, 2024 at 05:27:33PM -0400, Michael Williamson wrote: > On Mon, Sep 30, 2024 at 12:00 AM Yi Liu <yi.l.liu@intel.com> wrote: > > > > On 2024/9/29 23:02, Michael Williamson wrote: > > > Hello, > > > > > > I've been trying to get VFIO working on nVidia's Orin platform (ARM64) in > > > order to support moving data efficiently off of an attached FPGA PCIe board > > > using the SMMU from a user space application. We have a full application > > > working on x86/x64 boxes that properly support the VFIO interface. We're > > > trying to port support to the Orin. > > > > > > I'm on nVidia's 5.15.36 branch. It doesn't work out of the box, as the > > > tegra194-pcie platform controller is lumped in the same iommu group as the > > > actual PCIe card. The acs override patch didn't help to separate them. > > > > > > I have a patch below that *seems* to work for us, but I will admit I do not > > > know the implications of what I am doing here. > > > > your below patch is to pass the vfio_dev_viable() check I suppose. If you > > yes. > > > are sure the tegra194-pcie platform controller will not issue DMA, then it > > is fine. If not, you should be careful about it. > > > > There are multiple tegra194-pcie platform controllers defined in the dts. This > one only has the one PCIe slot we are working with and nothing else. There > are other instances of the tegra194-pcie controllers that have control over > other PCIe devices in the system. There appear to be two main smmu controllers > that support multiple masters. The tegra194-pcie platform controller tied > to the PCIe bus I want has a unique smmu/iommu master ID but is sharing the main > smmu controller with other devices. However, they are all assigned to > different iommu groups, so I think this is OK? > > > > Can anyone let me know if this is (and why it is) a bad idea, and what really > > > needs to be done? Or if this is the wrong mailing list, point me in the right > > > direction? > > > > this is the right place to ask. +NV folks I know. > > > > I appreciate the insight. Thank you. I am not quite sure about tegra194-pcie. +Vidya who's the owner. Nicolin > > > Thanks, > > > Mike > > > > > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > > > index 818e47fc0896..a598a2204781 100644 > > > --- a/drivers/vfio/vfio.c > > > +++ b/drivers/vfio/vfio.c > > > @@ -638,8 +638,15 @@ static struct vfio_device > > > *vfio_group_get_device(struct vfio_group *group, > > > * breaks anything, it only does so for user owned devices downstream. Note > > > * that error notification via MSI can be affected for platforms that handle > > > * MSI within the same IOVA space as DMA. > > > + * > > > + * [MAW] - the tegra194-pcie driver is a platform PCie device controller and > > > + * fails the dev_is_pci() check below. Not sure if it's because its grouping > > > + * needs to be reworked, but I don't know how this is (or if it > > > should be) done. > > > + * This is a hack to see if we can get it going well enough to use the > > > + * SMMU from user space. The other two devices (for the Orin) in the group > > > + * are the host bridge and the PCIe card itself. > > > */ > > > -static const char * const vfio_driver_allowed[] = { "pci-stub" }; > > > +static const char * const vfio_driver_allowed[] = { "pci-stub", > > > "tegra194-pcie" }; > > > > > > static bool vfio_dev_driver_allowed(struct device *dev, > > > struct device_driver *drv) > > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > > > index 66bbb125d761..e34fbe17ae1a 100644 > > > --- a/drivers/vfio/vfio_iommu_type1.c > > > +++ b/drivers/vfio/vfio_iommu_type1.c > > > @@ -45,7 +45,8 @@ > > > #define DRIVER_AUTHOR "Alex Williamson <alex.williamson@redhat.com>" > > > #define DRIVER_DESC "Type1 IOMMU driver for VFIO" > > > > > > -static bool allow_unsafe_interrupts; > > > +/** [MAW] - hack, need this set for Orin test, not compiled is module > > > currently */ > > > +static bool allow_unsafe_interrupts = true; > > > module_param_named(allow_unsafe_interrupts, > > > allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR); > > > MODULE_PARM_DESC(allow_unsafe_interrupts, > > > @@ -1733,8 +1734,18 @@ static int vfio_bus_type(struct device *dev, void *data) > > > { > > > struct bus_type **bus = data; > > > > > > - if (*bus && *bus != dev->bus) > > > + /** > > > + * [MAW] - hack. the orin tegra194-pcie is in this group and > > > + * reports in as bus-type of "platform". We will ignore it > > > + * in an attempt to get vfio to play along. > > > + */ > > > + if (!strcmp(dev->bus->name,"platform")) { > > > + return 0; > > > + } > > > + > > > + if (*bus && *bus != dev->bus) { > > > return -EINVAL; > > > + } > > > > > > *bus = dev->bus; > > > > > > > -- > > Regards, > > Yi Liu > > > > -- > Mike Williamson > OpenPGP Public Key
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 818e47fc0896..a598a2204781 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -638,8 +638,15 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group, * breaks anything, it only does so for user owned devices downstream. Note * that error notification via MSI can be affected for platforms that handle * MSI within the same IOVA space as DMA. + * + * [MAW] - the tegra194-pcie driver is a platform PCie device controller and + * fails the dev_is_pci() check below. Not sure if it's because its grouping + * needs to be reworked, but I don't know how this is (or if it should be) done. + * This is a hack to see if we can get it going well enough to use the + * SMMU from user space. The other two devices (for the Orin) in the group + * are the host bridge and the PCIe card itself. */ -static const char * const vfio_driver_allowed[] = { "pci-stub" }; +static const char * const vfio_driver_allowed[] = { "pci-stub", "tegra194-pcie" }; static bool vfio_dev_driver_allowed(struct device *dev, struct device_driver *drv) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 66bbb125d761..e34fbe17ae1a 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -45,7 +45,8 @@ #define DRIVER_AUTHOR "Alex Williamson <alex.williamson@redhat.com>" #define DRIVER_DESC "Type1 IOMMU driver for VFIO" -static bool allow_unsafe_interrupts; +/** [MAW] - hack, need this set for Orin test, not compiled is module currently */ +static bool allow_unsafe_interrupts = true; module_param_named(allow_unsafe_interrupts, allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);