Message ID | 20240626-vfio_pci_mmap-v4-3-7f038870f022@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfio/pci: s390: Fix issues preventing VFIO_PCI_MMAP=y for s390 and enable it | expand |
On Wed, 26 Jun 2024 13:15:50 +0200 Niklas Schnelle <schnelle@linux.ibm.com> wrote: > When VFIO_PCI_MMAP is enabled for s390 in a future commit and the ISM > device is passed-through to a KVM guest QEMU attempts to eagerly mmap() > its BAR. This fails because the 256 TiB large BAR does not fit in the > virtual map. Besides this issue mmap() of the ISM device's BAR is not > useful either as even a partial mapping won't be usable from user-space > without a vfio-pci variant driver. A previous commit ensures that pdev-> > non_compliant_bars is set for ISM so use this to disallow mmap() with > the expecation that mmap() of non-compliant BARs is not advisable in the > general case either. > > Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> > --- > drivers/vfio/pci/vfio_pci_core.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > index 987c7921affa..0e9d46575776 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -128,10 +128,9 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_core_device *vdev) > > /* > * The PCI core shouldn't set up a resource with a > - * type but zero size. But there may be bugs that > - * cause us to do that. > + * type but zero size or non-compliant BARs. > */ > - if (!resource_size(res)) > + if (!resource_size(res) || vdev->pdev->non_compliant_bars) > goto no_mmap; > > if (resource_size(res) >= PAGE_SIZE) { > The non_compliant_bars flag causes pci_read_bases() to exit, shouldn't that mean the resource is not setup and resource_size() is zero and explicitly testing the non_compliant_bars flag is redundant? Or does s390 do this somewhere else? The non_compliant_bars flag is defined as /* Broken BARs; ignore them */ so it'd be pretty strange if they had a resource size and we chose to still expose them with read-write access... why wouldn't we just deny-list the device from use with vfio-pci? Also probably worth an explicit comment in the commit log why pci-sysfs mmap support doesn't need to be bypassed on s390. Thanks, Alex
On Thu, 2024-06-27 at 16:05 -0600, Alex Williamson wrote: > On Wed, 26 Jun 2024 13:15:50 +0200 > Niklas Schnelle <schnelle@linux.ibm.com> wrote: > > > When VFIO_PCI_MMAP is enabled for s390 in a future commit and the ISM > > device is passed-through to a KVM guest QEMU attempts to eagerly mmap() > > its BAR. This fails because the 256 TiB large BAR does not fit in the > > virtual map. Besides this issue mmap() of the ISM device's BAR is not > > useful either as even a partial mapping won't be usable from user-space > > without a vfio-pci variant driver. A previous commit ensures that pdev-> > > non_compliant_bars is set for ISM so use this to disallow mmap() with > > the expecation that mmap() of non-compliant BARs is not advisable in the > > general case either. > > > > Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> > > --- > > drivers/vfio/pci/vfio_pci_core.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > > index 987c7921affa..0e9d46575776 100644 > > --- a/drivers/vfio/pci/vfio_pci_core.c > > +++ b/drivers/vfio/pci/vfio_pci_core.c > > @@ -128,10 +128,9 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_core_device *vdev) > > > > /* > > * The PCI core shouldn't set up a resource with a > > - * type but zero size. But there may be bugs that > > - * cause us to do that. > > + * type but zero size or non-compliant BARs. > > */ > > - if (!resource_size(res)) > > + if (!resource_size(res) || vdev->pdev->non_compliant_bars) > > goto no_mmap; > > > > if (resource_size(res) >= PAGE_SIZE) { > > > > The non_compliant_bars flag causes pci_read_bases() to exit, shouldn't > that mean the resource is not setup and resource_size() is zero and > explicitly testing the non_compliant_bars flag is redundant? Or does > s390 do this somewhere else? > > The non_compliant_bars flag is defined as /* Broken BARs; ignore them */ > so it'd be pretty strange if they had a resource size and we chose to > still expose them with read-write access... why wouldn't we just > deny-list the device from use with vfio-pci? > > Also probably worth an explicit comment in the commit log why pci-sysfs > mmap support doesn't need to be bypassed on s390. Thanks, > > Alex > Uhh, you're right. With pdev->non_compliant_bars we don't get the resources filled in at all due to the check in pci_read_bases(), the structs are however still allocated. Consequently IORESOURCE_MEM is unset (and resource_size(res) is 0) and the existing IORESOURCE_MEM check already ignores the ISM device here. Funnily enough this doesn't seem to cause issues for the ISM driver itself because of the way it bypasses normal PCI access and doesn't use ioremap() nor the MMIO access helpers. Honestlym I think pdev->non_compliant_bars might still be the right thing. This BAR is special enough that anything other than the ISM driver is probably served best by just ignoring it. We could then just drop this patch. I think this would also take care of pci- sysfs mmap ignoring ISM if s390x were to set HAVE_PCI_MMAP. Thanks, Niklas
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 987c7921affa..0e9d46575776 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -128,10 +128,9 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_core_device *vdev) /* * The PCI core shouldn't set up a resource with a - * type but zero size. But there may be bugs that - * cause us to do that. + * type but zero size or non-compliant BARs. */ - if (!resource_size(res)) + if (!resource_size(res) || vdev->pdev->non_compliant_bars) goto no_mmap; if (resource_size(res) >= PAGE_SIZE) {
When VFIO_PCI_MMAP is enabled for s390 in a future commit and the ISM device is passed-through to a KVM guest QEMU attempts to eagerly mmap() its BAR. This fails because the 256 TiB large BAR does not fit in the virtual map. Besides this issue mmap() of the ISM device's BAR is not useful either as even a partial mapping won't be usable from user-space without a vfio-pci variant driver. A previous commit ensures that pdev-> non_compliant_bars is set for ISM so use this to disallow mmap() with the expecation that mmap() of non-compliant BARs is not advisable in the general case either. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> --- drivers/vfio/pci/vfio_pci_core.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)