diff mbox series

[v2,8/9] vfio: Check compatibility of CPU and IOMMU address space width

Message ID 20250130134346.1754143-9-clg@redhat.com (mailing list archive)
State New
Headers show
Series vfio: Improve error reporting when MMIO region mapping fails | expand

Commit Message

Cédric Le Goater Jan. 30, 2025, 1:43 p.m. UTC
Print a warning if IOMMU address space width is smaller than the
physical address width. In this case, PCI peer-to-peer transactions on
BARs are not supported and failures of device MMIO regions are to be
expected.

This can occur with the 39-bit IOMMU address space width as found on
consumer grade processors or when using a vIOMMU device with default
settings.

Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
 hw/vfio/common.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Alex Williamson Jan. 30, 2025, 6:58 p.m. UTC | #1
On Thu, 30 Jan 2025 14:43:45 +0100
Cédric Le Goater <clg@redhat.com> wrote:

> Print a warning if IOMMU address space width is smaller than the
> physical address width. In this case, PCI peer-to-peer transactions on
> BARs are not supported and failures of device MMIO regions are to be
> expected.
> 
> This can occur with the 39-bit IOMMU address space width as found on
> consumer grade processors or when using a vIOMMU device with default
> settings.
> 
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
>  hw/vfio/common.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 5c9d8657d746ce30af5ae8f9122101e086a61ef5..e5ef93c589b2bed68f790608868ec3c7779d4cb8 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -44,6 +44,8 @@
>  #include "migration/qemu-file.h"
>  #include "system/tpm.h"
>  
> +#include "hw/core/cpu.h"
> +
>  VFIODeviceList vfio_device_list =
>      QLIST_HEAD_INITIALIZER(vfio_device_list);
>  static QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces =
> @@ -1546,12 +1548,28 @@ retry:
>      return info;
>  }
>  
> +static bool vfio_device_check_address_space(VFIODevice *vbasedev, Error **errp)
> +{
> +    uint32_t cpu_aw_bits = cpu_get_phys_bits(first_cpu);
> +    uint32_t iommu_aw_bits = vfio_device_get_aw_bits(vbasedev);
> +
> +    if (cpu_aw_bits && cpu_aw_bits > iommu_aw_bits) {

Should we be testing the 64-bit MMIO window and maximum RAM GPA rather
than the vCPU physical address width?

Maybe we're just stuck with an indirect solution currently.  AIUI,
we're testing the vCPU physical address width because (obviously) all
devices and memory need to be addressable within that address space.
However, as we've explored offline, there are bare metal systems where
the CPU physical address width exceeds the IOMMU address width and this
is not a fundamental problem.  It places restrictions on the maximum
RAM physical address and the location of the 64-bit MMIO space.

RAM tends not to be a problem on these bare metal systems since they
physically cannot hold enough RAM to exceed the IOMMU width and, for
the most part, we map RAM starting from the bottom of the address
space.  So long as the VMM doesn't decide to map RAM at the top of the
physical address space, this doesn't become a problem.

However, we've decided to do exactly that for the 64-bit MMIO window.
It's not that the vCPU width being greater than the IOMMU width is a
fundamental problem, it's that we've chosen a 64-bit MMIO policy that
makes this become a problem and QEMU only has a convenient mechanism to
check the host IOMMU width when a vfio device is present.  Arguably,
when a vIOMMU is present guest firmware should be enlightened to
understand the address width of that vIOMMU when placing the 64-bit
MMIO window.  I'd say the failure to do so is a current firmware bug.

If the vIOMMU address width were honored, we could recommend users set
that to match the host, regardless of the vCPU physical address width.
We also already have a failure condition if the vIOMMU address width
exceeds the vfio-device (ie. indirectly the host) IOMMU width.

Of course without a vIOMMU, given our 64-bit MMIO policy, we don't have
a good solution without specifying the 64-bit window or implementing a
more conservative placement.

Not sure how much of this is immediately solvable and some indication
to the user how they can resolve the issue, such as implemented here, is
better than none, but maybe we can elaborate in a comment that this is
really more of a workaround for the current behavior of firmware
relative to the 64-bit MMIO placement policy.  Thanks,

Alex

> +        error_setg(errp, "Host physical address space (%u) is larger than "
> +                   "the host IOMMU address space (%u).", cpu_aw_bits,
> +                   iommu_aw_bits);
> +        vfio_device_error_append(vbasedev, errp);
> +        return false;
> +    }
> +    return true;
> +}
> +
>  bool vfio_attach_device(char *name, VFIODevice *vbasedev,
>                          AddressSpace *as, Error **errp)
>  {
>      const VFIOIOMMUClass *ops =
>          VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
>      HostIOMMUDevice *hiod = NULL;
> +    Error *local_err = NULL;
>  
>      if (vbasedev->iommufd) {
>          ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
> @@ -1571,6 +1589,9 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev,
>          return false;
>      }
>  
> +    if (!vfio_device_check_address_space(vbasedev, &local_err)) {
> +        warn_report_err(local_err);
> +    }
>      return true;
>  }
>
diff mbox series

Patch

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 5c9d8657d746ce30af5ae8f9122101e086a61ef5..e5ef93c589b2bed68f790608868ec3c7779d4cb8 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -44,6 +44,8 @@ 
 #include "migration/qemu-file.h"
 #include "system/tpm.h"
 
+#include "hw/core/cpu.h"
+
 VFIODeviceList vfio_device_list =
     QLIST_HEAD_INITIALIZER(vfio_device_list);
 static QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces =
@@ -1546,12 +1548,28 @@  retry:
     return info;
 }
 
+static bool vfio_device_check_address_space(VFIODevice *vbasedev, Error **errp)
+{
+    uint32_t cpu_aw_bits = cpu_get_phys_bits(first_cpu);
+    uint32_t iommu_aw_bits = vfio_device_get_aw_bits(vbasedev);
+
+    if (cpu_aw_bits && cpu_aw_bits > iommu_aw_bits) {
+        error_setg(errp, "Host physical address space (%u) is larger than "
+                   "the host IOMMU address space (%u).", cpu_aw_bits,
+                   iommu_aw_bits);
+        vfio_device_error_append(vbasedev, errp);
+        return false;
+    }
+    return true;
+}
+
 bool vfio_attach_device(char *name, VFIODevice *vbasedev,
                         AddressSpace *as, Error **errp)
 {
     const VFIOIOMMUClass *ops =
         VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
     HostIOMMUDevice *hiod = NULL;
+    Error *local_err = NULL;
 
     if (vbasedev->iommufd) {
         ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
@@ -1571,6 +1589,9 @@  bool vfio_attach_device(char *name, VFIODevice *vbasedev,
         return false;
     }
 
+    if (!vfio_device_check_address_space(vbasedev, &local_err)) {
+        warn_report_err(local_err);
+    }
     return true;
 }