Message ID | 20250130134346.1754143-9-clg@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | vfio: Improve error reporting when MMIO region mapping fails | expand |
On Thu, 30 Jan 2025 14:43:45 +0100 Cédric Le Goater <clg@redhat.com> wrote: > Print a warning if IOMMU address space width is smaller than the > physical address width. In this case, PCI peer-to-peer transactions on > BARs are not supported and failures of device MMIO regions are to be > expected. > > This can occur with the 39-bit IOMMU address space width as found on > consumer grade processors or when using a vIOMMU device with default > settings. > > Signed-off-by: Cédric Le Goater <clg@redhat.com> > --- > hw/vfio/common.c | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > index 5c9d8657d746ce30af5ae8f9122101e086a61ef5..e5ef93c589b2bed68f790608868ec3c7779d4cb8 100644 > --- a/hw/vfio/common.c > +++ b/hw/vfio/common.c > @@ -44,6 +44,8 @@ > #include "migration/qemu-file.h" > #include "system/tpm.h" > > +#include "hw/core/cpu.h" > + > VFIODeviceList vfio_device_list = > QLIST_HEAD_INITIALIZER(vfio_device_list); > static QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces = > @@ -1546,12 +1548,28 @@ retry: > return info; > } > > +static bool vfio_device_check_address_space(VFIODevice *vbasedev, Error **errp) > +{ > + uint32_t cpu_aw_bits = cpu_get_phys_bits(first_cpu); > + uint32_t iommu_aw_bits = vfio_device_get_aw_bits(vbasedev); > + > + if (cpu_aw_bits && cpu_aw_bits > iommu_aw_bits) { Should we be testing the 64-bit MMIO window and maximum RAM GPA rather than the vCPU physical address width? Maybe we're just stuck with an indirect solution currently. AIUI, we're testing the vCPU physical address width because (obviously) all devices and memory need to be addressable within that address space. However, as we've explored offline, there are bare metal systems where the CPU physical address width exceeds the IOMMU address width and this is not a fundamental problem. It places restrictions on the maximum RAM physical address and the location of the 64-bit MMIO space. RAM tends not to be a problem on these bare metal systems since they physically cannot hold enough RAM to exceed the IOMMU width and, for the most part, we map RAM starting from the bottom of the address space. So long as the VMM doesn't decide to map RAM at the top of the physical address space, this doesn't become a problem. However, we've decided to do exactly that for the 64-bit MMIO window. It's not that the vCPU width being greater than the IOMMU width is a fundamental problem, it's that we've chosen a 64-bit MMIO policy that makes this become a problem and QEMU only has a convenient mechanism to check the host IOMMU width when a vfio device is present. Arguably, when a vIOMMU is present guest firmware should be enlightened to understand the address width of that vIOMMU when placing the 64-bit MMIO window. I'd say the failure to do so is a current firmware bug. If the vIOMMU address width were honored, we could recommend users set that to match the host, regardless of the vCPU physical address width. We also already have a failure condition if the vIOMMU address width exceeds the vfio-device (ie. indirectly the host) IOMMU width. Of course without a vIOMMU, given our 64-bit MMIO policy, we don't have a good solution without specifying the 64-bit window or implementing a more conservative placement. Not sure how much of this is immediately solvable and some indication to the user how they can resolve the issue, such as implemented here, is better than none, but maybe we can elaborate in a comment that this is really more of a workaround for the current behavior of firmware relative to the 64-bit MMIO placement policy. Thanks, Alex > + error_setg(errp, "Host physical address space (%u) is larger than " > + "the host IOMMU address space (%u).", cpu_aw_bits, > + iommu_aw_bits); > + vfio_device_error_append(vbasedev, errp); > + return false; > + } > + return true; > +} > + > bool vfio_attach_device(char *name, VFIODevice *vbasedev, > AddressSpace *as, Error **errp) > { > const VFIOIOMMUClass *ops = > VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY)); > HostIOMMUDevice *hiod = NULL; > + Error *local_err = NULL; > > if (vbasedev->iommufd) { > ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD)); > @@ -1571,6 +1589,9 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev, > return false; > } > > + if (!vfio_device_check_address_space(vbasedev, &local_err)) { > + warn_report_err(local_err); > + } > return true; > } >
diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 5c9d8657d746ce30af5ae8f9122101e086a61ef5..e5ef93c589b2bed68f790608868ec3c7779d4cb8 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -44,6 +44,8 @@ #include "migration/qemu-file.h" #include "system/tpm.h" +#include "hw/core/cpu.h" + VFIODeviceList vfio_device_list = QLIST_HEAD_INITIALIZER(vfio_device_list); static QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces = @@ -1546,12 +1548,28 @@ retry: return info; } +static bool vfio_device_check_address_space(VFIODevice *vbasedev, Error **errp) +{ + uint32_t cpu_aw_bits = cpu_get_phys_bits(first_cpu); + uint32_t iommu_aw_bits = vfio_device_get_aw_bits(vbasedev); + + if (cpu_aw_bits && cpu_aw_bits > iommu_aw_bits) { + error_setg(errp, "Host physical address space (%u) is larger than " + "the host IOMMU address space (%u).", cpu_aw_bits, + iommu_aw_bits); + vfio_device_error_append(vbasedev, errp); + return false; + } + return true; +} + bool vfio_attach_device(char *name, VFIODevice *vbasedev, AddressSpace *as, Error **errp) { const VFIOIOMMUClass *ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY)); HostIOMMUDevice *hiod = NULL; + Error *local_err = NULL; if (vbasedev->iommufd) { ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD)); @@ -1571,6 +1589,9 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev, return false; } + if (!vfio_device_check_address_space(vbasedev, &local_err)) { + warn_report_err(local_err); + } return true; }
Print a warning if IOMMU address space width is smaller than the physical address width. In this case, PCI peer-to-peer transactions on BARs are not supported and failures of device MMIO regions are to be expected. This can occur with the 39-bit IOMMU address space width as found on consumer grade processors or when using a vIOMMU device with default settings. Signed-off-by: Cédric Le Goater <clg@redhat.com> --- hw/vfio/common.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)