Message ID | 20230526231558.1660396-1-alex.williamson@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | vfio/pci: Atomic Ops completer support | expand |
Hi Alex, Thanks for taking the time to implement support for Atomic Op completer support properly :). I have tested out these patches and the kernel patch, and apart from a minor issue with patch 2 everything works fine; ROCm programs that use device->host atomic operations work properly. Something that I have been thinking about, are there any implications involved with enabling this feature automatically with no possibility of turning it off? I have no use case for that, though, and I cant really think of a reason other than preventing the guest from finding out hardware details about the host. Thanks, Robin Voetter, Stream HPC
On Wed, 31 May 2023 23:55:41 +0200 Robin Voetter <robin@streamhpc.com> wrote: > Hi Alex, > > Thanks for taking the time to implement support for Atomic Op completer > support properly :). I have tested out these patches and the kernel > patch, and apart from a minor issue with patch 2 everything works fine; Yes, Cedric noted the extra semicolon as well, I'm about to post that patch as non-RFC since it has no dependencies on anything else. > ROCm programs that use device->host atomic operations work properly. Great, thanks for testing! > Something that I have been thinking about, are there any implications > involved with enabling this feature automatically with no possibility of > turning it off? I have no use case for that, though, and I cant really > think of a reason other than preventing the guest from finding out > hardware details about the host. Not sure I follow, are you suggesting that Atomic Ops completion support is proactively enabled in the VM to match the host, regardless of attached assigned devices? An obvious problem there is migration. If I start a VM on a host with Atomic Ops support and want to migrate to a host without Atomic Ops support, config space of the root ports is now different and the migration fails. QEMU would also require elevated privileges to read config space to determine the host support, and then what does it do if only some of the PCIe hierarchy supports Atomic Ops. Policy decisions like that are generally left to management tools, so there would always be a means to enable or disable the feature. In fact, that's specifically why I test that the Atomic Op completer bits are unset in the root port before changing them so that this automatic enablement could live alongside a command line option to statically enable some bits. That does however remind me that it is often good with these sorts of "clever" automatic features to have an opt-out, so I'll likely add an x-no-rp-atomics device option in the next version to provide that. Thanks, Alex
On 1/6/23 00:24, Alex Williamson wrote: > On Wed, 31 May 2023 23:55:41 +0200 > Robin Voetter <robin@streamhpc.com> wrote: > >> Hi Alex, >> >> Thanks for taking the time to implement support for Atomic Op completer >> support properly :). I have tested out these patches and the kernel >> patch, and apart from a minor issue with patch 2 everything works fine; > > Yes, Cedric noted the extra semicolon as well, I'm about to post that > patch as non-RFC since it has no dependencies on anything else. > Policy decisions like that are generally left to management tools, so > there would always be a means to enable or disable the feature. In > fact, that's specifically why I test that the Atomic Op completer bits > are unset in the root port before changing them so that this automatic > enablement could live alongside a command line option to statically > enable some bits. > > That does however remind me that it is often good with these sorts of > "clever" automatic features to have an opt-out, so I'll likely add an > x-no-rp-atomics device option in the next version to provide that. I was just going to suggest that, just in case :)
On 27/5/23 01:15, Alex Williamson wrote: > This RFC proposes to allow a vfio-pci device to manipulate the PCI > Express capability of an associated root port to enable Atomic Op > completer support as equivalent to host capabilities. This would [...] > While it's not exactly standard practice to modify root port device > capabilities runtime, it also does not seem to be precluded by the PCIe > Spec (6.0.1). The Atomic Op completion bits of the DEVCAP2 register > are defined as Read-only: > > 7.4 Configuration Register Types > Read-only - Register bits are read-only and cannot be altered by software. > Where explicitly defined, these bits are used to reflect changing > hardware state, and as a result bit values can be observed to > change at run time. Register bit default values and bits that > cannot change value at run time, are permitted to be hard-coded, > initialized by system/device firmware, or initialized by hardware > mechanisms such as pin strapping or nonvolatile storage. > Initialization by system firmware is permitted only for system- > integrated devices. If the optional feature that would Set the > bits is not implemented, the bits must be hardwired to Zero. > > Here "altered by software" is relative to guest writes to the config > space register, whereas in this implementation we're acting as hardware > and the bits are changing to reflect a change in runtime capabilities. > The spec does include a HwInit register type which would restrict the > value from changing at runtime outside of resets. Therefore while it > would not be advised to update these bits arbitrarily, it does seem safe > and compatible with guest software to update the value on device attach > and detach. From my previous (short) PCIe experience, this is also my understanding.
On 6/1/23 00:24, Alex Williamson wrote: > On Wed, 31 May 2023 23:55:41 +0200 > Robin Voetter <robin@streamhpc.com> wrote: >> Something that I have been thinking about, are there any implications >> involved with enabling this feature automatically with no possibility of >> turning it off? I have no use case for that, though, and I cant really >> think of a reason other than preventing the guest from finding out >> hardware details about the host. > > Not sure I follow, are you suggesting that Atomic Ops completion > support is proactively enabled in the VM to match the host, regardless > of attached assigned devices? An obvious problem there is migration. > If I start a VM on a host with Atomic Ops support and want to migrate > to a host without Atomic Ops support, config space of the root ports is > now different and the migration fails. QEMU would also require > elevated privileges to read config space to determine the host support, > and then what does it do if only some of the PCIe hierarchy supports > Atomic Ops. > > Policy decisions like that are generally left to management tools, so > there would always be a means to enable or disable the feature. In > fact, that's specifically why I test that the Atomic Op completer bits > are unset in the root port before changing them so that this automatic > enablement could live alongside a command line option to statically > enable some bits. > > That does however remind me that it is often good with these sorts of > "clever" automatic features to have an opt-out, so I'll likely add an > x-no-rp-atomics device option in the next version to provide that. Yes, something like that that is exactly what I was thinking about. Thanks, Robin Voetter
Hi Markus, Marc-André, > On 1/6/23 00:24, Alex Williamson wrote: >> Policy decisions like that are generally left to management tools, so >> there would always be a means to enable or disable the feature. In >> fact, that's specifically why I test that the Atomic Op completer bits >> are unset in the root port before changing them so that this automatic >> enablement could live alongside a command line option to statically >> enable some bits. >> >> That does however remind me that it is often good with these sorts of >> "clever" automatic features to have an opt-out, so I'll likely add an >> x-no-rp-atomics device option in the next version to provide that. Still thinking about this series I remembered since commit a3c45b3e62 ("qapi: New special feature flag "unstable"") the "x-" prefix is obsolete (superseded) in QAPI generated code. I wonder about device properties: $ git grep -F '"x-' hw | wc -l 130 $ git grep -F '"x-' hw/vfio/pci.c hw/vfio/pci.c:2987: error_setg(errp, "x-balloon-allowed only potentially compatible " hw/vfio/pci.c:3335: DEFINE_PROP_ON_OFF_AUTO("x-pre-copy-dirty-page-tracking", VFIOPCIDevice, hw/vfio/pci.c:3342: DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIOPCIDevice, hw/vfio/pci.c:3344: DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features, hw/vfio/pci.c:3346: DEFINE_PROP_BIT("x-req", VFIOPCIDevice, features, hw/vfio/pci.c:3348: DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features, hw/vfio/pci.c:3350: DEFINE_PROP_BOOL("x-enable-migration", VFIOPCIDevice, hw/vfio/pci.c:3352: DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), hw/vfio/pci.c:3353: DEFINE_PROP_BOOL("x-balloon-allowed", VFIOPCIDevice, hw/vfio/pci.c:3355: DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false), hw/vfio/pci.c:3356: DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false), hw/vfio/pci.c:3357: DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false), hw/vfio/pci.c:3358: DEFINE_PROP_BOOL("x-no-geforce-quirks", VFIOPCIDevice, hw/vfio/pci.c:3360: DEFINE_PROP_BOOL("x-no-kvm-ioeventfd", VFIOPCIDevice, no_kvm_ioeventfd, hw/vfio/pci.c:3362: DEFINE_PROP_BOOL("x-no-vfio-ioeventfd", VFIOPCIDevice, no_vfio_ioeventfd, hw/vfio/pci.c:3364: DEFINE_PROP_UINT32("x-pci-vendor-id", VFIOPCIDevice, vendor_id, PCI_ANY_ID), hw/vfio/pci.c:3365: DEFINE_PROP_UINT32("x-pci-device-id", VFIOPCIDevice, device_id, PCI_ANY_ID), hw/vfio/pci.c:3366: DEFINE_PROP_UINT32("x-pci-sub-vendor-id", VFIOPCIDevice, hw/vfio/pci.c:3368: DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice, hw/vfio/pci.c:3370: DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0), hw/vfio/pci.c:3371: DEFINE_PROP_UNSIGNED_NODEFAULT("x-nv-gpudirect-clique", VFIOPCIDevice, hw/vfio/pci.c:3374: DEFINE_PROP_OFF_AUTO_PCIBAR("x-msix-relocation", VFIOPCIDevice, msix_relo, Is there a plan to use something similar for QOM properties? Is it OK to keep using the "x-" prefix there? Thanks, Phil.