mbox series

[v1,0/6] Check and sync host IOMMU cap/ecap with vIOMMU

Message ID 20240228094432.1092748-1-zhenzhong.duan@intel.com (mailing list archive)
Headers show
Series Check and sync host IOMMU cap/ecap with vIOMMU | expand

Message

Duan, Zhenzhong Feb. 28, 2024, 9:44 a.m. UTC
Hi,

Based on Joao's suggestion, the iommufd nesting prerequisite series [1]
is further splitted to host IOMMU device abstract part [2] and vIOMMU
check/sync part. This series implements the 2nd part.

This enables vIOMMU to get host IOMMU cap/ecap information by implementing
a new set/unset_iommu_device interface, then vIOMMU could check or sync
with vIOMMU's own cap/ecap config.

It works by having device side, i.e. VFIO, register either an IOMMULegacyDevice
or IOMMUFDDevice to vIOMMU, which includes necessary data to archive that.
Currently only VFIO device is supported, but it could also be used for other
devices, i.e., VDPA.

For coldplugged device, we can get its host IOMMU cap/ecap during qemu init,
then check and sync into vIOMMU cap/ecap.
For hotplugged device, vIOMMU cap/ecap is frozen, we could only check with
vIOMMU cap/ecap, not allowed to update. If check fails, hotplugged will fail.

This is also a prerequisite for incoming iommufd nesting series:
'intel_iommu: Enable stage-1 translation'.

I didn't implement cap/ecap sync for legacy VFIO backend, would like to see
what Eric want to put in IOMMULegacyDevice for virtio-iommu and if I can
utilize some of them.

Because it's becoming clear on community's suggestion, I'd like to remove
rfc tag from this version.

Qemu code can be found at:
https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_part2_v1

[1] https://lore.kernel.org/qemu-devel/20240201072818.327930-1-zhenzhong.duan@intel.com
[2] https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg06314.html

Thanks
Zhenzhong

Changelog:
v1:
- convert HostIOMMUDevice to sub object pointer in vtd_check_hdev

rfcv2:
- introduce common abstract HostIOMMUDevice and sub struct for different BEs (Eric, Cédric)
- remove iommufd_device.[ch] (Cédric)
- remove duplicate iommufd/devid define from VFIODevice (Eric)
- drop the p in aliased_pbus and aliased_pdevfn (Eric)
- assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn (Cédric, Eric)
- use errp in iommufd_device_get_info (Eric)
- split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
- move VTDHostIOMMUDevice declaration to intel_iommu_internal.h (Cédric)
- make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1' (Cédric)
- block migration if vIOMMU cap/ecap updated based on host IOMMU cap/ecap
- add R-B


Yi Liu (2):
  intel_iommu: Add set/unset_iommu_device callback
  intel_iommu: Add a framework to check and sync host IOMMU cap/ecap

Zhenzhong Duan (4):
  intel_iommu: Extract out vtd_cap_init to initialize cap/ecap
  intel_iommu: Implement check and sync mechanism in iommufd mode
  intel_iommu: Use mgaw instead of s->aw_bits
  intel_iommu: Block migration if cap is updated

 hw/i386/intel_iommu_internal.h |   9 ++
 include/hw/i386/intel_iommu.h  |   4 +
 hw/i386/acpi-build.c           |   3 +-
 hw/i386/intel_iommu.c          | 287 ++++++++++++++++++++++++++-------
 4 files changed, 245 insertions(+), 58 deletions(-)

Comments

Jason Wang March 4, 2024, 4:17 a.m. UTC | #1
On Wed, Feb 28, 2024 at 5:46 PM Zhenzhong Duan <zhenzhong.duan@intel.com> wrote:
>
> Hi,
>
> Based on Joao's suggestion, the iommufd nesting prerequisite series [1]
> is further splitted to host IOMMU device abstract part [2] and vIOMMU
> check/sync part. This series implements the 2nd part.
>
> This enables vIOMMU to get host IOMMU cap/ecap information by implementing
> a new set/unset_iommu_device interface, then vIOMMU could check or sync
> with vIOMMU's own cap/ecap config.

Does it mean that it would supress the cap/ecap config from the qemu
command line? If yes, I wonder how to maintain the migration
compatibility.

Thanks

>
> It works by having device side, i.e. VFIO, register either an IOMMULegacyDevice
> or IOMMUFDDevice to vIOMMU, which includes necessary data to archive that.
> Currently only VFIO device is supported, but it could also be used for other
> devices, i.e., VDPA.
>
> For coldplugged device, we can get its host IOMMU cap/ecap during qemu init,
> then check and sync into vIOMMU cap/ecap.
> For hotplugged device, vIOMMU cap/ecap is frozen, we could only check with
> vIOMMU cap/ecap, not allowed to update. If check fails, hotplugged will fail.
>
> This is also a prerequisite for incoming iommufd nesting series:
> 'intel_iommu: Enable stage-1 translation'.
>
> I didn't implement cap/ecap sync for legacy VFIO backend, would like to see
> what Eric want to put in IOMMULegacyDevice for virtio-iommu and if I can
> utilize some of them.
>
> Because it's becoming clear on community's suggestion, I'd like to remove
> rfc tag from this version.
>
> Qemu code can be found at:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_part2_v1
>
> [1] https://lore.kernel.org/qemu-devel/20240201072818.327930-1-zhenzhong.duan@intel.com
> [2] https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg06314.html
>
> Thanks
> Zhenzhong
>
> Changelog:
> v1:
> - convert HostIOMMUDevice to sub object pointer in vtd_check_hdev
>
> rfcv2:
> - introduce common abstract HostIOMMUDevice and sub struct for different BEs (Eric, Cédric)
> - remove iommufd_device.[ch] (Cédric)
> - remove duplicate iommufd/devid define from VFIODevice (Eric)
> - drop the p in aliased_pbus and aliased_pdevfn (Eric)
> - assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn (Cédric, Eric)
> - use errp in iommufd_device_get_info (Eric)
> - split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
> - move VTDHostIOMMUDevice declaration to intel_iommu_internal.h (Cédric)
> - make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1' (Cédric)
> - block migration if vIOMMU cap/ecap updated based on host IOMMU cap/ecap
> - add R-B
>
>
> Yi Liu (2):
>   intel_iommu: Add set/unset_iommu_device callback
>   intel_iommu: Add a framework to check and sync host IOMMU cap/ecap
>
> Zhenzhong Duan (4):
>   intel_iommu: Extract out vtd_cap_init to initialize cap/ecap
>   intel_iommu: Implement check and sync mechanism in iommufd mode
>   intel_iommu: Use mgaw instead of s->aw_bits
>   intel_iommu: Block migration if cap is updated
>
>  hw/i386/intel_iommu_internal.h |   9 ++
>  include/hw/i386/intel_iommu.h  |   4 +
>  hw/i386/acpi-build.c           |   3 +-
>  hw/i386/intel_iommu.c          | 287 ++++++++++++++++++++++++++-------
>  4 files changed, 245 insertions(+), 58 deletions(-)
>
> --
> 2.34.1
>
Duan, Zhenzhong March 4, 2024, 6:13 a.m. UTC | #2
>-----Original Message-----
>From: Jason Wang <jasowang@redhat.com>
>Subject: Re: [PATCH v1 0/6] Check and sync host IOMMU cap/ecap with
>vIOMMU
>
>On Wed, Feb 28, 2024 at 5:46 PM Zhenzhong Duan
><zhenzhong.duan@intel.com> wrote:
>>
>> Hi,
>>
>> Based on Joao's suggestion, the iommufd nesting prerequisite series [1]
>> is further splitted to host IOMMU device abstract part [2] and vIOMMU
>> check/sync part. This series implements the 2nd part.
>>
>> This enables vIOMMU to get host IOMMU cap/ecap information by
>implementing
>> a new set/unset_iommu_device interface, then vIOMMU could check or
>sync
>> with vIOMMU's own cap/ecap config.
>
>Does it mean that it would supress the cap/ecap config from the qemu
>command line?

No, cap/ecap have two kinds of bits, one is not controlled by cmdline, e.g., MGAW;
the other is, e.g., we initialize SAGAW through aw_bits.

I treat qemu cmdline with higher priority. We only allow update those bits not related
to any cmdline, e.g., update vIOMMU MGAW with host MGAW.

If there is cmdline controlled cap/ecap bits incompatibility between host and vIOMMU,
vfio device hotplug should fail.

> If yes, I wonder how to maintain the migration compatibility.

If cap/ecap is updated due to above reason, I have below patch to block migration.

[PATCH v1 6/6] intel_iommu: Block migration if cap is updated

Thanks
Zhenzhong

>
>Thanks
>
>>
>> It works by having device side, i.e. VFIO, register either an
>IOMMULegacyDevice
>> or IOMMUFDDevice to vIOMMU, which includes necessary data to archive
>that.
>> Currently only VFIO device is supported, but it could also be used for other
>> devices, i.e., VDPA.
>>
>> For coldplugged device, we can get its host IOMMU cap/ecap during qemu
>init,
>> then check and sync into vIOMMU cap/ecap.
>> For hotplugged device, vIOMMU cap/ecap is frozen, we could only check
>with
>> vIOMMU cap/ecap, not allowed to update. If check fails, hotplugged will
>fail.
>>
>> This is also a prerequisite for incoming iommufd nesting series:
>> 'intel_iommu: Enable stage-1 translation'.
>>
>> I didn't implement cap/ecap sync for legacy VFIO backend, would like to
>see
>> what Eric want to put in IOMMULegacyDevice for virtio-iommu and if I can
>> utilize some of them.
>>
>> Because it's becoming clear on community's suggestion, I'd like to remove
>> rfc tag from this version.
>>
>> Qemu code can be found at:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_pre
>q_part2_v1
>>
>> [1] https://lore.kernel.org/qemu-devel/20240201072818.327930-1-
>zhenzhong.duan@intel.com
>> [2] https://lists.gnu.org/archive/html/qemu-devel/2024-
>02/msg06314.html
>>
>> Thanks
>> Zhenzhong
>>
>> Changelog:
>> v1:
>> - convert HostIOMMUDevice to sub object pointer in vtd_check_hdev
>>
>> rfcv2:
>> - introduce common abstract HostIOMMUDevice and sub struct for
>different BEs (Eric, Cédric)
>> - remove iommufd_device.[ch] (Cédric)
>> - remove duplicate iommufd/devid define from VFIODevice (Eric)
>> - drop the p in aliased_pbus and aliased_pdevfn (Eric)
>> - assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn
>(Cédric, Eric)
>> - use errp in iommufd_device_get_info (Eric)
>> - split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
>> - move VTDHostIOMMUDevice declaration to intel_iommu_internal.h
>(Cédric)
>> - make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1'
>(Cédric)
>> - block migration if vIOMMU cap/ecap updated based on host IOMMU
>cap/ecap
>> - add R-B
>>
>>
>> Yi Liu (2):
>>   intel_iommu: Add set/unset_iommu_device callback
>>   intel_iommu: Add a framework to check and sync host IOMMU cap/ecap
>>
>> Zhenzhong Duan (4):
>>   intel_iommu: Extract out vtd_cap_init to initialize cap/ecap
>>   intel_iommu: Implement check and sync mechanism in iommufd mode
>>   intel_iommu: Use mgaw instead of s->aw_bits
>>   intel_iommu: Block migration if cap is updated
>>
>>  hw/i386/intel_iommu_internal.h |   9 ++
>>  include/hw/i386/intel_iommu.h  |   4 +
>>  hw/i386/acpi-build.c           |   3 +-
>>  hw/i386/intel_iommu.c          | 287 ++++++++++++++++++++++++++-------
>>  4 files changed, 245 insertions(+), 58 deletions(-)
>>
>> --
>> 2.34.1
>>