mbox series

[v5,00/13] iommufd: Add vIOMMU infrastructure (Part-2: vDEVICE)

Message ID cover.1729897278.git.nicolinc@nvidia.com (mailing list archive)
Headers show
Series iommufd: Add vIOMMU infrastructure (Part-2: vDEVICE) | expand

Message

Nicolin Chen Oct. 25, 2024, 11:50 p.m. UTC
Following the previous vIOMMU series, this adds another vDEVICE structure,
representing the association from an iommufd_device to an iommufd_viommu.
This gives the whole architecture a new "v" layer:
  _______________________________________________________________________
 |                      iommufd (with vIOMMU/vDEVICE)                    |
 |                        _____________      _____________               |
 |                       |             |    |             |              |
 |      |----------------|    vIOMMU   |<---|   vDEVICE   |<------|      |
 |      |                |             |    |_____________|       |      |
 |      |     ______     |             |     _____________     ___|____  |
 |      |    |      |    |             |    |             |   |        | |
 |      |    | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
 |      |    |______|    |_____________|    |_____________|   |________| |
 |______|________|______________|__________________|_______________|_____|
        |        |              |                  |               |
  ______v_____   |        ______v_____       ______v_____       ___v__
 |   struct   |  |  PFN  |  (paging)  |     |  (nested)  |     |struct|
 |iommu_device|  |------>|iommu_domain|<----|iommu_domain|<----|device|
 |____________|   storage|____________|     |____________|     |______|

This vDEVICE object is used to collect and store all vIOMMU-related device
information/attributes in a VM. As an initial series for vDEVICE, add only
the virt_id to the vDEVICE, which is a vIOMMU specific device ID in a VM:
e.g. vSID of ARM SMMUv3, vDeviceID of AMD IOMMU, and vID of Intel VT-d to
a Context Table. This virt_id helps IOMMU drivers to link the vID to a pID
of the device against the physical IOMMU instance. This is essential for a
vIOMMU-based invalidation, where the request contains a device's vID for a
device cache flush, e.g. ATC invalidation.

Therefore, with this vDEVICE object, support a vIOMMU-based invalidation,
by reusing IOMMUFD_CMD_HWPT_INVALIDATE for a vIOMMU object to flush cache
with a given driver data.

As for the implementation of the series, add driver support in ARM SMMUv3
for a real world use case.

This series is on Github:
https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v5

For testing, try this "with-rmr" branch:
https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v5-with-rmr
Paring QEMU branch for testing:
https://github.com/nicolinc/qemu/commits/wip/for_iommufd_viommu_p2-v5

Changelog
v5
 * Dropped driver-allocated vDEVICE support
 * Changed vdev_to_dev helper to iommufd_viommu_find_dev
v4
 https://lore.kernel.org/all/cover.1729555967.git.nicolinc@nvidia.com/
 * Added missing brackets in switch-case
 * Fixed the unreleased idev refcount issue
 * Reworked the iommufd_vdevice_alloc allocator
 * Dropped support for IOMMU_VIOMMU_TYPE_DEFAULT
 * Added missing TEST_LENGTH and fail_nth coverages
 * Added a verification to the driver-allocated vDEVICE object
 * Added an iommufd_vdevice_abort for a missing mutex protection
 * Added a u64 structure arm_vsmmu_invalidation_cmd for user command
   conversion
v3
 https://lore.kernel.org/all/cover.1728491532.git.nicolinc@nvidia.com/
 * Added Jason's Reviewed-by
 * Split this invalidation part out of the part-1 series
 * Repurposed VDEV_ID ioctl to a wider vDEVICE structure and ioctl
 * Reduced viommu_api functions by allowing drivers to access viommu
   and vdevice structure directly
 * Dropped vdevs_rwsem by using xa_lock instead
 * Dropped arm_smmu_cache_invalidate_user
v2
 https://lore.kernel.org/all/cover.1724776335.git.nicolinc@nvidia.com/
 * Limited vdev_id to one per idev
 * Added a rw_sem to protect the vdev_id list
 * Reworked driver-level APIs with proper lockings
 * Added a new viommu_api file for IOMMUFD_DRIVER config
 * Dropped useless iommu_dev point from the viommu structure
 * Added missing index numnbers to new types in the uAPI header
 * Dropped IOMMU_VIOMMU_INVALIDATE uAPI; Instead, reuse the HWPT one
 * Reworked mock_viommu_cache_invalidate() using the new iommu helper
 * Reordered details of set/unset_vdev_id handlers for proper lockings
v1
 https://lore.kernel.org/all/cover.1723061377.git.nicolinc@nvidia.com/

Thanks!
Nicolin

Jason Gunthorpe (2):
  iommu: Add iommu_copy_struct_from_full_user_array helper
  iommu/arm-smmu-v3: Allow ATS for IOMMU_DOMAIN_NESTED

Nicolin Chen (11):
  iommufd/viommu: Add IOMMUFD_OBJ_VDEVICE and IOMMU_VDEVICE_ALLOC ioctl
  iommufd/selftest: Add IOMMU_VDEVICE_ALLOC test coverage
  iommu/viommu: Add cache_invalidate to iommufd_viommu_ops
  iommufd/hw_pagetable: Enforce invalidation op on vIOMMU-based
    hwpt_nested
  iommufd: Allow hwpt_id to carry viommu_id for IOMMU_HWPT_INVALIDATE
  iommufd/viommu: Add iommufd_viommu_find_dev helper
  iommufd/selftest: Add mock_viommu_cache_invalidate
  iommufd/selftest: Add IOMMU_TEST_OP_DEV_CHECK_CACHE test command
  iommufd/selftest: Add vIOMMU coverage for IOMMU_HWPT_INVALIDATE ioctl
  Documentation: userspace-api: iommufd: Update vDEVICE
  iommu/arm-smmu-v3: Add arm_vsmmu_cache_invalidate

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   9 +-
 drivers/iommu/iommufd/iommufd_private.h       |  20 ++
 drivers/iommu/iommufd/iommufd_test.h          |  30 +++
 include/linux/iommu.h                         |  48 ++++-
 include/linux/iommufd.h                       |  21 ++
 include/uapi/linux/iommufd.h                  |  61 +++++-
 tools/testing/selftests/iommu/iommufd_utils.h |  83 +++++++
 .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c     | 161 +++++++++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  32 ++-
 drivers/iommu/iommufd/device.c                |  11 +
 drivers/iommu/iommufd/driver.c                |  13 ++
 drivers/iommu/iommufd/hw_pagetable.c          |  36 +++-
 drivers/iommu/iommufd/main.c                  |   7 +
 drivers/iommu/iommufd/selftest.c              |  98 ++++++++-
 drivers/iommu/iommufd/viommu.c                | 101 +++++++++
 tools/testing/selftests/iommu/iommufd.c       | 204 +++++++++++++++++-
 .../selftests/iommu/iommufd_fail_nth.c        |   4 +
 Documentation/userspace-api/iommufd.rst       |  41 +++-
 18 files changed, 942 insertions(+), 38 deletions(-)

Comments

Tian, Kevin Oct. 28, 2024, 3:03 a.m. UTC | #1
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Saturday, October 26, 2024 7:50 AM
> 
> Following the previous vIOMMU series, this adds another vDEVICE structure,
> representing the association from an iommufd_device to an
> iommufd_viommu.
> This gives the whole architecture a new "v" layer:
> 
> ________________________________________________________________
> _______
>  |                      iommufd (with vIOMMU/vDEVICE)                    |
>  |                        _____________      _____________               |
>  |                       |             |    |             |              |
>  |      |----------------|    vIOMMU   |<---|   vDEVICE   |<------|      |
>  |      |                |             |    |_____________|       |      |
>  |      |     ______     |             |     _____________     ___|____  |
>  |      |    |      |    |             |    |             |   |        | |
>  |      |    | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
>  |      |    |______|    |_____________|    |_____________|   |________| |
> 
> |______|________|______________|__________________|_____________
> __|_____|
>         |        |              |                  |               |
>   ______v_____   |        ______v_____       ______v_____       ___v__
>  |   struct   |  |  PFN  |  (paging)  |     |  (nested)  |     |struct|
>  |iommu_device|  |------>|iommu_domain|<----|iommu_domain|<----
> |device|
>  |____________|   storage|____________|     |____________|     |______|
> 
> This vDEVICE object is used to collect and store all vIOMMU-related device
> information/attributes in a VM. As an initial series for vDEVICE, add only
> the virt_id to the vDEVICE, which is a vIOMMU specific device ID in a VM:
> e.g. vSID of ARM SMMUv3, vDeviceID of AMD IOMMU, and vID of Intel VT-d

s/vID/vRID/ for VT-d

> to
> a Context Table. This virt_id helps IOMMU drivers to link the vID to a pID
> of the device against the physical IOMMU instance. This is essential for a
> vIOMMU-based invalidation, where the request contains a device's vID for a
> device cache flush, e.g. ATC invalidation.

probably connect this to vCMDQ passthrough? otherwise for sw-based
invalidation the userspace can always replace vID with pID before
submitting the request.

> 
> Therefore, with this vDEVICE object, support a vIOMMU-based invalidation,
> by reusing IOMMUFD_CMD_HWPT_INVALIDATE for a vIOMMU object to
> flush cache
> with a given driver data.
> 
> As for the implementation of the series, add driver support in ARM SMMUv3
> for a real world use case.
> 
> This series is on Github:
> https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v5
> 
> For testing, try this "with-rmr" branch:
> https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v5-
> with-rmr
> Paring QEMU branch for testing:
> https://github.com/nicolinc/qemu/commits/wip/for_iommufd_viommu_p2-
> v5
> 
> Changelog
> v5
>  * Dropped driver-allocated vDEVICE support
>  * Changed vdev_to_dev helper to iommufd_viommu_find_dev
> v4
>  https://lore.kernel.org/all/cover.1729555967.git.nicolinc@nvidia.com/
>  * Added missing brackets in switch-case
>  * Fixed the unreleased idev refcount issue
>  * Reworked the iommufd_vdevice_alloc allocator
>  * Dropped support for IOMMU_VIOMMU_TYPE_DEFAULT
>  * Added missing TEST_LENGTH and fail_nth coverages
>  * Added a verification to the driver-allocated vDEVICE object
>  * Added an iommufd_vdevice_abort for a missing mutex protection
>  * Added a u64 structure arm_vsmmu_invalidation_cmd for user command
>    conversion
> v3
>  https://lore.kernel.org/all/cover.1728491532.git.nicolinc@nvidia.com/
>  * Added Jason's Reviewed-by
>  * Split this invalidation part out of the part-1 series
>  * Repurposed VDEV_ID ioctl to a wider vDEVICE structure and ioctl
>  * Reduced viommu_api functions by allowing drivers to access viommu
>    and vdevice structure directly
>  * Dropped vdevs_rwsem by using xa_lock instead
>  * Dropped arm_smmu_cache_invalidate_user
> v2
>  https://lore.kernel.org/all/cover.1724776335.git.nicolinc@nvidia.com/
>  * Limited vdev_id to one per idev
>  * Added a rw_sem to protect the vdev_id list
>  * Reworked driver-level APIs with proper lockings
>  * Added a new viommu_api file for IOMMUFD_DRIVER config
>  * Dropped useless iommu_dev point from the viommu structure
>  * Added missing index numnbers to new types in the uAPI header
>  * Dropped IOMMU_VIOMMU_INVALIDATE uAPI; Instead, reuse the HWPT
> one
>  * Reworked mock_viommu_cache_invalidate() using the new iommu helper
>  * Reordered details of set/unset_vdev_id handlers for proper lockings
> v1
>  https://lore.kernel.org/all/cover.1723061377.git.nicolinc@nvidia.com/
> 
> Thanks!
> Nicolin
> 
> Jason Gunthorpe (2):
>   iommu: Add iommu_copy_struct_from_full_user_array helper
>   iommu/arm-smmu-v3: Allow ATS for IOMMU_DOMAIN_NESTED
> 
> Nicolin Chen (11):
>   iommufd/viommu: Add IOMMUFD_OBJ_VDEVICE and
> IOMMU_VDEVICE_ALLOC ioctl
>   iommufd/selftest: Add IOMMU_VDEVICE_ALLOC test coverage
>   iommu/viommu: Add cache_invalidate to iommufd_viommu_ops
>   iommufd/hw_pagetable: Enforce invalidation op on vIOMMU-based
>     hwpt_nested
>   iommufd: Allow hwpt_id to carry viommu_id for
> IOMMU_HWPT_INVALIDATE
>   iommufd/viommu: Add iommufd_viommu_find_dev helper
>   iommufd/selftest: Add mock_viommu_cache_invalidate
>   iommufd/selftest: Add IOMMU_TEST_OP_DEV_CHECK_CACHE test
> command
>   iommufd/selftest: Add vIOMMU coverage for IOMMU_HWPT_INVALIDATE
> ioctl
>   Documentation: userspace-api: iommufd: Update vDEVICE
>   iommu/arm-smmu-v3: Add arm_vsmmu_cache_invalidate
> 
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   9 +-
>  drivers/iommu/iommufd/iommufd_private.h       |  20 ++
>  drivers/iommu/iommufd/iommufd_test.h          |  30 +++
>  include/linux/iommu.h                         |  48 ++++-
>  include/linux/iommufd.h                       |  21 ++
>  include/uapi/linux/iommufd.h                  |  61 +++++-
>  tools/testing/selftests/iommu/iommufd_utils.h |  83 +++++++
>  .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c     | 161 +++++++++++++-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  32 ++-
>  drivers/iommu/iommufd/device.c                |  11 +
>  drivers/iommu/iommufd/driver.c                |  13 ++
>  drivers/iommu/iommufd/hw_pagetable.c          |  36 +++-
>  drivers/iommu/iommufd/main.c                  |   7 +
>  drivers/iommu/iommufd/selftest.c              |  98 ++++++++-
>  drivers/iommu/iommufd/viommu.c                | 101 +++++++++
>  tools/testing/selftests/iommu/iommufd.c       | 204 +++++++++++++++++-
>  .../selftests/iommu/iommufd_fail_nth.c        |   4 +
>  Documentation/userspace-api/iommufd.rst       |  41 +++-
>  18 files changed, 942 insertions(+), 38 deletions(-)
> 
> --
> 2.43.0
Jason Gunthorpe Oct. 28, 2024, 2:17 p.m. UTC | #2
> > to
> > a Context Table. This virt_id helps IOMMU drivers to link the vID to a pID
> > of the device against the physical IOMMU instance. This is essential for a
> > vIOMMU-based invalidation, where the request contains a device's vID for a
> > device cache flush, e.g. ATC invalidation.
> 
> probably connect this to vCMDQ passthrough? otherwise for sw-based
> invalidation the userspace can always replace vID with pID before
> submitting the request.

You can't just do that, the ID in the invalidation command has to be
validated by the kernel.

At that point you may as well just use the vID instead of inventing a
new means to validate raw pIDs.

VT-D avoided this so far because it is pushing the invalidation
through the device object and the device object provides the ID
directly. No IDs in commands, no switching devices during a command
batch.

Jason
Tian, Kevin Oct. 29, 2024, 8:51 a.m. UTC | #3
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Monday, October 28, 2024 10:17 PM
> 
> > > to
> > > a Context Table. This virt_id helps IOMMU drivers to link the vID to a pID
> > > of the device against the physical IOMMU instance. This is essential for a
> > > vIOMMU-based invalidation, where the request contains a device's vID
> for a
> > > device cache flush, e.g. ATC invalidation.
> >
> > probably connect this to vCMDQ passthrough? otherwise for sw-based
> > invalidation the userspace can always replace vID with pID before
> > submitting the request.
> 
> You can't just do that, the ID in the invalidation command has to be
> validated by the kernel.

sure the ID must be validated to match the iommufd_device but not
exactly going through a vID indirectly.

> 
> At that point you may as well just use the vID instead of inventing a
> new means to validate raw pIDs.

w/o VCMDQ stuff validating raw pID sounds the natural way while
vID is more like a new means and not mandatory.

I'm fine with this design but just didn't feel the above description
is accurate.