diff mbox series

[v3,10/11] Documentation: userspace-api: iommufd: Update vIOMMU

Message ID 0b56b2a4e38e8f4cf3a96c4fb2ccbbf4b5c67da8.1728491453.git.nicolinc@nvidia.com (mailing list archive)
State New
Headers show
Series cover-letter: iommufd: Add vIOMMU infrastructure (Part-1) | expand

Commit Message

Nicolin Chen Oct. 9, 2024, 4:38 p.m. UTC
With the introduction of the new object and its infrastructure, update the
doc to reflect that and add a new graph.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 Documentation/userspace-api/iommufd.rst | 66 ++++++++++++++++++++++++-
 1 file changed, 65 insertions(+), 1 deletion(-)

Comments

Jason Gunthorpe Oct. 17, 2024, 7:12 p.m. UTC | #1
On Wed, Oct 09, 2024 at 09:38:10AM -0700, Nicolin Chen wrote:
> With the introduction of the new object and its infrastructure, update the
> doc to reflect that and add a new graph.
> 
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
>  Documentation/userspace-api/iommufd.rst | 66 ++++++++++++++++++++++++-
>  1 file changed, 65 insertions(+), 1 deletion(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason
diff mbox series

Patch

diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
index 2deba93bf159..37eb1adda57b 100644
--- a/Documentation/userspace-api/iommufd.rst
+++ b/Documentation/userspace-api/iommufd.rst
@@ -63,6 +63,37 @@  Following IOMMUFD objects are exposed to userspace:
   space usually has mappings from guest-level I/O virtual addresses to guest-
   level physical addresses.
 
+ - IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance,
+   passed to or shared with a VM. It may be some HW-accelerated virtualization
+   features and some SW resources used by the VM. For examples:
+   * Security namespace for guest owned ID, e.g. guest-controlled cache tags
+   * Access to a sharable nesting parent pagetable across physical IOMMUs
+   * Virtualization of various platforms IDs, e.g. RIDs and others
+   * Delivery of paravirtualized invalidation
+   * Direct assigned invalidation queues
+   * Direct assigned interrupts
+   * Non-affiliated event reporting
+   Such a vIOMMU object generally has the access to a nesting parent pagetable
+   to support some HW-accelerated virtualization features. So, a vIOMMU object
+   must be created given a nesting parent HWPT_PAGING object, and then it would
+   encapsulate that HWPT_PAGING object. Therefore, a vIOMMU object can be used
+   to allocate an HWPT_NESTED object in place of the encapsulated HWPT_PAGING.
+
+   .. note::
+
+      The name "vIOMMU" isn't necessarily identical to a virtualized IOMMU in a
+      VM. A VM can have one giant virtualized IOMMU running on a machine having
+      multiple physical IOMMUs, in which case the VMM will dispatch the requests
+      or configurations from this single virtualized IOMMU instance to multiple
+      vIOMMU objects created for individual slices of different physical IOMMUs.
+      In other words, a vIOMMU object is always a representation of one physical
+      IOMMU, not necessarily of a virtualized IOMMU. For VMMs that want the full
+      virtualization features from physical IOMMUs, it is suggested to build the
+      same number of virtualized IOMMUs as the number of physical IOMMUs, so the
+      passed-through devices would be connected to their own virtualized IOMMUs
+      backed by corresponding vIOMMU objects, in which case a guest OS would do
+      the "dispatch" naturally instead of VMM trappings.
+
 All user-visible objects are destroyed via the IOMMU_DESTROY uAPI.
 
 The diagrams below show relationships between user-visible objects and kernel
@@ -101,6 +132,25 @@  creating the objects and links::
            |------------>|iommu_domain|<----|iommu_domain|<----|device|
                          |____________|     |____________|     |______|
 
+  _______________________________________________________________________
+ |                      iommufd (with vIOMMU)                            |
+ |                                                                       |
+ |                             [5]                                       |
+ |                        _____________                                  |
+ |                       |             |                                 |
+ |        [1]            |    vIOMMU   |          [4]             [2]    |
+ |  ________________     |             |     _____________     ________  |
+ | |                |    |     [3]     |    |             |   |        | |
+ | |      IOAS      |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
+ | |________________|    |_____________|    |_____________|   |________| |
+ |         |                    |                  |               |     |
+ |_________|____________________|__________________|_______________|_____|
+           |                    |                  |               |
+           |              ______v_____       ______v_____       ___v__
+           | PFN storage |  (paging)  |     |  (nested)  |     |struct|
+           |------------>|iommu_domain|<----|iommu_domain|<----|device|
+                         |____________|     |____________|     |______|
+
 1. IOMMUFD_OBJ_IOAS is created via the IOMMU_IOAS_ALLOC uAPI. An iommufd can
    hold multiple IOAS objects. IOAS is the most generic object and does not
    expose interfaces that are specific to single IOMMU drivers. All operations
@@ -132,7 +182,8 @@  creating the objects and links::
      flag is set.
 
 4. IOMMUFD_OBJ_HWPT_NESTED can be only manually created via the IOMMU_HWPT_ALLOC
-   uAPI, provided an hwpt_id via @pt_id to associate the new HWPT_NESTED object
+   uAPI, provided an hwpt_id or a viommu_id of a vIOMMU object encapsulating a
+   nesting parent HWPT_PAGING via @pt_id to associate the new HWPT_NESTED object
    to the corresponding HWPT_PAGING object. The associating HWPT_PAGING object
    must be a nesting parent manually allocated via the same uAPI previously with
    an IOMMU_HWPT_ALLOC_NEST_PARENT flag, otherwise the allocation will fail. The
@@ -149,6 +200,18 @@  creating the objects and links::
       created via the same IOMMU_HWPT_ALLOC uAPI. The difference is at the type
       of the object passed in via the @pt_id field of struct iommufd_hwpt_alloc.
 
+5. IOMMUFD_OBJ_VIOMMU can be only manually created via the IOMMU_VIOMMU_ALLOC
+   uAPI, provided a dev_id (for the device's physical IOMMU to back the vIOMMU)
+   and an hwpt_id (to associate the vIOMMU to a nesting parent HWPT_PAGING). The
+   iommufd core will link the vIOMMU object to the struct iommu_device that the
+   struct device is behind. And an IOMMU driver can implement a viommu_alloc op
+   to allocate its own vIOMMU data structure embedding the core-level structure
+   iommufd_viommu and some driver-specific data. If necessary, the driver can
+   also configure its HW virtualization feature for that vIOMMU (and thus for
+   the VM). Successful completion of this operation sets up the linkages between
+   the vIOMMU object and the HWPT_PAGING, then this vIOMMU object can be used
+   as a nesting parent object to allocate an HWPT_NESTED object described above.
+
 A device can only bind to an iommufd due to DMA ownership claim and attach to at
 most one IOAS object (no support of PASID yet).
 
@@ -161,6 +224,7 @@  User visible objects are backed by following datastructures:
 - iommufd_device for IOMMUFD_OBJ_DEVICE.
 - iommufd_hwpt_paging for IOMMUFD_OBJ_HWPT_PAGING.
 - iommufd_hwpt_nested for IOMMUFD_OBJ_HWPT_NESTED.
+- iommufd_viommu for IOMMUFD_OBJ_VIOMMU.
 
 Several terminologies when looking at these datastructures: