Message ID | 20230921075138.124099-11-yi.l.liu@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | iommufd: Add nesting infrastructure | expand |
On Thu, Sep 21, 2023 at 12:51:31AM -0700, Yi Liu wrote: > IOMMU_HWPT_ALLOC already supports iommu_domain allocation for usersapce. > But it can only allocate a hw_pagetable that associates to a given IOAS, > i.e. only a kernel-managed hw_pagetable of IOMMU_HWPT_TYPE_DEFAULT type. > > IOMMU drivers can now support user-managed hw_pagetables, for two-stage > translation use cases, that require user data input from the user space. > > Extend the IOMMU_HWPT_ALLOC ioctl to accept non-default hwpt_type with a > type specified user data. Also, update the @pt_id to accept hwpt_id too > besides an ioas_id. Then, pass them to the downstream alloc_fn(). > > Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > --- > drivers/iommu/iommufd/hw_pagetable.c | 19 ++++++++++++++++++- > include/uapi/linux/iommufd.h | 23 +++++++++++++++++++++-- > 2 files changed, 39 insertions(+), 3 deletions(-) Can we also come with a small vt-d patch that does implement an op for this? Or is it too big? It would be nice if we could wrap IOMMU_HWPT_ALLOC into one self-contained series and another series for invalidate. Jason
On Fri, Oct 13, 2023 at 12:19:23PM -0300, Jason Gunthorpe wrote: > On Thu, Sep 21, 2023 at 12:51:31AM -0700, Yi Liu wrote: > > IOMMU_HWPT_ALLOC already supports iommu_domain allocation for usersapce. > > But it can only allocate a hw_pagetable that associates to a given IOAS, > > i.e. only a kernel-managed hw_pagetable of IOMMU_HWPT_TYPE_DEFAULT type. > > > > IOMMU drivers can now support user-managed hw_pagetables, for two-stage > > translation use cases, that require user data input from the user space. > > > > Extend the IOMMU_HWPT_ALLOC ioctl to accept non-default hwpt_type with a > > type specified user data. Also, update the @pt_id to accept hwpt_id too > > besides an ioas_id. Then, pass them to the downstream alloc_fn(). > > > > Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > --- > > drivers/iommu/iommufd/hw_pagetable.c | 19 ++++++++++++++++++- > > include/uapi/linux/iommufd.h | 23 +++++++++++++++++++++-- > > 2 files changed, 39 insertions(+), 3 deletions(-) > > Can we also come with a small vt-d patch that does implement an op for > this? Or is it too big? > > It would be nice if we could wrap IOMMU_HWPT_ALLOC into one > self-contained series and another series for invalidate. We now only use IOMMU_HWPT_ALLOC for nested domain allocations, which won't be supported until the cache_invalidate_user ops is preset? /* e.g. the following piece is in iommufd_user_managed_hwpt_alloc */ + /* Driver is buggy by missing cache_invalidate_user in domain_ops */ + if (WARN_ON_ONCE(!hwpt->domain->ops->cache_invalidate_user)) { + rc = -EINVAL; + goto out_abort; + } Thanks Nic
On Fri, Oct 13, 2023 at 01:58:59PM -0700, Nicolin Chen wrote: > On Fri, Oct 13, 2023 at 12:19:23PM -0300, Jason Gunthorpe wrote: > > On Thu, Sep 21, 2023 at 12:51:31AM -0700, Yi Liu wrote: > > > IOMMU_HWPT_ALLOC already supports iommu_domain allocation for usersapce. > > > But it can only allocate a hw_pagetable that associates to a given IOAS, > > > i.e. only a kernel-managed hw_pagetable of IOMMU_HWPT_TYPE_DEFAULT type. > > > > > > IOMMU drivers can now support user-managed hw_pagetables, for two-stage > > > translation use cases, that require user data input from the user space. > > > > > > Extend the IOMMU_HWPT_ALLOC ioctl to accept non-default hwpt_type with a > > > type specified user data. Also, update the @pt_id to accept hwpt_id too > > > besides an ioas_id. Then, pass them to the downstream alloc_fn(). > > > > > > Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > > --- > > > drivers/iommu/iommufd/hw_pagetable.c | 19 ++++++++++++++++++- > > > include/uapi/linux/iommufd.h | 23 +++++++++++++++++++++-- > > > 2 files changed, 39 insertions(+), 3 deletions(-) > > > > Can we also come with a small vt-d patch that does implement an op for > > this? Or is it too big? > > > > It would be nice if we could wrap IOMMU_HWPT_ALLOC into one > > self-contained series and another series for invalidate. > > We now only use IOMMU_HWPT_ALLOC for nested domain allocations, > which won't be supported until the cache_invalidate_user ops is > preset? > > /* e.g. the following piece is in iommufd_user_managed_hwpt_alloc */ > > + /* Driver is buggy by missing cache_invalidate_user in domain_ops */ > + if (WARN_ON_ONCE(!hwpt->domain->ops->cache_invalidate_user)) { > + rc = -EINVAL; > + goto out_abort; > + } > Hm. That hunk could migrate to the invalidate series. I'm just leeary of doing the invalidation too considering how complicated it is Jason
On Fri, Oct 13, 2023 at 09:07:09PM -0300, Jason Gunthorpe wrote: > On Fri, Oct 13, 2023 at 01:58:59PM -0700, Nicolin Chen wrote: > > On Fri, Oct 13, 2023 at 12:19:23PM -0300, Jason Gunthorpe wrote: > > > On Thu, Sep 21, 2023 at 12:51:31AM -0700, Yi Liu wrote: > > > > IOMMU_HWPT_ALLOC already supports iommu_domain allocation for usersapce. > > > > But it can only allocate a hw_pagetable that associates to a given IOAS, > > > > i.e. only a kernel-managed hw_pagetable of IOMMU_HWPT_TYPE_DEFAULT type. > > > > > > > > IOMMU drivers can now support user-managed hw_pagetables, for two-stage > > > > translation use cases, that require user data input from the user space. > > > > > > > > Extend the IOMMU_HWPT_ALLOC ioctl to accept non-default hwpt_type with a > > > > type specified user data. Also, update the @pt_id to accept hwpt_id too > > > > besides an ioas_id. Then, pass them to the downstream alloc_fn(). > > > > > > > > Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> > > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > > > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > > > --- > > > > drivers/iommu/iommufd/hw_pagetable.c | 19 ++++++++++++++++++- > > > > include/uapi/linux/iommufd.h | 23 +++++++++++++++++++++-- > > > > 2 files changed, 39 insertions(+), 3 deletions(-) > > > > > > Can we also come with a small vt-d patch that does implement an op for > > > this? Or is it too big? > > > > > > It would be nice if we could wrap IOMMU_HWPT_ALLOC into one > > > self-contained series and another series for invalidate. > > > > We now only use IOMMU_HWPT_ALLOC for nested domain allocations, > > which won't be supported until the cache_invalidate_user ops is > > preset? > > > > /* e.g. the following piece is in iommufd_user_managed_hwpt_alloc */ > > > > + /* Driver is buggy by missing cache_invalidate_user in domain_ops */ > > + if (WARN_ON_ONCE(!hwpt->domain->ops->cache_invalidate_user)) { > > + rc = -EINVAL; > > + goto out_abort; > > + } > > > > Hm. That hunk could migrate to the invalidate series. > > I'm just leeary of doing the invalidation too considering how > complicated it is OK. Let's see how Yi/Kevin/Baolu reply about the feasibility with the VT-d driver. Then Yi and I can accordingly separate the allocation part into a smaller series. Thanks Nic
On 2023/10/14 08:51, Nicolin Chen wrote: > On Fri, Oct 13, 2023 at 09:07:09PM -0300, Jason Gunthorpe wrote: >> On Fri, Oct 13, 2023 at 01:58:59PM -0700, Nicolin Chen wrote: >>> On Fri, Oct 13, 2023 at 12:19:23PM -0300, Jason Gunthorpe wrote: >>>> On Thu, Sep 21, 2023 at 12:51:31AM -0700, Yi Liu wrote: >>>>> IOMMU_HWPT_ALLOC already supports iommu_domain allocation for usersapce. >>>>> But it can only allocate a hw_pagetable that associates to a given IOAS, >>>>> i.e. only a kernel-managed hw_pagetable of IOMMU_HWPT_TYPE_DEFAULT type. >>>>> >>>>> IOMMU drivers can now support user-managed hw_pagetables, for two-stage >>>>> translation use cases, that require user data input from the user space. >>>>> >>>>> Extend the IOMMU_HWPT_ALLOC ioctl to accept non-default hwpt_type with a >>>>> type specified user data. Also, update the @pt_id to accept hwpt_id too >>>>> besides an ioas_id. Then, pass them to the downstream alloc_fn(). >>>>> >>>>> Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> >>>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> >>>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com> >>>>> --- >>>>> drivers/iommu/iommufd/hw_pagetable.c | 19 ++++++++++++++++++- >>>>> include/uapi/linux/iommufd.h | 23 +++++++++++++++++++++-- >>>>> 2 files changed, 39 insertions(+), 3 deletions(-) >>>> >>>> Can we also come with a small vt-d patch that does implement an op for >>>> this? Or is it too big? >>>> >>>> It would be nice if we could wrap IOMMU_HWPT_ALLOC into one >>>> self-contained series and another series for invalidate. >>> >>> We now only use IOMMU_HWPT_ALLOC for nested domain allocations, >>> which won't be supported until the cache_invalidate_user ops is >>> preset? >>> >>> /* e.g. the following piece is in iommufd_user_managed_hwpt_alloc */ >>> >>> + /* Driver is buggy by missing cache_invalidate_user in domain_ops */ >>> + if (WARN_ON_ONCE(!hwpt->domain->ops->cache_invalidate_user)) { >>> + rc = -EINVAL; >>> + goto out_abort; >>> + } >>> >> >> Hm. That hunk could migrate to the invalidate series. >> >> I'm just leeary of doing the invalidation too considering how >> complicated it is > > OK. Let's see how Yi/Kevin/Baolu reply about the feasibility > with the VT-d driver. Then Yi and I can accordingly separate > the allocation part into a smaller series. Current nesting series actually extends HWPT_ALLOC ioctl to accept user data for allocating domain with vendor specific data. Nested translation happens to be the usage of it. But nesting requires invalidation. If we want to do further split, then this new series would be just "extending HWPT_ALLOC to accept vendor specific data from userspace". But it will lack of a user if nesting is separated. Is this acceptable? @Jason BTW. Do we still have unsolved issue on the invalidation path?
On Mon, Oct 16, 2023 at 03:03:04PM +0800, Yi Liu wrote: > On 2023/10/14 08:51, Nicolin Chen wrote: > > On Fri, Oct 13, 2023 at 09:07:09PM -0300, Jason Gunthorpe wrote: > > > On Fri, Oct 13, 2023 at 01:58:59PM -0700, Nicolin Chen wrote: > > > > On Fri, Oct 13, 2023 at 12:19:23PM -0300, Jason Gunthorpe wrote: > > > > > On Thu, Sep 21, 2023 at 12:51:31AM -0700, Yi Liu wrote: > > > > > > IOMMU_HWPT_ALLOC already supports iommu_domain allocation for usersapce. > > > > > > But it can only allocate a hw_pagetable that associates to a given IOAS, > > > > > > i.e. only a kernel-managed hw_pagetable of IOMMU_HWPT_TYPE_DEFAULT type. > > > > > > > > > > > > IOMMU drivers can now support user-managed hw_pagetables, for two-stage > > > > > > translation use cases, that require user data input from the user space. > > > > > > > > > > > > Extend the IOMMU_HWPT_ALLOC ioctl to accept non-default hwpt_type with a > > > > > > type specified user data. Also, update the @pt_id to accept hwpt_id too > > > > > > besides an ioas_id. Then, pass them to the downstream alloc_fn(). > > > > > > > > > > > > Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> > > > > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > > > > > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > > > > > --- > > > > > > drivers/iommu/iommufd/hw_pagetable.c | 19 ++++++++++++++++++- > > > > > > include/uapi/linux/iommufd.h | 23 +++++++++++++++++++++-- > > > > > > 2 files changed, 39 insertions(+), 3 deletions(-) > > > > > > > > > > Can we also come with a small vt-d patch that does implement an op for > > > > > this? Or is it too big? > > > > > > > > > > It would be nice if we could wrap IOMMU_HWPT_ALLOC into one > > > > > self-contained series and another series for invalidate. > > > > > > > > We now only use IOMMU_HWPT_ALLOC for nested domain allocations, > > > > which won't be supported until the cache_invalidate_user ops is > > > > preset? > > > > > > > > /* e.g. the following piece is in iommufd_user_managed_hwpt_alloc */ > > > > > > > > + /* Driver is buggy by missing cache_invalidate_user in domain_ops */ > > > > + if (WARN_ON_ONCE(!hwpt->domain->ops->cache_invalidate_user)) { > > > > + rc = -EINVAL; > > > > + goto out_abort; > > > > + } > > > > > > > > > > Hm. That hunk could migrate to the invalidate series. > > > > > > I'm just leeary of doing the invalidation too considering how > > > complicated it is > > > > OK. Let's see how Yi/Kevin/Baolu reply about the feasibility > > with the VT-d driver. Then Yi and I can accordingly separate > > the allocation part into a smaller series. > > Current nesting series actually extends HWPT_ALLOC ioctl to accept user > data for allocating domain with vendor specific data. Nested translation > happens to be the usage of it. But nesting requires invalidation. If we > want to do further split, then this new series would be just "extending > HWPT_ALLOC to accept vendor specific data from userspace". But it will > lack of a user if nesting is separated. Is this acceptable? @Jason I'd still like to include the nesting allocation and attach parts though, even if they are not usable without invalidation .. > BTW. Do we still have unsolved issue on the invalidation path? I'm not sure, there were so many different versions of it we need to go back over it and check the dirver implementations again Jason
On Mon, Oct 16, 2023 at 08:59:07AM -0300, Jason Gunthorpe wrote: > On Mon, Oct 16, 2023 at 03:03:04PM +0800, Yi Liu wrote: > > Current nesting series actually extends HWPT_ALLOC ioctl to accept user > > data for allocating domain with vendor specific data. Nested translation > > happens to be the usage of it. But nesting requires invalidation. If we > > want to do further split, then this new series would be just "extending > > HWPT_ALLOC to accept vendor specific data from userspace". But it will > > lack of a user if nesting is separated. Is this acceptable? @Jason > > I'd still like to include the nesting allocation and attach parts > though, even if they are not usable without invalidation .. This is the latest series that I reworked (in bottom-up order): iommu: Add a pair of helper to copy struct iommu_user_data{_array} iommufd: Add IOMMU_HWPT_INVALIDATE iommufd: Add a nested HW pagetable object iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable iommufd: Rename IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING iommufd/device: Add helpers to enforce/remove device reserved regions iommu: Add IOMMU_DOMAIN_NESTED and cache_invalidate_user op iommu: Pass in parent domain with user_data to domain_alloc_user op Perhaps we can have a preparatory series to merge first: iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable iommufd: Rename IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING iommufd/device: Add helpers to enforce/remove device reserved regions Then next cycle would be basically 4 patches + selftests: iommufd: Add IOMMU_HWPT_INVALIDATE iommufd: Add a nested HW pagetable object iommu: Add IOMMU_DOMAIN_NESTED and cache_invalidate_user op iommu: Pass in parent domain with user_data to domain_alloc_user op The preparatory series doesn't involve functional changes yet have a good amount of pieces to simplify the "nested HW pagetable" that is basically nested_alloc/abort/destroy. > > BTW. Do we still have unsolved issue on the invalidation path? > > I'm not sure, there were so many different versions of it we need to > go back over it and check the dirver implementations again Only this v4 has the latest array-based invalidation design. And it should be straightforward for drivers to define entry/request structures. It might be a bit rush to review/finalize it at the stage of rc6 though. Thanks Nic
On 2023/10/17 02:44, Nicolin Chen wrote: > On Mon, Oct 16, 2023 at 08:59:07AM -0300, Jason Gunthorpe wrote: >> On Mon, Oct 16, 2023 at 03:03:04PM +0800, Yi Liu wrote: >>> Current nesting series actually extends HWPT_ALLOC ioctl to accept user >>> data for allocating domain with vendor specific data. Nested translation >>> happens to be the usage of it. But nesting requires invalidation. If we >>> want to do further split, then this new series would be just "extending >>> HWPT_ALLOC to accept vendor specific data from userspace". But it will >>> lack of a user if nesting is separated. Is this acceptable? @Jason >> >> I'd still like to include the nesting allocation and attach parts >> though, even if they are not usable without invalidation .. > > This is the latest series that I reworked (in bottom-up order): > iommu: Add a pair of helper to copy struct iommu_user_data{_array} > iommufd: Add IOMMU_HWPT_INVALIDATE > iommufd: Add a nested HW pagetable object > iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED > iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable > iommufd: Rename IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING > iommufd/device: Add helpers to enforce/remove device reserved regions > iommu: Add IOMMU_DOMAIN_NESTED and cache_invalidate_user op > iommu: Pass in parent domain with user_data to domain_alloc_user op following Jason's comment, it looks like we can just split the cache invalidation path out. Then the above looks good after removing "iommufd: Add IOMMU_HWPT_INVALIDATE" and also the cache_invalidate_user callback in "iommu: Add IOMMU_DOMAIN_NESTED and cache_invalidate_user op". Is it? @Jason > Perhaps we can have a preparatory series to merge first: > iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED > iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable > iommufd: Rename IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING > iommufd/device: Add helpers to enforce/remove device reserved regions > > Then next cycle would be basically 4 patches + selftests: > iommufd: Add IOMMU_HWPT_INVALIDATE > iommufd: Add a nested HW pagetable object > iommu: Add IOMMU_DOMAIN_NESTED and cache_invalidate_user op > iommu: Pass in parent domain with user_data to domain_alloc_user op > > The preparatory series doesn't involve functional changes yet have > a good amount of pieces to simplify the "nested HW pagetable" that > is basically nested_alloc/abort/destroy. > >>> BTW. Do we still have unsolved issue on the invalidation path? >> >> I'm not sure, there were so many different versions of it we need to >> go back over it and check the dirver implementations again > > Only this v4 has the latest array-based invalidation design. And > it should be straightforward for drivers to define entry/request > structures. It might be a bit rush to review/finalize it at the > stage of rc6 though. yes, before v4, the cache invalidation path is simple and vendor drivers have their own handling.
On Tue, Oct 17, 2023 at 04:55:12PM +0800, Yi Liu wrote: > On 2023/10/17 02:44, Nicolin Chen wrote: > > On Mon, Oct 16, 2023 at 08:59:07AM -0300, Jason Gunthorpe wrote: > > > On Mon, Oct 16, 2023 at 03:03:04PM +0800, Yi Liu wrote: > > > > Current nesting series actually extends HWPT_ALLOC ioctl to accept user > > > > data for allocating domain with vendor specific data. Nested translation > > > > happens to be the usage of it. But nesting requires invalidation. If we > > > > want to do further split, then this new series would be just "extending > > > > HWPT_ALLOC to accept vendor specific data from userspace". But it will > > > > lack of a user if nesting is separated. Is this acceptable? @Jason > > > > > > I'd still like to include the nesting allocation and attach parts > > > though, even if they are not usable without invalidation .. > > > > This is the latest series that I reworked (in bottom-up order): > > iommu: Add a pair of helper to copy struct iommu_user_data{_array} > > iommufd: Add IOMMU_HWPT_INVALIDATE > > iommufd: Add a nested HW pagetable object > > iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED > > iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable > > iommufd: Rename IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING > > iommufd/device: Add helpers to enforce/remove device reserved regions > > iommu: Add IOMMU_DOMAIN_NESTED and cache_invalidate_user op > > iommu: Pass in parent domain with user_data to domain_alloc_user op > > following Jason's comment, it looks like we can just split the cache > invalidation path out. Then the above looks good after removing > "iommufd: Add IOMMU_HWPT_INVALIDATE" and also the cache_invalidate_user > callback in "iommu: Add IOMMU_DOMAIN_NESTED and cache_invalidate_user op". > Is it? @Jason If it can make sense, sure. It would be nice to be finished with the alloc path > > Only this v4 has the latest array-based invalidation design. And > > it should be straightforward for drivers to define entry/request > > structures. It might be a bit rush to review/finalize it at the > > stage of rc6 though. > > yes, before v4, the cache invalidation path is simple and vendor > drivers have their own handling. Have driver implementations of v4 been done to look at? Jason
On Tue, Oct 17, 2023 at 12:50:11PM -0300, Jason Gunthorpe wrote: > On Tue, Oct 17, 2023 at 04:55:12PM +0800, Yi Liu wrote: > > On 2023/10/17 02:44, Nicolin Chen wrote: > > > On Mon, Oct 16, 2023 at 08:59:07AM -0300, Jason Gunthorpe wrote: > > > > On Mon, Oct 16, 2023 at 03:03:04PM +0800, Yi Liu wrote: > > > > > Current nesting series actually extends HWPT_ALLOC ioctl to accept user > > > > > data for allocating domain with vendor specific data. Nested translation > > > > > happens to be the usage of it. But nesting requires invalidation. If we > > > > > want to do further split, then this new series would be just "extending > > > > > HWPT_ALLOC to accept vendor specific data from userspace". But it will > > > > > lack of a user if nesting is separated. Is this acceptable? @Jason > > > > > > > > I'd still like to include the nesting allocation and attach parts > > > > though, even if they are not usable without invalidation .. > > > > > > This is the latest series that I reworked (in bottom-up order): > > > iommu: Add a pair of helper to copy struct iommu_user_data{_array} > > > iommufd: Add IOMMU_HWPT_INVALIDATE > > > iommufd: Add a nested HW pagetable object > > > iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED > > > iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable > > > iommufd: Rename IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING > > > iommufd/device: Add helpers to enforce/remove device reserved regions > > > iommu: Add IOMMU_DOMAIN_NESTED and cache_invalidate_user op > > > iommu: Pass in parent domain with user_data to domain_alloc_user op > > > > following Jason's comment, it looks like we can just split the cache > > invalidation path out. Then the above looks good after removing > > "iommufd: Add IOMMU_HWPT_INVALIDATE" and also the cache_invalidate_user > > callback in "iommu: Add IOMMU_DOMAIN_NESTED and cache_invalidate_user op". > > Is it? @Jason > > If it can make sense, sure. It would be nice to be finished with the > alloc path I can do the split today. Shall we have a domain_alloc_user op in VT-d driver? Can we accept a core series only? I understood it's better to have though... > > > Only this v4 has the latest array-based invalidation design. And > > > it should be straightforward for drivers to define entry/request > > > structures. It might be a bit rush to review/finalize it at the > > > stage of rc6 though. > > > > yes, before v4, the cache invalidation path is simple and vendor > > drivers have their own handling. > > Have driver implementations of v4 been done to look at? I think so: https://lore.kernel.org/linux-iommu/20230921075431.125239-10-yi.l.liu@intel.com/ Thanks Nicolin
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 90fd65859e28..ab25de149ae6 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -300,6 +300,7 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) bool flag); struct iommufd_hw_pagetable *hwpt, *parent; struct iommu_hwpt_alloc *cmd = ucmd->cmd; + struct iommu_user_data *data = NULL; struct iommufd_object *pt_obj; struct iommufd_device *idev; struct iommufd_ioas *ioas; @@ -308,6 +309,11 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) if (cmd->flags & ~IOMMU_HWPT_ALLOC_NEST_PARENT || cmd->__reserved) return -EOPNOTSUPP; + if (!cmd->data_len && cmd->hwpt_type != IOMMU_HWPT_TYPE_DEFAULT) + return -EINVAL; + if (cmd->flags & IOMMU_HWPT_ALLOC_NEST_PARENT && + cmd->hwpt_type != IOMMU_HWPT_TYPE_DEFAULT) + return -EINVAL; idev = iommufd_get_device(ucmd, cmd->dev_id); if (IS_ERR(idev)) @@ -340,9 +346,19 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) goto out_put_pt; } + if (cmd->data_len) { + data = kzalloc(sizeof(*data), GFP_KERNEL); + if (!data) { + rc = -ENOMEM; + goto out_put_pt; + } + data->uptr = u64_to_user_ptr(cmd->data_uptr); + data->len = cmd->data_len; + } + mutex_lock(mutex); hwpt = alloc_fn(ucmd->ictx, pt_obj, idev, cmd->flags, - IOMMU_HWPT_TYPE_DEFAULT, NULL, false); + cmd->hwpt_type, data, false); if (IS_ERR(hwpt)) { rc = PTR_ERR(hwpt); goto out_unlock; @@ -359,6 +375,7 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) iommufd_object_abort_and_destroy(ucmd->ictx, &hwpt->obj); out_unlock: mutex_unlock(mutex); + kfree(data); out_put_pt: iommufd_put_object(pt_obj); out_put_idev: diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 3c8660fe9bb1..c46b1f772f20 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -370,15 +370,31 @@ enum iommu_hwpt_type { * @size: sizeof(struct iommu_hwpt_alloc) * @flags: Combination of enum iommufd_hwpt_alloc_flags * @dev_id: The device to allocate this HWPT for - * @pt_id: The IOAS to connect this HWPT to + * @pt_id: The IOAS or HWPT to connect this HWPT to * @out_hwpt_id: The ID of the new HWPT * @__reserved: Must be 0 + * @hwpt_type: One of enum iommu_hwpt_type + * @data_len: Length of the type specific data + * @data_uptr: User pointer to the type specific data * * Explicitly allocate a hardware page table object. This is the same object * type that is returned by iommufd_device_attach() and represents the * underlying iommu driver's iommu_domain kernel object. * - * A HWPT will be created with the IOVA mappings from the given IOAS. + * A kernel-managed HWPT will be created with the mappings from the given + * IOAS via the @pt_id. The @hwpt_type for this allocation can be set to + * either IOMMU_HWPT_TYPE_DEFAULT or a pre-defined type corresponding to + * an I/O page table type supported by the underlying IOMMU hardware. + * + * A user-managed HWPT will be created from a given parent HWPT via the + * @pt_id, in which the parent HWPT must be allocated previously via the + * same ioctl from a given IOAS (@pt_id). In this case, the @hwpt_type + * must be set to a pre-defined type corresponding to an I/O page table + * type supported by the underlying IOMMU hardware. + * + * If the @hwpt_type is set to IOMMU_HWPT_TYPE_DEFAULT, @data_len and + * @data_uptr should be zero. Otherwise, both @data_len and @data_uptr + * must be given. */ struct iommu_hwpt_alloc { __u32 size; @@ -387,6 +403,9 @@ struct iommu_hwpt_alloc { __u32 pt_id; __u32 out_hwpt_id; __u32 __reserved; + __u32 hwpt_type; + __u32 data_len; + __aligned_u64 data_uptr; }; #define IOMMU_HWPT_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_ALLOC)