Message ID | 20230401151833.124749-17-yi.l.liu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add vfio_device cdev for iommufd support | expand |
On Sat, 1 Apr 2023 08:18:24 -0700 Yi Liu <yi.l.liu@intel.com> wrote: > From: Nicolin Chen <nicolinc@nvidia.com> > > Previously, the detach routine is only done by the destroy(). And it was > called by vfio_iommufd_emulated_unbind() when the device runs close(), so > all the mappings in iopt were cleaned in that setup, when the call trace > reaches this detach() routine. > > Now, there's a need of a detach uAPI, meaning that it does not only need > a new iommufd_access_detach() API, but also requires access->ops->unmap() > call as a cleanup. So add one. > > However, leaving that unprotected can introduce some potential of a race > condition during the pin_/unpin_pages() call, where access->ioas->iopt is > getting referenced. So, add an ioas_lock to protect the context of iopt > referencings. > > Also, to allow the iommufd_access_unpin_pages() callback to happen via > this unmap() call, add an ioas_unpin pointer, so the unpin routine won't > be affected by the "access->ioas = NULL" trick. > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > Tested-by: Terrence Xu <terrence.xu@intel.com> > Tested-by: Yanting Jiang <yanting.jiang@intel.com> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > --- > drivers/iommu/iommufd/device.c | 76 +++++++++++++++++++++++-- > drivers/iommu/iommufd/iommufd_private.h | 2 + > include/linux/iommufd.h | 1 + > 3 files changed, 74 insertions(+), 5 deletions(-) Does this need to go in via iommufd first? There seems to be quite a bit of churn in iommufd/device.c vs the vfio_mdev_ops branch (ie. it doesn't apply). Thanks, Alex
On Tue, Apr 04, 2023 at 04:45:12PM -0600, Alex Williamson wrote: > On Sat, 1 Apr 2023 08:18:24 -0700 > Yi Liu <yi.l.liu@intel.com> wrote: > > > From: Nicolin Chen <nicolinc@nvidia.com> > > > > Previously, the detach routine is only done by the destroy(). And it was > > called by vfio_iommufd_emulated_unbind() when the device runs close(), so > > all the mappings in iopt were cleaned in that setup, when the call trace > > reaches this detach() routine. > > > > Now, there's a need of a detach uAPI, meaning that it does not only need > > a new iommufd_access_detach() API, but also requires access->ops->unmap() > > call as a cleanup. So add one. > > > > However, leaving that unprotected can introduce some potential of a race > > condition during the pin_/unpin_pages() call, where access->ioas->iopt is > > getting referenced. So, add an ioas_lock to protect the context of iopt > > referencings. > > > > Also, to allow the iommufd_access_unpin_pages() callback to happen via > > this unmap() call, add an ioas_unpin pointer, so the unpin routine won't > > be affected by the "access->ioas = NULL" trick. > > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > Tested-by: Terrence Xu <terrence.xu@intel.com> > > Tested-by: Yanting Jiang <yanting.jiang@intel.com> > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > --- > > drivers/iommu/iommufd/device.c | 76 +++++++++++++++++++++++-- > > drivers/iommu/iommufd/iommufd_private.h | 2 + > > include/linux/iommufd.h | 1 + > > 3 files changed, 74 insertions(+), 5 deletions(-) > > Does this need to go in via iommufd first? There seems to be quite a > bit of churn in iommufd/device.c vs the vfio_mdev_ops branch (ie. it > doesn't apply). Thanks, I think it is best to stay with this series, Yi has to rebase it Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Wednesday, April 5, 2023 7:56 PM > > On Tue, Apr 04, 2023 at 04:45:12PM -0600, Alex Williamson wrote: > > On Sat, 1 Apr 2023 08:18:24 -0700 > > Yi Liu <yi.l.liu@intel.com> wrote: > > > > > From: Nicolin Chen <nicolinc@nvidia.com> > > > > > > Previously, the detach routine is only done by the destroy(). And it was > > > called by vfio_iommufd_emulated_unbind() when the device runs close(), so > > > all the mappings in iopt were cleaned in that setup, when the call trace > > > reaches this detach() routine. > > > > > > Now, there's a need of a detach uAPI, meaning that it does not only need > > > a new iommufd_access_detach() API, but also requires access->ops->unmap() > > > call as a cleanup. So add one. > > > > > > However, leaving that unprotected can introduce some potential of a race > > > condition during the pin_/unpin_pages() call, where access->ioas->iopt is > > > getting referenced. So, add an ioas_lock to protect the context of iopt > > > referencings. > > > > > > Also, to allow the iommufd_access_unpin_pages() callback to happen via > > > this unmap() call, add an ioas_unpin pointer, so the unpin routine won't > > > be affected by the "access->ioas = NULL" trick. > > > > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > > Tested-by: Terrence Xu <terrence.xu@intel.com> > > > Tested-by: Yanting Jiang <yanting.jiang@intel.com> > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > > --- > > > drivers/iommu/iommufd/device.c | 76 +++++++++++++++++++++++-- > > > drivers/iommu/iommufd/iommufd_private.h | 2 + > > > include/linux/iommufd.h | 1 + > > > 3 files changed, 74 insertions(+), 5 deletions(-) > > > > Does this need to go in via iommufd first? There seems to be quite a > > bit of churn in iommufd/device.c vs the vfio_mdev_ops branch (ie. it > > doesn't apply). Thanks, > > I think it is best to stay with this series, Yi has to rebase it The rebased version is here. Shall I resend a version which is rebased on top of vfio_mdev_ops? https://github.com/yiliu1765/iommufd/commit/d3d8f65c82fe2ca2a7b1a635f4b40b2a0971daa9 Regards, Yi Liu
On Wed, Apr 05, 2023 at 02:10:19PM +0000, Liu, Yi L wrote: > > From: Jason Gunthorpe <jgg@nvidia.com> > > Sent: Wednesday, April 5, 2023 7:56 PM > > > > On Tue, Apr 04, 2023 at 04:45:12PM -0600, Alex Williamson wrote: > > > On Sat, 1 Apr 2023 08:18:24 -0700 > > > Yi Liu <yi.l.liu@intel.com> wrote: > > > > > > > From: Nicolin Chen <nicolinc@nvidia.com> > > > > > > > > Previously, the detach routine is only done by the destroy(). And it was > > > > called by vfio_iommufd_emulated_unbind() when the device runs close(), so > > > > all the mappings in iopt were cleaned in that setup, when the call trace > > > > reaches this detach() routine. > > > > > > > > Now, there's a need of a detach uAPI, meaning that it does not only need > > > > a new iommufd_access_detach() API, but also requires access->ops->unmap() > > > > call as a cleanup. So add one. > > > > > > > > However, leaving that unprotected can introduce some potential of a race > > > > condition during the pin_/unpin_pages() call, where access->ioas->iopt is > > > > getting referenced. So, add an ioas_lock to protect the context of iopt > > > > referencings. > > > > > > > > Also, to allow the iommufd_access_unpin_pages() callback to happen via > > > > this unmap() call, add an ioas_unpin pointer, so the unpin routine won't > > > > be affected by the "access->ioas = NULL" trick. > > > > > > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > > > Tested-by: Terrence Xu <terrence.xu@intel.com> > > > > Tested-by: Yanting Jiang <yanting.jiang@intel.com> > > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > > > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > > > --- > > > > drivers/iommu/iommufd/device.c | 76 +++++++++++++++++++++++-- > > > > drivers/iommu/iommufd/iommufd_private.h | 2 + > > > > include/linux/iommufd.h | 1 + > > > > 3 files changed, 74 insertions(+), 5 deletions(-) > > > > > > Does this need to go in via iommufd first? There seems to be quite a > > > bit of churn in iommufd/device.c vs the vfio_mdev_ops branch (ie. it > > > doesn't apply). Thanks, > > > > I think it is best to stay with this series, Yi has to rebase it > > The rebased version is here. Shall I resend a version which is rebased on > top of vfio_mdev_ops? > > https://github.com/yiliu1765/iommufd/commit/d3d8f65c82fe2ca2a7b1a635f4b40b2a0971daa9 When you post the v10 it should be based on top of the vfio_mdev_ops and the hot reset series. Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Wednesday, April 5, 2023 10:29 PM > > > > > > > > Does this need to go in via iommufd first? There seems to be quite a > > > > bit of churn in iommufd/device.c vs the vfio_mdev_ops branch (ie. it > > > > doesn't apply). Thanks, > > > > > > I think it is best to stay with this series, Yi has to rebase it > > > > The rebased version is here. Shall I resend a version which is rebased on > > top of vfio_mdev_ops? > > > > > https://github.com/yiliu1765/iommufd/commit/d3d8f65c82fe2ca2a7b1a635f4b40b2a > 0971daa9 > > When you post the v10 it should be based on top of the vfio_mdev_ops > and the hot reset series. yes. At least, I see the hot reset series needs to be refreshed w.r.t. the comments from Alex and Eric. Regards, Yi Liu
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 04a57aa1ae2c..0eaae60f3537 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -474,6 +474,7 @@ iommufd_access_create(struct iommufd_ctx *ictx, iommufd_ctx_get(ictx); iommufd_object_finalize(ictx, &access->obj); *id = access->obj.id; + mutex_init(&access->ioas_lock); return access; } EXPORT_SYMBOL_NS_GPL(iommufd_access_create, IOMMUFD); @@ -493,26 +494,66 @@ void iommufd_access_destroy(struct iommufd_access *access) } EXPORT_SYMBOL_NS_GPL(iommufd_access_destroy, IOMMUFD); +static void __iommufd_access_detach(struct iommufd_access *access) +{ + struct iommufd_ioas *cur_ioas = access->ioas; + + lockdep_assert_held(&access->ioas_lock); + /* + * Set ioas to NULL to block any further iommufd_access_pin_pages(). + * iommufd_access_unpin_pages() can continue using access->ioas_unpin. + */ + access->ioas = NULL; + + if (access->ops->unmap) { + mutex_unlock(&access->ioas_lock); + access->ops->unmap(access->data, 0, ULONG_MAX); + mutex_lock(&access->ioas_lock); + } + iopt_remove_access(&cur_ioas->iopt, access); + refcount_dec(&cur_ioas->obj.users); +} + +void iommufd_access_detach(struct iommufd_access *access) +{ + mutex_lock(&access->ioas_lock); + if (WARN_ON(!access->ioas)) + goto out; + __iommufd_access_detach(access); +out: + access->ioas_unpin = NULL; + mutex_unlock(&access->ioas_lock); +} +EXPORT_SYMBOL_NS_GPL(iommufd_access_detach, IOMMUFD); + int iommufd_access_attach(struct iommufd_access *access, u32 ioas_id) { struct iommufd_ioas *new_ioas; int rc = 0; - if (access->ioas != NULL && access->ioas->obj.id != ioas_id) + mutex_lock(&access->ioas_lock); + if (access->ioas != NULL && access->ioas->obj.id != ioas_id) { + mutex_unlock(&access->ioas_lock); return -EINVAL; + } new_ioas = iommufd_get_ioas(access->ictx, ioas_id); - if (IS_ERR(new_ioas)) + if (IS_ERR(new_ioas)) { + mutex_unlock(&access->ioas_lock); return PTR_ERR(new_ioas); + } rc = iopt_add_access(&new_ioas->iopt, access); if (rc) { + mutex_unlock(&access->ioas_lock); iommufd_put_object(&new_ioas->obj); return rc; } iommufd_ref_to_users(&new_ioas->obj); access->ioas = new_ioas; + access->ioas_unpin = new_ioas; + mutex_unlock(&access->ioas_lock); return 0; } EXPORT_SYMBOL_NS_GPL(iommufd_access_attach, IOMMUFD); @@ -567,8 +608,8 @@ void iommufd_access_notify_unmap(struct io_pagetable *iopt, unsigned long iova, void iommufd_access_unpin_pages(struct iommufd_access *access, unsigned long iova, unsigned long length) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; unsigned long last_iova; struct iopt_area *area; @@ -576,6 +617,13 @@ void iommufd_access_unpin_pages(struct iommufd_access *access, WARN_ON(check_add_overflow(iova, length - 1, &last_iova))) return; + mutex_lock(&access->ioas_lock); + if (!access->ioas_unpin) { + mutex_unlock(&access->ioas_lock); + return; + } + iopt = &access->ioas_unpin->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) iopt_area_remove_access( @@ -585,6 +633,7 @@ void iommufd_access_unpin_pages(struct iommufd_access *access, min(last_iova, iopt_area_last_iova(area)))); up_read(&iopt->iova_rwsem); WARN_ON(!iopt_area_contig_done(&iter)); + mutex_unlock(&access->ioas_lock); } EXPORT_SYMBOL_NS_GPL(iommufd_access_unpin_pages, IOMMUFD); @@ -630,8 +679,8 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, unsigned long length, struct page **out_pages, unsigned int flags) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; unsigned long last_iova; struct iopt_area *area; int rc; @@ -646,6 +695,13 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, if (check_add_overflow(iova, length - 1, &last_iova)) return -EOVERFLOW; + mutex_lock(&access->ioas_lock); + if (!access->ioas) { + mutex_unlock(&access->ioas_lock); + return -ENOENT; + } + iopt = &access->ioas->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) { unsigned long last = min(last_iova, iopt_area_last_iova(area)); @@ -676,6 +732,7 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, } up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return 0; err_remove: @@ -690,6 +747,7 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, iopt_area_last_iova(area)))); } up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_pin_pages, IOMMUFD); @@ -709,8 +767,8 @@ EXPORT_SYMBOL_NS_GPL(iommufd_access_pin_pages, IOMMUFD); int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, void *data, size_t length, unsigned int flags) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; struct iopt_area *area; unsigned long last_iova; int rc; @@ -720,6 +778,13 @@ int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, if (check_add_overflow(iova, length - 1, &last_iova)) return -EOVERFLOW; + mutex_lock(&access->ioas_lock); + if (!access->ioas) { + mutex_unlock(&access->ioas_lock); + return -ENOENT; + } + iopt = &access->ioas->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) { unsigned long last = min(last_iova, iopt_area_last_iova(area)); @@ -746,6 +811,7 @@ int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, rc = -ENOENT; err_out: up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_rw, IOMMUFD); diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 2e6e8e217cce..ec2ce3ef187d 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -263,6 +263,8 @@ struct iommufd_access { struct iommufd_object obj; struct iommufd_ctx *ictx; struct iommufd_ioas *ioas; + struct iommufd_ioas *ioas_unpin; + struct mutex ioas_lock; const struct iommufd_access_ops *ops; void *data; unsigned long iova_alignment; diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index ac96df406833..9e0e8894dacc 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -47,6 +47,7 @@ iommufd_access_create(struct iommufd_ctx *ictx, const struct iommufd_access_ops *ops, void *data, u32 *id); void iommufd_access_destroy(struct iommufd_access *access); int iommufd_access_attach(struct iommufd_access *access, u32 ioas_id); +void iommufd_access_detach(struct iommufd_access *access); void iommufd_ctx_get(struct iommufd_ctx *ictx);