Message ID | 20220906124458.46461-8-baolu.lu@linux.intel.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | iommu: SVA and IOPF refactoring | expand |
On Tue, Sep 06, 2022 at 08:44:52PM +0800, Lu Baolu wrote: > +static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid) > +{ > + struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL); > + struct iommu_domain *domain; > + > + /* Domain type specific cleanup: */ > + domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0); > + if (domain) { > + switch (domain->type) { > + case IOMMU_DOMAIN_SVA: > + intel_svm_remove_dev_pasid(dev, pasid); > + break; > + default: > + /* should never reach here */ > + WARN_ON(1); > + break; This is eventually going to need a lot more cleaning up to split out the PASID from the SVM stuff. SVA should *only* be a set of predefined handlers (in the core code!) for the generic PRI mechanism, it shouldn't be entangled deeply into PASID or the drivers like this. When we get done with this, the flow should have the core code attach a SVA domain to a PASID with PRI enabled and the core code should supply a generic PRI implementation that does the mmu_notifier stuff. Also, stuff like this: /* We mandate that no page faults may be outstanding * for the PASID when intel_svm_unbind_mm() is called. * If that is not obeyed, subtle errors will happen. * Let's make them less subtle... */ Are going to be problematic for VFIO as well. In a VFIO world the entire RID and its entire PASID table has to be owned by VFIO and never shared - so these sequencing issues should be solvable. But this is all for further series.. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Jason
On 2022/9/22 23:49, Jason Gunthorpe wrote: > On Tue, Sep 06, 2022 at 08:44:52PM +0800, Lu Baolu wrote: >> +static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid) >> +{ >> + struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL); >> + struct iommu_domain *domain; >> + >> + /* Domain type specific cleanup: */ >> + domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0); >> + if (domain) { >> + switch (domain->type) { >> + case IOMMU_DOMAIN_SVA: >> + intel_svm_remove_dev_pasid(dev, pasid); >> + break; >> + default: >> + /* should never reach here */ >> + WARN_ON(1); >> + break; > > This is eventually going to need a lot more cleaning up to split out > the PASID from the SVM stuff. > > SVA should *only* be a set of predefined handlers (in the core code!) > for the generic PRI mechanism, it shouldn't be entangled deeply into > PASID or the drivers like this. > > When we get done with this, the flow should have the core code attach > a SVA domain to a PASID with PRI enabled and the core code should > supply a generic PRI implementation that does the mmu_notifier > stuff. Yes. Agreed. At the beginning of this project, I wanted to consolidate the mm notifications into the core. However, ARM SMMUv3 and Intel handle the mm notifications a little differently. Then I decided to do this work separately from the current series. > > Also, stuff like this: > > /* We mandate that no page faults may be outstanding > * for the PASID when intel_svm_unbind_mm() is called. > * If that is not obeyed, subtle errors will happen. > * Let's make them less subtle... */ > > Are going to be problematic for VFIO as well. In a VFIO world the > entire RID and its entire PASID table has to be owned by VFIO and > never shared - so these sequencing issues should be solvable. > > But this is all for further series.. Yes. All these sound interesting future series. > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> I'm very appreciated for your time. Best regards, baolu
On Fri, Sep 23, 2022 at 10:21:51AM +0800, Baolu Lu wrote: > At the beginning of this project, I wanted to consolidate the mm > notifications into the core. However, ARM SMMUv3 and Intel handle the mm > notifications a little differently. Then I decided to do this work > separately from the current series. It doesn't look really different.. The SVA iommu domain needs two new ops triggered by the notifier: - 'stop using the mm, subsitute a blocking domain' tied to release - Cache invalidate a range, maybe this is just iotlb_sync_map() And we could even think about the first as the core code literally attaching a dummy blocking domain and releasing the SVA domain. There is no reason we need to have every driver do this tricky bit of lifecycle management. Jason
On 2022/9/23 20:15, Jason Gunthorpe wrote: > On Fri, Sep 23, 2022 at 10:21:51AM +0800, Baolu Lu wrote: > >> At the beginning of this project, I wanted to consolidate the mm >> notifications into the core. However, ARM SMMUv3 and Intel handle the mm >> notifications a little differently. Then I decided to do this work >> separately from the current series. > > It doesn't look really different.. They are essentially the same, but slightly different. For example, arm smmuv3 provides .free_notifier, and I don't think it could be merged to the release callback. > > The SVA iommu domain needs two new ops triggered by the notifier: > > - 'stop using the mm, subsitute a blocking domain' tied to release > - Cache invalidate a range, maybe this is just iotlb_sync_map() > > And we could even think about the first as the core code literally > attaching a dummy blocking domain and releasing the SVA domain. There > is no reason we need to have every driver do this tricky bit of > lifecycle management. Yes. I have similar ideas. We can further discuss it with the real code later. Best regards, baolu
On Fri, Sep 23, 2022 at 08:41:56PM +0800, Baolu Lu wrote: > On 2022/9/23 20:15, Jason Gunthorpe wrote: > > On Fri, Sep 23, 2022 at 10:21:51AM +0800, Baolu Lu wrote: > > > > > At the beginning of this project, I wanted to consolidate the mm > > > notifications into the core. However, ARM SMMUv3 and Intel handle the mm > > > notifications a little differently. Then I decided to do this work > > > separately from the current series. > > > > It doesn't look really different.. > > They are essentially the same, but slightly different. For example, arm > smmuv3 provides .free_notifier, and I don't think it could be merged to > the release callback. free_notifier allows to use mmu_notifier_put() instead of mmu_notifier_unregister() which avoids a synchronize_rcu() penalty on teardown. Intel should copy the same design. Jason
On 2022/9/23 21:07, Jason Gunthorpe wrote: > On Fri, Sep 23, 2022 at 08:41:56PM +0800, Baolu Lu wrote: >> On 2022/9/23 20:15, Jason Gunthorpe wrote: >>> On Fri, Sep 23, 2022 at 10:21:51AM +0800, Baolu Lu wrote: >>> >>>> At the beginning of this project, I wanted to consolidate the mm >>>> notifications into the core. However, ARM SMMUv3 and Intel handle the mm >>>> notifications a little differently. Then I decided to do this work >>>> separately from the current series. >>> It doesn't look really different.. >> They are essentially the same, but slightly different. For example, arm >> smmuv3 provides .free_notifier, and I don't think it could be merged to >> the release callback. > free_notifier allows to use mmu_notifier_put() instead of > mmu_notifier_unregister() which avoids a synchronize_rcu() penalty on > teardown. Intel should copy the same design. Ah! Thanks for the guide. I will head in this direction. Best regards, baolu
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 91312aabdd53..8310bc2f5f0d 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -751,6 +751,8 @@ void intel_svm_unbind(struct iommu_sva *handle); u32 intel_svm_get_pasid(struct iommu_sva *handle); int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt, struct iommu_page_response *msg); +struct iommu_domain *intel_svm_domain_alloc(void); +void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid); struct intel_svm_dev { struct list_head list; @@ -776,6 +778,14 @@ struct intel_svm { }; #else static inline void intel_svm_check(struct intel_iommu *iommu) {} +static inline struct iommu_domain *intel_svm_domain_alloc(void) +{ + return NULL; +} + +static inline void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid) +{ +} #endif #ifdef CONFIG_INTEL_IOMMU_DEBUGFS diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 64d30895a4c8..bca38dd79569 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4156,6 +4156,8 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) return domain; case IOMMU_DOMAIN_IDENTITY: return &si_domain->domain; + case IOMMU_DOMAIN_SVA: + return intel_svm_domain_alloc(); default: return NULL; } @@ -4746,6 +4748,28 @@ static void intel_iommu_iotlb_sync_map(struct iommu_domain *domain, __mapping_notify_one(info->iommu, dmar_domain, pfn, pages); } +static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid) +{ + struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL); + struct iommu_domain *domain; + + /* Domain type specific cleanup: */ + domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0); + if (domain) { + switch (domain->type) { + case IOMMU_DOMAIN_SVA: + intel_svm_remove_dev_pasid(dev, pasid); + break; + default: + /* should never reach here */ + WARN_ON(1); + break; + } + } + + intel_pasid_tear_down_entry(iommu, dev, pasid, false); +} + const struct iommu_ops intel_iommu_ops = { .capable = intel_iommu_capable, .domain_alloc = intel_iommu_domain_alloc, @@ -4758,6 +4782,7 @@ const struct iommu_ops intel_iommu_ops = { .dev_disable_feat = intel_iommu_dev_disable_feat, .is_attach_deferred = intel_iommu_is_attach_deferred, .def_domain_type = device_def_domain_type, + .remove_dev_pasid = intel_iommu_remove_dev_pasid, .pgsize_bitmap = SZ_4K, #ifdef CONFIG_INTEL_IOMMU_SVM .sva_bind = intel_svm_bind, diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 2420fa5c2360..8fe7cff78356 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -928,3 +928,50 @@ int intel_svm_page_response(struct device *dev, mutex_unlock(&pasid_mutex); return ret; } + +void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid) +{ + mutex_lock(&pasid_mutex); + intel_svm_unbind_mm(dev, pasid); + mutex_unlock(&pasid_mutex); +} + +static int intel_svm_set_dev_pasid(struct iommu_domain *domain, + struct device *dev, ioasid_t pasid) +{ + struct device_domain_info *info = dev_iommu_priv_get(dev); + struct intel_iommu *iommu = info->iommu; + struct mm_struct *mm = domain->mm; + struct iommu_sva *sva; + int ret = 0; + + mutex_lock(&pasid_mutex); + sva = intel_svm_bind_mm(iommu, dev, mm); + if (IS_ERR(sva)) + ret = PTR_ERR(sva); + mutex_unlock(&pasid_mutex); + + return ret; +} + +static void intel_svm_domain_free(struct iommu_domain *domain) +{ + kfree(to_dmar_domain(domain)); +} + +static const struct iommu_domain_ops intel_svm_domain_ops = { + .set_dev_pasid = intel_svm_set_dev_pasid, + .free = intel_svm_domain_free +}; + +struct iommu_domain *intel_svm_domain_alloc(void) +{ + struct dmar_domain *domain; + + domain = kzalloc(sizeof(*domain), GFP_KERNEL); + if (!domain) + return NULL; + domain->domain.ops = &intel_svm_domain_ops; + + return &domain->domain; +}