Message ID | 20240102143834.146165-11-yi.l.liu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add iommufd nesting (part 2/2) | expand |
On Tue, Jan 02, 2024 at 06:38:34AM -0800, Yi Liu wrote: > +static void intel_nested_flush_cache(struct dmar_domain *domain, u64 addr, > + unsigned long npages, bool ih, u32 *error) > +{ > + struct iommu_domain_info *info; > + unsigned long i; > + unsigned mask; > + u32 fault; > + > + xa_for_each(&domain->iommu_array, i, info) > + qi_flush_piotlb(info->iommu, > + domain_id_iommu(domain, info->iommu), > + IOMMU_NO_PASID, addr, npages, ih, NULL); This locking on the xarray is messed up throughout the driver. There could be a concurrent detach at this point which will free info and UAF this. This seems to be systemic issue, so I'm going to ignore it here, but please make a series to fix it completely. xarray is probably a bad data structure to manage attachment, a linked list is going to use less memory in most cases and you need a mutex lock anyhow. Jason
On 2024/1/3 02:44, Jason Gunthorpe wrote: > On Tue, Jan 02, 2024 at 06:38:34AM -0800, Yi Liu wrote: > >> +static void intel_nested_flush_cache(struct dmar_domain *domain, u64 addr, >> + unsigned long npages, bool ih, u32 *error) >> +{ >> + struct iommu_domain_info *info; >> + unsigned long i; >> + unsigned mask; >> + u32 fault; >> + >> + xa_for_each(&domain->iommu_array, i, info) >> + qi_flush_piotlb(info->iommu, >> + domain_id_iommu(domain, info->iommu), >> + IOMMU_NO_PASID, addr, npages, ih, NULL); > > This locking on the xarray is messed up throughout the driver. There > could be a concurrent detach at this point which will free info and > UAF this. hmmm, xa_for_each() takes and releases rcu lock, and according to the domain_detach_iommu(), info is freed after xa_erase(). For an existing info stored in xarray, xa_erase() should return after rcu lock is released. is it? Any idea? @Baolu void domain_detach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu) { struct iommu_domain_info *info; spin_lock(&iommu->lock); info = xa_load(&domain->iommu_array, iommu->seq_id); if (--info->refcnt == 0) { clear_bit(info->did, iommu->domain_ids); xa_erase(&domain->iommu_array, iommu->seq_id); domain->nid = NUMA_NO_NODE; domain_update_iommu_cap(domain); kfree(info); } spin_unlock(&iommu->lock); } > This seems to be systemic issue, so I'm going to ignore it here, but > please make a series to fix it completely. yeah, this writing is the same with other places that reference the iommu_array. If there is real problem, may check with Baolu and Kevin. > xarray is probably a bad data structure to manage attachment, a linked > list is going to use less memory in most cases and you need a mutex > lock anyhow. below is the commit that introduces iommu_array. commit ba949f4cd4c39c587e9b722ac7eb7f7e8a42dace Author: Lu Baolu <baolu.lu@linux.intel.com> Date: Tue Jul 12 08:09:05 2022 +0800 iommu/vt-d: Refactor iommu information of each domain When a DMA domain is attached to a device, it needs to allocate a domain ID from its IOMMU. Currently, the domain ID information is stored in two static arrays embedded in the domain structure. This can lead to memory waste when the driver is running on a small platform. This optimizes these static arrays by replacing them with an xarray and consuming memory on demand. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lore.kernel.org/r/20220702015610.2849494-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
On 2024/1/3 9:33, Yi Liu wrote: > On 2024/1/3 02:44, Jason Gunthorpe wrote: >> On Tue, Jan 02, 2024 at 06:38:34AM -0800, Yi Liu wrote: >> >>> +static void intel_nested_flush_cache(struct dmar_domain *domain, u64 >>> addr, >>> + unsigned long npages, bool ih, u32 *error) >>> +{ >>> + struct iommu_domain_info *info; >>> + unsigned long i; >>> + unsigned mask; >>> + u32 fault; >>> + >>> + xa_for_each(&domain->iommu_array, i, info) >>> + qi_flush_piotlb(info->iommu, >>> + domain_id_iommu(domain, info->iommu), >>> + IOMMU_NO_PASID, addr, npages, ih, NULL); >> >> This locking on the xarray is messed up throughout the driver. There >> could be a concurrent detach at this point which will free info and >> UAF this. > > hmmm, xa_for_each() takes and releases rcu lock, and according to the > domain_detach_iommu(), info is freed after xa_erase(). For an existing > info stored in xarray, xa_erase() should return after rcu lock is released. > is it? Any idea? @Baolu I once thought locking for xarray is self-contained. I need more thought on this before taking further action. Best regards, baolu
On Wed, Jan 03, 2024 at 11:06:19AM +0800, Baolu Lu wrote: > On 2024/1/3 9:33, Yi Liu wrote: > > On 2024/1/3 02:44, Jason Gunthorpe wrote: > > > On Tue, Jan 02, 2024 at 06:38:34AM -0800, Yi Liu wrote: > > > > > > > +static void intel_nested_flush_cache(struct dmar_domain > > > > *domain, u64 addr, > > > > + unsigned long npages, bool ih, u32 *error) > > > > +{ > > > > + struct iommu_domain_info *info; > > > > + unsigned long i; > > > > + unsigned mask; > > > > + u32 fault; > > > > + > > > > + xa_for_each(&domain->iommu_array, i, info) > > > > + qi_flush_piotlb(info->iommu, > > > > + domain_id_iommu(domain, info->iommu), > > > > + IOMMU_NO_PASID, addr, npages, ih, NULL); > > > > > > This locking on the xarray is messed up throughout the driver. There > > > could be a concurrent detach at this point which will free info and > > > UAF this. > > > > hmmm, xa_for_each() takes and releases rcu lock, and according to the > > domain_detach_iommu(), info is freed after xa_erase(). For an existing > > info stored in xarray, xa_erase() should return after rcu lock is released. > > is it? Any idea? @Baolu > > I once thought locking for xarray is self-contained. I need more thought > on this before taking further action. The locking of xarray itself is self-contained, but once it returns a value then the user has to provide locking to protect the value. In this case the xarray storage memory itself will not UAF but the info pointer to memory returned from the xarray will. I've been thinking arm/amd/intel all need the same datastructure here, and it is a bit complicated. We should try to make a library to handle it.. It is straightforward except for the RCU list walk for invalidation.. Jason
diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c index b5a5563ab32c..f1f86437939c 100644 --- a/drivers/iommu/intel/nested.c +++ b/drivers/iommu/intel/nested.c @@ -73,9 +73,116 @@ static void intel_nested_domain_free(struct iommu_domain *domain) kfree(to_dmar_domain(domain)); } +static void nested_flush_dev_iotlb(struct dmar_domain *domain, u64 addr, + unsigned mask, u32 *fault) +{ + struct device_domain_info *info; + unsigned long flags; + u16 sid, qdep; + + spin_lock_irqsave(&domain->lock, flags); + list_for_each_entry(info, &domain->devices, link) { + if (!info->ats_enabled) + continue; + sid = info->bus << 8 | info->devfn; + qdep = info->ats_qdep; + qi_flush_dev_iotlb(info->iommu, sid, info->pfsid, + qdep, addr, mask, fault); + quirk_extra_dev_tlb_flush(info, addr, mask, + IOMMU_NO_PASID, qdep); + } + spin_unlock_irqrestore(&domain->lock, flags); +} + +static void intel_nested_flush_cache(struct dmar_domain *domain, u64 addr, + unsigned long npages, bool ih, u32 *error) +{ + struct iommu_domain_info *info; + unsigned long i; + unsigned mask; + u32 fault; + + xa_for_each(&domain->iommu_array, i, info) + qi_flush_piotlb(info->iommu, + domain_id_iommu(domain, info->iommu), + IOMMU_NO_PASID, addr, npages, ih, NULL); + + if (!domain->has_iotlb_device) + return; + + if (npages == U64_MAX) + mask = 64 - VTD_PAGE_SHIFT; + else + mask = ilog2(__roundup_pow_of_two(npages)); + + nested_flush_dev_iotlb(domain, addr, mask, &fault); + + *error = 0; + /* + * Invalidation queue error (i.e. IQE) will not be reported to user + * as it's caused only by driver internal bug. + */ + if (fault & DMA_FSTS_ICE) + *error |= IOMMU_HWPT_INVALIDATE_VTD_S1_ICE; + if (fault & DMA_FSTS_ITE) + *error |= IOMMU_HWPT_INVALIDATE_VTD_S1_ITE; +} + +static int intel_nested_cache_invalidate_user(struct iommu_domain *domain, + struct iommu_user_data_array *array) +{ + struct dmar_domain *dmar_domain = to_dmar_domain(domain); + struct iommu_hwpt_vtd_s1_invalidate inv_entry; + u32 processed = 0; + int ret = 0; + u32 index; + + if (array->type != IOMMU_HWPT_INVALIDATE_DATA_VTD_S1) { + ret = -EINVAL; + goto out; + } + + for (index = 0; index < array->entry_num; index++) { + ret = iommu_copy_struct_from_user_array(&inv_entry, array, + IOMMU_HWPT_INVALIDATE_DATA_VTD_S1, + index, hw_error); + if (ret) + break; + + if (inv_entry.flags & ~IOMMU_VTD_INV_FLAGS_LEAF) { + ret = -EOPNOTSUPP; + break; + } + + if (!IS_ALIGNED(inv_entry.addr, VTD_PAGE_SIZE) || + ((inv_entry.npages == U64_MAX) && inv_entry.addr)) { + ret = -EINVAL; + break; + } + + intel_nested_flush_cache(dmar_domain, inv_entry.addr, + inv_entry.npages, + inv_entry.flags & IOMMU_VTD_INV_FLAGS_LEAF, + &inv_entry.hw_error); + + ret = iommu_respond_struct_to_user_array(array, index, + (void *)&inv_entry, + sizeof(inv_entry)); + if (ret) + break; + + processed++; + } + +out: + array->entry_num = processed; + return ret; +} + static const struct iommu_domain_ops intel_nested_domain_ops = { .attach_dev = intel_nested_attach_dev, .free = intel_nested_domain_free, + .cache_invalidate_user = intel_nested_cache_invalidate_user, }; struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain *parent,