Message ID | 20220106022053.2406748-2-baolu.lu@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Scrap iommu_attach/detach_group() interfaces | expand |
On Thu, Jan 06, 2022 at 10:20:46AM +0800, Lu Baolu wrote: > Expose an interface to replace the domain of an iommu group for frameworks > like vfio which claims the ownership of the whole iommu group. > > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> > include/linux/iommu.h | 10 ++++++++++ > drivers/iommu/iommu.c | 37 +++++++++++++++++++++++++++++++++++++ > 2 files changed, 47 insertions(+) > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index 408a6d2b3034..66ebce3d1e11 100644 > +++ b/include/linux/iommu.h > @@ -677,6 +677,9 @@ void iommu_device_unuse_dma_api(struct device *dev); > int iommu_group_set_dma_owner(struct iommu_group *group, void *owner); > void iommu_group_release_dma_owner(struct iommu_group *group); > bool iommu_group_dma_owner_claimed(struct iommu_group *group); > +int iommu_group_replace_domain(struct iommu_group *group, > + struct iommu_domain *old, > + struct iommu_domain *new); > > #else /* CONFIG_IOMMU_API */ > > @@ -1090,6 +1093,13 @@ static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group) > { > return false; > } > + > +static inline int > +iommu_group_replace_domain(struct iommu_group *group, struct iommu_domain *old, > + struct iommu_domain *new) > +{ > + return -ENODEV; > +} > #endif /* CONFIG_IOMMU_API */ > > /** > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 72a95dea688e..ab8ab95969f5 100644 > +++ b/drivers/iommu/iommu.c > @@ -3431,3 +3431,40 @@ bool iommu_group_dma_owner_claimed(struct iommu_group *group) > return user; > } > EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed); > + > +/** > + * iommu_group_replace_domain() - Replace group's domain > + * @group: The group. > + * @old: The previous attached domain. NULL for none. > + * @new: The new domain about to be attached. > + * > + * This is to support backward compatibility for vfio which manages the dma > + * ownership in iommu_group level. This should mention it can only be used with iommu_group_set_dma_owner() > + if (old) > + __iommu_detach_group(old, group); > + > + if (new) { > + ret = __iommu_attach_group(new, group); > + if (ret && old) > + __iommu_attach_group(old, group); > + } The sketchy error unwind here gives me some pause for sure. Maybe we should define that on error this leaves the domain as NULL Complicates vfio a tiny bit to cope with this failure but seems cleaner than leaving it indeterminate. Jason
On 1/7/22 1:06 AM, Jason Gunthorpe wrote: > On Thu, Jan 06, 2022 at 10:20:46AM +0800, Lu Baolu wrote: >> Expose an interface to replace the domain of an iommu group for frameworks >> like vfio which claims the ownership of the whole iommu group. >> >> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> >> include/linux/iommu.h | 10 ++++++++++ >> drivers/iommu/iommu.c | 37 +++++++++++++++++++++++++++++++++++++ >> 2 files changed, 47 insertions(+) >> >> diff --git a/include/linux/iommu.h b/include/linux/iommu.h >> index 408a6d2b3034..66ebce3d1e11 100644 >> +++ b/include/linux/iommu.h >> @@ -677,6 +677,9 @@ void iommu_device_unuse_dma_api(struct device *dev); >> int iommu_group_set_dma_owner(struct iommu_group *group, void *owner); >> void iommu_group_release_dma_owner(struct iommu_group *group); >> bool iommu_group_dma_owner_claimed(struct iommu_group *group); >> +int iommu_group_replace_domain(struct iommu_group *group, >> + struct iommu_domain *old, >> + struct iommu_domain *new); >> >> #else /* CONFIG_IOMMU_API */ >> >> @@ -1090,6 +1093,13 @@ static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group) >> { >> return false; >> } >> + >> +static inline int >> +iommu_group_replace_domain(struct iommu_group *group, struct iommu_domain *old, >> + struct iommu_domain *new) >> +{ >> + return -ENODEV; >> +} >> #endif /* CONFIG_IOMMU_API */ >> >> /** >> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c >> index 72a95dea688e..ab8ab95969f5 100644 >> +++ b/drivers/iommu/iommu.c >> @@ -3431,3 +3431,40 @@ bool iommu_group_dma_owner_claimed(struct iommu_group *group) >> return user; >> } >> EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed); >> + >> +/** >> + * iommu_group_replace_domain() - Replace group's domain >> + * @group: The group. >> + * @old: The previous attached domain. NULL for none. >> + * @new: The new domain about to be attached. >> + * >> + * This is to support backward compatibility for vfio which manages the dma >> + * ownership in iommu_group level. > > This should mention it can only be used with iommu_group_set_dma_owner() Sure. > >> + if (old) >> + __iommu_detach_group(old, group); >> + >> + if (new) { >> + ret = __iommu_attach_group(new, group); >> + if (ret && old) >> + __iommu_attach_group(old, group); >> + } > > The sketchy error unwind here gives me some pause for sure. Maybe we > should define that on error this leaves the domain as NULL > > Complicates vfio a tiny bit to cope with this failure but seems > cleaner than leaving it indeterminate. Fair enough. > > Jason > Best regards, baolu
On 2022-01-06 02:20, Lu Baolu wrote: > Expose an interface to replace the domain of an iommu group for frameworks > like vfio which claims the ownership of the whole iommu group. But if the underlying point is the new expectation that iommu_{attach,detach}_device() operate on the device's whole group where relevant, why should we invent some special mechanism for VFIO to be needlessly inconsistent? I said before that it's trivial for VFIO to resolve a suitable device if it needs to; by now I've actually written the patch ;) https://gitlab.arm.com/linux-arm/linux-rm/-/commit/9f37d8c17c9b606abc96e1f1001c0b97c8b93ed5 Robin. > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> > --- > include/linux/iommu.h | 10 ++++++++++ > drivers/iommu/iommu.c | 37 +++++++++++++++++++++++++++++++++++++ > 2 files changed, 47 insertions(+) > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index 408a6d2b3034..66ebce3d1e11 100644 > --- a/include/linux/iommu.h > +++ b/include/linux/iommu.h > @@ -677,6 +677,9 @@ void iommu_device_unuse_dma_api(struct device *dev); > int iommu_group_set_dma_owner(struct iommu_group *group, void *owner); > void iommu_group_release_dma_owner(struct iommu_group *group); > bool iommu_group_dma_owner_claimed(struct iommu_group *group); > +int iommu_group_replace_domain(struct iommu_group *group, > + struct iommu_domain *old, > + struct iommu_domain *new); > > #else /* CONFIG_IOMMU_API */ > > @@ -1090,6 +1093,13 @@ static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group) > { > return false; > } > + > +static inline int > +iommu_group_replace_domain(struct iommu_group *group, struct iommu_domain *old, > + struct iommu_domain *new) > +{ > + return -ENODEV; > +} > #endif /* CONFIG_IOMMU_API */ > > /** > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 72a95dea688e..ab8ab95969f5 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -3431,3 +3431,40 @@ bool iommu_group_dma_owner_claimed(struct iommu_group *group) > return user; > } > EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed); > + > +/** > + * iommu_group_replace_domain() - Replace group's domain > + * @group: The group. > + * @old: The previous attached domain. NULL for none. > + * @new: The new domain about to be attached. > + * > + * This is to support backward compatibility for vfio which manages the dma > + * ownership in iommu_group level. > + */ > +int iommu_group_replace_domain(struct iommu_group *group, > + struct iommu_domain *old, > + struct iommu_domain *new) > +{ > + int ret = 0; > + > + mutex_lock(&group->mutex); > + if (!group->owner || group->domain != old) { > + ret = -EPERM; > + goto unlock_out; > + } > + > + if (old) > + __iommu_detach_group(old, group); > + > + if (new) { > + ret = __iommu_attach_group(new, group); > + if (ret && old) > + __iommu_attach_group(old, group); > + } > + > +unlock_out: > + mutex_unlock(&group->mutex); > + > + return ret; > +} > +EXPORT_SYMBOL_GPL(iommu_group_replace_domain);
On Mon, Feb 14, 2022 at 12:09:36PM +0000, Robin Murphy wrote: > On 2022-01-06 02:20, Lu Baolu wrote: > > Expose an interface to replace the domain of an iommu group for frameworks > > like vfio which claims the ownership of the whole iommu group. > > But if the underlying point is the new expectation that > iommu_{attach,detach}_device() operate on the device's whole group where > relevant, why should we invent some special mechanism for VFIO to be > needlessly inconsistent? > > I said before that it's trivial for VFIO to resolve a suitable device if it > needs to; by now I've actually written the patch ;) > > https://gitlab.arm.com/linux-arm/linux-rm/-/commit/9f37d8c17c9b606abc96e1f1001c0b97c8b93ed5 Er, how does locking work there? What keeps busdev from being concurrently unplugged? How can iommu_group_get() be safely called on this pointer? All of the above only works normally inside a probe/remove context where the driver core is blocking concurrent unplug and descruction. I think I said this last time you brought it up that lifetime was the challenge with this idea. Jason
On 2022-02-14 12:45, Jason Gunthorpe wrote: > On Mon, Feb 14, 2022 at 12:09:36PM +0000, Robin Murphy wrote: >> On 2022-01-06 02:20, Lu Baolu wrote: >>> Expose an interface to replace the domain of an iommu group for frameworks >>> like vfio which claims the ownership of the whole iommu group. >> >> But if the underlying point is the new expectation that >> iommu_{attach,detach}_device() operate on the device's whole group where >> relevant, why should we invent some special mechanism for VFIO to be >> needlessly inconsistent? >> >> I said before that it's trivial for VFIO to resolve a suitable device if it >> needs to; by now I've actually written the patch ;) >> >> https://gitlab.arm.com/linux-arm/linux-rm/-/commit/9f37d8c17c9b606abc96e1f1001c0b97c8b93ed5 > > Er, how does locking work there? What keeps busdev from being > concurrently unplugged? Same thing that prevents the bus pointer from suddenly becoming invalid in the current code, I guess :) But yes, holding a group reference alone can't prevent the group itself from changing, and the finer points of locking still need working out - there's a reason you got a link to a WIP branch in my tree rather than a proper patch in your inbox (TBH at the moment that one represents about a 5:1 ratio of time spent on the reasoning behind the commit message vs. the implementation itself). > How can iommu_group_get() be safely called on > this pointer? VFIO hardly needs to retrieve the iommu_group from a device which it derived from the iommu_group it holds in the first place. What matters is being able to call *other* device-based IOMMU API interfaces in the long term. And once a robust solution for that is in place, it should inevitably work for a device-based attach interface too. > All of the above only works normally inside a probe/remove context > where the driver core is blocking concurrent unplug and descruction. > > I think I said this last time you brought it up that lifetime was the > challenge with this idea. Indeed, but it's a challenge that needs tackling, because the bus-based interfaces need to go away. So either we figure it out now and let this attach interface rework benefit immediately, or I spend three times as long solving it on my own and end up deleting iommu_group_replace_domain() in about 6 months' time anyway. Thanks, Robin.
On Mon, Feb 14, 2022 at 02:10:19PM +0000, Robin Murphy wrote: > On 2022-02-14 12:45, Jason Gunthorpe wrote: > > On Mon, Feb 14, 2022 at 12:09:36PM +0000, Robin Murphy wrote: > > > On 2022-01-06 02:20, Lu Baolu wrote: > > > > Expose an interface to replace the domain of an iommu group for frameworks > > > > like vfio which claims the ownership of the whole iommu group. > > > > > > But if the underlying point is the new expectation that > > > iommu_{attach,detach}_device() operate on the device's whole group where > > > relevant, why should we invent some special mechanism for VFIO to be > > > needlessly inconsistent? > > > > > > I said before that it's trivial for VFIO to resolve a suitable device if it > > > needs to; by now I've actually written the patch ;) > > > > > > https://gitlab.arm.com/linux-arm/linux-rm/-/commit/9f37d8c17c9b606abc96e1f1001c0b97c8b93ed5 > > > > Er, how does locking work there? What keeps busdev from being > > concurrently unplugged? > > Same thing that prevents the bus pointer from suddenly becoming invalid in > the current code, I guess :) Oooh, yes, that does look broken now too. :( > > How can iommu_group_get() be safely called on > > this pointer? > > What matters is being able to call *other* device-based IOMMU API > interfaces in the long term. Yes, this is what I mean, those are the ones that call iommu_group_get(). > > All of the above only works normally inside a probe/remove context > > where the driver core is blocking concurrent unplug and descruction. > > > > I think I said this last time you brought it up that lifetime was the > > challenge with this idea. > > Indeed, but it's a challenge that needs tackling, because the bus-based > interfaces need to go away. So either we figure it out now and let this > attach interface rework benefit immediately, or I spend three times as long IMHO your path is easier if you let VFIO stay with the group interface and use something like: domain = iommu_group_alloc_domain(group) Which is what VFIO is trying to accomplish. Since Lu removed the only other user of iommu_group_for_each_dev() it means we can de-export that interface. This works better because the iommu code can hold the internal group while it finds the bus/device and then invokes the driver op. We don't have a lifetime problem anymore under that lock. The remaining VFIO use of bus for iommu_capable() is better done against the domain or the group object, as appropriate. In the bigger picture, VFIO should stop doing 'iommu_group_alloc_domain' by moving the domain alloc to VFIO_GROUP_GET_DEVICE_FD where we have a struct device to use. We've already been experimenting with this for iommufd and the subtle difference in the uapi doesn't seem relevant. > solving it on my own and end up deleting > iommu_group_replace_domain() in about 6 months' time anyway. I expect this API to remain until we figure out a solution to the PPC problem, and come up with an alternative way to change the attached domain on the fly. Jason
On 2022-02-14 14:56, Jason Gunthorpe via iommu wrote: > On Mon, Feb 14, 2022 at 02:10:19PM +0000, Robin Murphy wrote: >> On 2022-02-14 12:45, Jason Gunthorpe wrote: >>> On Mon, Feb 14, 2022 at 12:09:36PM +0000, Robin Murphy wrote: >>>> On 2022-01-06 02:20, Lu Baolu wrote: >>>>> Expose an interface to replace the domain of an iommu group for frameworks >>>>> like vfio which claims the ownership of the whole iommu group. >>>> >>>> But if the underlying point is the new expectation that >>>> iommu_{attach,detach}_device() operate on the device's whole group where >>>> relevant, why should we invent some special mechanism for VFIO to be >>>> needlessly inconsistent? >>>> >>>> I said before that it's trivial for VFIO to resolve a suitable device if it >>>> needs to; by now I've actually written the patch ;) >>>> >>>> https://gitlab.arm.com/linux-arm/linux-rm/-/commit/9f37d8c17c9b606abc96e1f1001c0b97c8b93ed5 >>> >>> Er, how does locking work there? What keeps busdev from being >>> concurrently unplugged? >> >> Same thing that prevents the bus pointer from suddenly becoming invalid in >> the current code, I guess :) > > Oooh, yes, that does look broken now too. :( > >>> How can iommu_group_get() be safely called on >>> this pointer? >> >> What matters is being able to call *other* device-based IOMMU API >> interfaces in the long term. > > Yes, this is what I mean, those are the ones that call > iommu_group_get(). > >>> All of the above only works normally inside a probe/remove context >>> where the driver core is blocking concurrent unplug and descruction. >>> >>> I think I said this last time you brought it up that lifetime was the >>> challenge with this idea. >> >> Indeed, but it's a challenge that needs tackling, because the bus-based >> interfaces need to go away. So either we figure it out now and let this >> attach interface rework benefit immediately, or I spend three times as long > > IMHO your path is easier if you let VFIO stay with the group interface > and use something like: > > domain = iommu_group_alloc_domain(group) > > Which is what VFIO is trying to accomplish. Since Lu removed the only > other user of iommu_group_for_each_dev() it means we can de-export > that interface. > > This works better because the iommu code can hold the internal group > while it finds the bus/device and then invokes the driver op. We don't > have a lifetime problem anymore under that lock. That's certainly one of the cleaner possibilities - per the theme of this thread I'm not hugely keen on proliferating special VFIO-specific versions of IOMMU APIs, but trying to take the dev->mutex might be a bit heavy-handed and risky, and getting at the vfio_group->device_lock a bit fiddly, so if I can't come up with anything nicer or more general it might be a fair compromise. > The remaining VFIO use of bus for iommu_capable() is better done > against the domain or the group object, as appropriate. Indeed, although half the implementations of .capable are nonsense already, so I'm treating that one as a secondary priority for the moment (with an aim to come back afterwards and just try to kill it off as far as possible). RDMA and VFIO shouldn't be a serious concern for the kind of systems with heterogeneous IOMMUs at this point. > In the bigger picture, VFIO should stop doing > 'iommu_group_alloc_domain' by moving the domain alloc to > VFIO_GROUP_GET_DEVICE_FD where we have a struct device to use. > > We've already been experimenting with this for iommufd and the subtle > difference in the uapi doesn't seem relevant. > >> solving it on my own and end up deleting >> iommu_group_replace_domain() in about 6 months' time anyway. > > I expect this API to remain until we figure out a solution to the PPC > problem, and come up with an alternative way to change the attached > domain on the fly. I though PPC wasn't using the IOMMU API at all... or is that the problem? Thanks, Robin.
On Mon, Feb 14, 2022 at 04:38:23PM +0000, Robin Murphy wrote: > > This works better because the iommu code can hold the internal group > > while it finds the bus/device and then invokes the driver op. We don't > > have a lifetime problem anymore under that lock. > > That's certainly one of the cleaner possibilities - per the theme of this > thread I'm not hugely keen on proliferating special VFIO-specific > versions IMHO this is still a net better than VFIO open coding buggy versions as it has done. > of IOMMU APIs, but trying to take the dev->mutex might be a bit heavy-handed > and risky, The container->group lock is held during this code, and the container->group_lock is taken during probe under the dev_mutex. Acquiring the dev_mutex inside the group_lock should not be done. > and getting at the vfio_group->device_lock a bit fiddly, so if I > can't come up with anything nicer or more general it might be a fair > compromise. Actually that doesn't look so bad. A 'vfio allocate domain from group' function that used the above trick looks OK to me right now. If we could move the iommu_capable() test to a domain that would make this pretty nice - getting the bus safely is a bit more of a PITA - I'm much less keen on holding the device_lock for the whole function. > > The remaining VFIO use of bus for iommu_capable() is better done > > against the domain or the group object, as appropriate. > > Indeed, although half the implementations of .capable are nonsense already, > so I'm treating that one as a secondary priority for the moment (with an aim > to come back afterwards and just try to kill it off as far as possible). > RDMA and VFIO shouldn't be a serious concern for the kind of systems with > heterogeneous IOMMUs at this point. Well, lets see: drivers/infiniband/hw/usnic/usnic_uiom.c: if (!iommu_capable(dev->bus, IOMMU_CAP_CACHE_COHERENCY)) { drivers/vhost/vdpa.c: if (!iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY)) These are kind of hacky ways to say "userspace can actually do DMA because we don't need privileged cache flush instructions on this platform". I would love it if these could be moved to some struct device API - I've aruged with Christoph a couple of times we need something like that.. drivers/vfio/vfio_iommu_type1.c: if (iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY)) This is doing the above, and also the no-snoop mess that Intel has mixed in. How to exactly properly expose their special no-snoop behavior is definitely something that should be on the domain. drivers/pci/controller/vmd.c: if (iommu_capable(vmd->dev->dev.bus, IOMMU_CAP_INTR_REMAP) || drivers/vfio/vfio_iommu_type1.c: iommu_capable(bus, IOMMU_CAP_INTR_REMAP); Not sure about VMD, but the VFIO one is a security statement. It could be quite happy as a domain query, or a flag 'require secure MSI interrupts' as input to attach_domain. > > > solving it on my own and end up deleting > > > iommu_group_replace_domain() in about 6 months' time anyway. > > > > I expect this API to remain until we figure out a solution to the PPC > > problem, and come up with an alternative way to change the attached > > domain on the fly. > > I though PPC wasn't using the IOMMU API at all... or is that the problem? It needs it both ways, a way to get all the DMA security properties from Lu's series without currently using an iommu_domain to get them. So the design is to attach a NULL domain for PPC and leave it like that. It is surely eventually fixable to introduce a domain to PPC, I would just prefer we not make anything contingent on actually doing that. :\ Jason
diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 408a6d2b3034..66ebce3d1e11 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -677,6 +677,9 @@ void iommu_device_unuse_dma_api(struct device *dev); int iommu_group_set_dma_owner(struct iommu_group *group, void *owner); void iommu_group_release_dma_owner(struct iommu_group *group); bool iommu_group_dma_owner_claimed(struct iommu_group *group); +int iommu_group_replace_domain(struct iommu_group *group, + struct iommu_domain *old, + struct iommu_domain *new); #else /* CONFIG_IOMMU_API */ @@ -1090,6 +1093,13 @@ static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group) { return false; } + +static inline int +iommu_group_replace_domain(struct iommu_group *group, struct iommu_domain *old, + struct iommu_domain *new) +{ + return -ENODEV; +} #endif /* CONFIG_IOMMU_API */ /** diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 72a95dea688e..ab8ab95969f5 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -3431,3 +3431,40 @@ bool iommu_group_dma_owner_claimed(struct iommu_group *group) return user; } EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed); + +/** + * iommu_group_replace_domain() - Replace group's domain + * @group: The group. + * @old: The previous attached domain. NULL for none. + * @new: The new domain about to be attached. + * + * This is to support backward compatibility for vfio which manages the dma + * ownership in iommu_group level. + */ +int iommu_group_replace_domain(struct iommu_group *group, + struct iommu_domain *old, + struct iommu_domain *new) +{ + int ret = 0; + + mutex_lock(&group->mutex); + if (!group->owner || group->domain != old) { + ret = -EPERM; + goto unlock_out; + } + + if (old) + __iommu_detach_group(old, group); + + if (new) { + ret = __iommu_attach_group(new, group); + if (ret && old) + __iommu_attach_group(old, group); + } + +unlock_out: + mutex_unlock(&group->mutex); + + return ret; +} +EXPORT_SYMBOL_GPL(iommu_group_replace_domain);
Expose an interface to replace the domain of an iommu group for frameworks like vfio which claims the ownership of the whole iommu group. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> --- include/linux/iommu.h | 10 ++++++++++ drivers/iommu/iommu.c | 37 +++++++++++++++++++++++++++++++++++++ 2 files changed, 47 insertions(+)