Message ID | f91cc278a173a95969af16c46442f18b639d4ea9.1729897352.git.nicolinc@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | iommufd: Add vIOMMU infrastructure (Part-1) | expand |
> From: Nicolin Chen <nicolinc@nvidia.com> > Sent: Saturday, October 26, 2024 7:50 AM > > For an iommu_dev that can unplug (so far only this selftest does so), the > viommu->iommu_dev pointer has no guarantee of its life cycle after it is > copied from the idev->dev->iommu->iommu_dev. > > Track the user count of the iommu_dev. Postpone the exit routine using a > completion, if refcount is unbalanced. The refcount inc/dec will be added > in the following patch. > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
On Fri, Oct 25, 2024 at 04:49:49PM -0700, Nicolin Chen wrote: > For an iommu_dev that can unplug (so far only this selftest does so), the > viommu->iommu_dev pointer has no guarantee of its life cycle after it is > copied from the idev->dev->iommu->iommu_dev. > > Track the user count of the iommu_dev. Postpone the exit routine using a > completion, if refcount is unbalanced. The refcount inc/dec will be added > in the following patch. > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > --- > drivers/iommu/iommufd/selftest.c | 32 ++++++++++++++++++++++++-------- > 1 file changed, 24 insertions(+), 8 deletions(-) Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Since this is built into the iommufd module it can't be unloaded without also unloading iommufd, which is impossible as long as any iommufd FDs are open. So I expect that the WARN_ON can never happen. Jason
On Tue, Oct 29, 2024 at 12:34:38PM -0300, Jason Gunthorpe wrote: > On Fri, Oct 25, 2024 at 04:49:49PM -0700, Nicolin Chen wrote: > > For an iommu_dev that can unplug (so far only this selftest does so), the > > viommu->iommu_dev pointer has no guarantee of its life cycle after it is > > copied from the idev->dev->iommu->iommu_dev. > > > > Track the user count of the iommu_dev. Postpone the exit routine using a > > completion, if refcount is unbalanced. The refcount inc/dec will be added > > in the following patch. > > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > > --- > > drivers/iommu/iommufd/selftest.c | 32 ++++++++++++++++++++++++-------- > > 1 file changed, 24 insertions(+), 8 deletions(-) > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > Since this is built into the iommufd module it can't be unloaded > without also unloading iommufd, which is impossible as long as any > iommufd FDs are open. So I expect that the WARN_ON can never happen. Hmm, I assume we still need this patch then? Could a faulty "--force" possibly trigger it? Nicolin
On Tue, Oct 29, 2024 at 09:02:58AM -0700, Nicolin Chen wrote: > On Tue, Oct 29, 2024 at 12:34:38PM -0300, Jason Gunthorpe wrote: > > On Fri, Oct 25, 2024 at 04:49:49PM -0700, Nicolin Chen wrote: > > > For an iommu_dev that can unplug (so far only this selftest does so), the > > > viommu->iommu_dev pointer has no guarantee of its life cycle after it is > > > copied from the idev->dev->iommu->iommu_dev. > > > > > > Track the user count of the iommu_dev. Postpone the exit routine using a > > > completion, if refcount is unbalanced. The refcount inc/dec will be added > > > in the following patch. > > > > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > > > --- > > > drivers/iommu/iommufd/selftest.c | 32 ++++++++++++++++++++++++-------- > > > 1 file changed, 24 insertions(+), 8 deletions(-) > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > > > Since this is built into the iommufd module it can't be unloaded > > without also unloading iommufd, which is impossible as long as any > > iommufd FDs are open. So I expect that the WARN_ON can never happen. > > Hmm, I assume we still need this patch then? I was thinking, I think it still is a reasonable example of what it might look like You might include the above remark as a comment above the WARN_ON though. > Could a faulty "--force" possibly trigger it? I'm not sure, I suspect not? Jason
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 92d753985640..2d33b35da704 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -533,14 +533,17 @@ static bool mock_domain_capable(struct device *dev, enum iommu_cap cap) static struct iopf_queue *mock_iommu_iopf_queue; -static struct iommu_device mock_iommu_device = { -}; +static struct mock_iommu_device { + struct iommu_device iommu_dev; + struct completion complete; + refcount_t users; +} mock_iommu; static struct iommu_device *mock_probe_device(struct device *dev) { if (dev->bus != &iommufd_mock_bus_type.bus) return ERR_PTR(-ENODEV); - return &mock_iommu_device; + return &mock_iommu.iommu_dev; } static void mock_domain_page_response(struct device *dev, struct iopf_fault *evt, @@ -1556,24 +1559,27 @@ int __init iommufd_test_init(void) if (rc) goto err_platform; - rc = iommu_device_sysfs_add(&mock_iommu_device, + rc = iommu_device_sysfs_add(&mock_iommu.iommu_dev, &selftest_iommu_dev->dev, NULL, "%s", dev_name(&selftest_iommu_dev->dev)); if (rc) goto err_bus; - rc = iommu_device_register_bus(&mock_iommu_device, &mock_ops, + rc = iommu_device_register_bus(&mock_iommu.iommu_dev, &mock_ops, &iommufd_mock_bus_type.bus, &iommufd_mock_bus_type.nb); if (rc) goto err_sysfs; + refcount_set(&mock_iommu.users, 1); + init_completion(&mock_iommu.complete); + mock_iommu_iopf_queue = iopf_queue_alloc("mock-iopfq"); return 0; err_sysfs: - iommu_device_sysfs_remove(&mock_iommu_device); + iommu_device_sysfs_remove(&mock_iommu.iommu_dev); err_bus: bus_unregister(&iommufd_mock_bus_type.bus); err_platform: @@ -1583,6 +1589,15 @@ int __init iommufd_test_init(void) return rc; } +static void iommufd_test_wait_for_users(void) +{ + if (refcount_dec_and_test(&mock_iommu.users)) + return; + /* Time out waiting for iommu device user count to become 0 */ + WARN_ON(!wait_for_completion_timeout(&mock_iommu.complete, + msecs_to_jiffies(10000))); +} + void iommufd_test_exit(void) { if (mock_iommu_iopf_queue) { @@ -1590,8 +1605,9 @@ void iommufd_test_exit(void) mock_iommu_iopf_queue = NULL; } - iommu_device_sysfs_remove(&mock_iommu_device); - iommu_device_unregister_bus(&mock_iommu_device, + iommufd_test_wait_for_users(); + iommu_device_sysfs_remove(&mock_iommu.iommu_dev); + iommu_device_unregister_bus(&mock_iommu.iommu_dev, &iommufd_mock_bus_type.bus, &iommufd_mock_bus_type.nb); bus_unregister(&iommufd_mock_bus_type.bus);
For an iommu_dev that can unplug (so far only this selftest does so), the viommu->iommu_dev pointer has no guarantee of its life cycle after it is copied from the idev->dev->iommu->iommu_dev. Track the user count of the iommu_dev. Postpone the exit routine using a completion, if refcount is unbalanced. The refcount inc/dec will be added in the following patch. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> --- drivers/iommu/iommufd/selftest.c | 32 ++++++++++++++++++++++++-------- 1 file changed, 24 insertions(+), 8 deletions(-)