Message ID | 20180518191025.3187.29141.stgit@gimli.home (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 5/19/2018 12:40 AM, Alex Williamson wrote: > When we create an mdev device, we check for duplicates against the > parent device and return -EEXIST if found, but the mdev device > namespace is global since we'll link all devices from the bus. We do > catch this later in sysfs_do_create_link_sd() to return -EEXIST, but > with it comes a kernel warning and stack trace for trying to create > duplicate sysfs links, which makes it an undesirable response. > > Therefore we should really be looking for duplicates across all mdev > parent devices, or as implemented here, against our mdev device list. > Using mdev_list to prevent duplicates means that we can remove > mdev_parent.lock, but in order not to serialize mdev device creation > and removal globally, we add mdev_device.active which allows UUIDs to > be reserved such that we can drop the mdev_list_lock before the mdev > device is fully in place. > > Two behavioral notes; first, mdev_parent.lock had the side-effect of > serializing mdev create and remove ops per parent device. This was > an implementation detail, not an intentional guarantee provided to > the mdev vendor drivers. Vendor drivers can trivially provide this > serialization internally if necessary. Second, review comments note > the new -EAGAIN behavior when the device, and in particular the remove > attribute, becomes visible in sysfs. If a remove is triggered prior > to completion of mdev_device_create() the user will see a -EAGAIN > error. While the errno is different, receiving an error during this > period is not, the previous implementation returned -ENODEV for the > same condition. Furthermore, the consistency to the user is improved > in the case where mdev_device_remove_ops() returns error. Previously > concurrent calls to mdev_device_remove() could see the device > disappear with -ENODEV and return in the case of error. Now a user > would see -EAGAIN while the device is in this transitory state. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Looks good to me. Reviewed by: Kirti Wankhede <kwankhede@nvidia.com> > --- > Documentation/vfio-mediated-device.txt | 5 ++ > drivers/vfio/mdev/mdev_core.c | 102 +++++++++++--------------------- > drivers/vfio/mdev/mdev_private.h | 2 - > 3 files changed, 42 insertions(+), 67 deletions(-) > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt > index 1b3950346532..c3f69bcaf96e 100644 > --- a/Documentation/vfio-mediated-device.txt > +++ b/Documentation/vfio-mediated-device.txt > @@ -145,6 +145,11 @@ The functions in the mdev_parent_ops structure are as follows: > * create: allocate basic resources in a driver for a mediated device > * remove: free resources in a driver when a mediated device is destroyed > > +(Note that mdev-core provides no implicit serialization of create/remove > +callbacks per mdev parent device, per mdev type, or any other categorization. > +Vendor drivers are expected to be fully asynchronous in this respect or > +provide their own internal resource protection.) > + > The callbacks in the mdev_parent_ops structure are as follows: > > * open: open callback of mediated device > diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c > index 126991046eb7..0212f0ee8aea 100644 > --- a/drivers/vfio/mdev/mdev_core.c > +++ b/drivers/vfio/mdev/mdev_core.c > @@ -66,34 +66,6 @@ uuid_le mdev_uuid(struct mdev_device *mdev) > } > EXPORT_SYMBOL(mdev_uuid); > > -static int _find_mdev_device(struct device *dev, void *data) > -{ > - struct mdev_device *mdev; > - > - if (!dev_is_mdev(dev)) > - return 0; > - > - mdev = to_mdev_device(dev); > - > - if (uuid_le_cmp(mdev->uuid, *(uuid_le *)data) == 0) > - return 1; > - > - return 0; > -} > - > -static bool mdev_device_exist(struct mdev_parent *parent, uuid_le uuid) > -{ > - struct device *dev; > - > - dev = device_find_child(parent->dev, &uuid, _find_mdev_device); > - if (dev) { > - put_device(dev); > - return true; > - } > - > - return false; > -} > - > /* Should be called holding parent_list_lock */ > static struct mdev_parent *__find_parent_device(struct device *dev) > { > @@ -221,7 +193,6 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops) > } > > kref_init(&parent->ref); > - mutex_init(&parent->lock); > > parent->dev = dev; > parent->ops = ops; > @@ -297,6 +268,10 @@ static void mdev_device_release(struct device *dev) > { > struct mdev_device *mdev = to_mdev_device(dev); > > + mutex_lock(&mdev_list_lock); > + list_del(&mdev->next); > + mutex_unlock(&mdev_list_lock); > + > dev_dbg(&mdev->dev, "MDEV: destroying\n"); > kfree(mdev); > } > @@ -304,7 +279,7 @@ static void mdev_device_release(struct device *dev) > int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid) > { > int ret; > - struct mdev_device *mdev; > + struct mdev_device *mdev, *tmp; > struct mdev_parent *parent; > struct mdev_type *type = to_mdev_type(kobj); > > @@ -312,21 +287,28 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid) > if (!parent) > return -EINVAL; > > - mutex_lock(&parent->lock); > + mutex_lock(&mdev_list_lock); > > /* Check for duplicate */ > - if (mdev_device_exist(parent, uuid)) { > - ret = -EEXIST; > - goto create_err; > + list_for_each_entry(tmp, &mdev_list, next) { > + if (!uuid_le_cmp(tmp->uuid, uuid)) { > + mutex_unlock(&mdev_list_lock); > + ret = -EEXIST; > + goto mdev_fail; > + } > } > > mdev = kzalloc(sizeof(*mdev), GFP_KERNEL); > if (!mdev) { > + mutex_unlock(&mdev_list_lock); > ret = -ENOMEM; > - goto create_err; > + goto mdev_fail; > } > > memcpy(&mdev->uuid, &uuid, sizeof(uuid_le)); > + list_add(&mdev->next, &mdev_list); > + mutex_unlock(&mdev_list_lock); > + > mdev->parent = parent; > kref_init(&mdev->ref); > > @@ -338,35 +320,28 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid) > ret = device_register(&mdev->dev); > if (ret) { > put_device(&mdev->dev); > - goto create_err; > + goto mdev_fail; > } > > ret = mdev_device_create_ops(kobj, mdev); > if (ret) > - goto create_failed; > + goto create_fail; > > ret = mdev_create_sysfs_files(&mdev->dev, type); > if (ret) { > mdev_device_remove_ops(mdev, true); > - goto create_failed; > + goto create_fail; > } > > mdev->type_kobj = kobj; > + mdev->active = true; > dev_dbg(&mdev->dev, "MDEV: created\n"); > > - mutex_unlock(&parent->lock); > - > - mutex_lock(&mdev_list_lock); > - list_add(&mdev->next, &mdev_list); > - mutex_unlock(&mdev_list_lock); > - > - return ret; > + return 0; > > -create_failed: > +create_fail: > device_unregister(&mdev->dev); > - > -create_err: > - mutex_unlock(&parent->lock); > +mdev_fail: > mdev_put_parent(parent); > return ret; > } > @@ -377,44 +352,39 @@ int mdev_device_remove(struct device *dev, bool force_remove) > struct mdev_parent *parent; > struct mdev_type *type; > int ret; > - bool found = false; > > mdev = to_mdev_device(dev); > > mutex_lock(&mdev_list_lock); > list_for_each_entry(tmp, &mdev_list, next) { > - if (tmp == mdev) { > - found = true; > + if (tmp == mdev) > break; > - } > } > > - if (found) > - list_del(&mdev->next); > + if (tmp != mdev) { > + mutex_unlock(&mdev_list_lock); > + return -ENODEV; > + } > > - mutex_unlock(&mdev_list_lock); > + if (!mdev->active) { > + mutex_unlock(&mdev_list_lock); > + return -EAGAIN; > + } > > - if (!found) > - return -ENODEV; > + mdev->active = false; > + mutex_unlock(&mdev_list_lock); > > type = to_mdev_type(mdev->type_kobj); > parent = mdev->parent; > - mutex_lock(&parent->lock); > > ret = mdev_device_remove_ops(mdev, force_remove); > if (ret) { > - mutex_unlock(&parent->lock); > - > - mutex_lock(&mdev_list_lock); > - list_add(&mdev->next, &mdev_list); > - mutex_unlock(&mdev_list_lock); > - > + mdev->active = true; > return ret; > } > > mdev_remove_sysfs_files(dev, type); > device_unregister(dev); > - mutex_unlock(&parent->lock); > mdev_put_parent(parent); > > return 0; > diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h > index a9cefd70a705..b5819b7d7ef7 100644 > --- a/drivers/vfio/mdev/mdev_private.h > +++ b/drivers/vfio/mdev/mdev_private.h > @@ -20,7 +20,6 @@ struct mdev_parent { > struct device *dev; > const struct mdev_parent_ops *ops; > struct kref ref; > - struct mutex lock; > struct list_head next; > struct kset *mdev_types_kset; > struct list_head type_list; > @@ -34,6 +33,7 @@ struct mdev_device { > struct kref ref; > struct list_head next; > struct kobject *type_kobj; > + bool active; > }; > > #define to_mdev_device(dev) container_of(dev, struct mdev_device, dev) >
On Fri, 18 May 2018 13:10:25 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > When we create an mdev device, we check for duplicates against the > parent device and return -EEXIST if found, but the mdev device > namespace is global since we'll link all devices from the bus. We do > catch this later in sysfs_do_create_link_sd() to return -EEXIST, but > with it comes a kernel warning and stack trace for trying to create > duplicate sysfs links, which makes it an undesirable response. > > Therefore we should really be looking for duplicates across all mdev > parent devices, or as implemented here, against our mdev device list. > Using mdev_list to prevent duplicates means that we can remove > mdev_parent.lock, but in order not to serialize mdev device creation > and removal globally, we add mdev_device.active which allows UUIDs to > be reserved such that we can drop the mdev_list_lock before the mdev > device is fully in place. > > Two behavioral notes; first, mdev_parent.lock had the side-effect of > serializing mdev create and remove ops per parent device. This was > an implementation detail, not an intentional guarantee provided to > the mdev vendor drivers. Vendor drivers can trivially provide this > serialization internally if necessary. Second, review comments note > the new -EAGAIN behavior when the device, and in particular the remove > attribute, becomes visible in sysfs. If a remove is triggered prior > to completion of mdev_device_create() the user will see a -EAGAIN > error. While the errno is different, receiving an error during this > period is not, the previous implementation returned -ENODEV for the > same condition. Furthermore, the consistency to the user is improved > in the case where mdev_device_remove_ops() returns error. Previously > concurrent calls to mdev_device_remove() could see the device > disappear with -ENODEV and return in the case of error. Now a user > would see -EAGAIN while the device is in this transitory state. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > --- > Documentation/vfio-mediated-device.txt | 5 ++ > drivers/vfio/mdev/mdev_core.c | 102 +++++++++++--------------------- > drivers/vfio/mdev/mdev_private.h | 2 - > 3 files changed, 42 insertions(+), 67 deletions(-) Reviewed-by: Cornelia Huck <cohuck@redhat.com> I think it is better to deal with any possible vendor driver implications on top of this (I still believe that vfio-ccw is fine).
[Cc +GVT-g maintainers/lists] On Tue, 22 May 2018 10:13:46 +0200 Cornelia Huck <cohuck@redhat.com> wrote: > On Fri, 18 May 2018 13:10:25 -0600 > Alex Williamson <alex.williamson@redhat.com> wrote: > > > When we create an mdev device, we check for duplicates against the > > parent device and return -EEXIST if found, but the mdev device > > namespace is global since we'll link all devices from the bus. We do > > catch this later in sysfs_do_create_link_sd() to return -EEXIST, but > > with it comes a kernel warning and stack trace for trying to create > > duplicate sysfs links, which makes it an undesirable response. > > > > Therefore we should really be looking for duplicates across all mdev > > parent devices, or as implemented here, against our mdev device list. > > Using mdev_list to prevent duplicates means that we can remove > > mdev_parent.lock, but in order not to serialize mdev device creation > > and removal globally, we add mdev_device.active which allows UUIDs to > > be reserved such that we can drop the mdev_list_lock before the mdev > > device is fully in place. > > > > Two behavioral notes; first, mdev_parent.lock had the side-effect of > > serializing mdev create and remove ops per parent device. This was > > an implementation detail, not an intentional guarantee provided to > > the mdev vendor drivers. Vendor drivers can trivially provide this > > serialization internally if necessary. Second, review comments note > > the new -EAGAIN behavior when the device, and in particular the remove > > attribute, becomes visible in sysfs. If a remove is triggered prior > > to completion of mdev_device_create() the user will see a -EAGAIN > > error. While the errno is different, receiving an error during this > > period is not, the previous implementation returned -ENODEV for the > > same condition. Furthermore, the consistency to the user is improved > > in the case where mdev_device_remove_ops() returns error. Previously > > concurrent calls to mdev_device_remove() could see the device > > disappear with -ENODEV and return in the case of error. Now a user > > would see -EAGAIN while the device is in this transitory state. > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > --- > > Documentation/vfio-mediated-device.txt | 5 ++ > > drivers/vfio/mdev/mdev_core.c | 102 +++++++++++--------------------- > > drivers/vfio/mdev/mdev_private.h | 2 - > > 3 files changed, 42 insertions(+), 67 deletions(-) > > Reviewed-by: Cornelia Huck <cohuck@redhat.com> > > I think it is better to deal with any possible vendor driver > implications on top of this (I still believe that vfio-ccw is fine). Thanks Cornelia. So if vfio-ccw is fine, presumably NVIDIA is fine, then this leaves GVT-g to see if there's any fallout. Zhenyu & Zhi, I've linked the series under discussion here below[1]. The question to you is the first of the two behavioral notes listed above, does GVT-g have any dependency on the mdev core providing serialization per mdev parent device for the create and remove callbacks within the mdev_parent_ops? This was never an intended feature of the implementation and as noted it should be trivial for for an mdev vendor driver to provide equivalent course grained serialization if necessary. Of course it would be better to implement that sooner rather than later if required. I see that __intel_gvt_create_vgpu() makes use of gvt->lock, which would seem to already provide this level of per-parent locking. The remove path makes use of this same lock, so I think we're ok, but looking for an explicit ack so there are no surprises. I'd like to queue this series for v4.18. Thanks, Alex [1] https://lkml.org/lkml/2018/5/18/1035
On 2018.05.22 09:53:37 -0600, Alex Williamson wrote: > [Cc +GVT-g maintainers/lists] > > On Tue, 22 May 2018 10:13:46 +0200 > Cornelia Huck <cohuck@redhat.com> wrote: > > > On Fri, 18 May 2018 13:10:25 -0600 > > Alex Williamson <alex.williamson@redhat.com> wrote: > > > > > When we create an mdev device, we check for duplicates against the > > > parent device and return -EEXIST if found, but the mdev device > > > namespace is global since we'll link all devices from the bus. We do > > > catch this later in sysfs_do_create_link_sd() to return -EEXIST, but > > > with it comes a kernel warning and stack trace for trying to create > > > duplicate sysfs links, which makes it an undesirable response. > > > > > > Therefore we should really be looking for duplicates across all mdev > > > parent devices, or as implemented here, against our mdev device list. > > > Using mdev_list to prevent duplicates means that we can remove > > > mdev_parent.lock, but in order not to serialize mdev device creation > > > and removal globally, we add mdev_device.active which allows UUIDs to > > > be reserved such that we can drop the mdev_list_lock before the mdev > > > device is fully in place. > > > > > > Two behavioral notes; first, mdev_parent.lock had the side-effect of > > > serializing mdev create and remove ops per parent device. This was > > > an implementation detail, not an intentional guarantee provided to > > > the mdev vendor drivers. Vendor drivers can trivially provide this > > > serialization internally if necessary. Second, review comments note > > > the new -EAGAIN behavior when the device, and in particular the remove > > > attribute, becomes visible in sysfs. If a remove is triggered prior > > > to completion of mdev_device_create() the user will see a -EAGAIN > > > error. While the errno is different, receiving an error during this > > > period is not, the previous implementation returned -ENODEV for the > > > same condition. Furthermore, the consistency to the user is improved > > > in the case where mdev_device_remove_ops() returns error. Previously > > > concurrent calls to mdev_device_remove() could see the device > > > disappear with -ENODEV and return in the case of error. Now a user > > > would see -EAGAIN while the device is in this transitory state. > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > --- > > > Documentation/vfio-mediated-device.txt | 5 ++ > > > drivers/vfio/mdev/mdev_core.c | 102 +++++++++++--------------------- > > > drivers/vfio/mdev/mdev_private.h | 2 - > > > 3 files changed, 42 insertions(+), 67 deletions(-) > > > > Reviewed-by: Cornelia Huck <cohuck@redhat.com> > > > > I think it is better to deal with any possible vendor driver > > implications on top of this (I still believe that vfio-ccw is fine). > > Thanks Cornelia. So if vfio-ccw is fine, presumably NVIDIA is fine, > then this leaves GVT-g to see if there's any fallout. Zhenyu & Zhi, > I've linked the series under discussion here below[1]. The question to > you is the first of the two behavioral notes listed above, does GVT-g > have any dependency on the mdev core providing serialization per mdev > parent device for the create and remove callbacks within the > mdev_parent_ops? This was never an intended feature of the > implementation and as noted it should be trivial for for an mdev vendor > driver to provide equivalent course grained serialization if > necessary. Of course it would be better to implement that sooner > rather than later if required. > > I see that __intel_gvt_create_vgpu() makes use of gvt->lock, which > would seem to already provide this level of per-parent locking. The > remove path makes use of this same lock, so I think we're ok, but > looking for an explicit ack so there are no surprises. I'd like > to queue this series for v4.18. Thanks, > yeah, we don't expect mdev core for parent serialization for create and remove of mdev device. Series look good to me. Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> > Alex > > [1] https://lkml.org/lkml/2018/5/18/1035
diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt index 1b3950346532..c3f69bcaf96e 100644 --- a/Documentation/vfio-mediated-device.txt +++ b/Documentation/vfio-mediated-device.txt @@ -145,6 +145,11 @@ The functions in the mdev_parent_ops structure are as follows: * create: allocate basic resources in a driver for a mediated device * remove: free resources in a driver when a mediated device is destroyed +(Note that mdev-core provides no implicit serialization of create/remove +callbacks per mdev parent device, per mdev type, or any other categorization. +Vendor drivers are expected to be fully asynchronous in this respect or +provide their own internal resource protection.) + The callbacks in the mdev_parent_ops structure are as follows: * open: open callback of mediated device diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c index 126991046eb7..0212f0ee8aea 100644 --- a/drivers/vfio/mdev/mdev_core.c +++ b/drivers/vfio/mdev/mdev_core.c @@ -66,34 +66,6 @@ uuid_le mdev_uuid(struct mdev_device *mdev) } EXPORT_SYMBOL(mdev_uuid); -static int _find_mdev_device(struct device *dev, void *data) -{ - struct mdev_device *mdev; - - if (!dev_is_mdev(dev)) - return 0; - - mdev = to_mdev_device(dev); - - if (uuid_le_cmp(mdev->uuid, *(uuid_le *)data) == 0) - return 1; - - return 0; -} - -static bool mdev_device_exist(struct mdev_parent *parent, uuid_le uuid) -{ - struct device *dev; - - dev = device_find_child(parent->dev, &uuid, _find_mdev_device); - if (dev) { - put_device(dev); - return true; - } - - return false; -} - /* Should be called holding parent_list_lock */ static struct mdev_parent *__find_parent_device(struct device *dev) { @@ -221,7 +193,6 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops) } kref_init(&parent->ref); - mutex_init(&parent->lock); parent->dev = dev; parent->ops = ops; @@ -297,6 +268,10 @@ static void mdev_device_release(struct device *dev) { struct mdev_device *mdev = to_mdev_device(dev); + mutex_lock(&mdev_list_lock); + list_del(&mdev->next); + mutex_unlock(&mdev_list_lock); + dev_dbg(&mdev->dev, "MDEV: destroying\n"); kfree(mdev); } @@ -304,7 +279,7 @@ static void mdev_device_release(struct device *dev) int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid) { int ret; - struct mdev_device *mdev; + struct mdev_device *mdev, *tmp; struct mdev_parent *parent; struct mdev_type *type = to_mdev_type(kobj); @@ -312,21 +287,28 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid) if (!parent) return -EINVAL; - mutex_lock(&parent->lock); + mutex_lock(&mdev_list_lock); /* Check for duplicate */ - if (mdev_device_exist(parent, uuid)) { - ret = -EEXIST; - goto create_err; + list_for_each_entry(tmp, &mdev_list, next) { + if (!uuid_le_cmp(tmp->uuid, uuid)) { + mutex_unlock(&mdev_list_lock); + ret = -EEXIST; + goto mdev_fail; + } } mdev = kzalloc(sizeof(*mdev), GFP_KERNEL); if (!mdev) { + mutex_unlock(&mdev_list_lock); ret = -ENOMEM; - goto create_err; + goto mdev_fail; } memcpy(&mdev->uuid, &uuid, sizeof(uuid_le)); + list_add(&mdev->next, &mdev_list); + mutex_unlock(&mdev_list_lock); + mdev->parent = parent; kref_init(&mdev->ref); @@ -338,35 +320,28 @@ int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid) ret = device_register(&mdev->dev); if (ret) { put_device(&mdev->dev); - goto create_err; + goto mdev_fail; } ret = mdev_device_create_ops(kobj, mdev); if (ret) - goto create_failed; + goto create_fail; ret = mdev_create_sysfs_files(&mdev->dev, type); if (ret) { mdev_device_remove_ops(mdev, true); - goto create_failed; + goto create_fail; } mdev->type_kobj = kobj; + mdev->active = true; dev_dbg(&mdev->dev, "MDEV: created\n"); - mutex_unlock(&parent->lock); - - mutex_lock(&mdev_list_lock); - list_add(&mdev->next, &mdev_list); - mutex_unlock(&mdev_list_lock); - - return ret; + return 0; -create_failed: +create_fail: device_unregister(&mdev->dev); - -create_err: - mutex_unlock(&parent->lock); +mdev_fail: mdev_put_parent(parent); return ret; } @@ -377,44 +352,39 @@ int mdev_device_remove(struct device *dev, bool force_remove) struct mdev_parent *parent; struct mdev_type *type; int ret; - bool found = false; mdev = to_mdev_device(dev); mutex_lock(&mdev_list_lock); list_for_each_entry(tmp, &mdev_list, next) { - if (tmp == mdev) { - found = true; + if (tmp == mdev) break; - } } - if (found) - list_del(&mdev->next); + if (tmp != mdev) { + mutex_unlock(&mdev_list_lock); + return -ENODEV; + } - mutex_unlock(&mdev_list_lock); + if (!mdev->active) { + mutex_unlock(&mdev_list_lock); + return -EAGAIN; + } - if (!found) - return -ENODEV; + mdev->active = false; + mutex_unlock(&mdev_list_lock); type = to_mdev_type(mdev->type_kobj); parent = mdev->parent; - mutex_lock(&parent->lock); ret = mdev_device_remove_ops(mdev, force_remove); if (ret) { - mutex_unlock(&parent->lock); - - mutex_lock(&mdev_list_lock); - list_add(&mdev->next, &mdev_list); - mutex_unlock(&mdev_list_lock); - + mdev->active = true; return ret; } mdev_remove_sysfs_files(dev, type); device_unregister(dev); - mutex_unlock(&parent->lock); mdev_put_parent(parent); return 0; diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h index a9cefd70a705..b5819b7d7ef7 100644 --- a/drivers/vfio/mdev/mdev_private.h +++ b/drivers/vfio/mdev/mdev_private.h @@ -20,7 +20,6 @@ struct mdev_parent { struct device *dev; const struct mdev_parent_ops *ops; struct kref ref; - struct mutex lock; struct list_head next; struct kset *mdev_types_kset; struct list_head type_list; @@ -34,6 +33,7 @@ struct mdev_device { struct kref ref; struct list_head next; struct kobject *type_kobj; + bool active; }; #define to_mdev_device(dev) container_of(dev, struct mdev_device, dev)
When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- Documentation/vfio-mediated-device.txt | 5 ++ drivers/vfio/mdev/mdev_core.c | 102 +++++++++++--------------------- drivers/vfio/mdev/mdev_private.h | 2 - 3 files changed, 42 insertions(+), 67 deletions(-)