Message ID | 1553658345-43995-7-git-send-email-parav@mellanox.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfio/mdev: Improve vfio/mdev core module | expand |
On Tue, 26 Mar 2019 22:45:44 -0500 Parav Pandit <parav@mellanox.com> wrote: > device_for_each_child() stops executing callback function for remaining > child devices, if callback hits an error. > Each child mdev device is independent of each other. > While unregistering parent device, mdev core must remove all child mdev > devices. > Therefore, mdev_device_remove_cb() always returns success so that s/always returns/must always return/ ? > device_for_each_child doesn't abort if one child removal hits error. > > While at it, improve remove and unregister functions for below simplicity. > > There isn't need to pass forced flag pointer during mdev parent > removal which invokes mdev_device_remove(). So simplify the flow. > > mdev_device_remove() is called from two paths. > 1. mdev_unregister_driver() > mdev_device_remove_cb() > mdev_device_remove() > 2. remove_store() > mdev_device_remove() > > When device is removed by user using remote_store(), device under > removal is mdev device. > When device is removed during parent device removal using generic child > iterator, mdev check is already done using dev_is_mdev(). Isn't there still a possible race condition (which you seem to address with the following patch)? IOW, you cannot remove that loop-under-mutex yet? > > Hence, remove the unnecessary loop in mdev_device_remove(). > > Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") > Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> > Signed-off-by: Parav Pandit <parav@mellanox.com> > --- > drivers/vfio/mdev/mdev_core.c | 23 +++++------------------ > 1 file changed, 5 insertions(+), 18 deletions(-) > > diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c > index 836d319..aefcf34 100644 > --- a/drivers/vfio/mdev/mdev_core.c > +++ b/drivers/vfio/mdev/mdev_core.c > @@ -149,10 +149,10 @@ static int mdev_device_remove_ops(struct mdev_device *mdev, bool force_remove) > Maybe add /* only called during parent device unregistration */ to avoid headscratching in the future? > static int mdev_device_remove_cb(struct device *dev, void *data) > { > - if (!dev_is_mdev(dev)) > - return 0; > + if (dev_is_mdev(dev)) > + mdev_device_remove(dev, true); > > - return mdev_device_remove(dev, data ? *(bool *)data : true); > + return 0; > } > > /* > @@ -240,7 +240,6 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops) > void mdev_unregister_device(struct device *dev) > { > struct mdev_parent *parent; > - bool force_remove = true; > > mutex_lock(&parent_list_lock); > parent = __find_parent_device(dev); > @@ -254,8 +253,7 @@ void mdev_unregister_device(struct device *dev) > list_del(&parent->next); > class_compat_remove_link(mdev_bus_compat_class, dev, NULL); > > - device_for_each_child(dev, (void *)&force_remove, > - mdev_device_remove_cb); > + device_for_each_child(dev, NULL, mdev_device_remove_cb); > > parent_remove_sysfs_files(parent); > Up to this chunk, the patch looks good to me. > @@ -348,24 +346,13 @@ int mdev_device_create(struct kobject *kobj, > > int mdev_device_remove(struct device *dev, bool force_remove) > { > - struct mdev_device *mdev, *tmp; > + struct mdev_device *mdev; > struct mdev_parent *parent; > struct mdev_type *type; > int ret; > > mdev = to_mdev_device(dev); > - > mutex_lock(&mdev_list_lock); > - list_for_each_entry(tmp, &mdev_list, next) { > - if (tmp == mdev) > - break; > - } > - > - if (tmp != mdev) { > - mutex_unlock(&mdev_list_lock); > - return -ENODEV; > - } > - > if (!mdev->active) { > mutex_unlock(&mdev_list_lock); > return -EAGAIN;
> -----Original Message----- > From: Cornelia Huck <cohuck@redhat.com> > Sent: Monday, April 1, 2019 12:39 PM > To: Parav Pandit <parav@mellanox.com> > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > kwankhede@nvidia.com; alex.williamson@redhat.com; cjia@nvidia.com > Subject: Re: [PATCHv1 6/7] vfio/mdev: Fix aborting mdev child device > removal if one fails > > On Tue, 26 Mar 2019 22:45:44 -0500 > Parav Pandit <parav@mellanox.com> wrote: > > > device_for_each_child() stops executing callback function for > > remaining child devices, if callback hits an error. > > Each child mdev device is independent of each other. > > While unregistering parent device, mdev core must remove all child > > mdev devices. > > Therefore, mdev_device_remove_cb() always returns success so that > > s/always returns/must always return/ ? > Must always return. :-) > > device_for_each_child doesn't abort if one child removal hits error. > > > > While at it, improve remove and unregister functions for below simplicity. > > > > There isn't need to pass forced flag pointer during mdev parent > > removal which invokes mdev_device_remove(). So simplify the flow. > > > > mdev_device_remove() is called from two paths. > > 1. mdev_unregister_driver() > > mdev_device_remove_cb() > > mdev_device_remove() > > 2. remove_store() > > mdev_device_remove() > > > > When device is removed by user using remote_store(), device under > > removal is mdev device. > > When device is removed during parent device removal using generic > > child iterator, mdev check is already done using dev_is_mdev(). > > Isn't there still a possible race condition (which you seem to address with > the following patch)? IOW, you cannot remove that loop-under-mutex yet? The loop checks if the remove() is called on the mdev or not. This is already checked from both the paths from remove is invoked. I didn't remove the 'active' check. So it should be fine. > > > > Hence, remove the unnecessary loop in mdev_device_remove(). > > > > Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") > > Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> > > Signed-off-by: Parav Pandit <parav@mellanox.com> > > --- > > drivers/vfio/mdev/mdev_core.c | 23 +++++------------------ > > 1 file changed, 5 insertions(+), 18 deletions(-) > > > > diff --git a/drivers/vfio/mdev/mdev_core.c > > b/drivers/vfio/mdev/mdev_core.c index 836d319..aefcf34 100644 > > --- a/drivers/vfio/mdev/mdev_core.c > > +++ b/drivers/vfio/mdev/mdev_core.c > > @@ -149,10 +149,10 @@ static int mdev_device_remove_ops(struct > > mdev_device *mdev, bool force_remove) > > > > Maybe add > > /* only called during parent device unregistration */ > > to avoid headscratching in the future? > > > static int mdev_device_remove_cb(struct device *dev, void *data) { > > - if (!dev_is_mdev(dev)) > > - return 0; > > + if (dev_is_mdev(dev)) > > + mdev_device_remove(dev, true); > > > > - return mdev_device_remove(dev, data ? *(bool *)data : true); > > + return 0; > > } > > > > /* > > @@ -240,7 +240,6 @@ int mdev_register_device(struct device *dev, const > > struct mdev_parent_ops *ops) void mdev_unregister_device(struct > > device *dev) { > > struct mdev_parent *parent; > > - bool force_remove = true; > > > > mutex_lock(&parent_list_lock); > > parent = __find_parent_device(dev); > > @@ -254,8 +253,7 @@ void mdev_unregister_device(struct device *dev) > > list_del(&parent->next); > > class_compat_remove_link(mdev_bus_compat_class, dev, NULL); > > > > - device_for_each_child(dev, (void *)&force_remove, > > - mdev_device_remove_cb); > > + device_for_each_child(dev, NULL, mdev_device_remove_cb); > > > > parent_remove_sysfs_files(parent); > > > > Up to this chunk, the patch looks good to me. > > > @@ -348,24 +346,13 @@ int mdev_device_create(struct kobject *kobj, > > > > int mdev_device_remove(struct device *dev, bool force_remove) { > > - struct mdev_device *mdev, *tmp; > > + struct mdev_device *mdev; > > struct mdev_parent *parent; > > struct mdev_type *type; > > int ret; > > > > mdev = to_mdev_device(dev); > > - > > mutex_lock(&mdev_list_lock); > > - list_for_each_entry(tmp, &mdev_list, next) { > > - if (tmp == mdev) > > - break; > > - } > > - > > - if (tmp != mdev) { > > - mutex_unlock(&mdev_list_lock); > > - return -ENODEV; > > - } > > - > > if (!mdev->active) { > > mutex_unlock(&mdev_list_lock); > > return -EAGAIN;
On Tue, 2 Apr 2019 19:59:58 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Cornelia Huck <cohuck@redhat.com> > > Sent: Monday, April 1, 2019 12:39 PM > > To: Parav Pandit <parav@mellanox.com> > > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > > kwankhede@nvidia.com; alex.williamson@redhat.com; cjia@nvidia.com > > Subject: Re: [PATCHv1 6/7] vfio/mdev: Fix aborting mdev child device > > removal if one fails > > > > On Tue, 26 Mar 2019 22:45:44 -0500 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > device_for_each_child() stops executing callback function for > > > remaining child devices, if callback hits an error. > > > Each child mdev device is independent of each other. > > > While unregistering parent device, mdev core must remove all child > > > mdev devices. > > > Therefore, mdev_device_remove_cb() always returns success so that > > > > s/always returns/must always return/ ? > > > Must always return. > :-) > > > > device_for_each_child doesn't abort if one child removal hits error. > > > > > > While at it, improve remove and unregister functions for below simplicity. > > > > > > There isn't need to pass forced flag pointer during mdev parent > > > removal which invokes mdev_device_remove(). So simplify the flow. > > > > > > mdev_device_remove() is called from two paths. > > > 1. mdev_unregister_driver() > > > mdev_device_remove_cb() > > > mdev_device_remove() > > > 2. remove_store() > > > mdev_device_remove() > > > > > > When device is removed by user using remote_store(), device under > > > removal is mdev device. > > > When device is removed during parent device removal using generic > > > child iterator, mdev check is already done using dev_is_mdev(). > > > > Isn't there still a possible race condition (which you seem to address with > > the following patch)? IOW, you cannot remove that loop-under-mutex yet? > > The loop checks if the remove() is called on the mdev or not. > This is already checked from both the paths from remove is invoked. > I didn't remove the 'active' check. So it should be fine. I believe the loop was actually trying to sanitize the mdev pointer, for example if it's not in our list of devices we should not even de-reference 'active'. I think maybe this was more fallout from allowing remove to fail. For instance, it seems like manipulating active within the list lock critical section should provide us with mutual exclusion, the mdev object should be valid until the sysfs remove attribute is removed, but remove_store() itself removes that attribute allowing mdev_remove_sysfs_files() to skip over it, but mdev_remove_device() can fail on the remove_store() path causing it to recreate the remove attribute. Now we're in trouble because I'm not sure if recreating the sysfs attribute ever takes a reference to the device. If it does, it's at least racy. Is it time to put the nail in the coffin of these remove failure paths? It seems too fundamental to our code base that drivers cannot do this. Thanks, Alex > > > > > > Hence, remove the unnecessary loop in mdev_device_remove(). > > > > > > Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") > > > Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> > > > Signed-off-by: Parav Pandit <parav@mellanox.com> > > > --- > > > drivers/vfio/mdev/mdev_core.c | 23 +++++------------------ > > > 1 file changed, 5 insertions(+), 18 deletions(-) > > > > > > diff --git a/drivers/vfio/mdev/mdev_core.c > > > b/drivers/vfio/mdev/mdev_core.c index 836d319..aefcf34 100644 > > > --- a/drivers/vfio/mdev/mdev_core.c > > > +++ b/drivers/vfio/mdev/mdev_core.c > > > @@ -149,10 +149,10 @@ static int mdev_device_remove_ops(struct > > > mdev_device *mdev, bool force_remove) > > > > > > > Maybe add > > > > /* only called during parent device unregistration */ > > > > to avoid headscratching in the future? > > > > > static int mdev_device_remove_cb(struct device *dev, void *data) { > > > - if (!dev_is_mdev(dev)) > > > - return 0; > > > + if (dev_is_mdev(dev)) > > > + mdev_device_remove(dev, true); > > > > > > - return mdev_device_remove(dev, data ? *(bool *)data : true); > > > + return 0; > > > } > > > > > > /* > > > @@ -240,7 +240,6 @@ int mdev_register_device(struct device *dev, const > > > struct mdev_parent_ops *ops) void mdev_unregister_device(struct > > > device *dev) { > > > struct mdev_parent *parent; > > > - bool force_remove = true; > > > > > > mutex_lock(&parent_list_lock); > > > parent = __find_parent_device(dev); > > > @@ -254,8 +253,7 @@ void mdev_unregister_device(struct device *dev) > > > list_del(&parent->next); > > > class_compat_remove_link(mdev_bus_compat_class, dev, NULL); > > > > > > - device_for_each_child(dev, (void *)&force_remove, > > > - mdev_device_remove_cb); > > > + device_for_each_child(dev, NULL, mdev_device_remove_cb); > > > > > > parent_remove_sysfs_files(parent); > > > > > > > Up to this chunk, the patch looks good to me. > > > > > @@ -348,24 +346,13 @@ int mdev_device_create(struct kobject *kobj, > > > > > > int mdev_device_remove(struct device *dev, bool force_remove) { > > > - struct mdev_device *mdev, *tmp; > > > + struct mdev_device *mdev; > > > struct mdev_parent *parent; > > > struct mdev_type *type; > > > int ret; > > > > > > mdev = to_mdev_device(dev); > > > - > > > mutex_lock(&mdev_list_lock); > > > - list_for_each_entry(tmp, &mdev_list, next) { > > > - if (tmp == mdev) > > > - break; > > > - } > > > - > > > - if (tmp != mdev) { > > > - mutex_unlock(&mdev_list_lock); > > > - return -ENODEV; > > > - } > > > - > > > if (!mdev->active) { > > > mutex_unlock(&mdev_list_lock); > > > return -EAGAIN; >
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Tuesday, April 2, 2019 5:33 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; kwankhede@nvidia.com; cjia@nvidia.com > Subject: Re: [PATCHv1 6/7] vfio/mdev: Fix aborting mdev child device > removal if one fails > > On Tue, 2 Apr 2019 19:59:58 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > > -----Original Message----- > > > From: Cornelia Huck <cohuck@redhat.com> > > > Sent: Monday, April 1, 2019 12:39 PM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > > > kwankhede@nvidia.com; alex.williamson@redhat.com; cjia@nvidia.com > > > Subject: Re: [PATCHv1 6/7] vfio/mdev: Fix aborting mdev child device > > > removal if one fails > > > > > > On Tue, 26 Mar 2019 22:45:44 -0500 > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > device_for_each_child() stops executing callback function for > > > > remaining child devices, if callback hits an error. > > > > Each child mdev device is independent of each other. > > > > While unregistering parent device, mdev core must remove all child > > > > mdev devices. > > > > Therefore, mdev_device_remove_cb() always returns success so that > > > > > > s/always returns/must always return/ ? > > > > > Must always return. > > :-) > > > > > > device_for_each_child doesn't abort if one child removal hits error. > > > > > > > > While at it, improve remove and unregister functions for below > simplicity. > > > > > > > > There isn't need to pass forced flag pointer during mdev parent > > > > removal which invokes mdev_device_remove(). So simplify the flow. > > > > > > > > mdev_device_remove() is called from two paths. > > > > 1. mdev_unregister_driver() > > > > mdev_device_remove_cb() > > > > mdev_device_remove() > > > > 2. remove_store() > > > > mdev_device_remove() > > > > > > > > When device is removed by user using remote_store(), device under > > > > removal is mdev device. > > > > When device is removed during parent device removal using generic > > > > child iterator, mdev check is already done using dev_is_mdev(). > > > > > > Isn't there still a possible race condition (which you seem to > > > address with the following patch)? IOW, you cannot remove that loop- > under-mutex yet? > > > > The loop checks if the remove() is called on the mdev or not. > > This is already checked from both the paths from remove is invoked. > > I didn't remove the 'active' check. So it should be fine. > > I believe the loop was actually trying to sanitize the mdev pointer, for > example if it's not in our list of devices we should not even de-reference > 'active'. I think maybe this was more fallout from allowing remove to fail. > For instance, it seems like manipulating active within the list lock critical > section should provide us with mutual exclusion, the mdev object should be > valid until the sysfs remove attribute is removed, but remove_store() itself > removes that attribute allowing mdev_remove_sysfs_files() to skip over it, > but > mdev_remove_device() can fail on the remove_store() path causing it to > recreate the remove attribute. Now we're in trouble because I'm not sure if > recreating the sysfs attribute ever takes a reference to the device. If it does, > it's at least racy. Is it time to put the nail in the coffin of these remove > failure paths? It seems too fundamental to our code base that drivers > cannot do this. Thanks, > Yes, I agree. We should follow the right remove/create sequence. + we need this for power management too anyway. There is no point in re-inventing the device model differently. If this series looks fine/merged, I can send v1 of the patch that fixes the callback order. Or you want to update this series? I haven't had chance to go through other email thread yet. > Alex > > > > > > > > > Hence, remove the unnecessary loop in mdev_device_remove(). > > > > > > > > Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") > > > > Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> > > > > Signed-off-by: Parav Pandit <parav@mellanox.com> > > > > --- > > > > drivers/vfio/mdev/mdev_core.c | 23 +++++------------------ > > > > 1 file changed, 5 insertions(+), 18 deletions(-) > > > > > > > > diff --git a/drivers/vfio/mdev/mdev_core.c > > > > b/drivers/vfio/mdev/mdev_core.c index 836d319..aefcf34 100644 > > > > --- a/drivers/vfio/mdev/mdev_core.c > > > > +++ b/drivers/vfio/mdev/mdev_core.c > > > > @@ -149,10 +149,10 @@ static int mdev_device_remove_ops(struct > > > > mdev_device *mdev, bool force_remove) > > > > > > > > > > Maybe add > > > > > > /* only called during parent device unregistration */ > > > > > > to avoid headscratching in the future? > > > > > > > static int mdev_device_remove_cb(struct device *dev, void *data) { > > > > - if (!dev_is_mdev(dev)) > > > > - return 0; > > > > + if (dev_is_mdev(dev)) > > > > + mdev_device_remove(dev, true); > > > > > > > > - return mdev_device_remove(dev, data ? *(bool *)data : true); > > > > + return 0; > > > > } > > > > > > > > /* > > > > @@ -240,7 +240,6 @@ int mdev_register_device(struct device *dev, > > > > const struct mdev_parent_ops *ops) void > > > > mdev_unregister_device(struct device *dev) { > > > > struct mdev_parent *parent; > > > > - bool force_remove = true; > > > > > > > > mutex_lock(&parent_list_lock); > > > > parent = __find_parent_device(dev); @@ -254,8 +253,7 @@ void > > > > mdev_unregister_device(struct device *dev) > > > > list_del(&parent->next); > > > > class_compat_remove_link(mdev_bus_compat_class, dev, NULL); > > > > > > > > - device_for_each_child(dev, (void *)&force_remove, > > > > - mdev_device_remove_cb); > > > > + device_for_each_child(dev, NULL, mdev_device_remove_cb); > > > > > > > > parent_remove_sysfs_files(parent); > > > > > > > > > > Up to this chunk, the patch looks good to me. > > > > > > > @@ -348,24 +346,13 @@ int mdev_device_create(struct kobject *kobj, > > > > > > > > int mdev_device_remove(struct device *dev, bool force_remove) { > > > > - struct mdev_device *mdev, *tmp; > > > > + struct mdev_device *mdev; > > > > struct mdev_parent *parent; > > > > struct mdev_type *type; > > > > int ret; > > > > > > > > mdev = to_mdev_device(dev); > > > > - > > > > mutex_lock(&mdev_list_lock); > > > > - list_for_each_entry(tmp, &mdev_list, next) { > > > > - if (tmp == mdev) > > > > - break; > > > > - } > > > > - > > > > - if (tmp != mdev) { > > > > - mutex_unlock(&mdev_list_lock); > > > > - return -ENODEV; > > > > - } > > > > - > > > > if (!mdev->active) { > > > > mutex_unlock(&mdev_list_lock); > > > > return -EAGAIN; > >
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c index 836d319..aefcf34 100644 --- a/drivers/vfio/mdev/mdev_core.c +++ b/drivers/vfio/mdev/mdev_core.c @@ -149,10 +149,10 @@ static int mdev_device_remove_ops(struct mdev_device *mdev, bool force_remove) static int mdev_device_remove_cb(struct device *dev, void *data) { - if (!dev_is_mdev(dev)) - return 0; + if (dev_is_mdev(dev)) + mdev_device_remove(dev, true); - return mdev_device_remove(dev, data ? *(bool *)data : true); + return 0; } /* @@ -240,7 +240,6 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops) void mdev_unregister_device(struct device *dev) { struct mdev_parent *parent; - bool force_remove = true; mutex_lock(&parent_list_lock); parent = __find_parent_device(dev); @@ -254,8 +253,7 @@ void mdev_unregister_device(struct device *dev) list_del(&parent->next); class_compat_remove_link(mdev_bus_compat_class, dev, NULL); - device_for_each_child(dev, (void *)&force_remove, - mdev_device_remove_cb); + device_for_each_child(dev, NULL, mdev_device_remove_cb); parent_remove_sysfs_files(parent); @@ -348,24 +346,13 @@ int mdev_device_create(struct kobject *kobj, int mdev_device_remove(struct device *dev, bool force_remove) { - struct mdev_device *mdev, *tmp; + struct mdev_device *mdev; struct mdev_parent *parent; struct mdev_type *type; int ret; mdev = to_mdev_device(dev); - mutex_lock(&mdev_list_lock); - list_for_each_entry(tmp, &mdev_list, next) { - if (tmp == mdev) - break; - } - - if (tmp != mdev) { - mutex_unlock(&mdev_list_lock); - return -ENODEV; - } - if (!mdev->active) { mutex_unlock(&mdev_list_lock); return -EAGAIN;