Message ID | 1230b0d2ba99ad546d72ab079e76cb1b3df32afb.1613216412.git.matti.vaittinen@fi.rohmeurope.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Add managed version of delayed work init | expand |
On Sat, Feb 13, 2021 at 01:58:44PM +0200, Matti Vaittinen wrote: > A few drivers which need a delayed work-queue must cancel work at exit. > Some of those implement remove solely for this purpose. Help drivers > to avoid unnecessary remove and error-branch implementation by adding > managed verision of delayed work initialization > > Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> That's not a good idea. As this would kick in when the device is removed from the system, not when it is unbound from the driver, right? > --- > drivers/base/devres.c | 33 +++++++++++++++++++++++++++++++++ > include/linux/device.h | 5 +++++ > 2 files changed, 38 insertions(+) > > diff --git a/drivers/base/devres.c b/drivers/base/devres.c > index fb9d5289a620..2879595bb5a4 100644 > --- a/drivers/base/devres.c > +++ b/drivers/base/devres.c > @@ -1231,3 +1231,36 @@ void devm_free_percpu(struct device *dev, void __percpu *pdata) > (void *)pdata)); > } > EXPORT_SYMBOL_GPL(devm_free_percpu); > + > +static void dev_delayed_work_drop(struct device *dev, void *res) > +{ > + cancel_delayed_work_sync(*(struct delayed_work **)res); > +} > + > +/** > + * devm_delayed_work_autocancel - Resource-managed work allocation > + * @dev: Device which lifetime work is bound to > + * @pdata: work to be cancelled when device exits > + * > + * Initialize work which is automatically cancelled when device exits. There is no such thing in the driver model as "when device exits". Please use the proper terminology as I do not understand what you think this is doing here... > + * A few drivers need delayed work which must be cancelled before driver > + * is unload to avoid accessing removed resources. > + * devm_delayed_work_autocancel() can be used to omit the explicit > + * cancelleation when driver is unload. > + */ > +int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, > + void (*worker)(struct work_struct *work)) > +{ > + struct delayed_work **ptr; > + > + ptr = devres_alloc(dev_delayed_work_drop, sizeof(*ptr), GFP_KERNEL); > + if (!ptr) > + return -ENOMEM; > + > + INIT_DELAYED_WORK(w, worker); > + *ptr = w; > + devres_add(dev, ptr); > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(devm_delayed_work_autocancel); > diff --git a/include/linux/device.h b/include/linux/device.h > index 1779f90eeb4c..192456198de7 100644 > --- a/include/linux/device.h > +++ b/include/linux/device.h > @@ -27,6 +27,7 @@ > #include <linux/uidgid.h> > #include <linux/gfp.h> > #include <linux/overflow.h> > +#include <linux/workqueue.h> > #include <linux/device/bus.h> > #include <linux/device/class.h> > #include <linux/device/driver.h> > @@ -249,6 +250,10 @@ void __iomem *devm_of_iomap(struct device *dev, > struct device_node *node, int index, > resource_size_t *size); > > +/* delayed work which is cancelled when driver exits */ Not when the "driver exits". There is two different lifespans here (well 3). Code and data*2. Don't confuse them as that will just cause lots of problems. The move toward more and more "devm" functions is not the way to go as they just more and more make things easier to get wrong. APIs should be impossible to get wrong, this one is going to be almost impossible to get right. thanks, greg k-h
On Sat, 2021-02-13 at 13:16 +0100, Greg Kroah-Hartman wrote: > On Sat, Feb 13, 2021 at 01:58:44PM +0200, Matti Vaittinen wrote: > > A few drivers which need a delayed work-queue must cancel work at > > exit. > > Some of those implement remove solely for this purpose. Help > > drivers > > to avoid unnecessary remove and error-branch implementation by > > adding > > managed verision of delayed work initialization > > > > Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> > > That's not a good idea. As this would kick in when the device is > removed from the system, not when it is unbound from the driver, > right? > > > --- > > drivers/base/devres.c | 33 +++++++++++++++++++++++++++++++++ > > include/linux/device.h | 5 +++++ > > 2 files changed, 38 insertions(+) > > > > diff --git a/drivers/base/devres.c b/drivers/base/devres.c > > index fb9d5289a620..2879595bb5a4 100644 > > --- a/drivers/base/devres.c > > +++ b/drivers/base/devres.c > > @@ -1231,3 +1231,36 @@ void devm_free_percpu(struct device *dev, > > void __percpu *pdata) > > (void *)pdata)); > > } > > EXPORT_SYMBOL_GPL(devm_free_percpu); > > + > > +static void dev_delayed_work_drop(struct device *dev, void *res) > > +{ > > + cancel_delayed_work_sync(*(struct delayed_work **)res); > > +} > > + > > +/** > > + * devm_delayed_work_autocancel - Resource-managed work allocation > > + * @dev: Device which lifetime work is bound to > > + * @pdata: work to be cancelled when device exits > > + * > > + * Initialize work which is automatically cancelled when device > > exits. > > There is no such thing in the driver model as "when device exits". > Please use the proper terminology as I do not understand what you > think > this is doing here... > > > + * A few drivers need delayed work which must be cancelled before > > driver > > + * is unload to avoid accessing removed resources. > > + * devm_delayed_work_autocancel() can be used to omit the explicit > > + * cancelleation when driver is unload. > > + */ > > +int devm_delayed_work_autocancel(struct device *dev, struct > > delayed_work *w, > > + void (*worker)(struct work_struct > > *work)) > > +{ > > + struct delayed_work **ptr; > > + > > + ptr = devres_alloc(dev_delayed_work_drop, sizeof(*ptr), > > GFP_KERNEL); > > + if (!ptr) > > + return -ENOMEM; > > + > > + INIT_DELAYED_WORK(w, worker); > > + *ptr = w; > > + devres_add(dev, ptr); > > + > > + return 0; > > +} > > +EXPORT_SYMBOL_GPL(devm_delayed_work_autocancel); > > diff --git a/include/linux/device.h b/include/linux/device.h > > index 1779f90eeb4c..192456198de7 100644 > > --- a/include/linux/device.h > > +++ b/include/linux/device.h > > @@ -27,6 +27,7 @@ > > #include <linux/uidgid.h> > > #include <linux/gfp.h> > > #include <linux/overflow.h> > > +#include <linux/workqueue.h> > > #include <linux/device/bus.h> > > #include <linux/device/class.h> > > #include <linux/device/driver.h> > > @@ -249,6 +250,10 @@ void __iomem *devm_of_iomap(struct device > > *dev, > > struct device_node *node, int index, > > resource_size_t *size); > > > > +/* delayed work which is cancelled when driver exits */ > > Not when the "driver exits". > > There is two different lifespans here (well 3). Code and > data*2. Don't > confuse them as that will just cause lots of problems. > > The move toward more and more "devm" functions is not the way to go > as > they just more and more make things easier to get wrong. > > APIs should be impossible to get wrong, this one is going to be > almost > impossible to get right. Thanks for prompt reply. I guess I must've misunderstood some of this concept. Frankly to say, I don't understand how the devm based irq management works and this does not. Maybe I'd better study this further then. Best Regards Matti Vaittinen
On Sat, Feb 13, 2021 at 12:26:59PM +0000, Vaittinen, Matti wrote: > > On Sat, 2021-02-13 at 13:16 +0100, Greg Kroah-Hartman wrote: > > On Sat, Feb 13, 2021 at 01:58:44PM +0200, Matti Vaittinen wrote: > > > A few drivers which need a delayed work-queue must cancel work at > > > exit. > > > Some of those implement remove solely for this purpose. Help > > > drivers > > > to avoid unnecessary remove and error-branch implementation by > > > adding > > > managed verision of delayed work initialization > > > > > > Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> > > > > That's not a good idea. As this would kick in when the device is > > removed from the system, not when it is unbound from the driver, > > right? > > > > > --- > > > drivers/base/devres.c | 33 +++++++++++++++++++++++++++++++++ > > > include/linux/device.h | 5 +++++ > > > 2 files changed, 38 insertions(+) > > > > > > diff --git a/drivers/base/devres.c b/drivers/base/devres.c > > > index fb9d5289a620..2879595bb5a4 100644 > > > --- a/drivers/base/devres.c > > > +++ b/drivers/base/devres.c > > > @@ -1231,3 +1231,36 @@ void devm_free_percpu(struct device *dev, > > > void __percpu *pdata) > > > (void *)pdata)); > > > } > > > EXPORT_SYMBOL_GPL(devm_free_percpu); > > > + > > > +static void dev_delayed_work_drop(struct device *dev, void *res) > > > +{ > > > + cancel_delayed_work_sync(*(struct delayed_work **)res); > > > +} > > > + > > > +/** > > > + * devm_delayed_work_autocancel - Resource-managed work allocation > > > + * @dev: Device which lifetime work is bound to > > > + * @pdata: work to be cancelled when device exits > > > + * > > > + * Initialize work which is automatically cancelled when device > > > exits. > > > > There is no such thing in the driver model as "when device exits". > > Please use the proper terminology as I do not understand what you > > think > > this is doing here... > > > > > + * A few drivers need delayed work which must be cancelled before > > > driver > > > + * is unload to avoid accessing removed resources. > > > + * devm_delayed_work_autocancel() can be used to omit the explicit > > > + * cancelleation when driver is unload. > > > + */ > > > +int devm_delayed_work_autocancel(struct device *dev, struct > > > delayed_work *w, > > > + void (*worker)(struct work_struct > > > *work)) > > > +{ > > > + struct delayed_work **ptr; > > > + > > > + ptr = devres_alloc(dev_delayed_work_drop, sizeof(*ptr), > > > GFP_KERNEL); > > > + if (!ptr) > > > + return -ENOMEM; > > > + > > > + INIT_DELAYED_WORK(w, worker); > > > + *ptr = w; > > > + devres_add(dev, ptr); > > > + > > > + return 0; > > > +} > > > +EXPORT_SYMBOL_GPL(devm_delayed_work_autocancel); > > > diff --git a/include/linux/device.h b/include/linux/device.h > > > index 1779f90eeb4c..192456198de7 100644 > > > --- a/include/linux/device.h > > > +++ b/include/linux/device.h > > > @@ -27,6 +27,7 @@ > > > #include <linux/uidgid.h> > > > #include <linux/gfp.h> > > > #include <linux/overflow.h> > > > +#include <linux/workqueue.h> > > > #include <linux/device/bus.h> > > > #include <linux/device/class.h> > > > #include <linux/device/driver.h> > > > @@ -249,6 +250,10 @@ void __iomem *devm_of_iomap(struct device > > > *dev, > > > struct device_node *node, int index, > > > resource_size_t *size); > > > > > > +/* delayed work which is cancelled when driver exits */ > > > > Not when the "driver exits". > > > > There is two different lifespans here (well 3). Code and > > data*2. Don't > > confuse them as that will just cause lots of problems. > > > > The move toward more and more "devm" functions is not the way to go > > as > > they just more and more make things easier to get wrong. > > > > APIs should be impossible to get wrong, this one is going to be > > almost > > impossible to get right. > > Thanks for prompt reply. I guess I must've misunderstood some of this > concept. Frankly to say, I don't understand how the devm based irq > management works and this does not. Maybe I'd better study this further > then. The devm based irq management works horribly and should not be used at all. That is the main offender in the "an api that is impossible to use correctly" category. Honestly, I think it should just be removed as there is almost no real need for it that I can determine, other than to cause bugs. thanks, greg k-h
Hi, On 2/13/21 1:16 PM, Greg Kroah-Hartman wrote: > On Sat, Feb 13, 2021 at 01:58:44PM +0200, Matti Vaittinen wrote: >> A few drivers which need a delayed work-queue must cancel work at exit. >> Some of those implement remove solely for this purpose. Help drivers >> to avoid unnecessary remove and error-branch implementation by adding >> managed verision of delayed work initialization >> >> Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> > > That's not a good idea. As this would kick in when the device is > removed from the system, not when it is unbound from the driver, right? Erm, no devm managed resources get released when the driver is detached: drivers/base/dd.c: __device_release_driver() calls devres_release_all(dev); > >> --- >> drivers/base/devres.c | 33 +++++++++++++++++++++++++++++++++ >> include/linux/device.h | 5 +++++ >> 2 files changed, 38 insertions(+) >> >> diff --git a/drivers/base/devres.c b/drivers/base/devres.c >> index fb9d5289a620..2879595bb5a4 100644 >> --- a/drivers/base/devres.c >> +++ b/drivers/base/devres.c >> @@ -1231,3 +1231,36 @@ void devm_free_percpu(struct device *dev, void __percpu *pdata) >> (void *)pdata)); >> } >> EXPORT_SYMBOL_GPL(devm_free_percpu); >> + >> +static void dev_delayed_work_drop(struct device *dev, void *res) >> +{ >> + cancel_delayed_work_sync(*(struct delayed_work **)res); >> +} >> + >> +/** >> + * devm_delayed_work_autocancel - Resource-managed work allocation >> + * @dev: Device which lifetime work is bound to >> + * @pdata: work to be cancelled when device exits >> + * >> + * Initialize work which is automatically cancelled when device exits. > > There is no such thing in the driver model as "when device exits". > Please use the proper terminology as I do not understand what you think > this is doing here... I agree that this needs better wording I always talk about driver-unbinding because sysfs has /sys/bus/*/drivers/*/bind and /sys/bus/*/drivers/*/unbind attributes. But I see that the relevant driver-core functions all call it driver detaching, so lets be consistent and use that here too. > >> + * A few drivers need delayed work which must be cancelled before driver >> + * is unload to avoid accessing removed resources. >> + * devm_delayed_work_autocancel() can be used to omit the explicit >> + * cancelleation when driver is unload. >> + */ >> +int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, >> + void (*worker)(struct work_struct *work)) >> +{ >> + struct delayed_work **ptr; >> + >> + ptr = devres_alloc(dev_delayed_work_drop, sizeof(*ptr), GFP_KERNEL); >> + if (!ptr) >> + return -ENOMEM; >> + >> + INIT_DELAYED_WORK(w, worker); >> + *ptr = w; >> + devres_add(dev, ptr); >> + >> + return 0; >> +} >> +EXPORT_SYMBOL_GPL(devm_delayed_work_autocancel); >> diff --git a/include/linux/device.h b/include/linux/device.h >> index 1779f90eeb4c..192456198de7 100644 >> --- a/include/linux/device.h >> +++ b/include/linux/device.h >> @@ -27,6 +27,7 @@ >> #include <linux/uidgid.h> >> #include <linux/gfp.h> >> #include <linux/overflow.h> >> +#include <linux/workqueue.h> >> #include <linux/device/bus.h> >> #include <linux/device/class.h> >> #include <linux/device/driver.h> >> @@ -249,6 +250,10 @@ void __iomem *devm_of_iomap(struct device *dev, >> struct device_node *node, int index, >> resource_size_t *size); >> >> +/* delayed work which is cancelled when driver exits */ > > Not when the "driver exits". Right this should be detached not exits. > There is two different lifespans here (well 3). Code and data*2. Don't > confuse them as that will just cause lots of problems. > > The move toward more and more "devm" functions is not the way to go as > they just more and more make things easier to get wrong. > > APIs should be impossible to get wrong, this one is going to be almost > impossible to get right. I have to disagree here devm generally makes it easier to get things right, it is when some devm functions are missing and devm and non devm resources are mixed that things get tricky. Lets look for example at the drivers/extcon/extcon-intel-int3496.c code from patch 2/7 from this set. The removed driver-remove function looks like this: -static int int3496_remove(struct platform_device *pdev) -{ - struct int3496_data *data = platform_get_drvdata(pdev); - - devm_free_irq(&pdev->dev, data->usb_id_irq, data); - cancel_delayed_work_sync(&data->work); - - return 0; -} - This is a good example where the mix of devm and non devm (the workqueue) resources makes things tricky. The IRQ must be freed first to avoid the work potentially getting re-queued after the sync cancel. In this case using devm for the IRQ may cause the driver author to forget about this, leaving a race. Bit with the new proposed devm_delayed_work_autocancel() function things will just work. This work gets queued by the IRQ handler, so the work must be initialized (1) *before* devm_request_irq() gets called. Any different order would be a bug in the probe function since then the IRQ might run before the work is initialized. Since devm unrolls / releases resources in reverse order, this means that it will automatically free the IRQ (which was requested later) before cancelling the work. So by switching to the new devm_delayed_work_autocancel() function we avoid a case where a driver author can cause a race on driver detach because it is relying on devm to free the IRQ, which may cause it to requeue a just cancelled work. IOW introducing this function (and using it where appropriate) actually removes a possible class of bugs. patch 2/7 actually has a nice example of this, drivers/extcon/extcon-gpio.c also uses a delayed work queued by an interrupt, together with devm managing the interrupt, yet the removed driver_remove callback: -static int gpio_extcon_remove(struct platform_device *pdev) -{ - struct gpio_extcon_data *data = platform_get_drvdata(pdev); - - cancel_delayed_work_sync(&data->work); - - return 0; -} - Is missing the explicit free on the IRQ which is necessary to avoid the race. One the one hand this illustrates your (Greg's) argument that devm managed IRQs may be a bad idea. OTOH it shows that if we have devm managed IRQs anyways that then also having devm managed autocancel works is a good idea, since this RFC patch-set not only results in some cleanup, but is actually fixing at least 1 driver detach race condition. Regards, Hans 1) devm_delayed_work_autocancel() replaces INIT_DELAYED_WORK()
On Sat, Feb 13, 2021 at 02:18:06PM +0100, Hans de Goede wrote: > Hi, > > On 2/13/21 1:16 PM, Greg Kroah-Hartman wrote: > > On Sat, Feb 13, 2021 at 01:58:44PM +0200, Matti Vaittinen wrote: > >> A few drivers which need a delayed work-queue must cancel work at exit. > >> Some of those implement remove solely for this purpose. Help drivers > >> to avoid unnecessary remove and error-branch implementation by adding > >> managed verision of delayed work initialization > >> > >> Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> > > > > That's not a good idea. As this would kick in when the device is > > removed from the system, not when it is unbound from the driver, right? > > Erm, no devm managed resources get released when the driver is detached: > drivers/base/dd.c: __device_release_driver() calls devres_release_all(dev); Then why do you have to manually call devm_free_irq() in release callbacks? I thought that was the primary problem with those things. I can understand devm_ calls handling resources, but callbacks and workqueues feels like a big stretch. > > There is two different lifespans here (well 3). Code and data*2. Don't > > confuse them as that will just cause lots of problems. > > > > The move toward more and more "devm" functions is not the way to go as > > they just more and more make things easier to get wrong. > > > > APIs should be impossible to get wrong, this one is going to be almost > > impossible to get right. > > I have to disagree here devm generally makes it easier to get things right, > it is when some devm functions are missing and devm and non devm resources > are mixed that things get tricky. > > Lets look for example at the drivers/extcon/extcon-intel-int3496.c code > from patch 2/7 from this set. The removed driver-remove function looks like > this: > > -static int int3496_remove(struct platform_device *pdev) > -{ > - struct int3496_data *data = platform_get_drvdata(pdev); > - > - devm_free_irq(&pdev->dev, data->usb_id_irq, data); > - cancel_delayed_work_sync(&data->work); > - > - return 0; > -} > - > > This is a good example where the mix of devm and non devm (the workqueue) > resources makes things tricky. The IRQ must be freed first to avoid the > work potentially getting re-queued after the sync cancel. > > In this case using devm for the IRQ may cause the driver author to forget > about this, leaving a race. > > Bit with the new proposed devm_delayed_work_autocancel() function things > will just work. > > This work gets queued by the IRQ handler, so the work must be initialized (1) > *before* devm_request_irq() gets called. Any different order would be a > bug in the probe function since then the IRQ might run before the work > is initialized. How are we now going to audit the order of these calls to ensure that this is done correctly? That still feels like it is ripe for bugs in a much easier way than without these functions. > Since devm unrolls / releases resources in reverse order, this means that > it will automatically free the IRQ (which was requested later) before > cancelling the work. > > So by switching to the new devm_delayed_work_autocancel() function we avoid > a case where a driver author can cause a race on driver detach because it is > relying on devm to free the IRQ, which may cause it to requeue a just > cancelled work. > > IOW introducing this function (and using it where appropriate) actually > removes a possible class of bugs. > > patch 2/7 actually has a nice example of this, drivers/extcon/extcon-gpio.c > also uses a delayed work queued by an interrupt, together with devm managing > the interrupt, yet the removed driver_remove callback: > > -static int gpio_extcon_remove(struct platform_device *pdev) > -{ > - struct gpio_extcon_data *data = platform_get_drvdata(pdev); > - > - cancel_delayed_work_sync(&data->work); > - > - return 0; > -} > - > > Is missing the explicit free on the IRQ which is necessary to avoid > the race. One the one hand this illustrates your (Greg's) argument that > devm managed IRQs may be a bad idea. I still think it is :) > OTOH it shows that if we have devm managed IRQs anyways that then also > having devm managed autocancel works is a good idea, since this RFC patch-set > not only results in some cleanup, but is actually fixing at least 1 driver > detach race condition. Fixing bugs is good, but the abstraction away from resource management that the devm_ calls cause is worrying as the "magic" behind them can be wrong, as seen here. thanks, greg k-h
Hi, On 2/13/21 2:33 PM, Greg Kroah-Hartman wrote: > On Sat, Feb 13, 2021 at 02:18:06PM +0100, Hans de Goede wrote: >> Hi, >> >> On 2/13/21 1:16 PM, Greg Kroah-Hartman wrote: >>> On Sat, Feb 13, 2021 at 01:58:44PM +0200, Matti Vaittinen wrote: >>>> A few drivers which need a delayed work-queue must cancel work at exit. >>>> Some of those implement remove solely for this purpose. Help drivers >>>> to avoid unnecessary remove and error-branch implementation by adding >>>> managed verision of delayed work initialization >>>> >>>> Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> >>> >>> That's not a good idea. As this would kick in when the device is >>> removed from the system, not when it is unbound from the driver, right? >> >> Erm, no devm managed resources get released when the driver is detached: >> drivers/base/dd.c: __device_release_driver() calls devres_release_all(dev); > > Then why do you have to manually call devm_free_irq() in release > callbacks? That is only necessary when there are none devm managed resources / resource-ish things which get manually released in the remove function (because they are not devm managed), while the IRQ handler depends on these. > I thought that was the primary problem with those things. If all resources / resource-ish things used by the IRQ handler are devm managed then there is no need to have a devm_free_irq() call in a drivers release (which is called remove) callback. > I can understand devm_ calls handling resources, but callbacks and > workqueues feels like a big stretch. Things get hairy wrt devm resource handling when some of the resources are not managed by devm. Having devm managed sync-cancel of work items is IMHO actually good to have because it helps avoiding mixing of devm managed resources with non devm managed resources. Some drivers are actually already doing devm managed workqueue cancellation by using the generic devm_add_action() helper which allows calling a driver supplied cleanup function when devres_release_all() runs. This is also useful to make sure the workqueue is cancelled at the right time during tear-down on error-exits from probe(), which also runs devres_release_all(). See drivers/power/supply/axp288_charger.c for an example of this. Note that no-one is going to force people to use this new devm_delayed_work_autocancel() functionality. IMHO it is a useful tool to have in our toolbox. > >>> There is two different lifespans here (well 3). Code and data*2. Don't >>> confuse them as that will just cause lots of problems. >>> >>> The move toward more and more "devm" functions is not the way to go as >>> they just more and more make things easier to get wrong. >>> >>> APIs should be impossible to get wrong, this one is going to be almost >>> impossible to get right. >> >> I have to disagree here devm generally makes it easier to get things right, >> it is when some devm functions are missing and devm and non devm resources >> are mixed that things get tricky. >> >> Lets look for example at the drivers/extcon/extcon-intel-int3496.c code >> from patch 2/7 from this set. The removed driver-remove function looks like >> this: >> >> -static int int3496_remove(struct platform_device *pdev) >> -{ >> - struct int3496_data *data = platform_get_drvdata(pdev); >> - >> - devm_free_irq(&pdev->dev, data->usb_id_irq, data); >> - cancel_delayed_work_sync(&data->work); >> - >> - return 0; >> -} >> - >> >> This is a good example where the mix of devm and non devm (the workqueue) >> resources makes things tricky. The IRQ must be freed first to avoid the >> work potentially getting re-queued after the sync cancel. >> >> In this case using devm for the IRQ may cause the driver author to forget >> about this, leaving a race. >> >> Bit with the new proposed devm_delayed_work_autocancel() function things >> will just work. >> >> This work gets queued by the IRQ handler, so the work must be initialized (1) >> *before* devm_request_irq() gets called. Any different order would be a >> bug in the probe function since then the IRQ might run before the work >> is initialized. > > How are we now going to audit the order of these calls to ensure that > this is done correctly? That still feels like it is ripe for bugs in a > much easier way than without these functions. We already need to audit probe() functions to make sure that all resources which the IRQ handlers need are allocated before the IRQ gets requested. This is why requesting the IRQ is usually one of the last things which driver's probe functions do. Using devm (and only devm without mixing and matching) avoids to have to _also_ audit the probe-error-exit teardown and driver-release paths. Those will automatically be done in the right order since devm will release everything in reverse order of allocation. So this reduces the amount of auditing work we need to do. In my experience the probe-error-exit teardown and driver-release paths are full of bugs, bugs which are often never caught because most drivers bind once and then stay bound until reboot / shutdown so this paths are often not exercised during testing. So automating this teardown is a good thing to do, it removes a whole class of bugs. >> Since devm unrolls / releases resources in reverse order, this means that >> it will automatically free the IRQ (which was requested later) before >> cancelling the work. >> >> So by switching to the new devm_delayed_work_autocancel() function we avoid >> a case where a driver author can cause a race on driver detach because it is >> relying on devm to free the IRQ, which may cause it to requeue a just >> cancelled work. >> >> IOW introducing this function (and using it where appropriate) actually >> removes a possible class of bugs. >> >> patch 2/7 actually has a nice example of this, drivers/extcon/extcon-gpio.c >> also uses a delayed work queued by an interrupt, together with devm managing >> the interrupt, yet the removed driver_remove callback: >> >> -static int gpio_extcon_remove(struct platform_device *pdev) >> -{ >> - struct gpio_extcon_data *data = platform_get_drvdata(pdev); >> - >> - cancel_delayed_work_sync(&data->work); >> - >> - return 0; >> -} >> - >> >> Is missing the explicit free on the IRQ which is necessary to avoid >> the race. One the one hand this illustrates your (Greg's) argument that >> devm managed IRQs may be a bad idea. > > I still think it is :) I understand and I'm not under the illusion that I'm going to change your mind about this :) >> OTOH it shows that if we have devm managed IRQs anyways that then also >> having devm managed autocancel works is a good idea, since this RFC patch-set >> not only results in some cleanup, but is actually fixing at least 1 driver >> detach race condition. > > Fixing bugs is good, but the abstraction away from resource management > that the devm_ calls cause is worrying as the "magic" behind them can be > wrong, as seen here. Here being the gpio_extcon_remove() example ? Yes that is bad, but again that is due to mixing manual and devm managed resource management. Having this new devm_delayed_work_autocancel() helper will allow a bunch of drivers to move away from mixing the 2, which is a good thing in my book. As I said above I believe that having devm_delayed_work_autocancel() (1) in our toolbox will be a good thing to have. Driver authors can then choose to use it; or they can choose to not use it if they don't like it. I know that the reason why I did not use it in the drivers/extcon/extcon-intel-int3496.c driver is because it was not available if it had been available then I would definitely have used it, as it avoids the mixing of resource-management styles which that driver is currently doing. And I think that that is what this is ultimately about, there are 2 styles of resource-management: 1. manual 2. devm based And they both have their pros and cons, problems mostly arise when mixing them and adding new devm helpers for commonly used cleanup patterns is a good thing as it helps to get rid of mixing these 2 styles in a single driver. Regards, Hans
Hi, On 2/13/21 3:38 PM, Hans de Goede wrote: > Hi, > > On 2/13/21 2:33 PM, Greg Kroah-Hartman wrote: <snip> > Having this new devm_delayed_work_autocancel() helper will allow a > bunch of drivers to move away from mixing the 2, which is a good thing > in my book. > > As I said above I believe that having devm_delayed_work_autocancel() (1) > in our toolbox will be a good thing to have. Driver authors can then choose > to use it; or they can choose to not use it if they don't like it. > > I know that the reason why I did not use it in the > drivers/extcon/extcon-intel-int3496.c driver is because it was not available > if it had been available then I would definitely have used it, as it avoids the > mixing of resource-management styles which that driver is currently doing. > > And I think that that is what this is ultimately about, there are 2 styles > of resource-management: > > 1. manual > 2. devm based > > And they both have their pros and cons, problems mostly arise when mixing them > and adding new devm helpers for commonly used cleanup patterns is a good thing > as it helps to get rid of mixing these 2 styles in a single driver. I just noticed that I forgot to fill in the (1) footnote above: 1) And we probably will want one for non delayed work items to: devm_work_autocancel(), but lets cross that bridge when we get there. Also when reviewing: "[RFC PATCH 2/7] extconn: Clean-up few drivers by using managed work init" I noticed that the extcon-qcom-spmi-misc.c and extcon-palmas.c follow the same standard pattern of having an IRQ which queues a delayed work and they both miss the devm_free_irq call before the cancel_delayed_work_sync() on driver release. So just patch 2/7 here fixes 3 driver-release race conditions (fixing 3/4 drivers which it touches) as a bonus to the code-cleanup which it does. So as this clearly seems to be fixing a bunch of bugs, by simply completely removing the buggy code driver remove callbacks, this really seems like a good idea to me. Regards, Hans
Hi, On 2/13/21 12:58 PM, Matti Vaittinen wrote: > A few drivers which need a delayed work-queue must cancel work at exit. > Some of those implement remove solely for this purpose. Help drivers > to avoid unnecessary remove and error-branch implementation by adding > managed verision of delayed work initialization > > Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> > --- > drivers/base/devres.c | 33 +++++++++++++++++++++++++++++++++ > include/linux/device.h | 5 +++++ > 2 files changed, 38 insertions(+) > > diff --git a/drivers/base/devres.c b/drivers/base/devres.c > index fb9d5289a620..2879595bb5a4 100644 > --- a/drivers/base/devres.c > +++ b/drivers/base/devres.c > @@ -1231,3 +1231,36 @@ void devm_free_percpu(struct device *dev, void __percpu *pdata) > (void *)pdata)); > } > EXPORT_SYMBOL_GPL(devm_free_percpu); > + > +static void dev_delayed_work_drop(struct device *dev, void *res) > +{ > + cancel_delayed_work_sync(*(struct delayed_work **)res); > +} > + > +/** > + * devm_delayed_work_autocancel - Resource-managed work allocation > + * @dev: Device which lifetime work is bound to > + * @pdata: work to be cancelled when device exits > + * > + * Initialize work which is automatically cancelled when device exits. > + * A few drivers need delayed work which must be cancelled before driver > + * is unload to avoid accessing removed resources. > + * devm_delayed_work_autocancel() can be used to omit the explicit > + * cancelleation when driver is unload. > + */ > +int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, > + void (*worker)(struct work_struct *work)) > +{ > + struct delayed_work **ptr; > + > + ptr = devres_alloc(dev_delayed_work_drop, sizeof(*ptr), GFP_KERNEL); > + if (!ptr) > + return -ENOMEM; > + > + INIT_DELAYED_WORK(w, worker); > + *ptr = w; > + devres_add(dev, ptr); > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(devm_delayed_work_autocancel); This is a bit of a micro-optimization I must admit, but I think you can just make this a static inline using devm_add_action, which would avoid growing the base kernel image and avoid adding yet another symbol to the exported symbols table. I think something like this should work: static int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, void (*worker)(struct work_struct *work)) { INIT_DELAYED_WORK(w, worker); return devm_add_action(dev, (void (*action)(void *))cancel_delayed_work_sync, w); } I'm not sure about the cast, that may need something like this instead: typedef void (*devm_action_func)(void *); static int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, void (*worker)(struct work_struct *work)) { INIT_DELAYED_WORK(w, worker); return devm_add_action(dev, (devm_action_func)cancel_delayed_work_sync, w); } As said this is a bit of a micro-optimization. But if we want to add others too, say one for non delayed works, then at some point in time this is going to start to add up (a bit) wrt symbol-table size and base kernel-image size. And if you go the static inline route, I guess you could add this in include/linux/workqueue.h instead and putting workqueue devm cleanup helpers there seems better then putting random devm cleanup helpers in include/linux/device.h . Regards, Hans > diff --git a/include/linux/device.h b/include/linux/device.h > index 1779f90eeb4c..192456198de7 100644 > --- a/include/linux/device.h > +++ b/include/linux/device.h > @@ -27,6 +27,7 @@ > #include <linux/uidgid.h> > #include <linux/gfp.h> > #include <linux/overflow.h> > +#include <linux/workqueue.h> > #include <linux/device/bus.h> > #include <linux/device/class.h> > #include <linux/device/driver.h> > @@ -249,6 +250,10 @@ void __iomem *devm_of_iomap(struct device *dev, > struct device_node *node, int index, > resource_size_t *size); > > +/* delayed work which is cancelled when driver exits */ > +int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, > + void (*worker)(struct work_struct *work)); > + > /* allows to add/remove a custom action to devres stack */ > int devm_add_action(struct device *dev, void (*action)(void *), void *data); > void devm_remove_action(struct device *dev, void (*action)(void *), void *data); >
On 2/13/21 7:03 AM, Hans de Goede wrote: [ ... ] > > I think something like this should work: > > static int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, > void (*worker)(struct work_struct *work)) { > INIT_DELAYED_WORK(w, worker); > return devm_add_action(dev, (void (*action)(void *))cancel_delayed_work_sync, w); > } > > I'm not sure about the cast, that may need something like this instead: > > typedef void (*devm_action_func)(void *); > > static int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, > void (*worker)(struct work_struct *work)) { > INIT_DELAYED_WORK(w, worker); > return devm_add_action(dev, (devm_action_func)cancel_delayed_work_sync, w); Unfortunately, you can not type cast function pointers in C. It is against the C ABI. I am sure it is done in a few places in the kernel anyway, but those are wrong. This is the reason why many calls to devm_add_action() point to functions such as static void visconti_clk_disable_unprepare(void *data) { clk_disable_unprepare(data); } which could otherwise be handled using typecasts. Guenter
Hi, On 2/13/21 4:27 PM, Guenter Roeck wrote: > On 2/13/21 7:03 AM, Hans de Goede wrote: > [ ... ] >> >> I think something like this should work: >> >> static int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, >> void (*worker)(struct work_struct *work)) { >> INIT_DELAYED_WORK(w, worker); >> return devm_add_action(dev, (void (*action)(void *))cancel_delayed_work_sync, w); >> } >> >> I'm not sure about the cast, that may need something like this instead: >> >> typedef void (*devm_action_func)(void *); >> >> static int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, >> void (*worker)(struct work_struct *work)) { >> INIT_DELAYED_WORK(w, worker); >> return devm_add_action(dev, (devm_action_func)cancel_delayed_work_sync, w); > > Unfortunately, you can not type cast function pointers in C. It is against the C ABI. > I am sure it is done in a few places in the kernel anyway, but those are wrong. I see, bummer. > This is the reason why many calls to devm_add_action() point to functions such as > > static void visconti_clk_disable_unprepare(void *data) > { > clk_disable_unprepare(data); > } > > which could otherwise be handled using typecasts. Hmm, wouldn't something like this be a candidate for adding a: devm_clk_prepare_enable() helper? This seems better then having the driver(s) make + error check separate clk_prepare_enable() + devm_add_action_or_reset() calls ? I must admit I'm guilty myself of just using devm_add_action() sometimes when a specific devm helper is missing, but this whole discussion makes me think that it would be good to have some extra devm helpers for common cases / driver cleanup patterns. If we add a devm_clk_prepare_enable() helper that should probably be added to drivers/clk/clk-devres.c and not to drivers/base/devres.c . I also still wonder if we cannot find a better place for this new devm_delayed_work_autocancel() helper but nothing comes to mind. Regards, Hans
On 2/13/21 7:59 AM, Hans de Goede wrote: > Hi, > > On 2/13/21 4:27 PM, Guenter Roeck wrote: >> On 2/13/21 7:03 AM, Hans de Goede wrote: >> [ ... ] >>> >>> I think something like this should work: >>> >>> static int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, >>> void (*worker)(struct work_struct *work)) { >>> INIT_DELAYED_WORK(w, worker); >>> return devm_add_action(dev, (void (*action)(void *))cancel_delayed_work_sync, w); >>> } >>> >>> I'm not sure about the cast, that may need something like this instead: >>> >>> typedef void (*devm_action_func)(void *); >>> >>> static int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, >>> void (*worker)(struct work_struct *work)) { >>> INIT_DELAYED_WORK(w, worker); >>> return devm_add_action(dev, (devm_action_func)cancel_delayed_work_sync, w); >> >> Unfortunately, you can not type cast function pointers in C. It is against the C ABI. >> I am sure it is done in a few places in the kernel anyway, but those are wrong. > > I see, bummer. > >> This is the reason why many calls to devm_add_action() point to functions such as >> >> static void visconti_clk_disable_unprepare(void *data) >> { >> clk_disable_unprepare(data); >> } >> >> which could otherwise be handled using typecasts. > > Hmm, wouldn't something like this be a candidate for adding a: > > devm_clk_prepare_enable() helper? > > This seems better then having the driver(s) make + error check separate > clk_prepare_enable() + devm_add_action_or_reset() calls ? > I don't really want to go there anymore. The maintainer rejected it several times. Guenter
On Sat, 2021-02-13 at 14:18 +0100, Hans de Goede wrote: > Hi, > > On 2/13/21 1:16 PM, Greg Kroah-Hartman wrote: > > On Sat, Feb 13, 2021 at 01:58:44PM +0200, Matti Vaittinen wrote: > > > +/** > > > + * devm_delayed_work_autocancel - Resource-managed work > > > allocation > > > + * @dev: Device which lifetime work is bound to > > > + * @pdata: work to be cancelled when device exits > > > + * > > > + * Initialize work which is automatically cancelled when device > > > exits. > > > > There is no such thing in the driver model as "when device exits". > > Please use the proper terminology as I do not understand what you > > think > > this is doing here... > > I agree that this needs better wording I always talk about driver- > unbinding > because sysfs has /sys/bus/*/drivers/*/bind and > /sys/bus/*/drivers/*/unbind > attributes. But I see that the relevant driver-core functions all > call it > driver detaching, so lets be consistent and use that here too. //Snip > > > @@ -249,6 +250,10 @@ void __iomem *devm_of_iomap(struct device > > > *dev, > > > struct device_node *node, int index, > > > resource_size_t *size); > > > > > > +/* delayed work which is cancelled when driver exits */ > > > > Not when the "driver exits". > > Right this should be detached not exits. > Thanks guys. I am poor with the terminology so I do appreciate your help in getting this right. I can change this for the v2. > > There is two different lifespans here (well 3). Code and > > data*2. Don't > > confuse them as that will just cause lots of problems. > > > > The move toward more and more "devm" functions is not the way to go > > as > > they just more and more make things easier to get wrong. > > > > APIs should be impossible to get wrong, this one is going to be > > almost > > impossible to get right. > > I have to disagree here devm generally makes it easier to get things > right, > it is when some devm functions are missing and devm and non devm > resources > are mixed that things get tricky. Thanks for the discussion. I hope we can come to some conclusion here. Unsurprisingly I agree with Hans here. I did after all send this patch series :) I guess I am mostly just repeating what he said. As Hans pointed out, when all calls are 'undone' by devm the order of 'undoing' is highly likely to be correct as the unwinding is done in reverse order to initializations. I think it is sane to assume in most case things are initiated in order where operations which depend on something are done last - and when 'unwinding' things those are 'undone' first. My 'gut feeling' for probe / remove related errors is that the most usual errors I've seen have been: a) Tear-down completely forgotten b) Tear-down forgotten at error path c) Wrong order of initiating things (IRQ requested prior resource initialization) d) Wrong order of cleann-up at remove. a) and b) class errors have been the most common ones I've seen. They can be completely avoided when devm is used. c) is there no matter if we use devm or not. d) is mostly avoided when only devm is used - mixing devm and manual operations make this more likely as Hans pointed out. As long as we have some devm operations we should help avoid mixing devm and manual clean-up. Best Regards Matti Vaittinen
On Sat, 2021-02-13 at 16:59 +0100, Hans de Goede wrote: > Hi, > > On 2/13/21 4:27 PM, Guenter Roeck wrote: > > On 2/13/21 7:03 AM, Hans de Goede wrote: > > [ ... ] > > > I think something like this should work: > > > > > > static int devm_delayed_work_autocancel(struct device *dev, > > > struct delayed_work *w, > > > void (*worker)(struct > > > work_struct *work)) { > > > INIT_DELAYED_WORK(w, worker); > > > return devm_add_action(dev, (void (*action)(void > > > *))cancel_delayed_work_sync, w); > > > } > > > > > > I'm not sure about the cast, that may need something like this > > > instead: > > > > > > typedef void (*devm_action_func)(void *); > > > > > > static int devm_delayed_work_autocancel(struct device *dev, > > > struct delayed_work *w, > > > void (*worker)(struct > > > work_struct *work)) { > > > INIT_DELAYED_WORK(w, worker); > > > return devm_add_action(dev, > > > (devm_action_func)cancel_delayed_work_sync, w); > > > > Unfortunately, you can not type cast function pointers in C. It is > > against the C ABI. > > I am sure it is done in a few places in the kernel anyway, but > > those are wrong. > > I see, bummer. I think using devm_add_action() is still a good idea. > > If we add a devm_clk_prepare_enable() helper that should probably be > added > to drivers/clk/clk-devres.c and not to drivers/base/devres.c . > > I also still wonder if we cannot find a better place for this new > devm_delayed_work_autocancel() helper but nothing comes to mind. I don't like the idea of including device.h from workqueue.h - and I think this would be necessary if we added devm_delayed_work_autocancel() as inline in workqueue.h, right? I also see strong objection towards the devm managed clean-ups. How about adding some devm-helpers.c in drivers/base - where we could collect devm-based helpers - and which could be enabled by own CONFIG - and left out by those who dislike it? I know I wrote that the devm_delayed_work_autocancel() does probably not warrant own file - but if you can foresee devm_work_autocancel() and few other generally useful helpers - then we would have a place for those. The devm stuff should in my opinion live under drivers/. Best Regards Matti Vaittinen
Hi, On 2/15/21 8:22 AM, Vaittinen, Matti wrote: > > On Sat, 2021-02-13 at 16:59 +0100, Hans de Goede wrote: >> Hi, >> >> On 2/13/21 4:27 PM, Guenter Roeck wrote: >>> On 2/13/21 7:03 AM, Hans de Goede wrote: >>> [ ... ] >>>> I think something like this should work: >>>> >>>> static int devm_delayed_work_autocancel(struct device *dev, >>>> struct delayed_work *w, >>>> void (*worker)(struct >>>> work_struct *work)) { >>>> INIT_DELAYED_WORK(w, worker); >>>> return devm_add_action(dev, (void (*action)(void >>>> *))cancel_delayed_work_sync, w); >>>> } >>>> >>>> I'm not sure about the cast, that may need something like this >>>> instead: >>>> >>>> typedef void (*devm_action_func)(void *); >>>> >>>> static int devm_delayed_work_autocancel(struct device *dev, >>>> struct delayed_work *w, >>>> void (*worker)(struct >>>> work_struct *work)) { >>>> INIT_DELAYED_WORK(w, worker); >>>> return devm_add_action(dev, >>>> (devm_action_func)cancel_delayed_work_sync, w); >>> >>> Unfortunately, you can not type cast function pointers in C. It is >>> against the C ABI. >>> I am sure it is done in a few places in the kernel anyway, but >>> those are wrong. >> >> I see, bummer. > > I think using devm_add_action() is still a good idea. Yes, we could also just have a 1 line static inline function to do the function-cast. Like this: static inline void devm_delayed_work_autocancel_func(void *work) { cancel_delayed_work_sync(work); } static inline int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, void (*worker)(struct work_struct *work)) { INIT_DELAYED_WORK(w, worker); return devm_add_action(dev, devm_delayed_work_autocancel_func, w); } Both functions will then simply be compiled out in files which do not use them. >> If we add a devm_clk_prepare_enable() helper that should probably be >> added >> to drivers/clk/clk-devres.c and not to drivers/base/devres.c . >> >> I also still wonder if we cannot find a better place for this new >> devm_delayed_work_autocancel() helper but nothing comes to mind. > > I don't like the idea of including device.h from workqueue.h - and I > think this would be necessary if we added > devm_delayed_work_autocancel() as inline in workqueue.h, right? Yes. > I also see strong objection towards the devm managed clean-ups. Yes it seems that there are some people who don't like this, where as others do like them. > How about adding some devm-helpers.c in drivers/base - where we could > collect devm-based helpers - and which could be enabled by own CONFIG - > and left out by those who dislike it? I would make this something configurable through Kconfig, but if go the static inline route, which I'm in favor of then we could just have a: include/linux/devm-cleanup-helpers.h And put everything (including kdoc texts) there. This way the functionality is 100% opt-in (by explicitly including the header if you want the helpers) which hopefully makes this a bit more acceptable to people who don't like this style of cleanups. I would be even happy to act as the upstream maintainer for such a include/linux/devm-cleanup-helpers.h file, I can maintain it as part of the platform-drivers-x86 tree (with its own MAINTAINERS entry). Greg, would this be an acceptable solution to you ? Regards, Hans
On Mon, Feb 15, 2021 at 11:37:26AM +0100, Hans de Goede wrote: > Hi, > > On 2/15/21 8:22 AM, Vaittinen, Matti wrote: > > > > On Sat, 2021-02-13 at 16:59 +0100, Hans de Goede wrote: > >> Hi, > >> > >> On 2/13/21 4:27 PM, Guenter Roeck wrote: > >>> On 2/13/21 7:03 AM, Hans de Goede wrote: > >>> [ ... ] > >>>> I think something like this should work: > >>>> > >>>> static int devm_delayed_work_autocancel(struct device *dev, > >>>> struct delayed_work *w, > >>>> void (*worker)(struct > >>>> work_struct *work)) { > >>>> INIT_DELAYED_WORK(w, worker); > >>>> return devm_add_action(dev, (void (*action)(void > >>>> *))cancel_delayed_work_sync, w); > >>>> } > >>>> > >>>> I'm not sure about the cast, that may need something like this > >>>> instead: > >>>> > >>>> typedef void (*devm_action_func)(void *); > >>>> > >>>> static int devm_delayed_work_autocancel(struct device *dev, > >>>> struct delayed_work *w, > >>>> void (*worker)(struct > >>>> work_struct *work)) { > >>>> INIT_DELAYED_WORK(w, worker); > >>>> return devm_add_action(dev, > >>>> (devm_action_func)cancel_delayed_work_sync, w); > >>> > >>> Unfortunately, you can not type cast function pointers in C. It is > >>> against the C ABI. > >>> I am sure it is done in a few places in the kernel anyway, but > >>> those are wrong. > >> > >> I see, bummer. > > > > I think using devm_add_action() is still a good idea. > > Yes, we could also just have a 1 line static inline function to do > the function-cast. Like this: > > static inline void devm_delayed_work_autocancel_func(void *work) > { > cancel_delayed_work_sync(work); > } > > static inline int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, void (*worker)(struct work_struct *work)) > { > INIT_DELAYED_WORK(w, worker); > return devm_add_action(dev, devm_delayed_work_autocancel_func, w); > } > > Both functions will then simply be compiled out in files which do not > use them. > > >> If we add a devm_clk_prepare_enable() helper that should probably be > >> added > >> to drivers/clk/clk-devres.c and not to drivers/base/devres.c . > >> > >> I also still wonder if we cannot find a better place for this new > >> devm_delayed_work_autocancel() helper but nothing comes to mind. > > > > I don't like the idea of including device.h from workqueue.h - and I > > think this would be necessary if we added > > devm_delayed_work_autocancel() as inline in workqueue.h, right? > > Yes. > > > I also see strong objection towards the devm managed clean-ups. > > Yes it seems that there are some people who don't like this, where as > others do like them. > > > How about adding some devm-helpers.c in drivers/base - where we could > > collect devm-based helpers - and which could be enabled by own CONFIG - > > and left out by those who dislike it? > > I would make this something configurable through Kconfig, but if > go the static inline route, which I'm in favor of then we could just > have a: > > include/linux/devm-cleanup-helpers.h > > And put everything (including kdoc texts) there. > > This way the functionality is 100% opt-in (by explicitly including > the header if you want the helpers) which hopefully makes this a > bit more acceptable to people who don't like this style of cleanups. > > I would be even happy to act as the upstream maintainer for such a > include/linux/devm-cleanup-helpers.h file, I can maintain it as part > of the platform-drivers-x86 tree (with its own MAINTAINERS entry). > > Greg, would this be an acceptable solution to you ? I don't know, sorry, let's revisit this after 5.12-rc1 is out, with a patch set that I can review again, and we can go from there as I can't do anything until then... thanks, greg k-h
Hi, On 2/15/21 12:31 PM, gregkh@linuxfoundation.org wrote: > On Mon, Feb 15, 2021 at 11:37:26AM +0100, Hans de Goede wrote: >> Hi, >> >> On 2/15/21 8:22 AM, Vaittinen, Matti wrote: >>> >>> On Sat, 2021-02-13 at 16:59 +0100, Hans de Goede wrote: >>>> Hi, >>>> >>>> On 2/13/21 4:27 PM, Guenter Roeck wrote: >>>>> On 2/13/21 7:03 AM, Hans de Goede wrote: >>>>> [ ... ] >>>>>> I think something like this should work: >>>>>> >>>>>> static int devm_delayed_work_autocancel(struct device *dev, >>>>>> struct delayed_work *w, >>>>>> void (*worker)(struct >>>>>> work_struct *work)) { >>>>>> INIT_DELAYED_WORK(w, worker); >>>>>> return devm_add_action(dev, (void (*action)(void >>>>>> *))cancel_delayed_work_sync, w); >>>>>> } >>>>>> >>>>>> I'm not sure about the cast, that may need something like this >>>>>> instead: >>>>>> >>>>>> typedef void (*devm_action_func)(void *); >>>>>> >>>>>> static int devm_delayed_work_autocancel(struct device *dev, >>>>>> struct delayed_work *w, >>>>>> void (*worker)(struct >>>>>> work_struct *work)) { >>>>>> INIT_DELAYED_WORK(w, worker); >>>>>> return devm_add_action(dev, >>>>>> (devm_action_func)cancel_delayed_work_sync, w); >>>>> >>>>> Unfortunately, you can not type cast function pointers in C. It is >>>>> against the C ABI. >>>>> I am sure it is done in a few places in the kernel anyway, but >>>>> those are wrong. >>>> >>>> I see, bummer. >>> >>> I think using devm_add_action() is still a good idea. >> >> Yes, we could also just have a 1 line static inline function to do >> the function-cast. Like this: >> >> static inline void devm_delayed_work_autocancel_func(void *work) >> { >> cancel_delayed_work_sync(work); >> } >> >> static inline int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, void (*worker)(struct work_struct *work)) >> { >> INIT_DELAYED_WORK(w, worker); >> return devm_add_action(dev, devm_delayed_work_autocancel_func, w); >> } >> >> Both functions will then simply be compiled out in files which do not >> use them. >> >>>> If we add a devm_clk_prepare_enable() helper that should probably be >>>> added >>>> to drivers/clk/clk-devres.c and not to drivers/base/devres.c . >>>> >>>> I also still wonder if we cannot find a better place for this new >>>> devm_delayed_work_autocancel() helper but nothing comes to mind. >>> >>> I don't like the idea of including device.h from workqueue.h - and I >>> think this would be necessary if we added >>> devm_delayed_work_autocancel() as inline in workqueue.h, right? >> >> Yes. >> >>> I also see strong objection towards the devm managed clean-ups. >> >> Yes it seems that there are some people who don't like this, where as >> others do like them. >> >>> How about adding some devm-helpers.c in drivers/base - where we could >>> collect devm-based helpers - and which could be enabled by own CONFIG - >>> and left out by those who dislike it? >> >> I would make this something configurable through Kconfig, but if Clarification I meant to write: "I would NOT make this something configurable through Kconfig". >> go the static inline route, which I'm in favor of then we could just >> have a: >> >> include/linux/devm-cleanup-helpers.h >> >> And put everything (including kdoc texts) there. >> >> This way the functionality is 100% opt-in (by explicitly including >> the header if you want the helpers) which hopefully makes this a >> bit more acceptable to people who don't like this style of cleanups. >> >> I would be even happy to act as the upstream maintainer for such a >> include/linux/devm-cleanup-helpers.h file, I can maintain it as part >> of the platform-drivers-x86 tree (with its own MAINTAINERS entry). >> >> Greg, would this be an acceptable solution to you ? > > I don't know, sorry, let's revisit this after 5.12-rc1 is out, with a > patch set that I can review again, and we can go from there as I can't > do anything until then... Ok. Regards, Hans
On Mon, 2021-02-15 at 12:43 +0100, Hans de Goede wrote: > Hi, > > On 2/15/21 12:31 PM, gregkh@linuxfoundation.org wrote: > > On Mon, Feb 15, 2021 at 11:37:26AM +0100, Hans de Goede wrote: > > > Hi, > > > > > > On 2/15/21 8:22 AM, Vaittinen, Matti wrote: > > > > On Sat, 2021-02-13 at 16:59 +0100, Hans de Goede wrote: > > > > > Hi, > > > > > > > > > > On 2/13/21 4:27 PM, Guenter Roeck wrote: > > > > > > On 2/13/21 7:03 AM, Hans de Goede wrote: > > > > > > [ ... ] > > > > > > > I think something like this should work: > > > > > > > > > > > > > > static int devm_delayed_work_autocancel(struct device > > > > > > > *dev, > > > > > > > struct delayed_work *w, > > > > > > > void (*worker)(struct > > > > > > > work_struct *work)) { > > > > > > > INIT_DELAYED_WORK(w, worker); > > > > > > > return devm_add_action(dev, (void (*action)(void > > > > > > > *))cancel_delayed_work_sync, w); > > > > > > > } > > > > > > include/linux/devm-cleanup-helpers.h > > > > > > And put everything (including kdoc texts) there. > > > > > > This way the functionality is 100% opt-in (by explicitly > > > including > > > the header if you want the helpers) which hopefully makes this a > > > bit more acceptable to people who don't like this style of > > > cleanups. > > > > > > I would be even happy to act as the upstream maintainer for such > > > a > > > include/linux/devm-cleanup-helpers.h file, I can maintain it as > > > part > > > of the platform-drivers-x86 tree (with its own MAINTAINERS > > > entry). > > > > > > Greg, would this be an acceptable solution to you ? > > > > I don't know, sorry, let's revisit this after 5.12-rc1 is out, with > > a > > patch set that I can review again, and we can go from there as I > > can't > > do anything until then... > > Ok. This is Ok for me too. I am in no hurry with this - I've already a few things to work on. So, I will rework this to be in a single header when v5.12-rc1 is out. Best Regards Matti Vaittinen -- Matti Vaittinen, Linux device drivers ROHM Semiconductors, Finland SWDC Kiviharjunlenkki 1E 90220 OULU FINLAND ~~~ "I don't think so," said Rene Descartes. Just then he vanished ~~~ Simon says - in Latin please. "non cogito me" dixit Rene Descarte, deinde evanescavit (Thanks for the translation Simon)
diff --git a/drivers/base/devres.c b/drivers/base/devres.c index fb9d5289a620..2879595bb5a4 100644 --- a/drivers/base/devres.c +++ b/drivers/base/devres.c @@ -1231,3 +1231,36 @@ void devm_free_percpu(struct device *dev, void __percpu *pdata) (void *)pdata)); } EXPORT_SYMBOL_GPL(devm_free_percpu); + +static void dev_delayed_work_drop(struct device *dev, void *res) +{ + cancel_delayed_work_sync(*(struct delayed_work **)res); +} + +/** + * devm_delayed_work_autocancel - Resource-managed work allocation + * @dev: Device which lifetime work is bound to + * @pdata: work to be cancelled when device exits + * + * Initialize work which is automatically cancelled when device exits. + * A few drivers need delayed work which must be cancelled before driver + * is unload to avoid accessing removed resources. + * devm_delayed_work_autocancel() can be used to omit the explicit + * cancelleation when driver is unload. + */ +int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, + void (*worker)(struct work_struct *work)) +{ + struct delayed_work **ptr; + + ptr = devres_alloc(dev_delayed_work_drop, sizeof(*ptr), GFP_KERNEL); + if (!ptr) + return -ENOMEM; + + INIT_DELAYED_WORK(w, worker); + *ptr = w; + devres_add(dev, ptr); + + return 0; +} +EXPORT_SYMBOL_GPL(devm_delayed_work_autocancel); diff --git a/include/linux/device.h b/include/linux/device.h index 1779f90eeb4c..192456198de7 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -27,6 +27,7 @@ #include <linux/uidgid.h> #include <linux/gfp.h> #include <linux/overflow.h> +#include <linux/workqueue.h> #include <linux/device/bus.h> #include <linux/device/class.h> #include <linux/device/driver.h> @@ -249,6 +250,10 @@ void __iomem *devm_of_iomap(struct device *dev, struct device_node *node, int index, resource_size_t *size); +/* delayed work which is cancelled when driver exits */ +int devm_delayed_work_autocancel(struct device *dev, struct delayed_work *w, + void (*worker)(struct work_struct *work)); + /* allows to add/remove a custom action to devres stack */ int devm_add_action(struct device *dev, void (*action)(void *), void *data); void devm_remove_action(struct device *dev, void (*action)(void *), void *data);
A few drivers which need a delayed work-queue must cancel work at exit. Some of those implement remove solely for this purpose. Help drivers to avoid unnecessary remove and error-branch implementation by adding managed verision of delayed work initialization Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> --- drivers/base/devres.c | 33 +++++++++++++++++++++++++++++++++ include/linux/device.h | 5 +++++ 2 files changed, 38 insertions(+)