Message ID | 87d3hu6oxo.fsf@ti.com (mailing list archive) |
---|---|
State | Rejected, archived |
Headers | show |
On Fri, 1 Jul 2011, Kevin Hilman wrote: > OK, so the ->probe() part has been explained and makes sense, but I > would expect ->remove() to be similarily protected (as the documentation > states.) But that is not the case. Is that a bug? If so, patch below > makes the code match the documentation. I suspect it is a bug, but it's hard to be sure. It's so _blatantly_ wrong that it looks like it was done deliberately. > Kevin > > From eef73ab2feb203bacb57dc35862f2a9969b61593 Mon Sep 17 00:00:00 2001 > From: Kevin Hilman <khilman@ti.com> > Date: Fri, 1 Jul 2011 07:37:47 -0700 > Subject: [PATCH] driver core: prevent runtime PM races with ->remove() > > Runtime PM Documentation states that the runtime PM usage count is > incremented during driver ->probe() and ->remove(). This is designed > to prevent driver runtime PM races with subsystems which may initiate > runtime PM transitions before during and after drivers are loaded. > > Current code increments the usage_count during ->probe() but not > during ->remove(). This patch fixes the ->remove() part and makes the > code match the documentation. > > Signed-off-by: Kevin Hilman <khilman@ti.com> > --- > drivers/base/dd.c | 6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c > index 6658da7..47e079d 100644 > --- a/drivers/base/dd.c > +++ b/drivers/base/dd.c > @@ -329,13 +329,13 @@ static void __device_release_driver(struct device *dev) > blocking_notifier_call_chain(&dev->bus->p->bus_notifier, > BUS_NOTIFY_UNBIND_DRIVER, > dev); > - > - pm_runtime_put_sync(dev); > - > if (dev->bus && dev->bus->remove) > dev->bus->remove(dev); > else if (drv->remove) > drv->remove(dev); > + > + pm_runtime_put_sync(dev); > + > devres_release_all(dev); > dev->driver = NULL; > klist_remove(&dev->p->knode_driver); To be safer, the put_sync() call should be moved down here. Or maybe even after the blocking_notifier_call_chain() that occurs here. Alan Stern
Alan Stern <stern@rowland.harvard.edu> writes: > On Fri, 1 Jul 2011, Kevin Hilman wrote: > >> OK, so the ->probe() part has been explained and makes sense, but I >> would expect ->remove() to be similarily protected (as the documentation >> states.) But that is not the case. Is that a bug? If so, patch below >> makes the code match the documentation. > > I suspect it is a bug, but it's hard to be sure. It's so _blatantly_ > wrong that it looks like it was done deliberately. heh >> Kevin >> >> From eef73ab2feb203bacb57dc35862f2a9969b61593 Mon Sep 17 00:00:00 2001 >> From: Kevin Hilman <khilman@ti.com> >> Date: Fri, 1 Jul 2011 07:37:47 -0700 >> Subject: [PATCH] driver core: prevent runtime PM races with ->remove() >> >> Runtime PM Documentation states that the runtime PM usage count is >> incremented during driver ->probe() and ->remove(). This is designed >> to prevent driver runtime PM races with subsystems which may initiate >> runtime PM transitions before during and after drivers are loaded. >> >> Current code increments the usage_count during ->probe() but not >> during ->remove(). This patch fixes the ->remove() part and makes the >> code match the documentation. >> >> Signed-off-by: Kevin Hilman <khilman@ti.com> >> --- >> drivers/base/dd.c | 6 +++--- >> 1 files changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/base/dd.c b/drivers/base/dd.c >> index 6658da7..47e079d 100644 >> --- a/drivers/base/dd.c >> +++ b/drivers/base/dd.c >> @@ -329,13 +329,13 @@ static void __device_release_driver(struct device *dev) >> blocking_notifier_call_chain(&dev->bus->p->bus_notifier, >> BUS_NOTIFY_UNBIND_DRIVER, >> dev); >> - >> - pm_runtime_put_sync(dev); >> - >> if (dev->bus && dev->bus->remove) >> dev->bus->remove(dev); >> else if (drv->remove) >> drv->remove(dev); >> + >> + pm_runtime_put_sync(dev); >> + >> devres_release_all(dev); >> dev->driver = NULL; >> klist_remove(&dev->p->knode_driver); > > To be safer, the put_sync() call should be moved down here. Or maybe > even after the blocking_notifier_call_chain() that occurs here. I was actually thinking about the other direction: moving the get_sync after the first notifier chain. IOW, the get_sync/put_sync only protects the ->remove() calls, not the notifiers. The protection around the notifiers doesn't make sense to me, at least in the context of driver runtime PM racing with the subsystem. Especially since these notifiers are likely how the subsystem/bus/pm_domain code getting notified that there may be a device to manage in the first place. Kevin
On Fri, 1 Jul 2011, Kevin Hilman wrote: > >> --- a/drivers/base/dd.c > >> +++ b/drivers/base/dd.c > >> @@ -329,13 +329,13 @@ static void __device_release_driver(struct device *dev) > >> blocking_notifier_call_chain(&dev->bus->p->bus_notifier, > >> BUS_NOTIFY_UNBIND_DRIVER, > >> dev); > >> - > >> - pm_runtime_put_sync(dev); > >> - > >> if (dev->bus && dev->bus->remove) > >> dev->bus->remove(dev); > >> else if (drv->remove) > >> drv->remove(dev); > >> + > >> + pm_runtime_put_sync(dev); > >> + > >> devres_release_all(dev); > >> dev->driver = NULL; > >> klist_remove(&dev->p->knode_driver); > > > > To be safer, the put_sync() call should be moved down here. Or maybe > > even after the blocking_notifier_call_chain() that occurs here. > > I was actually thinking about the other direction: moving the get_sync > after the first notifier chain. IOW, the get_sync/put_sync only > protects the ->remove() calls, not the notifiers. > > The protection around the notifiers doesn't make sense to me, at least > in the context of driver runtime PM racing with the subsystem. > Especially since these notifiers are likely how the > subsystem/bus/pm_domain code getting notified that there may be a device > to manage in the first place. The get_sync part doesn't matter so much. Moving it past the notifier call would probably be okay -- unless one of the listeners on the notifier chain expects the device to be active. Changing the get_sync to get_noresume would probably also be okay -- subject to a similar reservation. The problem with the put_sync isn't the notifier. If you leave it where you've got it now, you'll end up invoking a callback at a time when the driver thinks it no longer controls the device but the driver-model core still thinks it does. You certainly want to do the dev->driver = NULL; first. Alan Stern
Alan Stern <stern@rowland.harvard.edu> writes: > On Fri, 1 Jul 2011, Kevin Hilman wrote: > >> >> --- a/drivers/base/dd.c >> >> +++ b/drivers/base/dd.c >> >> @@ -329,13 +329,13 @@ static void __device_release_driver(struct device *dev) >> >> blocking_notifier_call_chain(&dev->bus->p->bus_notifier, >> >> BUS_NOTIFY_UNBIND_DRIVER, >> >> dev); >> >> - >> >> - pm_runtime_put_sync(dev); >> >> - >> >> if (dev->bus && dev->bus->remove) >> >> dev->bus->remove(dev); >> >> else if (drv->remove) >> >> drv->remove(dev); >> >> + >> >> + pm_runtime_put_sync(dev); >> >> + >> >> devres_release_all(dev); >> >> dev->driver = NULL; >> >> klist_remove(&dev->p->knode_driver); >> > >> > To be safer, the put_sync() call should be moved down here. Or maybe >> > even after the blocking_notifier_call_chain() that occurs here. >> >> I was actually thinking about the other direction: moving the get_sync >> after the first notifier chain. IOW, the get_sync/put_sync only >> protects the ->remove() calls, not the notifiers. >> >> The protection around the notifiers doesn't make sense to me, at least >> in the context of driver runtime PM racing with the subsystem. >> Especially since these notifiers are likely how the >> subsystem/bus/pm_domain code getting notified that there may be a device >> to manage in the first place. > > The get_sync part doesn't matter so much. Moving it past the notifier > call would probably be okay -- unless one of the listeners on the > notifier chain expects the device to be active. Changing the get_sync > to get_noresume would probably also be okay -- subject to a similar > reservation. There are enough "probably"s in the above to make me a bit uncomfortable making this change. Maybe you can take this patch forward? Kevin > The problem with the put_sync isn't the notifier. If you leave it > where you've got it now, you'll end up invoking a callback at a time > when the driver thinks it no longer controls the device but the > driver-model core still thinks it does. You certainly want to do the > > dev->driver = NULL; > > first. > > Alan Stern
Hi, On Friday, July 01, 2011, Kevin Hilman wrote: > Alan Stern <stern@rowland.harvard.edu> writes: > > > On Fri, 1 Jul 2011, Kevin Hilman wrote: > > > >> OK, so the ->probe() part has been explained and makes sense, but I > >> would expect ->remove() to be similarily protected (as the documentation > >> states.) But that is not the case. Is that a bug? If so, patch below > >> makes the code match the documentation. > > > > I suspect it is a bug, but it's hard to be sure. It's so _blatantly_ > > wrong that it looks like it was done deliberately. > > heh I seem to remeber having a problem with the pm_runtime_put_sync() after drv->remove(dev) ... So the code in question was introduced by commit e1866b33b1e89f077b7132daae3dfd9a594e9a1a Author: Rafael J. Wysocki <rjw@sisk.pl> Date: Fri Apr 29 00:33:45 2011 +0200 PM / Runtime: Rework runtime PM handling during driver removal with a long changelog explaining the reason why. Which seems to make sense. ;-) So I'm not sure. Thanks, Rafael
On Fri, 1 Jul 2011, Rafael J. Wysocki wrote: > Hi, > > On Friday, July 01, 2011, Kevin Hilman wrote: > > Alan Stern <stern@rowland.harvard.edu> writes: > > > > > On Fri, 1 Jul 2011, Kevin Hilman wrote: > > > > > >> OK, so the ->probe() part has been explained and makes sense, but I > > >> would expect ->remove() to be similarily protected (as the documentation > > >> states.) But that is not the case. Is that a bug? If so, patch below > > >> makes the code match the documentation. > > > > > > I suspect it is a bug, but it's hard to be sure. It's so _blatantly_ > > > wrong that it looks like it was done deliberately. > > > > heh > > I seem to remeber having a problem with the pm_runtime_put_sync() after > drv->remove(dev) ... > > So the code in question was introduced by > > commit e1866b33b1e89f077b7132daae3dfd9a594e9a1a > Author: Rafael J. Wysocki <rjw@sisk.pl> > Date: Fri Apr 29 00:33:45 2011 +0200 > > PM / Runtime: Rework runtime PM handling during driver removal > > with a long changelog explaining the reason why. Which seems to make sense. ;-) Okay, that seems fair enough. Looks like the documentation needs to be updated to match, though. And we probably still want to make sure that access to the power/control and related attribute files is mutually exclusive with probe and remove. Alan Stern
diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 6658da7..47e079d 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -329,13 +329,13 @@ static void __device_release_driver(struct device *dev) blocking_notifier_call_chain(&dev->bus->p->bus_notifier, BUS_NOTIFY_UNBIND_DRIVER, dev); - - pm_runtime_put_sync(dev); - if (dev->bus && dev->bus->remove) dev->bus->remove(dev); else if (drv->remove) drv->remove(dev); + + pm_runtime_put_sync(dev); + devres_release_all(dev); dev->driver = NULL; klist_remove(&dev->p->knode_driver);