Message ID | 1634058.cXDNg15SOd@aspire.rjw.lan (mailing list archive) |
---|---|
State | Mainlined |
Headers | show |
Series | driver core: Two more updates related to device links | expand |
On Thu, 7 Feb 2019 at 19:46, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > If the target device has any suppliers, as reflected by device links > to them, __pm_runtime_set_status() does not take them into account, > which is not consistent with the other parts of the PM-runtime > framework and may lead to programming mistakes. > > Modify __pm_runtime_set_status() to take suppliers into account by > activating them upfront if the new status is RPM_ACTIVE and > deactivating them on exit if the new status is RPM_SUSPENDED. > > If the activation of one of the suppliers fails, the new status > will be RPM_SUSPENDED and the (remaining) suppliers will be > deactivated on exit (the child count of the device's parent > will be dropped too then). > > Of course, adding device links locking to __pm_runtime_set_status() > means that it cannot be run fron interrupt context, so make it use > spin_lock_irq() and spin_unlock_irq() instead of spin_lock_irqsave() > and spin_unlock_irqrestore(), respectively. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Rafael, thanks for working on this! I am running some tests at my side, but still not achieving the behavior I expect to. Will let you know when I have more details, but first some comments below. > --- > drivers/base/power/runtime.c | 45 ++++++++++++++++++++++++++++++++++++++----- > 1 file changed, 40 insertions(+), 5 deletions(-) > > Index: linux-pm/drivers/base/power/runtime.c > =================================================================== > --- linux-pm.orig/drivers/base/power/runtime.c > +++ linux-pm/drivers/base/power/runtime.c > @@ -1102,20 +1102,43 @@ EXPORT_SYMBOL_GPL(pm_runtime_get_if_in_u > * and the device parent's counter of unsuspended children is modified to > * reflect the new status. If the new status is RPM_SUSPENDED, an idle > * notification request for the parent is submitted. > + * > + * If @dev has any suppliers (as reflected by device links to them), and @status > + * is RPM_ACTIVE, they will be activated upfront and if the activation of one > + * of them fails, the status of @dev will be changed to RPM_SUSPENDED (instead > + * of the @status value) and the suppliers will be deacticated on exit. The > + * error returned by the failing supplier activation will be returned in that > + * case. > */ > int __pm_runtime_set_status(struct device *dev, unsigned int status) > { > struct device *parent = dev->parent; > - unsigned long flags; > bool notify_parent = false; > int error = 0; > > if (status != RPM_ACTIVE && status != RPM_SUSPENDED) > return -EINVAL; > > - spin_lock_irqsave(&dev->power.lock, flags); > + /* > + * If the new status is RPM_ACTIVE, the suppliers can be activated > + * upfront regardless of the current status, because next time > + * rpm_put_suppliers() runs, the rpm_active refcounts of the links > + * involved will be dropped down to one anyway. > + */ > + if (status == RPM_ACTIVE) { > + int idx = device_links_read_lock(); > + > + error = rpm_get_suppliers(dev); > + if (error) > + status = RPM_SUSPENDED; > + > + device_links_read_unlock(idx); > + } This doesn't look right to me, and more importantly, this isn't consistent with how we treat a parent/child. More precisely, I think you need to check "if (!dev->power.runtime_error && !dev->power.disable_depth)" and also whether "dev->power.runtime_status == status", before deciding to call rpm_get_suppliers() above. Otherwise you may end up resuming suppliers and/or increasing the link->rpm_active count, when you shouldn't. In other words, expecting __pm_runtime_set_status() to be called in "balanced" manner isn't correct. > + > + spin_lock_irq(&dev->power.lock); > > if (!dev->power.runtime_error && !dev->power.disable_depth) { > + status = dev->power.runtime_status; > error = -EAGAIN; > goto out; > } > @@ -1147,19 +1170,31 @@ int __pm_runtime_set_status(struct devic > > spin_unlock(&parent->power.lock); > > - if (error) > + if (error) { > + status = RPM_SUSPENDED; > goto out; > + } > } > > out_set: > __update_runtime_status(dev, status); > - dev->power.runtime_error = 0; > + if (!error) > + dev->power.runtime_error = 0; > + > out: > - spin_unlock_irqrestore(&dev->power.lock, flags); > + spin_unlock_irq(&dev->power.lock); > > if (notify_parent) > pm_request_idle(parent); > > + if (status == RPM_SUSPENDED) { > + int idx = device_links_read_lock(); > + > + rpm_put_suppliers(dev); > + > + device_links_read_unlock(idx); > + } > + > return error; > } > EXPORT_SYMBOL_GPL(__pm_runtime_set_status); > Kind regards Uffe
On Mon, 11 Feb 2019 at 14:27, Ulf Hansson <ulf.hansson@linaro.org> wrote: > > On Thu, 7 Feb 2019 at 19:46, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > If the target device has any suppliers, as reflected by device links > > to them, __pm_runtime_set_status() does not take them into account, > > which is not consistent with the other parts of the PM-runtime > > framework and may lead to programming mistakes. > > > > Modify __pm_runtime_set_status() to take suppliers into account by > > activating them upfront if the new status is RPM_ACTIVE and > > deactivating them on exit if the new status is RPM_SUSPENDED. > > > > If the activation of one of the suppliers fails, the new status > > will be RPM_SUSPENDED and the (remaining) suppliers will be > > deactivated on exit (the child count of the device's parent > > will be dropped too then). > > > > Of course, adding device links locking to __pm_runtime_set_status() > > means that it cannot be run fron interrupt context, so make it use > > spin_lock_irq() and spin_unlock_irq() instead of spin_lock_irqsave() > > and spin_unlock_irqrestore(), respectively. > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Rafael, thanks for working on this! > > I am running some tests at my side, but still not achieving the > behavior I expect to. Will let you know when I have more details, but > first some comments below. Alright, this is what I found. When I call pm_runtime_set_suspended(), in the ->probe() error path of my RPM test driver (I am removing the device link afterwards), then my expectation was that this should allow the supplier to become runtime suspended (sooner or later). This isn't the case, as it turns out the runtime PM usage count of the supplier, still remains 1 after the probe failure. My observation is that with $subject patch, the link->rpm_active count is now reaching 1, before it stayed at 2 - so one step forward. :-) However, the reason to why the runtime PM usage count never reaches 0, is because of the call to pm_runtime_get_noresume(supplier) in device_link_rpm_prepare(), which is called from device_link_add(). To solve the problem, it seems like we need to call pm_runtime_put(supplier), in case the device link is deleted while the consumer is still probing. > > > --- > > drivers/base/power/runtime.c | 45 ++++++++++++++++++++++++++++++++++++++----- > > 1 file changed, 40 insertions(+), 5 deletions(-) > > > > Index: linux-pm/drivers/base/power/runtime.c > > =================================================================== > > --- linux-pm.orig/drivers/base/power/runtime.c > > +++ linux-pm/drivers/base/power/runtime.c > > @@ -1102,20 +1102,43 @@ EXPORT_SYMBOL_GPL(pm_runtime_get_if_in_u > > * and the device parent's counter of unsuspended children is modified to > > * reflect the new status. If the new status is RPM_SUSPENDED, an idle > > * notification request for the parent is submitted. > > + * > > + * If @dev has any suppliers (as reflected by device links to them), and @status > > + * is RPM_ACTIVE, they will be activated upfront and if the activation of one > > + * of them fails, the status of @dev will be changed to RPM_SUSPENDED (instead > > + * of the @status value) and the suppliers will be deacticated on exit. The > > + * error returned by the failing supplier activation will be returned in that > > + * case. > > */ > > int __pm_runtime_set_status(struct device *dev, unsigned int status) > > { > > struct device *parent = dev->parent; > > - unsigned long flags; > > bool notify_parent = false; > > int error = 0; > > > > if (status != RPM_ACTIVE && status != RPM_SUSPENDED) > > return -EINVAL; > > > > - spin_lock_irqsave(&dev->power.lock, flags); > > + /* > > + * If the new status is RPM_ACTIVE, the suppliers can be activated > > + * upfront regardless of the current status, because next time > > + * rpm_put_suppliers() runs, the rpm_active refcounts of the links > > + * involved will be dropped down to one anyway. > > + */ > > + if (status == RPM_ACTIVE) { > > + int idx = device_links_read_lock(); > > + > > + error = rpm_get_suppliers(dev); > > + if (error) > > + status = RPM_SUSPENDED; > > + > > + device_links_read_unlock(idx); > > + } > > This doesn't look right to me, and more importantly, this isn't > consistent with how we treat a parent/child. > > More precisely, I think you need to check "if > (!dev->power.runtime_error && !dev->power.disable_depth)" and also > whether "dev->power.runtime_status == status", before deciding to call > rpm_get_suppliers() above. Otherwise you may end up resuming suppliers > and/or increasing the link->rpm_active count, when you shouldn't. > > In other words, expecting __pm_runtime_set_status() to be called in > "balanced" manner isn't correct. > > > + > > + spin_lock_irq(&dev->power.lock); > > > > if (!dev->power.runtime_error && !dev->power.disable_depth) { > > + status = dev->power.runtime_status; > > error = -EAGAIN; > > goto out; > > } > > @@ -1147,19 +1170,31 @@ int __pm_runtime_set_status(struct devic > > > > spin_unlock(&parent->power.lock); > > > > - if (error) > > + if (error) { > > + status = RPM_SUSPENDED; > > goto out; > > + } > > } > > > > out_set: > > __update_runtime_status(dev, status); > > - dev->power.runtime_error = 0; > > + if (!error) > > + dev->power.runtime_error = 0; > > + > > out: > > - spin_unlock_irqrestore(&dev->power.lock, flags); > > + spin_unlock_irq(&dev->power.lock); > > > > if (notify_parent) > > pm_request_idle(parent); > > > > + if (status == RPM_SUSPENDED) { > > + int idx = device_links_read_lock(); > > + > > + rpm_put_suppliers(dev); > > + > > + device_links_read_unlock(idx); > > + } > > + > > return error; > > } > > EXPORT_SYMBOL_GPL(__pm_runtime_set_status); > > > > Kind regards > Uffe
On Mon, Feb 11, 2019 at 2:28 PM Ulf Hansson <ulf.hansson@linaro.org> wrote: > > On Thu, 7 Feb 2019 at 19:46, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > If the target device has any suppliers, as reflected by device links > > to them, __pm_runtime_set_status() does not take them into account, > > which is not consistent with the other parts of the PM-runtime > > framework and may lead to programming mistakes. > > > > Modify __pm_runtime_set_status() to take suppliers into account by > > activating them upfront if the new status is RPM_ACTIVE and > > deactivating them on exit if the new status is RPM_SUSPENDED. > > > > If the activation of one of the suppliers fails, the new status > > will be RPM_SUSPENDED and the (remaining) suppliers will be > > deactivated on exit (the child count of the device's parent > > will be dropped too then). > > > > Of course, adding device links locking to __pm_runtime_set_status() > > means that it cannot be run fron interrupt context, so make it use > > spin_lock_irq() and spin_unlock_irq() instead of spin_lock_irqsave() > > and spin_unlock_irqrestore(), respectively. > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Rafael, thanks for working on this! > > I am running some tests at my side, but still not achieving the > behavior I expect to. Will let you know when I have more details, but > first some comments below. > > > --- > > drivers/base/power/runtime.c | 45 ++++++++++++++++++++++++++++++++++++++----- > > 1 file changed, 40 insertions(+), 5 deletions(-) > > > > Index: linux-pm/drivers/base/power/runtime.c > > =================================================================== > > --- linux-pm.orig/drivers/base/power/runtime.c > > +++ linux-pm/drivers/base/power/runtime.c > > @@ -1102,20 +1102,43 @@ EXPORT_SYMBOL_GPL(pm_runtime_get_if_in_u > > * and the device parent's counter of unsuspended children is modified to > > * reflect the new status. If the new status is RPM_SUSPENDED, an idle > > * notification request for the parent is submitted. > > + * > > + * If @dev has any suppliers (as reflected by device links to them), and @status > > + * is RPM_ACTIVE, they will be activated upfront and if the activation of one > > + * of them fails, the status of @dev will be changed to RPM_SUSPENDED (instead > > + * of the @status value) and the suppliers will be deacticated on exit. The > > + * error returned by the failing supplier activation will be returned in that > > + * case. > > */ > > int __pm_runtime_set_status(struct device *dev, unsigned int status) > > { > > struct device *parent = dev->parent; > > - unsigned long flags; > > bool notify_parent = false; > > int error = 0; > > > > if (status != RPM_ACTIVE && status != RPM_SUSPENDED) > > return -EINVAL; > > > > - spin_lock_irqsave(&dev->power.lock, flags); > > + /* > > + * If the new status is RPM_ACTIVE, the suppliers can be activated > > + * upfront regardless of the current status, because next time > > + * rpm_put_suppliers() runs, the rpm_active refcounts of the links > > + * involved will be dropped down to one anyway. > > + */ > > + if (status == RPM_ACTIVE) { > > + int idx = device_links_read_lock(); > > + > > + error = rpm_get_suppliers(dev); > > + if (error) > > + status = RPM_SUSPENDED; > > + > > + device_links_read_unlock(idx); > > + } > > This doesn't look right to me, and more importantly, this isn't > consistent with how we treat a parent/child. It cannot be entirely consistent with that, because you cannot walk the suppliers under the device's power.lock. The idea here is that activating suppliers upfront if the new status is RPM_ACTIVE shouldn't hurt regardless. > More precisely, I think you need to check "if > (!dev->power.runtime_error && !dev->power.disable_depth)" and also > whether "dev->power.runtime_status == status", before deciding to call > rpm_get_suppliers() above. Otherwise you may end up resuming suppliers > and/or increasing the link->rpm_active count, when you shouldn't. Resuming suppliers unnecessarily is not particularly efficient, but it is not incorrect. Incrementing their rpm_active temporarily also isn't incorrect as long as the rpm_active values are correct on exit (and note that incementing them if the consumer's status is RPM_ACTIVE doesn't even matter). > In other words, expecting __pm_runtime_set_status() to be called in > "balanced" manner isn't correct. There is no such expectation here. There is a possible race between __pm_runtime_set_status() and runtime suspend or resume of the device in case PM-runtime is enabled for it when __pm_runtime_set_status() is called, but it shouldn't occur if __pm_runtime_set_status() is used correctly (that is, when PM-runtime is disabled for the device). I think I know how to avoid that race, though, so I'm going to post an incremental fix if that works out.
On Mon, Feb 11, 2019 at 4:51 PM Ulf Hansson <ulf.hansson@linaro.org> wrote: > > On Mon, 11 Feb 2019 at 14:27, Ulf Hansson <ulf.hansson@linaro.org> wrote: > > > > On Thu, 7 Feb 2019 at 19:46, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > > > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > > > If the target device has any suppliers, as reflected by device links > > > to them, __pm_runtime_set_status() does not take them into account, > > > which is not consistent with the other parts of the PM-runtime > > > framework and may lead to programming mistakes. > > > > > > Modify __pm_runtime_set_status() to take suppliers into account by > > > activating them upfront if the new status is RPM_ACTIVE and > > > deactivating them on exit if the new status is RPM_SUSPENDED. > > > > > > If the activation of one of the suppliers fails, the new status > > > will be RPM_SUSPENDED and the (remaining) suppliers will be > > > deactivated on exit (the child count of the device's parent > > > will be dropped too then). > > > > > > Of course, adding device links locking to __pm_runtime_set_status() > > > means that it cannot be run fron interrupt context, so make it use > > > spin_lock_irq() and spin_unlock_irq() instead of spin_lock_irqsave() > > > and spin_unlock_irqrestore(), respectively. > > > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > Rafael, thanks for working on this! > > > > I am running some tests at my side, but still not achieving the > > behavior I expect to. Will let you know when I have more details, but > > first some comments below. > > Alright, this is what I found. > > When I call pm_runtime_set_suspended(), in the ->probe() error path of > my RPM test driver (I am removing the device link afterwards), then my > expectation was that this should allow the supplier to become runtime > suspended (sooner or later). This isn't the case, as it turns out the > runtime PM usage count of the supplier, still remains 1 after the > probe failure. > > My observation is that with $subject patch, the link->rpm_active count > is now reaching 1, before it stayed at 2 - so one step forward. :-) > > However, the reason to why the runtime PM usage count never reaches 0, > is because of the call to pm_runtime_get_noresume(supplier) in > device_link_rpm_prepare(), which is called from device_link_add(). That was there previously, I've just moved it to device_link_rpm_prepare(). But good catch! > To solve the problem, it seems like we need to call > pm_runtime_put(supplier), in case the device link is deleted while the > consumer is still probing. I'd rather change the way pm_runtime_get/put_suppliers() work, so that they use the rpm_active refcount, but pm_runtime_put_suppliers() only drops it by one - unless it is one already. Then, when adding a new link with DL_FLAG_RPM_ACTIVE, device_link_add() only needs to increment its rpm_active *twice* (instead of doing that once as to does now), so it will stay above one after the subsequent pm_runtime_put_suppliers() - and if it goes away in the meantime, then it will be cleaned up by the removal. In turn, if a link is created without DL_FLAG_RPM_ACTIVE, its rpm_active is one and then pm_runtime_put_suppliers() will just skip it. A patch will follow. :-)
On Tue, 12 Feb 2019 at 00:05, Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Mon, Feb 11, 2019 at 4:51 PM Ulf Hansson <ulf.hansson@linaro.org> wrote: > > > > On Mon, 11 Feb 2019 at 14:27, Ulf Hansson <ulf.hansson@linaro.org> wrote: > > > > > > On Thu, 7 Feb 2019 at 19:46, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > > > > > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > > > > > If the target device has any suppliers, as reflected by device links > > > > to them, __pm_runtime_set_status() does not take them into account, > > > > which is not consistent with the other parts of the PM-runtime > > > > framework and may lead to programming mistakes. > > > > > > > > Modify __pm_runtime_set_status() to take suppliers into account by > > > > activating them upfront if the new status is RPM_ACTIVE and > > > > deactivating them on exit if the new status is RPM_SUSPENDED. > > > > > > > > If the activation of one of the suppliers fails, the new status > > > > will be RPM_SUSPENDED and the (remaining) suppliers will be > > > > deactivated on exit (the child count of the device's parent > > > > will be dropped too then). > > > > > > > > Of course, adding device links locking to __pm_runtime_set_status() > > > > means that it cannot be run fron interrupt context, so make it use > > > > spin_lock_irq() and spin_unlock_irq() instead of spin_lock_irqsave() > > > > and spin_unlock_irqrestore(), respectively. > > > > > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > > > Rafael, thanks for working on this! > > > > > > I am running some tests at my side, but still not achieving the > > > behavior I expect to. Will let you know when I have more details, but > > > first some comments below. > > > > Alright, this is what I found. > > > > When I call pm_runtime_set_suspended(), in the ->probe() error path of > > my RPM test driver (I am removing the device link afterwards), then my > > expectation was that this should allow the supplier to become runtime > > suspended (sooner or later). This isn't the case, as it turns out the > > runtime PM usage count of the supplier, still remains 1 after the > > probe failure. > > > > My observation is that with $subject patch, the link->rpm_active count > > is now reaching 1, before it stayed at 2 - so one step forward. :-) > > > > However, the reason to why the runtime PM usage count never reaches 0, > > is because of the call to pm_runtime_get_noresume(supplier) in > > device_link_rpm_prepare(), which is called from device_link_add(). > > That was there previously, I've just moved it to device_link_rpm_prepare(). Correct. The problem been there before. Even without using DL_FLAG_RPM_ACTIVE. > > But good catch! > > > To solve the problem, it seems like we need to call > > pm_runtime_put(supplier), in case the device link is deleted while the > > consumer is still probing. > > I'd rather change the way pm_runtime_get/put_suppliers() work, so that > they use the rpm_active refcount, but pm_runtime_put_suppliers() only > drops it by one - unless it is one already. That seems like a very reasonable approach! The mix between calling pm_runtime_get/put*() on the supplier device directly vs using the path with the rpm_active count, is to me rather confusing. Using only the latter, would be a nice cleanup anyway, I think. > > Then, when adding a new link with DL_FLAG_RPM_ACTIVE, > device_link_add() only needs to increment its rpm_active *twice* > (instead of doing that once as to does now), so it will stay above one > after the subsequent pm_runtime_put_suppliers() - and if it goes away > in the meantime, then it will be cleaned up by the removal. Assuming you will add a check for "consumer->links.status == DL_DEV_PROBING" to understand if rpm_active should by be decreased. Yes, it seems reasonable. > > In turn, if a link is created without DL_FLAG_RPM_ACTIVE, its > rpm_active is one and then pm_runtime_put_suppliers() will just skip > it. > > A patch will follow. :-) Great, I am here to review it. :-) Kind regards Uffe
On Mon, 11 Feb 2019 at 23:41, Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Mon, Feb 11, 2019 at 2:28 PM Ulf Hansson <ulf.hansson@linaro.org> wrote: > > > > On Thu, 7 Feb 2019 at 19:46, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > > > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > > > If the target device has any suppliers, as reflected by device links > > > to them, __pm_runtime_set_status() does not take them into account, > > > which is not consistent with the other parts of the PM-runtime > > > framework and may lead to programming mistakes. > > > > > > Modify __pm_runtime_set_status() to take suppliers into account by > > > activating them upfront if the new status is RPM_ACTIVE and > > > deactivating them on exit if the new status is RPM_SUSPENDED. > > > > > > If the activation of one of the suppliers fails, the new status > > > will be RPM_SUSPENDED and the (remaining) suppliers will be > > > deactivated on exit (the child count of the device's parent > > > will be dropped too then). > > > > > > Of course, adding device links locking to __pm_runtime_set_status() > > > means that it cannot be run fron interrupt context, so make it use > > > spin_lock_irq() and spin_unlock_irq() instead of spin_lock_irqsave() > > > and spin_unlock_irqrestore(), respectively. > > > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > Rafael, thanks for working on this! > > > > I am running some tests at my side, but still not achieving the > > behavior I expect to. Will let you know when I have more details, but > > first some comments below. > > > > > --- > > > drivers/base/power/runtime.c | 45 ++++++++++++++++++++++++++++++++++++++----- > > > 1 file changed, 40 insertions(+), 5 deletions(-) > > > > > > Index: linux-pm/drivers/base/power/runtime.c > > > =================================================================== > > > --- linux-pm.orig/drivers/base/power/runtime.c > > > +++ linux-pm/drivers/base/power/runtime.c > > > @@ -1102,20 +1102,43 @@ EXPORT_SYMBOL_GPL(pm_runtime_get_if_in_u > > > * and the device parent's counter of unsuspended children is modified to > > > * reflect the new status. If the new status is RPM_SUSPENDED, an idle > > > * notification request for the parent is submitted. > > > + * > > > + * If @dev has any suppliers (as reflected by device links to them), and @status > > > + * is RPM_ACTIVE, they will be activated upfront and if the activation of one > > > + * of them fails, the status of @dev will be changed to RPM_SUSPENDED (instead > > > + * of the @status value) and the suppliers will be deacticated on exit. The > > > + * error returned by the failing supplier activation will be returned in that > > > + * case. > > > */ > > > int __pm_runtime_set_status(struct device *dev, unsigned int status) > > > { > > > struct device *parent = dev->parent; > > > - unsigned long flags; > > > bool notify_parent = false; > > > int error = 0; > > > > > > if (status != RPM_ACTIVE && status != RPM_SUSPENDED) > > > return -EINVAL; > > > > > > - spin_lock_irqsave(&dev->power.lock, flags); > > > + /* > > > + * If the new status is RPM_ACTIVE, the suppliers can be activated > > > + * upfront regardless of the current status, because next time > > > + * rpm_put_suppliers() runs, the rpm_active refcounts of the links > > > + * involved will be dropped down to one anyway. > > > + */ > > > + if (status == RPM_ACTIVE) { > > > + int idx = device_links_read_lock(); > > > + > > > + error = rpm_get_suppliers(dev); > > > + if (error) > > > + status = RPM_SUSPENDED; > > > + > > > + device_links_read_unlock(idx); > > > + } > > > > This doesn't look right to me, and more importantly, this isn't > > consistent with how we treat a parent/child. > > It cannot be entirely consistent with that, because you cannot walk > the suppliers under the device's power.lock. > > The idea here is that activating suppliers upfront if the new status > is RPM_ACTIVE shouldn't hurt regardless. I see. However, perhaps we can just read out the needed flags/states (within device's power.lock) before walking the suppliers. In principle, those flags/states shouldn't really change, in case runtime PM have been properly disabled by the caller. > > > More precisely, I think you need to check "if > > (!dev->power.runtime_error && !dev->power.disable_depth)" and also > > whether "dev->power.runtime_status == status", before deciding to call > > rpm_get_suppliers() above. Otherwise you may end up resuming suppliers > > and/or increasing the link->rpm_active count, when you shouldn't. > > Resuming suppliers unnecessarily is not particularly efficient, but it > is not incorrect. Incrementing their rpm_active temporarily also > isn't incorrect as long as the rpm_active values are correct on exit > (and note that incementing them if the consumer's status is RPM_ACTIVE > doesn't even matter). > > > In other words, expecting __pm_runtime_set_status() to be called in > > "balanced" manner isn't correct. > > There is no such expectation here. You are right! I didn't realize that rpm_put_suppliers() actually doesn't drop the usage count only by one, but instead as many times as needed to let rpm_active reach one. > > There is a possible race between __pm_runtime_set_status() and runtime > suspend or resume of the device in case PM-runtime is enabled for it > when __pm_runtime_set_status() is called, but it shouldn't occur if > __pm_runtime_set_status() is used correctly (that is, when PM-runtime > is disabled for the device). > > I think I know how to avoid that race, though, so I'm going to post an > incremental fix if that works out. Okay, let's see what comes out of this. Kind regards Uffe
On Thu, 7 Feb 2019 at 19:46, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > If the target device has any suppliers, as reflected by device links > to them, __pm_runtime_set_status() does not take them into account, > which is not consistent with the other parts of the PM-runtime > framework and may lead to programming mistakes. > > Modify __pm_runtime_set_status() to take suppliers into account by > activating them upfront if the new status is RPM_ACTIVE and > deactivating them on exit if the new status is RPM_SUSPENDED. > > If the activation of one of the suppliers fails, the new status > will be RPM_SUSPENDED and the (remaining) suppliers will be > deactivated on exit (the child count of the device's parent > will be dropped too then). > > Of course, adding device links locking to __pm_runtime_set_status() > means that it cannot be run fron interrupt context, so make it use > spin_lock_irq() and spin_unlock_irq() instead of spin_lock_irqsave() > and spin_unlock_irqrestore(), respectively. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Kind regards Uffe > --- > drivers/base/power/runtime.c | 45 ++++++++++++++++++++++++++++++++++++++----- > 1 file changed, 40 insertions(+), 5 deletions(-) > > Index: linux-pm/drivers/base/power/runtime.c > =================================================================== > --- linux-pm.orig/drivers/base/power/runtime.c > +++ linux-pm/drivers/base/power/runtime.c > @@ -1102,20 +1102,43 @@ EXPORT_SYMBOL_GPL(pm_runtime_get_if_in_u > * and the device parent's counter of unsuspended children is modified to > * reflect the new status. If the new status is RPM_SUSPENDED, an idle > * notification request for the parent is submitted. > + * > + * If @dev has any suppliers (as reflected by device links to them), and @status > + * is RPM_ACTIVE, they will be activated upfront and if the activation of one > + * of them fails, the status of @dev will be changed to RPM_SUSPENDED (instead > + * of the @status value) and the suppliers will be deacticated on exit. The > + * error returned by the failing supplier activation will be returned in that > + * case. > */ > int __pm_runtime_set_status(struct device *dev, unsigned int status) > { > struct device *parent = dev->parent; > - unsigned long flags; > bool notify_parent = false; > int error = 0; > > if (status != RPM_ACTIVE && status != RPM_SUSPENDED) > return -EINVAL; > > - spin_lock_irqsave(&dev->power.lock, flags); > + /* > + * If the new status is RPM_ACTIVE, the suppliers can be activated > + * upfront regardless of the current status, because next time > + * rpm_put_suppliers() runs, the rpm_active refcounts of the links > + * involved will be dropped down to one anyway. > + */ > + if (status == RPM_ACTIVE) { > + int idx = device_links_read_lock(); > + > + error = rpm_get_suppliers(dev); > + if (error) > + status = RPM_SUSPENDED; > + > + device_links_read_unlock(idx); > + } > + > + spin_lock_irq(&dev->power.lock); > > if (!dev->power.runtime_error && !dev->power.disable_depth) { > + status = dev->power.runtime_status; > error = -EAGAIN; > goto out; > } > @@ -1147,19 +1170,31 @@ int __pm_runtime_set_status(struct devic > > spin_unlock(&parent->power.lock); > > - if (error) > + if (error) { > + status = RPM_SUSPENDED; > goto out; > + } > } > > out_set: > __update_runtime_status(dev, status); > - dev->power.runtime_error = 0; > + if (!error) > + dev->power.runtime_error = 0; > + > out: > - spin_unlock_irqrestore(&dev->power.lock, flags); > + spin_unlock_irq(&dev->power.lock); > > if (notify_parent) > pm_request_idle(parent); > > + if (status == RPM_SUSPENDED) { > + int idx = device_links_read_lock(); > + > + rpm_put_suppliers(dev); > + > + device_links_read_unlock(idx); > + } > + > return error; > } > EXPORT_SYMBOL_GPL(__pm_runtime_set_status); >
Index: linux-pm/drivers/base/power/runtime.c =================================================================== --- linux-pm.orig/drivers/base/power/runtime.c +++ linux-pm/drivers/base/power/runtime.c @@ -1102,20 +1102,43 @@ EXPORT_SYMBOL_GPL(pm_runtime_get_if_in_u * and the device parent's counter of unsuspended children is modified to * reflect the new status. If the new status is RPM_SUSPENDED, an idle * notification request for the parent is submitted. + * + * If @dev has any suppliers (as reflected by device links to them), and @status + * is RPM_ACTIVE, they will be activated upfront and if the activation of one + * of them fails, the status of @dev will be changed to RPM_SUSPENDED (instead + * of the @status value) and the suppliers will be deacticated on exit. The + * error returned by the failing supplier activation will be returned in that + * case. */ int __pm_runtime_set_status(struct device *dev, unsigned int status) { struct device *parent = dev->parent; - unsigned long flags; bool notify_parent = false; int error = 0; if (status != RPM_ACTIVE && status != RPM_SUSPENDED) return -EINVAL; - spin_lock_irqsave(&dev->power.lock, flags); + /* + * If the new status is RPM_ACTIVE, the suppliers can be activated + * upfront regardless of the current status, because next time + * rpm_put_suppliers() runs, the rpm_active refcounts of the links + * involved will be dropped down to one anyway. + */ + if (status == RPM_ACTIVE) { + int idx = device_links_read_lock(); + + error = rpm_get_suppliers(dev); + if (error) + status = RPM_SUSPENDED; + + device_links_read_unlock(idx); + } + + spin_lock_irq(&dev->power.lock); if (!dev->power.runtime_error && !dev->power.disable_depth) { + status = dev->power.runtime_status; error = -EAGAIN; goto out; } @@ -1147,19 +1170,31 @@ int __pm_runtime_set_status(struct devic spin_unlock(&parent->power.lock); - if (error) + if (error) { + status = RPM_SUSPENDED; goto out; + } } out_set: __update_runtime_status(dev, status); - dev->power.runtime_error = 0; + if (!error) + dev->power.runtime_error = 0; + out: - spin_unlock_irqrestore(&dev->power.lock, flags); + spin_unlock_irq(&dev->power.lock); if (notify_parent) pm_request_idle(parent); + if (status == RPM_SUSPENDED) { + int idx = device_links_read_lock(); + + rpm_put_suppliers(dev); + + device_links_read_unlock(idx); + } + return error; } EXPORT_SYMBOL_GPL(__pm_runtime_set_status);