Message ID | 201208062230.58553.rjw@sisk.pl (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Monday, August 06, 2012, Rafael J. Wysocki wrote: > On Monday, August 06, 2012, Alan Stern wrote: > > On Sun, 5 Aug 2012, Rafael J. Wysocki wrote: > [...] > > > > > What about the appended experimental patch? > > > > For now, I think it would be best to start with a single func argument. > > If it turns out that anyone really needs to have two separate > > arguments, the single-argument form can be reimplemented on top of the > > two-argument form easily enough. > > All right. > > > > @@ -484,6 +484,15 @@ static int rpm_suspend(struct device *de > > > goto out; > > > } > > > > > > +void rpm_queue_up_resume(struct device *dev) > > > +{ > > > + dev->power.request = RPM_REQ_RESUME; > > > + if (!dev->power.request_pending) { > > > + dev->power.request_pending = true; > > > + queue_work(pm_wq, &dev->power.work); > > > + } > > > +} > > > + > > > /** > > > * rpm_resume - Carry out runtime resume of given device. > > > * @dev: Device to resume. > > > @@ -524,7 +533,9 @@ static int rpm_resume(struct device *dev > > > * rather than cancelling it now only to restart it again in the near > > > * future. > > > */ > > > - dev->power.request = RPM_REQ_NONE; > > > + if (dev->power.request != RPM_REQ_RESUME || !dev->power.func) > > > + dev->power.request = RPM_REQ_NONE; > > > + > > > if (!dev->power.timer_autosuspends) > > > pm_runtime_deactivate_timer(dev); > > > > > > @@ -533,6 +544,12 @@ static int rpm_resume(struct device *dev > > > goto out; > > > } > > > > > > + if (dev->power.func && (rpmflags & RPM_ASYNC)) { > > > + rpm_queue_up_resume(dev); > > > + retval = 0; > > > + goto out; > > > + } > > > + > > > if (dev->power.runtime_status == RPM_RESUMING > > > || dev->power.runtime_status == RPM_SUSPENDING) { > > > DEFINE_WAIT(wait); > > > > All those changes (and some of the following ones) are symptoms of a > > basic mistake in this approach. > > Every time you say something like this (i.e. liks someone who knows better) s/liks/like/ > I kind of feel like being under attach, which I hope is not your intention. s/attach/attack/ Two typos in one sentence, I guess it could have been worse ... > Never mind. :-) > > Those changes are there for different reasons rather unrelated to the way > func() is being called, so let me explain. > > First, rpm_queue_up_resume() is introduced, because the code it contains will > have to be called in two different places. At least I don't see how to avoid > that without increasing the code complexity too much. > > Second, the following change in rpm_resume() > > - dev->power.request = RPM_REQ_NONE; > + if (dev->power.request != RPM_REQ_RESUME || !dev->power.func) > + dev->power.request = RPM_REQ_NONE; > > is made to prevent rpm_resume() from canceling the execution of func() queued > up by an earlier pm_runtime_get_and_call(). I believe it is necessary. > > Finally, this change in rpm_resume(): > > + if (dev->power.func && (rpmflags & RPM_ASYNC)) { > + rpm_queue_up_resume(dev); > + retval = 0; > + goto out; > + } > > is not strictly necessary if pm_runtime_get_and_call() is modified to run > rpm_queue_up_resume() directly, like in the new version of the patch which is > appended. This reduces the number of checks overall, so perhaps it's better. > > > The idea of this new feature is to > > call "func" as soon as we know the device is at full power, no matter > > how it got there. > > Yes, it is so. > > > That means we should call it near the end of > > rpm_resume() (just before the rpm_idle() call), not from within > > pm_runtime_work(). > > > > Doing it this way will be more efficient and (I think) will remove > > some races. > > Except that func() shouldn't be executed under dev->power.lock, which makes it > rather difficult to call it from rpm_resume(). Or at least it seems so. > > Moreover, it should be called after we've changed the status to RPM_ACTIVE > _and_ dropped the lock. So we could drop the lock right before returning, execute func() and acquire the lock once again, but then func() might be executed by any thread that happened to resume the device. In that case the caller of pm_runtime_get_and_call() would have to worry about locks that such threads might acquire and it would have to make sure that func() didn't try to acquire them too. That may not be a big deal, but if func() is executed by pm_runtime_work(), that issue simply goes away. Then, however, there's another issue: what should happen if pm_runtime_get_and_call() finds that it cannot execute func() right away, so it queues up resume and the execution of it, in the meantime some other thread resumes the device synchronously and pm_runtime_get_and_call() is run again. I think in that case func() should be executed synchronously and the one waiting for execution should be canceled. The alternative would be to return -EAGAIN from pm_runtime_get_and_call() and expect the caller to cope with that, which isn't too attractive. This actually is analogous to the case when pm_runtime_get_and_call() sees that power.func is not NULL. In my experimental patches it returned -EAGAIN in that case, but perhaps it's better to replace the existing power.func with the new one. Then, by doing pm_runtime_get_and_call(dev, NULL) we can ensure that either the previous func() has run already or it will never run, which may be useful. Well, I guess I need to prepare a new experimantal patch. :-) I'll send it in a new thread I think. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 7 Aug 2012, Rafael J. Wysocki wrote: > > > All those changes (and some of the following ones) are symptoms of a > > > basic mistake in this approach. > > > > Every time you say something like this (i.e. liks someone who knows better) > > s/liks/like/ > > > I kind of feel like being under attach, which I hope is not your intention. > > s/attach/attack/ Sorry; you're right. It's all too easy to get very arrogant in email messages. I'll try not to attack so strongly in the future. > > > The idea of this new feature is to > > > call "func" as soon as we know the device is at full power, no matter > > > how it got there. > > > > Yes, it is so. Incidentally, that sentence is the justification for the invariance condition mentioned later. power.func should be called as soon as we know the device is at full power; therefore when the status changes to RPM_ACTIVE it should be called and then cleared (if it was set), and it should never get set while the status is RPM_ACTIVE. Therefore it should never be true that power.func is set _and_ the status is RPM_ACTIVE. > > > That means we should call it near the end of > > > rpm_resume() (just before the rpm_idle() call), not from within > > > pm_runtime_work(). > > > > > > Doing it this way will be more efficient and (I think) will remove > > > some races. > > > > Except that func() shouldn't be executed under dev->power.lock, which makes it > > rather difficult to call it from rpm_resume(). Or at least it seems so. > > > > Moreover, it should be called after we've changed the status to RPM_ACTIVE > > _and_ dropped the lock. > > So we could drop the lock right before returning, execute func() and acquire > the lock once again, Yes; that's what I had in mind. We already do something similar when calling pm_runtime_put(parent). > but then func() might be executed by any thread that > happened to resume the device. In that case the caller of > pm_runtime_get_and_call() would have to worry about locks that such threads > might acquire and it would have to make sure that func() didn't try to acquire > them too. That may not be a big deal, but if func() is executed by > pm_runtime_work(), that issue simply goes away. But then you have to worry about races between pm_runtime_resume() and the workqueue. If the device is resumed by some other thread, it could be suspended again before "func" is called. > Then, however, there's another issue: what should happen if > pm_runtime_get_and_call() finds that it cannot execute func() right away, > so it queues up resume and the execution of it, in the meantime some other > thread resumes the device synchronously and pm_runtime_get_and_call() is > run again. I think in that case func() should be executed synchronously > and the one waiting for execution should be canceled. The alternative > would be to return -EAGAIN from pm_runtime_get_and_call() and expect the > caller to cope with that, which isn't too attractive. > > This actually is analogous to the case when pm_runtime_get_and_call() > sees that power.func is not NULL. In my experimental patches it returned > -EAGAIN in that case, but perhaps it's better to replace the existing > power.func with the new one. Then, by doing pm_runtime_get_and_call(dev, NULL) > we can ensure that either the previous func() has run already or it will never > run, which may be useful. A good point. I agree that pm_runtime_get_and_call() should always overwrite the existing power.func value. There are a couple of other issues remaining. What's the best approach when disable_count > 0? My feeling is that we should still rely on power.runtime_status as the best approximation to the device's state, so we shouldn't call "func" directly unless the status is already RPM_ACTIVE. If the status is something else, we can't queue an async resume request. So we just set power.func and return. Eventually the driver will either call pm_runtime_set_active() or pm_runtime_enable() followed by pm_runtime_resume(), at which time we would call power.func. Also, what should happen when power.runtime_error is set? The same as when disable_depth > 0? You mentioned that pm_runtime_disable() does a resume if there's a pending resume request. I had forgotten about this. It worries me, because subsystems use code sequences like this: pm_runtime_disable(dev); pm_runtime_set_active(dev); pm_runtime_enable(dev); in their system resume routines (in fact, we advise them to do so in the Documentation file). Now, it is unlikely for a resume request to be pending during system sleep, but it doesn't seem to be impossible. When there is such a pending request, the pm_runtime_disable() call will try to do a runtime resume at a time when the device has just been restored to full power. That's not good. Probably this pattern occurs in few enough places that we could go through and fix them all. But how? Should there be a new function: pm_adjust_runtime_status_after_system_resume()? Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday, August 07, 2012, Alan Stern wrote: > On Tue, 7 Aug 2012, Rafael J. Wysocki wrote: > > > > > All those changes (and some of the following ones) are symptoms of a > > > > basic mistake in this approach. > > > > > > Every time you say something like this (i.e. liks someone who knows better) > > > > s/liks/like/ > > > > > I kind of feel like being under attach, which I hope is not your intention. > > > > s/attach/attack/ > > Sorry; you're right. It's all too easy to get very arrogant in email > messages. I'll try not to attack so strongly in the future. Thanks! > > > > The idea of this new feature is to > > > > call "func" as soon as we know the device is at full power, no matter > > > > how it got there. > > > > > > Yes, it is so. > > Incidentally, that sentence is the justification for the invariance > condition mentioned later. :-) > power.func should be called as soon as we > know the device is at full power; therefore when the status changes to > RPM_ACTIVE it should be called and then cleared (if it was set), and it > should never get set while the status is RPM_ACTIVE. Therefore it > should never be true that power.func is set _and_ the status is > RPM_ACTIVE. I guess with the patch I've just sent: http://marc.info/?l=linux-pm&m=134437366811066&w=4 it's almost the case, except when a synchronous resume happens before the work item scheduled by __pm_runtime_get_and_call() is run. However, I don't think it is a problem in that case, because the device won't be suspended before the execution of that work item starts (rpm_check_suspend_allowed() will see that power.request_pending is set and that power.request is RPM_REQ_RESUME, so it will return -EAGAIN). > > > > That means we should call it near the end of > > > > rpm_resume() (just before the rpm_idle() call), not from within > > > > pm_runtime_work(). > > > > > > > > Doing it this way will be more efficient and (I think) will remove > > > > some races. > > > > > > Except that func() shouldn't be executed under dev->power.lock, which makes it > > > rather difficult to call it from rpm_resume(). Or at least it seems so. > > > > > > Moreover, it should be called after we've changed the status to RPM_ACTIVE > > > _and_ dropped the lock. > > > > So we could drop the lock right before returning, execute func() and acquire > > the lock once again, > > Yes; that's what I had in mind. We already do something similar when > calling pm_runtime_put(parent). Yes, we do. However, I still don't think it's really safe to call func() from rpm_resume(), because it may be run synchronously from a context quite unrelated to the caller of __pm_runtime_get_and_call() (for example, from the pm_runtime_barrier() in __device_suspend()). > > but then func() might be executed by any thread that > > happened to resume the device. In that case the caller of > > pm_runtime_get_and_call() would have to worry about locks that such threads > > might acquire and it would have to make sure that func() didn't try to acquire > > them too. That may not be a big deal, but if func() is executed by > > pm_runtime_work(), that issue simply goes away. > > But then you have to worry about races between pm_runtime_resume() and > the workqueue. If the device is resumed by some other thread, it > could be suspended again before "func" is called. No, it can't, if the device's usage count is incremented before dropping power.lock after rpm_resume(dev, 0) has returned. > > Then, however, there's another issue: what should happen if > > pm_runtime_get_and_call() finds that it cannot execute func() right away, > > so it queues up resume and the execution of it, in the meantime some other > > thread resumes the device synchronously and pm_runtime_get_and_call() is > > run again. I think in that case func() should be executed synchronously > > and the one waiting for execution should be canceled. The alternative > > would be to return -EAGAIN from pm_runtime_get_and_call() and expect the > > caller to cope with that, which isn't too attractive. > > > > This actually is analogous to the case when pm_runtime_get_and_call() > > sees that power.func is not NULL. In my experimental patches it returned > > -EAGAIN in that case, but perhaps it's better to replace the existing > > power.func with the new one. Then, by doing pm_runtime_get_and_call(dev, NULL) > > we can ensure that either the previous func() has run already or it will never > > run, which may be useful. > > A good point. I agree that pm_runtime_get_and_call() should always > overwrite the existing power.func value. > > There are a couple of other issues remaining. > > What's the best approach when disable_count > 0? My feeling is that we > should still rely on power.runtime_status as the best approximation to > the device's state, so we shouldn't call "func" directly unless the > status is already RPM_ACTIVE. Well, that's one possibility. In that case, though, the caller may want to run func() regardless of whether or not runtime PM is enabled for the given device and that would require some serious trickery. For this reason, in the newest patch (http://marc.info/?l=linux-pm&m=134437366811066&w=4) the caller can choose what to do. > If the status is something else, we > can't queue an async resume request. So we just set power.func and > return. Eventually the driver will either call pm_runtime_set_active() > or pm_runtime_enable() followed by pm_runtime_resume(), at which time > we would call power.func. > > Also, what should happen when power.runtime_error is set? The same as > when disable_depth > 0? I think so. > You mentioned that pm_runtime_disable() does a resume if there's a > pending resume request. I had forgotten about this. It worries me, > because subsystems use code sequences like this: > > pm_runtime_disable(dev); > pm_runtime_set_active(dev); > pm_runtime_enable(dev); > > in their system resume routines (in fact, we advise them to do so in > the Documentation file). Now, it is unlikely for a resume request to > be pending during system sleep, but it doesn't seem to be impossible. > When there is such a pending request, the pm_runtime_disable() call > will try to do a runtime resume at a time when the device has just been > restored to full power. That's not good. Well, they should do __pm_runtime_disable(dev, false), then. > Probably this pattern occurs in few enough places that we could go > through and fix them all. But how? Should there be a new function: > pm_adjust_runtime_status_after_system_resume()? I think the above would suffice. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: linux/drivers/base/power/runtime.c =================================================================== --- linux.orig/drivers/base/power/runtime.c +++ linux/drivers/base/power/runtime.c @@ -484,6 +484,15 @@ static int rpm_suspend(struct device *de goto out; } +void rpm_queue_up_resume(struct device *dev) +{ + dev->power.request = RPM_REQ_RESUME; + if (!dev->power.request_pending) { + dev->power.request_pending = true; + queue_work(pm_wq, &dev->power.work); + } +} + /** * rpm_resume - Carry out runtime resume of given device. * @dev: Device to resume. @@ -519,12 +528,18 @@ static int rpm_resume(struct device *dev goto out; /* - * Other scheduled or pending requests need to be canceled. Small - * optimization: If an autosuspend timer is running, leave it running - * rather than cancelling it now only to restart it again in the near - * future. + * Other scheduled or pending requests need to be canceled. If the + * execution of a function is queued up along with a resume request, + * do not cancel it. + */ + if (dev->power.request != RPM_REQ_RESUME || !dev->power.func) + dev->power.request = RPM_REQ_NONE; + + /* + * Small optimization: If an autosuspend timer is running, leave it + * running rather than cancelling it now only to restart it again in the + * near future. */ - dev->power.request = RPM_REQ_NONE; if (!dev->power.timer_autosuspends) pm_runtime_deactivate_timer(dev); @@ -591,11 +606,7 @@ static int rpm_resume(struct device *dev /* Carry out an asynchronous or a synchronous resume. */ if (rpmflags & RPM_ASYNC) { - dev->power.request = RPM_REQ_RESUME; - if (!dev->power.request_pending) { - dev->power.request_pending = true; - queue_work(pm_wq, &dev->power.work); - } + rpm_queue_up_resume(dev); retval = 0; goto out; } @@ -691,6 +702,7 @@ static int rpm_resume(struct device *dev static void pm_runtime_work(struct work_struct *work) { struct device *dev = container_of(work, struct device, power.work); + void (*func)(struct device *) = NULL; enum rpm_request req; spin_lock_irq(&dev->power.lock); @@ -715,12 +727,24 @@ static void pm_runtime_work(struct work_ rpm_suspend(dev, RPM_NOWAIT | RPM_AUTO); break; case RPM_REQ_RESUME: - rpm_resume(dev, RPM_NOWAIT); + func = dev->power.func; + if (func) { + dev->power.func = NULL; + pm_runtime_get_noresume(dev); + rpm_resume(dev, 0); + } else { + rpm_resume(dev, RPM_NOWAIT); + } break; } out: spin_unlock_irq(&dev->power.lock); + + if (func) { + func(dev); + pm_runtime_put(dev); + } } /** @@ -877,6 +901,58 @@ int __pm_runtime_resume(struct device *d } EXPORT_SYMBOL_GPL(__pm_runtime_resume); +int pm_runtime_get_and_call(struct device *dev, void (*func)(struct device *)) +{ + unsigned long flags; + bool sync = false; + int ret; + + atomic_inc(&dev->power.usage_count); + + spin_lock_irqsave(&dev->power.lock, flags); + + ret = dev->power.runtime_error; + if (ret) + goto out; + + sync = dev->power.runtime_status == RPM_ACTIVE + || dev->power.disable_depth > 0; + + if (!sync) { + if (dev->power.func) { + ret = -EAGAIN; + goto out; + } else { + dev->power.func = func; + } + } + + /* + * The approach here is the same as in rpm_suspend(): autosuspend timers + * will be activated shortly anyway, so it is pointless to cancel them + * now. + */ + if (!dev->power.timer_autosuspends) + pm_runtime_deactivate_timer(dev); + + if (sync) + dev->power.request = RPM_REQ_NONE; + else + rpm_queue_up_resume(dev); + + out: + spin_unlock_irqrestore(&dev->power.lock, flags); + + if (sync) { + if (func) + func(dev); + + return 1; + } + + return ret; +} + /** * __pm_runtime_set_status - Set runtime PM status of a device. * @dev: Device to handle. Index: linux/include/linux/pm.h =================================================================== --- linux.orig/include/linux/pm.h +++ linux/include/linux/pm.h @@ -547,6 +547,7 @@ struct dev_pm_info { unsigned long suspended_jiffies; unsigned long accounting_timestamp; struct dev_pm_qos_request *pq_req; + void (*func)(struct device *); #endif struct pm_subsys_data *subsys_data; /* Owned by the subsystem. */ struct pm_qos_constraints *constraints; Index: linux/include/linux/pm_runtime.h =================================================================== --- linux.orig/include/linux/pm_runtime.h +++ linux/include/linux/pm_runtime.h @@ -47,6 +47,8 @@ extern void pm_runtime_set_autosuspend_d extern unsigned long pm_runtime_autosuspend_expiration(struct device *dev); extern void pm_runtime_update_max_time_suspended(struct device *dev, s64 delta_ns); +extern int pm_runtime_get_and_call(struct device *dev, + void (*func)(struct device *)); static inline bool pm_children_suspended(struct device *dev) { @@ -150,6 +152,15 @@ static inline void pm_runtime_set_autosu static inline unsigned long pm_runtime_autosuspend_expiration( struct device *dev) { return 0; } +static inline int pm_runtime_get_and_call(struct device *dev, + void (*func)(struct device *)) +{ + if (func) + func(dev); + + return 1; +} + #endif /* !CONFIG_PM_RUNTIME */ static inline int pm_runtime_idle(struct device *dev)