diff mbox

[RFC] PM / Runtime: Make autosuspend_delay == 0 work as intended

Message ID 1401704952-32046-1-git-send-email-bjorn@mork.no (mailing list archive)
State RFC, archived
Headers show

Commit Message

Bjørn Mork June 2, 2014, 10:29 a.m. UTC
OK, I don't like the result I came up with here.  So I am sending this
as an RFC in the hope that smarter people than me can come up with
something nicer. I hope the patch illustrates my problem:  While
trying do debug some runtime suspend issues in a driver, I temporarily
set autosuspend to 0 to deliberately make it aggressive enough for my
driver to fail. I was surprised to learn than this effectively
*disabled* autosuspend if the driver ever return -EBUSY from the
runtime suspend callback.  I believe that is too unexpected to be
acceptable. Either we should not allow 0, or we should make it work
at least as well as 1.


Bjørn
8<--------------
We should always reschedule if the driver runtime suspend callback
returns -EBUSY and the driver updated the last_busy time.  But if
autosuspend_delay is 0, then this cannot be detected because
pm_runtime_autosuspend_expiration() always returns 0.

Fix by rescheduling if autosuspend_delay is 0.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
 drivers/base/power/runtime.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Rafael J. Wysocki June 2, 2014, 12:14 p.m. UTC | #1
Alan has to see this in the first place.

On Monday, June 02, 2014 12:29:12 PM Bjørn Mork wrote:
> OK, I don't like the result I came up with here.  So I am sending this
> as an RFC in the hope that smarter people than me can come up with
> something nicer. I hope the patch illustrates my problem:  While
> trying do debug some runtime suspend issues in a driver, I temporarily
> set autosuspend to 0 to deliberately make it aggressive enough for my
> driver to fail. I was surprised to learn than this effectively
> *disabled* autosuspend if the driver ever return -EBUSY from the
> runtime suspend callback.

That results from pm_runtime_autosuspend_expiration() always returning 0 in
that case.

> I believe that is too unexpected to be
> acceptable. Either we should not allow 0, or we should make it work
> at least as well as 1.
> 
> 
> Bjørn
> 8<--------------
> We should always reschedule if the driver runtime suspend callback
> returns -EBUSY and the driver updated the last_busy time.  But if
> autosuspend_delay is 0, then this cannot be detected because
> pm_runtime_autosuspend_expiration() always returns 0.
> 
> Fix by rescheduling if autosuspend_delay is 0.
> 
> Signed-off-by: Bjørn Mork <bjorn@mork.no>
> ---
>  drivers/base/power/runtime.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 67c7938e430b..c52e630a2e54 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -421,6 +421,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
>  {
>  	int (*callback)(struct device *);
>  	struct device *parent = NULL;
> +	unsigned long retry_delay = 0;
>  	int retval;
>  
>  	trace_rpm_suspend(dev, rpmflags);
> @@ -441,7 +442,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
>  	/* If the autosuspend_delay time hasn't expired yet, reschedule. */
>  	if ((rpmflags & RPM_AUTO)
>  	    && dev->power.runtime_status != RPM_SUSPENDING) {
> -		unsigned long expires = pm_runtime_autosuspend_expiration(dev);
> +		unsigned long expires = pm_runtime_autosuspend_expiration(dev) + retry_delay;
>  
>  		if (expires != 0) {
>  			/* Pending requests need to be canceled. */
> @@ -571,8 +572,11 @@ static int rpm_suspend(struct device *dev, int rpmflags)
>  		 * reschedule another autosuspend.
>  		 */
>  		if ((rpmflags & RPM_AUTO) &&
> -		    pm_runtime_autosuspend_expiration(dev) != 0)
> +		    (pm_runtime_autosuspend_expiration(dev) != 0 ||
> +		     ACCESS_ONCE(dev->power.autosuspend_delay) == 0)) {
> +			retry_delay++;
>  			goto repeat;
> +		}
>  	} else {
>  		pm_runtime_cancel_pending(dev);
>  	}

Well, as I said.

Perhaps pm_runtime_autosuspend_expiration() could simply treat
power.autosuspend_delay equal to 0 in a special way, but then what about
power.autosuspend_delay equal to 1?

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjørn Mork June 2, 2014, 12:42 p.m. UTC | #2
"Rafael J. Wysocki" <rjw@rjwysocki.net> writes:

> Alan has to see this in the first place.
>
> On Monday, June 02, 2014 12:29:12 PM Bjørn Mork wrote:
>> OK, I don't like the result I came up with here.  So I am sending this
>> as an RFC in the hope that smarter people than me can come up with
>> something nicer. I hope the patch illustrates my problem:  While
>> trying do debug some runtime suspend issues in a driver, I temporarily
>> set autosuspend to 0 to deliberately make it aggressive enough for my
>> driver to fail. I was surprised to learn than this effectively
>> *disabled* autosuspend if the driver ever return -EBUSY from the
>> runtime suspend callback.
>
> That results from pm_runtime_autosuspend_expiration() always returning 0 in
> that case.

Yup.

>> I believe that is too unexpected to be
>> acceptable. Either we should not allow 0, or we should make it work
>> at least as well as 1.
>> 
>> 
>> Bjørn
>> 8<--------------
>> We should always reschedule if the driver runtime suspend callback
>> returns -EBUSY and the driver updated the last_busy time.  But if
>> autosuspend_delay is 0, then this cannot be detected because
>> pm_runtime_autosuspend_expiration() always returns 0.
>> 
>> Fix by rescheduling if autosuspend_delay is 0.
>> 
>> Signed-off-by: Bjørn Mork <bjorn@mork.no>
>> ---
>>  drivers/base/power/runtime.c |    8 ++++++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
>> index 67c7938e430b..c52e630a2e54 100644
>> --- a/drivers/base/power/runtime.c
>> +++ b/drivers/base/power/runtime.c
>> @@ -421,6 +421,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
>>  {
>>  	int (*callback)(struct device *);
>>  	struct device *parent = NULL;
>> +	unsigned long retry_delay = 0;
>>  	int retval;
>>  
>>  	trace_rpm_suspend(dev, rpmflags);
>> @@ -441,7 +442,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
>>  	/* If the autosuspend_delay time hasn't expired yet, reschedule. */
>>  	if ((rpmflags & RPM_AUTO)
>>  	    && dev->power.runtime_status != RPM_SUSPENDING) {
>> -		unsigned long expires = pm_runtime_autosuspend_expiration(dev);
>> +		unsigned long expires = pm_runtime_autosuspend_expiration(dev) + retry_delay;
>>  
>>  		if (expires != 0) {
>>  			/* Pending requests need to be canceled. */
>> @@ -571,8 +572,11 @@ static int rpm_suspend(struct device *dev, int rpmflags)
>>  		 * reschedule another autosuspend.
>>  		 */
>>  		if ((rpmflags & RPM_AUTO) &&
>> -		    pm_runtime_autosuspend_expiration(dev) != 0)
>> +		    (pm_runtime_autosuspend_expiration(dev) != 0 ||
>> +		     ACCESS_ONCE(dev->power.autosuspend_delay) == 0)) {
>> +			retry_delay++;
>>  			goto repeat;
>> +		}
>>  	} else {
>>  		pm_runtime_cancel_pending(dev);
>>  	}
>
> Well, as I said.
>
> Perhaps pm_runtime_autosuspend_expiration() could simply treat
> power.autosuspend_delay equal to 0 in a special way, but then what about
> power.autosuspend_delay equal to 1?

Is that going to cause similar problems? Hmm, I guess it is if the
callback takes so long that the timeout is reached when we get here?

In which cases do you *not* want to retry runtime suspend on -EBUSY from
the driver callback? If you don't retry, then the device will not
runtime suspend at all until the next time the driver updates last_busy.
And if the driver always is busy some time after such updates, then
runtime suspend is effectively disabled.

Should drivers just update last_busy more agressively, and make sure to
update it both when starting some hardware transaction and when
finishing it? That would make the suspend callback run again when the
hardware is not busy anymore, and would probably make stuff work without
any PM core changes.



Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern June 2, 2014, 3:35 p.m. UTC | #3
On Mon, 2 Jun 2014, Bjørn Mork wrote:

> "Rafael J. Wysocki" <rjw@rjwysocki.net> writes:
> 
> > Alan has to see this in the first place.
> >
> > On Monday, June 02, 2014 12:29:12 PM Bjørn Mork wrote:
> >> OK, I don't like the result I came up with here.  So I am sending this
> >> as an RFC in the hope that smarter people than me can come up with
> >> something nicer. I hope the patch illustrates my problem:  While
> >> trying do debug some runtime suspend issues in a driver, I temporarily
> >> set autosuspend to 0 to deliberately make it aggressive enough for my
> >> driver to fail. I was surprised to learn than this effectively
> >> *disabled* autosuspend if the driver ever return -EBUSY from the
> >> runtime suspend callback.
> >
> > That results from pm_runtime_autosuspend_expiration() always returning 0 in
> > that case.
> 
> Yup.

> > Well, as I said.
> >
> > Perhaps pm_runtime_autosuspend_expiration() could simply treat
> > power.autosuspend_delay equal to 0 in a special way, but then what about
> > power.autosuspend_delay equal to 1?
> 
> Is that going to cause similar problems? Hmm, I guess it is if the
> callback takes so long that the timeout is reached when we get here?
> 
> In which cases do you *not* want to retry runtime suspend on -EBUSY from
> the driver callback? If you don't retry, then the device will not
> runtime suspend at all until the next time the driver updates last_busy.
> And if the driver always is busy some time after such updates, then
> runtime suspend is effectively disabled.

We don't want to get into a tight retry loop.  That's what I was trying
to avoid.  Notice that although the comment mentions last_busy being
updated, the code doesn't actually look at last_busy.  Instead, it
looks to see if the new expiration time is in the future (which won't
happen unless last_busy gets updated) -- that's how it avoids going
into a tight loop.

> Should drivers just update last_busy more agressively, and make sure to
> update it both when starting some hardware transaction and when
> finishing it? That would make the suspend callback run again when the
> hardware is not busy anymore, and would probably make stuff work without
> any PM core changes.

We want to minimize the amount of work that drivers need to do.

I like your idea of adding an explicit retry delay.  In fact, I would 
like to go farther and make the delay automatically back off 
exponentially.

How about something like this?  Let t = jiffies - last_busy.  If t <= 1 
ms then set retry_delay to 1 ms, otherwise set retry_delay to min(t*2, 
1000 ms).  Or even use a larger cap, such as 60 s.

Alan Stern

PS: You don't need to use ACCESS_ONCE the way you did.  It's important
only when the same value is used several times and you want the
compiler to store a local copy instead of re-reading the original (in
case the original gets changed by another CPU).

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 67c7938e430b..c52e630a2e54 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -421,6 +421,7 @@  static int rpm_suspend(struct device *dev, int rpmflags)
 {
 	int (*callback)(struct device *);
 	struct device *parent = NULL;
+	unsigned long retry_delay = 0;
 	int retval;
 
 	trace_rpm_suspend(dev, rpmflags);
@@ -441,7 +442,7 @@  static int rpm_suspend(struct device *dev, int rpmflags)
 	/* If the autosuspend_delay time hasn't expired yet, reschedule. */
 	if ((rpmflags & RPM_AUTO)
 	    && dev->power.runtime_status != RPM_SUSPENDING) {
-		unsigned long expires = pm_runtime_autosuspend_expiration(dev);
+		unsigned long expires = pm_runtime_autosuspend_expiration(dev) + retry_delay;
 
 		if (expires != 0) {
 			/* Pending requests need to be canceled. */
@@ -571,8 +572,11 @@  static int rpm_suspend(struct device *dev, int rpmflags)
 		 * reschedule another autosuspend.
 		 */
 		if ((rpmflags & RPM_AUTO) &&
-		    pm_runtime_autosuspend_expiration(dev) != 0)
+		    (pm_runtime_autosuspend_expiration(dev) != 0 ||
+		     ACCESS_ONCE(dev->power.autosuspend_delay) == 0)) {
+			retry_delay++;
 			goto repeat;
+		}
 	} else {
 		pm_runtime_cancel_pending(dev);
 	}