diff mbox

usb_wwan_write() called while device still being resumed

Message ID 87r4khu9az.fsf@nemi.mork.no (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Bjørn Mork Feb. 15, 2013, 11:05 a.m. UTC
Alex Courbot <acourbot@nvidia.com> writes:

> Unfortunately it does not, and fails the same way. On the other hand,
> I do not see the issue when doing the following:
>
> diff --git a/drivers/usb/serial/usb_wwan.c b/drivers/usb/serial/usb_wwan.c
> index e4fad5e..1490029 100644
> --- a/drivers/usb/serial/usb_wwan.c
> +++ b/drivers/usb/serial/usb_wwan.c
> @@ -238,8 +238,6 @@ int usb_wwan_write(struct tty_struct *tty, struct
> usb_serial_port *port,
>                     usb_pipeendpoint(this_urb->pipe), i);
>
>                 err =
> usb_autopm_get_interface_async(port->serial->interface);
> -               if (err < 0)
> -                       break;
>
>                 /* send the data */
>                 memcpy(this_urb->transfer_buffer, buf, todo);
>
> After doing this I don't see this issue anymore. It looks wrong
> though. But it seems to work despite the obvious unbalance in autopm
> calls that results.
>
> If I understand you correctly, usb_wwan_write() failing here is not a
> problem in itself, and the ack should just be sent again later?

That was what I thought looking (obviously too) briefly through this.

Most errors from usb_autopm_get_interface_async will be translated to
EIO before being returned by serial_write.  I believe the userspace
application should deal with that.  But maybe it just gives up?  Should
we return EAGAIN or something instead?

I don't know.  I am pretty clueless about these things...

But looking again, trying to guess why it works fine if you just ignore
the error. I believe that is because you then end up hitting this until
the interface is fully resumed:

		if (intfdata->suspended) {
			usb_anchor_urb(this_urb, &portdata->delayed);
			spin_unlock_irqrestore(&intfdata->susp_lock, flags);
                }

>> that should not cause the modem to stop working.
>
> Actually it might also be that the network stack ends up in a bad
> state and remains stuck in it. I don't think the modem by itself is
> affected. All I observe is that no network traffic takes place after
> this. I'm not familiar enough with networking to make any stronger
> assumption.

> FWIW, when usb_autopm_get_interface_async() returns -EACCES, the power
> parameters of port->serial->interface->dev are as follows:
>
> disable_depth = 1
> is_suspended = 1
> runtime_status = 2 (RPM_SUSPENDED)

Yes, that makes pm_runtime_get() return -EACCES.

I am way out of my league here, but I wonder if pm_runtime_get()
shouldn't return -EINPROGRESS instead if there is a queued resume
request or an ongoing resume, regardless of disable_depth?

Maybe something like the completely untested:

---
usb_autopm_get_interface_async() will interprete EINPROGRESS as success,
so that would prevent this problem.


Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Alexandre Courbot Feb. 17, 2013, 11:31 a.m. UTC | #1
On 02/15/2013 08:05 PM, Bjørn Mork wrote:
> Alex Courbot <acourbot@nvidia.com> writes:
>
>> Unfortunately it does not, and fails the same way. On the other hand,
>> I do not see the issue when doing the following:
>>
>> diff --git a/drivers/usb/serial/usb_wwan.c b/drivers/usb/serial/usb_wwan.c
>> index e4fad5e..1490029 100644
>> --- a/drivers/usb/serial/usb_wwan.c
>> +++ b/drivers/usb/serial/usb_wwan.c
>> @@ -238,8 +238,6 @@ int usb_wwan_write(struct tty_struct *tty, struct
>> usb_serial_port *port,
>>                      usb_pipeendpoint(this_urb->pipe), i);
>>
>>                  err =
>> usb_autopm_get_interface_async(port->serial->interface);
>> -               if (err < 0)
>> -                       break;
>>
>>                  /* send the data */
>>                  memcpy(this_urb->transfer_buffer, buf, todo);
>>
>> After doing this I don't see this issue anymore. It looks wrong
>> though. But it seems to work despite the obvious unbalance in autopm
>> calls that results.
>>
>> If I understand you correctly, usb_wwan_write() failing here is not a
>> problem in itself, and the ack should just be sent again later?
>
> That was what I thought looking (obviously too) briefly through this.
>
> Most errors from usb_autopm_get_interface_async will be translated to
> EIO before being returned by serial_write.  I believe the userspace
> application should deal with that.  But maybe it just gives up?  Should
> we return EAGAIN or something instead?
>
> I don't know.  I am pretty clueless about these things...

Obviously not as much as I am. :) Checking what userspace is doing could 
indeed be another trail.

> But looking again, trying to guess why it works fine if you just ignore
> the error. I believe that is because you then end up hitting this until
> the interface is fully resumed:
>
> 		if (intfdata->suspended) {
> 			usb_anchor_urb(this_urb, &portdata->delayed);
> 			spin_unlock_irqrestore(&intfdata->susp_lock, flags);
>                  }

Yes, this seems to be exactly what is happening.

> I am way out of my league here, but I wonder if pm_runtime_get()
> shouldn't return -EINPROGRESS instead if there is a queued resume
> request or an ongoing resume, regardless of disable_depth?
>
> Maybe something like the completely untested:
>
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 3148b10..38e19ba 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -512,6 +512,9 @@ static int rpm_resume(struct device *dev, int rpmflags)
>   	else if (dev->power.disable_depth == 1 && dev->power.is_suspended
>   	    && dev->power.runtime_status == RPM_ACTIVE)
>   		retval = 1;
> +	else if (rpmflags & RPM_ASYNC && dev->power.request_pending &&
> +		 dev->power.request == RPM_REQ_RESUME)
> +		retval = -EINPROGRESS;
>   	else if (dev->power.disable_depth > 0)
>   		retval = -EACCES;
>   	if (retval)
> ---
> usb_autopm_get_interface_async() will interprete EINPROGRESS as success,
> so that would prevent this problem.

That sounds sensefull indeed. If the interface is soon to be resumed, 
there should be no reason for usb_autopm_get_interface_async() to fail. 
Let's give this a try and bring the idea to the PM people if it works.

In any case thanks a lot for the help, it is extremely useful.
Alex.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexandre Courbot Feb. 18, 2013, 3:20 a.m. UTC | #2
On 02/15/2013 08:05 PM, Bjørn Mork wrote:
> Maybe something like the completely untested:
>
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 3148b10..38e19ba 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -512,6 +512,9 @@ static int rpm_resume(struct device *dev, int rpmflags)
>   	else if (dev->power.disable_depth == 1 && dev->power.is_suspended
>   	    && dev->power.runtime_status == RPM_ACTIVE)
>   		retval = 1;
> +	else if (rpmflags & RPM_ASYNC && dev->power.request_pending &&
> +		 dev->power.request == RPM_REQ_RESUME)
> +		retval = -EINPROGRESS;
>   	else if (dev->power.disable_depth > 0)
>   		retval = -EACCES;
>   	if (retval)

Second thought: not sure this will work since power.request_pending and 
power.request are set to these values later in the same rpm_resume() 
function. However, the three lines before yours caught my attention. 
They are not in my 3.1 source tree and the conditions are very close 
from the ones I am seeing when the issue happens: disable_depth == 1, 
is_suspended == 1. Only runtime_status is not equal to RPM_ACTIVE.

Nonetheless, I have looked at the patch that introduced these 
(http://pastebin.com/PmHUjiAE ) and it details a problem that is very 
similar to mine. It also mentions a workaround to be implemented in the 
driver by saving the suspend status into a variable that is checked when 
pm_runtime_get() return -EACCES. This variable already exists in 
usb_wwan, actually it is the very variable that is checked a bit later 
in that other chunk of code you mentioned:

spin_lock_irqsave(&intfdata->susp_lock, flags);
if (intfdata->suspended) {
	usb_anchor_urb(this_urb, &portdata->delayed);
	spin_unlock_irqrestore(&intfdata->susp_lock, flags);
} else {

So it looks like this code is here for exactly that purpose. However in 
my current condition I do not see how this block could be run since 
runtime PM is disabled when intfdata->suspended is set to true.

I will try implementing the workaround suggested (checking if 
intfdata->suspended is set when -EACCES is returned and go on if it is 
the case) and see if it works (and I bet it will). In the upstream 
kernel the issue seems to have been addressed already, so this might 
just be a source-out-of-date issue.

Thanks,
Alex.

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 3148b10..38e19ba 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -512,6 +512,9 @@  static int rpm_resume(struct device *dev, int rpmflags)
 	else if (dev->power.disable_depth == 1 && dev->power.is_suspended
 	    && dev->power.runtime_status == RPM_ACTIVE)
 		retval = 1;
+	else if (rpmflags & RPM_ASYNC && dev->power.request_pending &&
+		 dev->power.request == RPM_REQ_RESUME)
+		retval = -EINPROGRESS;
 	else if (dev->power.disable_depth > 0)
 		retval = -EACCES;
 	if (retval)