diff mbox series

[v2] xhci: fix runtime PM imbalance in USB2 resume

Message ID 20220225071506.22012-1-henryl@nvidia.com (mailing list archive)
State Superseded
Headers show
Series [v2] xhci: fix runtime PM imbalance in USB2 resume | expand

Commit Message

Henry Lin Feb. 25, 2022, 7:15 a.m. UTC
USB2 resume starts with usb_hcd_start_port_resume() in port status
change handling for RESUME link state. usb_hcd_end_port_resume() call is
needed to keep runtime PM balance.

Fixes: a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status")
Signed-off-by: Henry Lin <henryl@nvidia.com>
---
 drivers/usb/host/xhci-hub.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Greg KH Feb. 25, 2022, 9:16 a.m. UTC | #1
On Fri, Feb 25, 2022 at 03:15:06PM +0800, Henry Lin wrote:
> USB2 resume starts with usb_hcd_start_port_resume() in port status
> change handling for RESUME link state. usb_hcd_end_port_resume() call is
> needed to keep runtime PM balance.
> 
> Fixes: a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status")
> Signed-off-by: Henry Lin <henryl@nvidia.com>
> ---
>  drivers/usb/host/xhci-hub.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
> index df3522dab31b..4a8b07b8ee01 100644
> --- a/drivers/usb/host/xhci-hub.c
> +++ b/drivers/usb/host/xhci-hub.c
> @@ -1090,6 +1090,8 @@ static void xhci_get_usb2_port_status(struct xhci_port *port, u32 *status,
>  		if (link_state == XDEV_U0) {
>  			bus_state->resume_done[portnum] = 0;
>  			clear_bit(portnum, &bus_state->resuming_ports);
> +			usb_hcd_end_port_resume(&port->rhub->hcd->self,
> +						portnum);
>  			if (bus_state->suspended_ports & (1 << portnum)) {
>  				bus_state->suspended_ports &= ~(1 << portnum);
>  				bus_state->port_c_suspend |= 1 << portnum;
> -- 
> 2.17.1
> 

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- This looks like a new version of a previously submitted patch, but you
  did not list below the --- line any changes from the previous version.
  Please read the section entitled "The canonical patch format" in the
  kernel file, Documentation/SubmittingPatches for what needs to be done
  here to properly describe this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot
Henry Lin Feb. 26, 2022, 4:06 p.m. UTC | #2
USB2 resume starts with usb_hcd_start_port_resume() in port status
change handling for RESUME link state. usb_hcd_end_port_resume() call is
needed to keep runtime PM balance.

Fixes: a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status")
Signed-off-by: Henry Lin <henryl@nvidia.com>
---
V1 -> V2: Added Fixes tag in changelog

 drivers/usb/host/xhci-hub.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index df3522dab31b..4a8b07b8ee01 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -1090,6 +1090,8 @@ static void xhci_get_usb2_port_status(struct xhci_port *port, u32 *status,
                 if (link_state == XDEV_U0) {
                         bus_state->resume_done[portnum] = 0;
                         clear_bit(portnum, &bus_state->resuming_ports);
+                       usb_hcd_end_port_resume(&port->rhub->hcd->self,
+                                               portnum);
                         if (bus_state->suspended_ports & (1 << portnum)) {
                                 bus_state->suspended_ports &= ~(1 << portnum);
                                 bus_state->port_c_suspend |= 1 << portnum;
--
2.17.1
Henry Lin Feb. 26, 2022, 4:20 p.m. UTC | #3
> +                       usb_hcd_end_port_resume(&port->rhub->hcd->self,
> +                                               portnum);
Tab indent is broken in this version. Will resend this.
Mathias Nyman Feb. 28, 2022, 10:56 a.m. UTC | #4
On 25.2.2022 9.15, Henry Lin wrote:
> USB2 resume starts with usb_hcd_start_port_resume() in port status
> change handling for RESUME link state. usb_hcd_end_port_resume() call is
> needed to keep runtime PM balance.

For normal usb2 port resume the usb_hcd_end_port_resume() is called when resume
has been signaled for long enough in xhci_handle_usb2_port_link_resume().
 
This is also where driver directs the port to go from Resume state to U0.
Port can't do this without driver directing it.

If there's a failure during resume signaling (disconnect, reset, error) then
stale resume variables are detected in xhci_get_port_status() and 
usb_hcd_end_port_resume() is called.

I do now see a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status") 
does change order of checking and clearing stale resume variables, but this should
only happen if the first port state we read is a fully enabled functional U0 state after
a failed resume.

Could you expand a bit how this was detected? 

> 
> Fixes: a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status")
> Signed-off-by: Henry Lin <henryl@nvidia.com>
> ---
>  drivers/usb/host/xhci-hub.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
> index df3522dab31b..4a8b07b8ee01 100644
> --- a/drivers/usb/host/xhci-hub.c
> +++ b/drivers/usb/host/xhci-hub.c
> @@ -1090,6 +1090,8 @@ static void xhci_get_usb2_port_status(struct xhci_port *port, u32 *status,
>  		if (link_state == XDEV_U0) {
>  			bus_state->resume_done[portnum] = 0;
>  			clear_bit(portnum, &bus_state->resuming_ports);
> +			usb_hcd_end_port_resume(&port->rhub->hcd->self,
> +						portnum);

This will call usb_hcd_end_port_resume() every time port is in normal enabled U0 state
even if resume was never signaled, or port suspended.

-Mathias
Henry Lin March 1, 2022, 10:28 a.m. UTC | #5
>> USB2 resume starts with usb_hcd_start_port_resume() in port status
>> change handling for RESUME link state. usb_hcd_end_port_resume() call is
>> needed to keep runtime PM balance.

> For normal usb2 port resume the usb_hcd_end_port_resume() is called when resume
> has been signaled for long enough in xhci_handle_usb2_port_link_resume().
>
> This is also where driver directs the port to go from Resume state to U0.
> Port can't do this without driver directing it.
>
> If there's a failure during resume signaling (disconnect, reset, error) then
> stale resume variables are detected in xhci_get_port_status() and
> usb_hcd_end_port_resume() is called.

> I do now see a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status")
> does change order of checking and clearing stale resume variables, but this should
> only happen if the first port state we read is a fully enabled functional U0 state after
> a failed resume.

> Could you expand a bit how this was detected?
We observed the racing issue when usb2 device-initiated resume occurs in system resume.
If usb2 host-initiated resume for system resume directs U0 before xhci_get_usb2_port_status()
see RESUME state, xhci_get_usb2_port_status() will not finish resume process in
xhci_handle_usb2_port_link_resume(). Its scenario is as follows:

1. System resume starts. All driver system resume callbacks get called in order. XHCI controller
    is resumed by xhci_resume().
2. USB2 root hub is resuming. hcd_bus_resume() is being executed.
3. Before xhci_bus_resume() is finished, XHCI driver receives a port status change event for 
    an USB2 port with RESUME link state in xhci_irq(). XHCI driver starts the process to resume
    HS port for device-initiated resume.
4. In xhci_bus_resume(), host-initiated resume (direct U0) is performed on the same port that is
    resuming in step 3 in below loop:

                if (bus_state->bus_suspended) {
                        spin_unlock_irqrestore(&xhci->lock, flags);
                        msleep(USB_RESUME_TIMEOUT);
                        spin_lock_irqsave(&xhci->lock, flags);
                }
                for_each_set_bit(port_index, &bus_state->bus_suspended,
                                 BITS_PER_LONG) {
                        /* Clear PLC to poll it later for U0 transition */
                        xhci_test_and_clear_bit(xhci, ports[port_index],
                                                PORT_PLC);
                        xhci_set_link_state(xhci, ports[port_index], XDEV_U0);
                }
5. Then, link state of the resuming port is observed as U0 in following
    xhci_get_usb2_port_status(). xhci_handle_usb2_port_link_resume() has
    no chance to get called on the resuming port.

Thanks,
Henry
Mathias Nyman March 1, 2022, 3:49 p.m. UTC | #6
On 1.3.2022 12.28, Henry Lin wrote:
>>> USB2 resume starts with usb_hcd_start_port_resume() in port status
>>> change handling for RESUME link state. usb_hcd_end_port_resume() call is
>>> needed to keep runtime PM balance.
> 
>> For normal usb2 port resume the usb_hcd_end_port_resume() is called when resume
>> has been signaled for long enough in xhci_handle_usb2_port_link_resume().
>>
>> This is also where driver directs the port to go from Resume state to U0.
>> Port can't do this without driver directing it.
>>
>> If there's a failure during resume signaling (disconnect, reset, error) then
>> stale resume variables are detected in xhci_get_port_status() and
>> usb_hcd_end_port_resume() is called.
> 
>> I do now see a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status")
>> does change order of checking and clearing stale resume variables, but this should
>> only happen if the first port state we read is a fully enabled functional U0 state after
>> a failed resume.
> 
>> Could you expand a bit how this was detected?
> We observed the racing issue when usb2 device-initiated resume occurs in system resume.
> If usb2 host-initiated resume for system resume directs U0 before xhci_get_usb2_port_status()
> see RESUME state, xhci_get_usb2_port_status() will not finish resume process in
> xhci_handle_usb2_port_link_resume(). Its scenario is as follows:
> 
> 1. System resume starts. All driver system resume callbacks get called in order. XHCI controller
>     is resumed by xhci_resume().
> 2. USB2 root hub is resuming. hcd_bus_resume() is being executed.
> 3. Before xhci_bus_resume() is finished, XHCI driver receives a port status change event for 
>     an USB2 port with RESUME link state in xhci_irq(). XHCI driver starts the process to resume
>     HS port for device-initiated resume.
> 4. In xhci_bus_resume(), host-initiated resume (direct U0) is performed on the same port that is
>     resuming in step 3 in below loop:
> 
>                 if (bus_state->bus_suspended) {
>                         spin_unlock_irqrestore(&xhci->lock, flags);
>                         msleep(USB_RESUME_TIMEOUT);
>                         spin_lock_irqsave(&xhci->lock, flags);
>                 }
>                 for_each_set_bit(port_index, &bus_state->bus_suspended,
>                                  BITS_PER_LONG) {
>                         /* Clear PLC to poll it later for U0 transition */
>                         xhci_test_and_clear_bit(xhci, ports[port_index],
>                                                 PORT_PLC);
>                         xhci_set_link_state(xhci, ports[port_index], XDEV_U0);
>                 }
> 5. Then, link state of the resuming port is observed as U0 in following
>     xhci_get_usb2_port_status(). xhci_handle_usb2_port_link_resume() has
>     no chance to get called on the resuming port.
> 

True, thanks for the explanation.
If there's a race between system resume and device-initiated resume, and port is resumed
in xhci_bus_resume() then yes I see how this could happen.

Maybe only call usb_hcd_end_port_resume() if xhci_irq() detected the device-initiated
resume. i.e. set a value to resume_done[portnum] and called usb_hcd_start_port_resume()
something like: 

@@ -1088,6 +1088,8 @@ static void xhci_get_usb2_port_status(struct xhci_port *port, u32 *status,
                if (link_state == XDEV_U2)
                        *status |= USB_PORT_STAT_L1;
                if (link_state == XDEV_U0) {
+                       if (bus_state->resume_done[portnum])
+                               usb_hcd_end_port_resume(&port->rhub->hcd->self, portnum);
                        bus_state->resume_done[portnum] = 0;
                        clear_bit(portnum, &bus_state->resuming_ports);

Also xhci_bus_resume() only resumes ports that were forcefully suspended to U3 in xhci_bus_suspend().
Could be worth checking why that device wasn't already in U3 when suspend reached xhci_bus_suspend().

Thanks
Mathias
Henry Lin March 1, 2022, 7:18 p.m. UTC | #7
> Maybe only call usb_hcd_end_port_resume() if xhci_irq() detected the device-initiated
> resume. i.e. set a value to resume_done[portnum] and called usb_hcd_start_port_resume()
> something like:
>
> @@ -1088,6 +1088,8 @@ static void xhci_get_usb2_port_status(struct xhci_port *port, u32 *status,
>                 if (link_state == XDEV_U2)
>                         *status |= USB_PORT_STAT_L1;
>                 if (link_state == XDEV_U0) {
> +                       if (bus_state->resume_done[portnum])
> +                               usb_hcd_end_port_resume(&port->rhub->hcd->self, portnum);
>                         bus_state->resume_done[portnum] = 0;
>                         clear_bit(portnum, &bus_state->resuming_ports);
This looks good. I can submit a new version based on this.

> Also xhci_bus_resume() only resumes ports that were forcefully suspended to U3 in xhci_bus_suspend().
> Could be worth checking why that device wasn't already in U3 when suspend reached xhci_bus_suspend().
It happens when we trigger USB device-initiated resume to bring system out of system suspend state.
For example, we can connect an USB2 external hub on root port and put system into suspend state. The USB2
external hub (the port it connects to) is in U3 after xhci_bus_suspend(). Once we connect a USB device
to downstream port of the USB2 external hub, the hub will trigger device initiated resume to wake up the
system. During system resume, XHCI controller will report the USB2 external hub is in RESUME state.

Thanks,
Henry
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index df3522dab31b..4a8b07b8ee01 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -1090,6 +1090,8 @@  static void xhci_get_usb2_port_status(struct xhci_port *port, u32 *status,
 		if (link_state == XDEV_U0) {
 			bus_state->resume_done[portnum] = 0;
 			clear_bit(portnum, &bus_state->resuming_ports);
+			usb_hcd_end_port_resume(&port->rhub->hcd->self,
+						portnum);
 			if (bus_state->suspended_ports & (1 << portnum)) {
 				bus_state->suspended_ports &= ~(1 << portnum);
 				bus_state->port_c_suspend |= 1 << portnum;