diff mbox series

[v4] USB: Fix the issue of task recovery failure caused by USB status when S4 wakes up

Message ID 20241024024038.26157-1-duanchenghao@kylinos.cn (mailing list archive)
State New
Headers show
Series [v4] USB: Fix the issue of task recovery failure caused by USB status when S4 wakes up | expand

Commit Message

Duan Chenghao Oct. 24, 2024, 2:40 a.m. UTC
When a device is inserted into the USB port and an S4 wakeup is initiated,
after the USB-hub initialization is completed, it will automatically enter
suspend mode. Upon detecting a device on the USB port, it will proceed with
resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state. During the S4
wakeup process, peripherals are put into suspend mode, followed by task
recovery. However, upon detecting that the hcd is in the
HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY status, causing the
S4 suspend to fail and subsequent task recovery to not proceed.
-
[   27.594598][ 1]  PM: pci_pm_freeze(): hcd_pci_suspend+0x0/0x28 returns -16
[   27.594601][ 1]  PM: dpm_run_callback(): pci_pm_freeze+0x0/0x100 returns -16
[   27.603420][ 1]  ehci-pci 0000:00:04.1: pci_pm_freeze+0x0/0x100 returned 0 after 3 usecs
[   27.612233][ 1]  ehci-pci 0000:00:05.1: pci_pm_freeze+0x0/0x100 returned -16 after 17223 usecs
[   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce async: error -16
[   27.816988][ 1]  PM: quiesce of devices aborted after 1833.282 msecs
[   27.823302][ 1]  PM: start quiesce of devices aborted after 1839.975 msecs
......
[   31.303172][ 1]  PM: recover of devices complete after 3473.039 msecs
[   31.309818][ 1]  PM: Failed to load hibernation image, recovering.
[   31.348188][ 1]  PM: Basic memory bitmaps freed
[   31.352686][ 1]  OOM killer enabled.
[   31.356232][ 1]  Restarting tasks ... done.
[   31.360609][ 1]  PM: resume from hibernation failed (0)
[   31.365800][ 1]  PM: Hibernation image not present or could not be loaded.

The "do_wakeup" is determined based on whether the controller's
power/wakeup attribute is set. The current issue necessitates considering
the type of suspend that is occurring. If the suspend type is either
PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should be set to
false.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Duan Chenghao <duanchenghao@kylinos.cn>
---
 drivers/usb/core/hcd-pci.c | 15 +++++++++++++--
 include/linux/pm.h         |  3 ++-
 2 files changed, 15 insertions(+), 3 deletions(-)

Comments

Duan Chenghao Oct. 24, 2024, 2:53 a.m. UTC | #1
Hi Alan,

The current patch has been tested on 6 machines and has passed the
tests, with each machine undergoing 500 cycles of testing.

Saranya also replied in the email that she conducted 1,500 tests and
all were successful.


Thanks
Duan Chenghao 

在 2024-10-24星期四的 10:40 +0800,Duan Chenghao写道:
> When a device is inserted into the USB port and an S4 wakeup is
> initiated,
> after the USB-hub initialization is completed, it will automatically
> enter
> suspend mode. Upon detecting a device on the USB port, it will
> proceed with
> resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state. During
> the S4
> wakeup process, peripherals are put into suspend mode, followed by
> task
> recovery. However, upon detecting that the hcd is in the
> HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY status,
> causing the
> S4 suspend to fail and subsequent task recovery to not proceed.
> -
> [   27.594598][ 1]  PM: pci_pm_freeze(): hcd_pci_suspend+0x0/0x28
> returns -16
> [   27.594601][ 1]  PM: dpm_run_callback(): pci_pm_freeze+0x0/0x100
> returns -16
> [   27.603420][ 1]  ehci-pci 0000:00:04.1: pci_pm_freeze+0x0/0x100
> returned 0 after 3 usecs
> [   27.612233][ 1]  ehci-pci 0000:00:05.1: pci_pm_freeze+0x0/0x100
> returned -16 after 17223 usecs
> [   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce async:
> error -16
> [   27.816988][ 1]  PM: quiesce of devices aborted after 1833.282
> msecs
> [   27.823302][ 1]  PM: start quiesce of devices aborted after
> 1839.975 msecs
> ......
> [   31.303172][ 1]  PM: recover of devices complete after 3473.039
> msecs
> [   31.309818][ 1]  PM: Failed to load hibernation image, recovering.
> [   31.348188][ 1]  PM: Basic memory bitmaps freed
> [   31.352686][ 1]  OOM killer enabled.
> [   31.356232][ 1]  Restarting tasks ... done.
> [   31.360609][ 1]  PM: resume from hibernation failed (0)
> [   31.365800][ 1]  PM: Hibernation image not present or could not be
> loaded.
> 
> The "do_wakeup" is determined based on whether the controller's
> power/wakeup attribute is set. The current issue necessitates
> considering
> the type of suspend that is occurring. If the suspend type is either
> PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should be set
> to
> false.
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Closes:
> https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> Signed-off-by: Duan Chenghao <duanchenghao@kylinos.cn>
> ---
>  drivers/usb/core/hcd-pci.c | 15 +++++++++++++--
>  include/linux/pm.h         |  3 ++-
>  2 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
> index a08f3f228e6d..390b1225f3cc 100644
> --- a/drivers/usb/core/hcd-pci.c
> +++ b/drivers/usb/core/hcd-pci.c
> @@ -422,7 +422,12 @@ static int suspend_common(struct device *dev,
> pm_message_t msg)
>         bool                    do_wakeup;
>         int                     retval;
>  
> -       do_wakeup = PMSG_IS_AUTO(msg) ? true :
> device_may_wakeup(dev);
> +       if (PMSG_IS_AUTO(msg))
> +               do_wakeup = true;
> +       else if (PMSG_NO_WAKEUP(msg))
> +               do_wakeup = false;
> +       else
> +               do_wakeup = device_may_wakeup(dev);
>  
>         /* Root hub suspend should have stopped all downstream
> traffic,
>          * and all bus master traffic.  And done so for both the
> interface
> @@ -521,6 +526,11 @@ static int hcd_pci_suspend(struct device *dev)
>         return suspend_common(dev, PMSG_SUSPEND);
>  }
>  
> +static int hcd_pci_freeze(struct device *dev)
> +{
> +       return suspend_common(dev, PMSG_FREEZE);
> +}
> +
>  static int hcd_pci_suspend_noirq(struct device *dev)
>  {
>         struct pci_dev          *pci_dev = to_pci_dev(dev);
> @@ -590,6 +600,7 @@ static int hcd_pci_restore(struct device *dev)
>  #else
>  
>  #define hcd_pci_suspend                NULL
> +#define hcd_pci_freeze                 NULL
>  #define hcd_pci_suspend_noirq  NULL
>  #define hcd_pci_poweroff_late  NULL
>  #define hcd_pci_resume_noirq   NULL
> @@ -624,7 +635,7 @@ const struct dev_pm_ops usb_hcd_pci_pm_ops = {
>         .suspend_noirq  = hcd_pci_suspend_noirq,
>         .resume_noirq   = hcd_pci_resume_noirq,
>         .resume         = hcd_pci_resume,
> -       .freeze         = hcd_pci_suspend,
> +       .freeze         = hcd_pci_freeze,
>         .freeze_noirq   = check_root_hub_suspended,
>         .thaw_noirq     = NULL,
>         .thaw           = hcd_pci_resume,
> diff --git a/include/linux/pm.h b/include/linux/pm.h
> index 97b0e23363c8..2a676b2cb699 100644
> --- a/include/linux/pm.h
> +++ b/include/linux/pm.h
> @@ -570,7 +570,8 @@ const struct dev_pm_ops name = { \
>                                         { .event =
> PM_EVENT_AUTO_RESUME, })
>  
>  #define PMSG_IS_AUTO(msg)      (((msg).event & PM_EVENT_AUTO) != 0)
> -
> +#define PMSG_NO_WAKEUP(msg)    (((msg).event & \
> +                               (PM_EVENT_FREEZE | PM_EVENT_QUIESCE))
> != 0)
>  /*
>   * Device run-time power management status.
>   *
Greg Kroah-Hartman Oct. 24, 2024, 7:05 a.m. UTC | #2
On Thu, Oct 24, 2024 at 10:40:38AM +0800, Duan Chenghao wrote:
> When a device is inserted into the USB port and an S4 wakeup is initiated,
> after the USB-hub initialization is completed, it will automatically enter
> suspend mode. Upon detecting a device on the USB port, it will proceed with
> resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state. During the S4
> wakeup process, peripherals are put into suspend mode, followed by task
> recovery. However, upon detecting that the hcd is in the
> HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY status, causing the
> S4 suspend to fail and subsequent task recovery to not proceed.
> -
> [   27.594598][ 1]  PM: pci_pm_freeze(): hcd_pci_suspend+0x0/0x28 returns -16
> [   27.594601][ 1]  PM: dpm_run_callback(): pci_pm_freeze+0x0/0x100 returns -16
> [   27.603420][ 1]  ehci-pci 0000:00:04.1: pci_pm_freeze+0x0/0x100 returned 0 after 3 usecs
> [   27.612233][ 1]  ehci-pci 0000:00:05.1: pci_pm_freeze+0x0/0x100 returned -16 after 17223 usecs
> [   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce async: error -16
> [   27.816988][ 1]  PM: quiesce of devices aborted after 1833.282 msecs
> [   27.823302][ 1]  PM: start quiesce of devices aborted after 1839.975 msecs
> ......
> [   31.303172][ 1]  PM: recover of devices complete after 3473.039 msecs
> [   31.309818][ 1]  PM: Failed to load hibernation image, recovering.
> [   31.348188][ 1]  PM: Basic memory bitmaps freed
> [   31.352686][ 1]  OOM killer enabled.
> [   31.356232][ 1]  Restarting tasks ... done.
> [   31.360609][ 1]  PM: resume from hibernation failed (0)
> [   31.365800][ 1]  PM: Hibernation image not present or could not be loaded.
> 
> The "do_wakeup" is determined based on whether the controller's
> power/wakeup attribute is set. The current issue necessitates considering
> the type of suspend that is occurring. If the suspend type is either
> PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should be set to
> false.
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> Signed-off-by: Duan Chenghao <duanchenghao@kylinos.cn>

What commit id does this fix?

And I missed where Alan provided a signed-off-by, where was that?

thanks,

greg k-h
Duan Chenghao Oct. 24, 2024, 8:46 a.m. UTC | #3
hi greg k-h,

在 2024-10-24星期四的 09:05 +0200,Greg KH写道:
> On Thu, Oct 24, 2024 at 10:40:38AM +0800, Duan Chenghao wrote:
> > When a device is inserted into the USB port and an S4 wakeup is
> > initiated,
> > after the USB-hub initialization is completed, it will
> > automatically enter
> > suspend mode. Upon detecting a device on the USB port, it will
> > proceed with
> > resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state. During
> > the S4
> > wakeup process, peripherals are put into suspend mode, followed by
> > task
> > recovery. However, upon detecting that the hcd is in the
> > HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY status,
> > causing the
> > S4 suspend to fail and subsequent task recovery to not proceed.
> > -
> > [   27.594598][ 1]  PM: pci_pm_freeze(): hcd_pci_suspend+0x0/0x28
> > returns -16
> > [   27.594601][ 1]  PM: dpm_run_callback(): pci_pm_freeze+0x0/0x100
> > returns -16
> > [   27.603420][ 1]  ehci-pci 0000:00:04.1: pci_pm_freeze+0x0/0x100
> > returned 0 after 3 usecs
> > [   27.612233][ 1]  ehci-pci 0000:00:05.1: pci_pm_freeze+0x0/0x100
> > returned -16 after 17223 usecs
> > [   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce
> > async: error -16
> > [   27.816988][ 1]  PM: quiesce of devices aborted after 1833.282
> > msecs
> > [   27.823302][ 1]  PM: start quiesce of devices aborted after
> > 1839.975 msecs
> > ......
> > [   31.303172][ 1]  PM: recover of devices complete after 3473.039
> > msecs
> > [   31.309818][ 1]  PM: Failed to load hibernation image,
> > recovering.
> > [   31.348188][ 1]  PM: Basic memory bitmaps freed
> > [   31.352686][ 1]  OOM killer enabled.
> > [   31.356232][ 1]  Restarting tasks ... done.
> > [   31.360609][ 1]  PM: resume from hibernation failed (0)
> > [   31.365800][ 1]  PM: Hibernation image not present or could not
> > be loaded.
> > 
> > The "do_wakeup" is determined based on whether the controller's
> > power/wakeup attribute is set. The current issue necessitates
> > considering
> > the type of suspend that is occurring. If the suspend type is
> > either
> > PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should be set
> > to
> > false.
> > 
> > Reported-by: kernel test robot <lkp@intel.com>
> > Closes:
> > https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
> > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > Signed-off-by: Duan Chenghao <duanchenghao@kylinos.cn>
> 
> What commit id does this fix?

The current patch is not intended to fix an issue with a specific
commit, but rather to address a long-standing problem in the USB core.

> 
> And I missed where Alan provided a signed-off-by, where was that?

In the following email, Alan proposed using "Signed-off-by" for
signing.
https://lore.kernel.org/all/489805e7-c19c-4b57-9cd7-713e075261cd@rowland.harvard.edu/

Thanks
Duan Chenghao

> 
> thanks,
> 
> greg k-h
Greg Kroah-Hartman Oct. 29, 2024, 3:27 a.m. UTC | #4
On Thu, Oct 24, 2024 at 04:46:48PM +0800, duanchenghao wrote:
> hi greg k-h,
> 
> 在 2024-10-24星期四的 09:05 +0200,Greg KH写道:
> > On Thu, Oct 24, 2024 at 10:40:38AM +0800, Duan Chenghao wrote:
> > > When a device is inserted into the USB port and an S4 wakeup is
> > > initiated,
> > > after the USB-hub initialization is completed, it will
> > > automatically enter
> > > suspend mode. Upon detecting a device on the USB port, it will
> > > proceed with
> > > resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state. During
> > > the S4
> > > wakeup process, peripherals are put into suspend mode, followed by
> > > task
> > > recovery. However, upon detecting that the hcd is in the
> > > HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY status,
> > > causing the
> > > S4 suspend to fail and subsequent task recovery to not proceed.
> > > -
> > > [   27.594598][ 1]  PM: pci_pm_freeze(): hcd_pci_suspend+0x0/0x28
> > > returns -16
> > > [   27.594601][ 1]  PM: dpm_run_callback(): pci_pm_freeze+0x0/0x100
> > > returns -16
> > > [   27.603420][ 1]  ehci-pci 0000:00:04.1: pci_pm_freeze+0x0/0x100
> > > returned 0 after 3 usecs
> > > [   27.612233][ 1]  ehci-pci 0000:00:05.1: pci_pm_freeze+0x0/0x100
> > > returned -16 after 17223 usecs
> > > [   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce
> > > async: error -16
> > > [   27.816988][ 1]  PM: quiesce of devices aborted after 1833.282
> > > msecs
> > > [   27.823302][ 1]  PM: start quiesce of devices aborted after
> > > 1839.975 msecs
> > > ......
> > > [   31.303172][ 1]  PM: recover of devices complete after 3473.039
> > > msecs
> > > [   31.309818][ 1]  PM: Failed to load hibernation image,
> > > recovering.
> > > [   31.348188][ 1]  PM: Basic memory bitmaps freed
> > > [   31.352686][ 1]  OOM killer enabled.
> > > [   31.356232][ 1]  Restarting tasks ... done.
> > > [   31.360609][ 1]  PM: resume from hibernation failed (0)
> > > [   31.365800][ 1]  PM: Hibernation image not present or could not
> > > be loaded.
> > > 
> > > The "do_wakeup" is determined based on whether the controller's
> > > power/wakeup attribute is set. The current issue necessitates
> > > considering
> > > the type of suspend that is occurring. If the suspend type is
> > > either
> > > PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should be set
> > > to
> > > false.
> > > 
> > > Reported-by: kernel test robot <lkp@intel.com>
> > > Closes:
> > > https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
> > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > > Signed-off-by: Duan Chenghao <duanchenghao@kylinos.cn>
> > 
> > What commit id does this fix?
> 
> The current patch is not intended to fix an issue with a specific
> commit, but rather to address a long-standing problem in the USB core.

So should it be backported to older stable kernels?  If so, how far
back?

> > And I missed where Alan provided a signed-off-by, where was that?
> 
> In the following email, Alan proposed using "Signed-off-by" for
> signing.
> https://lore.kernel.org/all/489805e7-c19c-4b57-9cd7-713e075261cd@rowland.harvard.edu/

Ah, missed that, sorry.

thanks,

greg k-h
Duan Chenghao Oct. 29, 2024, 8:01 a.m. UTC | #5
hi greg k-h,

在 2024-10-29星期二的 04:27 +0100,Greg KH写道:
> On Thu, Oct 24, 2024 at 04:46:48PM +0800, duanchenghao wrote:
> > hi greg k-h,
> > 
> > 在 2024-10-24星期四的 09:05 +0200,Greg KH写道:
> > > On Thu, Oct 24, 2024 at 10:40:38AM +0800, Duan Chenghao wrote:
> > > > When a device is inserted into the USB port and an S4 wakeup is
> > > > initiated,
> > > > after the USB-hub initialization is completed, it will
> > > > automatically enter
> > > > suspend mode. Upon detecting a device on the USB port, it will
> > > > proceed with
> > > > resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state.
> > > > During
> > > > the S4
> > > > wakeup process, peripherals are put into suspend mode, followed
> > > > by
> > > > task
> > > > recovery. However, upon detecting that the hcd is in the
> > > > HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY status,
> > > > causing the
> > > > S4 suspend to fail and subsequent task recovery to not proceed.
> > > > -
> > > > [   27.594598][ 1]  PM: pci_pm_freeze():
> > > > hcd_pci_suspend+0x0/0x28
> > > > returns -16
> > > > [   27.594601][ 1]  PM: dpm_run_callback():
> > > > pci_pm_freeze+0x0/0x100
> > > > returns -16
> > > > [   27.603420][ 1]  ehci-pci 0000:00:04.1:
> > > > pci_pm_freeze+0x0/0x100
> > > > returned 0 after 3 usecs
> > > > [   27.612233][ 1]  ehci-pci 0000:00:05.1:
> > > > pci_pm_freeze+0x0/0x100
> > > > returned -16 after 17223 usecs
> > > > [   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce
> > > > async: error -16
> > > > [   27.816988][ 1]  PM: quiesce of devices aborted after
> > > > 1833.282
> > > > msecs
> > > > [   27.823302][ 1]  PM: start quiesce of devices aborted after
> > > > 1839.975 msecs
> > > > ......
> > > > [   31.303172][ 1]  PM: recover of devices complete after
> > > > 3473.039
> > > > msecs
> > > > [   31.309818][ 1]  PM: Failed to load hibernation image,
> > > > recovering.
> > > > [   31.348188][ 1]  PM: Basic memory bitmaps freed
> > > > [   31.352686][ 1]  OOM killer enabled.
> > > > [   31.356232][ 1]  Restarting tasks ... done.
> > > > [   31.360609][ 1]  PM: resume from hibernation failed (0)
> > > > [   31.365800][ 1]  PM: Hibernation image not present or could
> > > > not
> > > > be loaded.
> > > > 
> > > > The "do_wakeup" is determined based on whether the controller's
> > > > power/wakeup attribute is set. The current issue necessitates
> > > > considering
> > > > the type of suspend that is occurring. If the suspend type is
> > > > either
> > > > PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should be
> > > > set
> > > > to
> > > > false.
> > > > 
> > > > Reported-by: kernel test robot <lkp@intel.com>
> > > > Closes:
> > > > https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
> > > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > > > Signed-off-by: Duan Chenghao <duanchenghao@kylinos.cn>
> > > 
> > > What commit id does this fix?
> > 
> > The current patch is not intended to fix an issue with a specific
> > commit, but rather to address a long-standing problem in the USB
> > core.
> 
> So should it be backported to older stable kernels?  If so, how far
> back?

yes, It needs to be backported. The stable branches such as 6.6.y,
6.10.y, and 6.11.y can be considered for the backport.

Should we backport to these versions?

Thanks 

Duan Chenghao
> 
> > > And I missed where Alan provided a signed-off-by, where was that?
> > 
> > In the following email, Alan proposed using "Signed-off-by" for
> > signing.
> > https://lore.kernel.org/all/489805e7-c19c-4b57-9cd7-713e075261cd@rowland.harvard.edu/
> 
> Ah, missed that, sorry.
> 
> thanks,
> 
> greg k-h
Duan Chenghao Nov. 6, 2024, 1:29 a.m. UTC | #6
Hi Greg k-h & Alan,

Excuse me, both of you. I've noticed that you haven't replied to the
emails for quite some time, which prompted me to send this one. I'd
like to inquire if the proposal in the current email is feasible and if
it can be integrated into the community.

Thanks,

Duan Chenghao

在 2024-10-29星期二的 16:01 +0800,duanchenghao写道:
> hi greg k-h,
> 
> 在 2024-10-29星期二的 04:27 +0100,Greg KH写道:
> > On Thu, Oct 24, 2024 at 04:46:48PM +0800, duanchenghao wrote:
> > > hi greg k-h,
> > > 
> > > 在 2024-10-24星期四的 09:05 +0200,Greg KH写道:
> > > > On Thu, Oct 24, 2024 at 10:40:38AM +0800, Duan Chenghao wrote:
> > > > > When a device is inserted into the USB port and an S4 wakeup
> > > > > is
> > > > > initiated,
> > > > > after the USB-hub initialization is completed, it will
> > > > > automatically enter
> > > > > suspend mode. Upon detecting a device on the USB port, it
> > > > > will
> > > > > proceed with
> > > > > resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state.
> > > > > During
> > > > > the S4
> > > > > wakeup process, peripherals are put into suspend mode,
> > > > > followed
> > > > > by
> > > > > task
> > > > > recovery. However, upon detecting that the hcd is in the
> > > > > HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY
> > > > > status,
> > > > > causing the
> > > > > S4 suspend to fail and subsequent task recovery to not
> > > > > proceed.
> > > > > -
> > > > > [   27.594598][ 1]  PM: pci_pm_freeze():
> > > > > hcd_pci_suspend+0x0/0x28
> > > > > returns -16
> > > > > [   27.594601][ 1]  PM: dpm_run_callback():
> > > > > pci_pm_freeze+0x0/0x100
> > > > > returns -16
> > > > > [   27.603420][ 1]  ehci-pci 0000:00:04.1:
> > > > > pci_pm_freeze+0x0/0x100
> > > > > returned 0 after 3 usecs
> > > > > [   27.612233][ 1]  ehci-pci 0000:00:05.1:
> > > > > pci_pm_freeze+0x0/0x100
> > > > > returned -16 after 17223 usecs
> > > > > [   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce
> > > > > async: error -16
> > > > > [   27.816988][ 1]  PM: quiesce of devices aborted after
> > > > > 1833.282
> > > > > msecs
> > > > > [   27.823302][ 1]  PM: start quiesce of devices aborted
> > > > > after
> > > > > 1839.975 msecs
> > > > > ......
> > > > > [   31.303172][ 1]  PM: recover of devices complete after
> > > > > 3473.039
> > > > > msecs
> > > > > [   31.309818][ 1]  PM: Failed to load hibernation image,
> > > > > recovering.
> > > > > [   31.348188][ 1]  PM: Basic memory bitmaps freed
> > > > > [   31.352686][ 1]  OOM killer enabled.
> > > > > [   31.356232][ 1]  Restarting tasks ... done.
> > > > > [   31.360609][ 1]  PM: resume from hibernation failed (0)
> > > > > [   31.365800][ 1]  PM: Hibernation image not present or
> > > > > could
> > > > > not
> > > > > be loaded.
> > > > > 
> > > > > The "do_wakeup" is determined based on whether the
> > > > > controller's
> > > > > power/wakeup attribute is set. The current issue necessitates
> > > > > considering
> > > > > the type of suspend that is occurring. If the suspend type is
> > > > > either
> > > > > PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should
> > > > > be
> > > > > set
> > > > > to
> > > > > false.
> > > > > 
> > > > > Reported-by: kernel test robot <lkp@intel.com>
> > > > > Closes:
> > > > > https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
> > > > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > > > > Signed-off-by: Duan Chenghao <duanchenghao@kylinos.cn>
> > > > 
> > > > What commit id does this fix?
> > > 
> > > The current patch is not intended to fix an issue with a specific
> > > commit, but rather to address a long-standing problem in the USB
> > > core.
> > 
> > So should it be backported to older stable kernels?  If so, how far
> > back?
> 
> yes, It needs to be backported. The stable branches such as 6.6.y,
> 6.10.y, and 6.11.y can be considered for the backport.
> 
> Should we backport to these versions?
> 
> Thanks 
> 
> Duan Chenghao
> > 
> > > > And I missed where Alan provided a signed-off-by, where was
> > > > that?
> > > 
> > > In the following email, Alan proposed using "Signed-off-by" for
> > > signing.
> > > https://lore.kernel.org/all/489805e7-c19c-4b57-9cd7-713e075261cd@rowland.harvard.edu/
> > 
> > Ah, missed that, sorry.
> > 
> > thanks,
> > 
> > greg k-h
>
Duan Chenghao Nov. 20, 2024, 2:14 a.m. UTC | #7
Hi Greg KH/Alan,

I haven't received a reply from both of you for a long time. Sorry for
the disturbance. May I ask if the current patch can be merged into the
mainline or stable branch? I need this conclusion as it is very
important to me. Thank you very much.

Thanks
Duan Chenghao 

在 2024-10-29星期二的 04:27 +0100,Greg KH写道:
> On Thu, Oct 24, 2024 at 04:46:48PM +0800, duanchenghao wrote:
> > hi greg k-h,
> > 
> > 在 2024-10-24星期四的 09:05 +0200,Greg KH写道:
> > > On Thu, Oct 24, 2024 at 10:40:38AM +0800, Duan Chenghao wrote:
> > > > When a device is inserted into the USB port and an S4 wakeup is
> > > > initiated,
> > > > after the USB-hub initialization is completed, it will
> > > > automatically enter
> > > > suspend mode. Upon detecting a device on the USB port, it will
> > > > proceed with
> > > > resume and set the hcd to the HCD_FLAG_WAKEUP_PENDING state.
> > > > During
> > > > the S4
> > > > wakeup process, peripherals are put into suspend mode, followed
> > > > by
> > > > task
> > > > recovery. However, upon detecting that the hcd is in the
> > > > HCD_FLAG_WAKEUP_PENDING state, it will return an EBUSY status,
> > > > causing the
> > > > S4 suspend to fail and subsequent task recovery to not proceed.
> > > > -
> > > > [   27.594598][ 1]  PM: pci_pm_freeze():
> > > > hcd_pci_suspend+0x0/0x28
> > > > returns -16
> > > > [   27.594601][ 1]  PM: dpm_run_callback():
> > > > pci_pm_freeze+0x0/0x100
> > > > returns -16
> > > > [   27.603420][ 1]  ehci-pci 0000:00:04.1:
> > > > pci_pm_freeze+0x0/0x100
> > > > returned 0 after 3 usecs
> > > > [   27.612233][ 1]  ehci-pci 0000:00:05.1:
> > > > pci_pm_freeze+0x0/0x100
> > > > returned -16 after 17223 usecs
> > > > [   27.810067][ 1]  PM: Device 0000:00:05.1 failed to quiesce
> > > > async: error -16
> > > > [   27.816988][ 1]  PM: quiesce of devices aborted after
> > > > 1833.282
> > > > msecs
> > > > [   27.823302][ 1]  PM: start quiesce of devices aborted after
> > > > 1839.975 msecs
> > > > ......
> > > > [   31.303172][ 1]  PM: recover of devices complete after
> > > > 3473.039
> > > > msecs
> > > > [   31.309818][ 1]  PM: Failed to load hibernation image,
> > > > recovering.
> > > > [   31.348188][ 1]  PM: Basic memory bitmaps freed
> > > > [   31.352686][ 1]  OOM killer enabled.
> > > > [   31.356232][ 1]  Restarting tasks ... done.
> > > > [   31.360609][ 1]  PM: resume from hibernation failed (0)
> > > > [   31.365800][ 1]  PM: Hibernation image not present or could
> > > > not
> > > > be loaded.
> > > > 
> > > > The "do_wakeup" is determined based on whether the controller's
> > > > power/wakeup attribute is set. The current issue necessitates
> > > > considering
> > > > the type of suspend that is occurring. If the suspend type is
> > > > either
> > > > PM_EVENT_FREEZE or PM_EVENT_QUIESCE, then "do_wakeup" should be
> > > > set
> > > > to
> > > > false.
> > > > 
> > > > Reported-by: kernel test robot <
> > > > lkp@intel.com
> > > > >
> > > > Closes:
> > > > https://lore.kernel.org/oe-kbuild-all/202410151722.rfjtknRz-lkp@intel.com/
> > > > 
> > > > Signed-off-by: Alan Stern <
> > > > stern@rowland.harvard.edu
> > > > >
> > > > Signed-off-by: Duan Chenghao <
> > > > duanchenghao@kylinos.cn
> > > > >
> > > 
> > > What commit id does this fix?
> > 
> > The current patch is not intended to fix an issue with a specific
> > commit, but rather to address a long-standing problem in the USB
> > core.
> 
> So should it be backported to older stable kernels?  If so, how far
> back?
> 
> > > And I missed where Alan provided a signed-off-by, where was that?
> > 
> > In the following email, Alan proposed using "Signed-off-by" for
> > signing.
> > https://lore.kernel.org/all/489805e7-c19c-4b57-9cd7-713e075261cd@rowland.harvard.edu/
> > 
> 
> Ah, missed that, sorry.
> 
> thanks,
> 
> greg k-h
Greg Kroah-Hartman Nov. 21, 2024, 7:09 p.m. UTC | #8
On Wed, Nov 06, 2024 at 09:29:15AM +0800, duanchenghao wrote:
> Hi Greg k-h & Alan,
> 
> Excuse me, both of you. I've noticed that you haven't replied to the
> emails for quite some time, which prompted me to send this one. I'd
> like to inquire if the proposal in the current email is feasible and if
> it can be integrated into the community.

Sorry, I missed this in the last round of patch reviews.  I'll queue it
up after 6.13-rc1 is out as the merge window is closed for adding new
stuff to my trees.

thanks,

greg k-h
diff mbox series

Patch

diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
index a08f3f228e6d..390b1225f3cc 100644
--- a/drivers/usb/core/hcd-pci.c
+++ b/drivers/usb/core/hcd-pci.c
@@ -422,7 +422,12 @@  static int suspend_common(struct device *dev, pm_message_t msg)
 	bool			do_wakeup;
 	int			retval;
 
-	do_wakeup = PMSG_IS_AUTO(msg) ? true : device_may_wakeup(dev);
+	if (PMSG_IS_AUTO(msg))
+		do_wakeup = true;
+	else if (PMSG_NO_WAKEUP(msg))
+		do_wakeup = false;
+	else
+		do_wakeup = device_may_wakeup(dev);
 
 	/* Root hub suspend should have stopped all downstream traffic,
 	 * and all bus master traffic.  And done so for both the interface
@@ -521,6 +526,11 @@  static int hcd_pci_suspend(struct device *dev)
 	return suspend_common(dev, PMSG_SUSPEND);
 }
 
+static int hcd_pci_freeze(struct device *dev)
+{
+	return suspend_common(dev, PMSG_FREEZE);
+}
+
 static int hcd_pci_suspend_noirq(struct device *dev)
 {
 	struct pci_dev		*pci_dev = to_pci_dev(dev);
@@ -590,6 +600,7 @@  static int hcd_pci_restore(struct device *dev)
 #else
 
 #define hcd_pci_suspend		NULL
+#define hcd_pci_freeze			NULL
 #define hcd_pci_suspend_noirq	NULL
 #define hcd_pci_poweroff_late	NULL
 #define hcd_pci_resume_noirq	NULL
@@ -624,7 +635,7 @@  const struct dev_pm_ops usb_hcd_pci_pm_ops = {
 	.suspend_noirq	= hcd_pci_suspend_noirq,
 	.resume_noirq	= hcd_pci_resume_noirq,
 	.resume		= hcd_pci_resume,
-	.freeze		= hcd_pci_suspend,
+	.freeze		= hcd_pci_freeze,
 	.freeze_noirq	= check_root_hub_suspended,
 	.thaw_noirq	= NULL,
 	.thaw		= hcd_pci_resume,
diff --git a/include/linux/pm.h b/include/linux/pm.h
index 97b0e23363c8..2a676b2cb699 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -570,7 +570,8 @@  const struct dev_pm_ops name = { \
 					{ .event = PM_EVENT_AUTO_RESUME, })
 
 #define PMSG_IS_AUTO(msg)	(((msg).event & PM_EVENT_AUTO) != 0)
-
+#define PMSG_NO_WAKEUP(msg)	(((msg).event & \
+				(PM_EVENT_FREEZE | PM_EVENT_QUIESCE)) != 0)
 /*
  * Device run-time power management status.
  *