Message ID | Pine.LNX.4.44L0.1805021334100.1505-100000@iolanthe.rowland.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 02.05.2018 20:52, Alan Stern wrote: > On Wed, 2 May 2018, Mathias Nyman wrote: > >> On 24.04.2018 16:50, Alan Stern wrote: >>> On Tue, 24 Apr 2018, Mathias Nyman wrote: >>> >>>>>>> In this situation, the HCD_WAKEUP_PENDING(hcd) test in >>>>>>> hcd-pci.c:suspend_common() should prevent the controller from going >>>>>>> back into D3. The WAKEUP_PENDING bit gets set in >>>>>>> usb_hcd_resume_root_hub() and it doesn't get cleared until >>>>>>> hcd_bus_resume() runs. >>>>>>> >>>>>> >>>>>> I think xhci never calls usb_hcd_resume_root_hub() in xhci_resume() in this >>>>>> specific failing case >>>>>> >>>>>> xhci_resume() has a check: >>>>>> /* Resume root hubs only when have pending events. */ >>>>>> status = readl(&xhci->op_regs->status); >>>>>> if (status & STS_EINT) { >>>>>> usb_hcd_resume_root_hub(xhci->shared_hcd); >>>>>> usb_hcd_resume_root_hub(hcd); >>>>>> } >>>>>> >>>>>> If the check fails, then WAKEUP_PENDING bit is not set, and runtime PM >>>>>> can suspend host controller again. when xhci driver finally gets to handle the interrupt >>>>>> the controller may be in D3 already >>>>>> >>>>>> This should only happen if xhci_resume() is called before xhci driver sees a pending interrupt, >>>>>> could be possible as xhci has interrupt moderation enabled. >>>>> >>>>> Then maybe that test should be removed. Calling >>>>> usb_hcd_resume_root_hub() for every wakeup shouldn't be too bad, >>>>> because there probably are not very many times when the controller gets >>>>> resumed without the root hub also being resumed. >>>>> >>>> >>>> The check was added to fix system suspend issue on a runtime suspended host: >>>> >>>> commit d6236f6d1d885aa19d1cd7317346fe795227a3cc >>>> >>>> xhci: Fix runtime suspended xhci from blocking system suspend. >>>> >>>> The system suspend flow as following: >>>> 1, Freeze all user processes and kenrel threads. >>>> >>>> 2, Try to suspend all devices. >>>> >>>> 2.1, If pci device is in RPM suspended state, then pci driver will try >>>> to resume it to RPM active state in the prepare stage. >>>> >>>> 2.2, xhci_resume function calls usb_hcd_resume_root_hub to queue two >>>> workqueue items to resume usb2&usb3 roothub devices. >>>> >>>> 2.3, Call suspend callbacks of devices. >>>> >>>> 2.3.1, All suspend callbacks of all hcd's children, including >>>> roothub devices are called. >>>> >>>> 2.3.2, Finally, hcd_pci_suspend callback is called. >>>> >>>> Due to workqueue threads were already frozen in step 1, the workqueue >>>> items can't be scheduled, and the roothub devices can't be resumed in >>>> this flow. The HCD_FLAG_WAKEUP_PENDING flag which is set in >>>> usb_hcd_resume_root_hub won't be cleared. Finally, >>>> hcd_pci_suspend will return -EBUSY, and system suspend fails. >>> >>> Hmmm. I don't recall seeing this problem occur with ehci-hcd. But >>> then, I haven't tested it very much recently. >>> >>> We could change to a different work queue, one that doesn't get >>> frozen. But there's no guarantee that the work items would run before >>> your step 2.3.2. >>> >>> Maybe we can avoid step 2.1. I think there have been some recent >>> changes to the PM code in this area. There may be a flag you can set >>> that will prevent the PCI core from resuming the host controller. >>> >>> Or maybe we can change step 2.3.1, so that the root hub's suspend >>> callback will first do a resume if the WAKEUP_PENDING flag is set. >>> That might be the most reliable approach. >>> >> >> I'm not sure I understand the last suggestion, could you open up how it >> would work? > > Here's what I had in mind. See if you think this would work. > > Consider choose_wakeup() in core/driver.c. That subroutine gets called > by usb_suspend() when step 2.3.1 wants to suspend a USB device. We > could patch it as follows: Thanks, now I see. I was able to reproduce system suspend failure of a runtime suspended host by removing the event check in xhci_ring, and making sure pm_runtime_resume(&udev->dev) wasn't called in choose_wakeup(). > > --- usb-4.x.orig/drivers/usb/core/driver.c > +++ usb-4.x/drivers/usb/core/driver.c > @@ -1449,11 +1449,21 @@ static void choose_wakeup(struct usb_dev > */ > w = device_may_wakeup(&udev->dev); > > - /* If the device is autosuspended with the wrong wakeup setting, > + /* > + * If the device is autosuspended with the wrong wakeup setting, > * autoresume now so the setting can be changed. > + * > + * Likewise, if the device is an autosuspended root hub and the > + * hcd needs to wake it up before the controller can be suspended, > + * resume it now to clear the WAKEUP_PENDING flag. > */ > - if (udev->state == USB_STATE_SUSPENDED && w != udev->do_remote_wakeup) > - pm_runtime_resume(&udev->dev); > + if (udev->state == USB_STATE_SUSPENDED) { > + struct usb_hcd *hcd = bus_to_hcd(udev->bus); > + > + if (w != udev->do_remote_wakeup || > + (!udev->parent && HCD_WAKEUP_PENDING(hcd))) > + pm_runtime_resume(&udev->dev); > + } > udev->do_remote_wakeup = w; > } > > If I only add the: if (!udev->parent && HCD_WAKEUP_PENDING(hcd))) pm_runtime_resume(&udev->dev); to choose_wakeup() It still doesn't work. The HCD_WAKEUP_PENDING(hcd) check is false. Turns out that the xhci_resume() that ends up setting HCD_FLAG_WAKEUP_PENDING is called after choose_wakeup() for the roothub. There's something in the pm functions order that I don't follow here -Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03.05.2018 14:37, Mathias Nyman wrote: > On 02.05.2018 20:52, Alan Stern wrote: >> On Wed, 2 May 2018, Mathias Nyman wrote: >> >>> On 24.04.2018 16:50, Alan Stern wrote: >>>> On Tue, 24 Apr 2018, Mathias Nyman wrote: >>>> >>>>>>>> In this situation, the HCD_WAKEUP_PENDING(hcd) test in >>>>>>>> hcd-pci.c:suspend_common() should prevent the controller from going >>>>>>>> back into D3. The WAKEUP_PENDING bit gets set in >>>>>>>> usb_hcd_resume_root_hub() and it doesn't get cleared until >>>>>>>> hcd_bus_resume() runs. >>>>>>>> >>>>>>> >>>>>>> I think xhci never calls usb_hcd_resume_root_hub() in xhci_resume() in this >>>>>>> specific failing case >>>>>>> >>>>>>> xhci_resume() has a check: >>>>>>> /* Resume root hubs only when have pending events. */ >>>>>>> status = readl(&xhci->op_regs->status); >>>>>>> if (status & STS_EINT) { >>>>>>> usb_hcd_resume_root_hub(xhci->shared_hcd); >>>>>>> usb_hcd_resume_root_hub(hcd); >>>>>>> } >>>>>>> >>>>>>> If the check fails, then WAKEUP_PENDING bit is not set, and runtime PM >>>>>>> can suspend host controller again. when xhci driver finally gets to handle the interrupt >>>>>>> the controller may be in D3 already >>>>>>> >>>>>>> This should only happen if xhci_resume() is called before xhci driver sees a pending interrupt, >>>>>>> could be possible as xhci has interrupt moderation enabled. >>>>>> >>>>>> Then maybe that test should be removed. Calling >>>>>> usb_hcd_resume_root_hub() for every wakeup shouldn't be too bad, >>>>>> because there probably are not very many times when the controller gets >>>>>> resumed without the root hub also being resumed. >>>>>> >>>>> >>>>> The check was added to fix system suspend issue on a runtime suspended host: >>>>> >>>>> commit d6236f6d1d885aa19d1cd7317346fe795227a3cc >>>>> >>>>> xhci: Fix runtime suspended xhci from blocking system suspend. >>>>> The system suspend flow as following: >>>>> 1, Freeze all user processes and kenrel threads. >>>>> 2, Try to suspend all devices. >>>>> 2.1, If pci device is in RPM suspended state, then pci driver will try >>>>> to resume it to RPM active state in the prepare stage. >>>>> 2.2, xhci_resume function calls usb_hcd_resume_root_hub to queue two >>>>> workqueue items to resume usb2&usb3 roothub devices. >>>>> 2.3, Call suspend callbacks of devices. >>>>> 2.3.1, All suspend callbacks of all hcd's children, including >>>>> roothub devices are called. >>>>> 2.3.2, Finally, hcd_pci_suspend callback is called. >>>>> Due to workqueue threads were already frozen in step 1, the workqueue >>>>> items can't be scheduled, and the roothub devices can't be resumed in >>>>> this flow. The HCD_FLAG_WAKEUP_PENDING flag which is set in >>>>> usb_hcd_resume_root_hub won't be cleared. Finally, >>>>> hcd_pci_suspend will return -EBUSY, and system suspend fails. >>>> >>>> Hmmm. I don't recall seeing this problem occur with ehci-hcd. But >>>> then, I haven't tested it very much recently. >>>> >>>> We could change to a different work queue, one that doesn't get >>>> frozen. But there's no guarantee that the work items would run before >>>> your step 2.3.2. >>>> >>>> Maybe we can avoid step 2.1. I think there have been some recent >>>> changes to the PM code in this area. There may be a flag you can set >>>> that will prevent the PCI core from resuming the host controller. >>>> >>>> Or maybe we can change step 2.3.1, so that the root hub's suspend >>>> callback will first do a resume if the WAKEUP_PENDING flag is set. >>>> That might be the most reliable approach. >>>> >>> >>> I'm not sure I understand the last suggestion, could you open up how it >>> would work? >> >> Here's what I had in mind. See if you think this would work. >> >> Consider choose_wakeup() in core/driver.c. That subroutine gets called >> by usb_suspend() when step 2.3.1 wants to suspend a USB device. We >> could patch it as follows: > > Thanks, now I see. > > I was able to reproduce system suspend failure of a runtime > suspended host by removing the event check in xhci_ring, and making sure > pm_runtime_resume(&udev->dev) wasn't called in choose_wakeup(). > >> >> --- usb-4.x.orig/drivers/usb/core/driver.c >> +++ usb-4.x/drivers/usb/core/driver.c >> @@ -1449,11 +1449,21 @@ static void choose_wakeup(struct usb_dev >> */ >> w = device_may_wakeup(&udev->dev); >> - /* If the device is autosuspended with the wrong wakeup setting, >> + /* >> + * If the device is autosuspended with the wrong wakeup setting, >> * autoresume now so the setting can be changed. >> + * >> + * Likewise, if the device is an autosuspended root hub and the >> + * hcd needs to wake it up before the controller can be suspended, >> + * resume it now to clear the WAKEUP_PENDING flag. >> */ >> - if (udev->state == USB_STATE_SUSPENDED && w != udev->do_remote_wakeup) >> - pm_runtime_resume(&udev->dev); >> + if (udev->state == USB_STATE_SUSPENDED) { >> + struct usb_hcd *hcd = bus_to_hcd(udev->bus); >> + >> + if (w != udev->do_remote_wakeup || >> + (!udev->parent && HCD_WAKEUP_PENDING(hcd))) >> + pm_runtime_resume(&udev->dev); >> + } >> udev->do_remote_wakeup = w; >> } >> > > If I only add the: > if (!udev->parent && HCD_WAKEUP_PENDING(hcd))) > pm_runtime_resume(&udev->dev); > to choose_wakeup() It still doesn't work. > The HCD_WAKEUP_PENDING(hcd) check is false. > > Turns out that the xhci_resume() that ends up setting HCD_FLAG_WAKEUP_PENDING is called > after choose_wakeup() for the roothub. > > There's something in the pm functions order that I don't follow here Actually I do get the ordering. When everything is runtime suspended and start system suspend, we first suspend all the usb devices, including the roothubs calling choose_wakeup() for the roothubs. No flags are set yet. When pm continues suspending, and tries to suspend the xhci PCI controller the PCI suspend code notices the device is runtime suspended, it resumes it -> xhci_resume() -> usb_hcd_resume_root_hub() and WAKEUP_PENDING flag is set. When PCI code then continues and tries to suspend the pci device it fails because the flag is set. So checking HCD_WAKEUP_PENDING(hcd) for roothub in choose_wakeup() won't help, it's too early -Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 3 May 2018, Mathias Nyman wrote: > When everything is runtime suspended and start system suspend, we first suspend all > the usb devices, including the roothubs calling choose_wakeup() for the roothubs. > No flags are set yet. When pm continues suspending, and tries to suspend the xhci PCI > controller the PCI suspend code notices the device is runtime suspended, > it resumes it -> xhci_resume() -> usb_hcd_resume_root_hub() and WAKEUP_PENDING flag is set. > When PCI code then continues and tries to suspend the pci device it fails because the flag is set. Okay, I get the picture. And I just spent some time going over the core code and some of the other drivers. So yes, what I said earlier was wrong. The existing code in xhci_resume() is more or less correct; it should call usb_hcd_resume_root_hub() _only_ when there is a pending wakeup request from the root hub or a downstream device. Earlier you wrote: > If the check fails, then WAKEUP_PENDING bit is not set, and runtime PM > can suspend host controller again. when xhci driver finally gets to handle the interrupt > the controller may be in D3 already > > This should only happen if xhci_resume() is called before xhci driver sees a pending interrupt, > could be possible as xhci has interrupt moderation enabled. This is the real problem. You need to make sure that even with interrupt moderation, if there is a pending wakeup request then you can detect it properly. In other words, xhci_resume() may need to explicitly check the root-hub port statuses, because it can't rely on the interrupt handler to inform it that a wakeup request has been received. Does that make sense? Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03.05.2018 21:56, Alan Stern wrote: > On Thu, 3 May 2018, Mathias Nyman wrote: > >> When everything is runtime suspended and start system suspend, we first suspend all >> the usb devices, including the roothubs calling choose_wakeup() for the roothubs. >> No flags are set yet. When pm continues suspending, and tries to suspend the xhci PCI >> controller the PCI suspend code notices the device is runtime suspended, >> it resumes it -> xhci_resume() -> usb_hcd_resume_root_hub() and WAKEUP_PENDING flag is set. >> When PCI code then continues and tries to suspend the pci device it fails because the flag is set. > > Okay, I get the picture. And I just spent some time going over the > core code and some of the other drivers. > > So yes, what I said earlier was wrong. The existing code in > xhci_resume() is more or less correct; it should call > usb_hcd_resume_root_hub() _only_ when there is a pending wakeup request > from the root hub or a downstream device. > > Earlier you wrote: > >> If the check fails, then WAKEUP_PENDING bit is not set, and runtime PM >> can suspend host controller again. when xhci driver finally gets to handle the interrupt >> the controller may be in D3 already >> >> This should only happen if xhci_resume() is called before xhci driver sees a pending interrupt, >> could be possible as xhci has interrupt moderation enabled. > > This is the real problem. You need to make sure that even with > interrupt moderation, if there is a pending wakeup request then you can > detect it properly. In other words, xhci_resume() may need to > explicitly check the root-hub port statuses, because it can't rely on > the interrupt handler to inform it that a wakeup request has been > received. > > Does that make sense? > It does, thanks I'll write a testpatch that checks changes for ports in xhci_resume() -Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- usb-4.x.orig/drivers/usb/core/driver.c +++ usb-4.x/drivers/usb/core/driver.c @@ -1449,11 +1449,21 @@ static void choose_wakeup(struct usb_dev */ w = device_may_wakeup(&udev->dev); - /* If the device is autosuspended with the wrong wakeup setting, + /* + * If the device is autosuspended with the wrong wakeup setting, * autoresume now so the setting can be changed. + * + * Likewise, if the device is an autosuspended root hub and the + * hcd needs to wake it up before the controller can be suspended, + * resume it now to clear the WAKEUP_PENDING flag. */ - if (udev->state == USB_STATE_SUSPENDED && w != udev->do_remote_wakeup) - pm_runtime_resume(&udev->dev); + if (udev->state == USB_STATE_SUSPENDED) { + struct usb_hcd *hcd = bus_to_hcd(udev->bus); + + if (w != udev->do_remote_wakeup || + (!udev->parent && HCD_WAKEUP_PENDING(hcd))) + pm_runtime_resume(&udev->dev); + } udev->do_remote_wakeup = w; }