Message ID | Pine.LNX.4.44L0.1106161656190.1697-100000@iolanthe.rowland.org (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Thu, Jun 16, 2011 at 04:57:59PM -0400, Alan Stern wrote: > > > > When I try to suspend my ASUS K53E laptop, the process hangs after > > > > spinning down the disk unless I unload ehci_hcd. This happens on 2.6.38, > > > > 2.6.39 and 3.0-rc3+. Some details are at > > > > https://bugzilla.kernel.org/show_bug.cgi?id=37632 > > > > What additional info can I provide to help fix this if this is possible? > > > Set CONFIG_USB_DEBUG=y and boot with "no_console_suspend" on the > > > command line. What appears on the console when the system is hung? > > PM: Entering mem sleep > > usb usb2: usb auto-resume > > sd blah-blah syncing cache > > ehci_hcd 0000:00:1d.0: resume root hub > > usb 1-1.1: usb suspend > > sd blah-blah stopping disk > > hub 1-1:1.0: hub_suspend > > usb 1-1: unlink qh256-0001/ffff88013b411000 start 1 [1/0 us] > > usb 1-1: usb suspend > > > > > > and that's all. > > Okay, the next step is to try and find out exactly where the ehci > suspend procedure is hanging. The patch below adds a bunch of > printouts at various spots throughout the procedure. Let's see what > shows up. I don't see any of these new messages.
On Fri, 17 Jun 2011, Andrey Rahmatullin wrote: > > Okay, the next step is to try and find out exactly where the ehci > > suspend procedure is hanging. The patch below adds a bunch of > > printouts at various spots throughout the procedure. Let's see what > > shows up. > I don't see any of these new messages. Are you sure the new ehci-hcd.ko module was loaded when you ran the test? Another thing you should do when testing: echo 9 >/proc/sys/kernel/printk (Or equivalently, Alt-SysRq-9 on the keyboard.) I should have mentioned that before sending the test patch. Oh well.... Alan Stern
On Fri, Jun 17, 2011 at 10:14:16AM -0400, Alan Stern wrote: > > > Okay, the next step is to try and find out exactly where the ehci > > > suspend procedure is hanging. The patch below adds a bunch of > > > printouts at various spots throughout the procedure. Let's see what > > > shows up. > > I don't see any of these new messages. > Are you sure the new ehci-hcd.ko module was loaded when you ran the > test? Yes. I'll triple-check though. > Another thing you should do when testing: > > echo 9 >/proc/sys/kernel/printk > > (Or equivalently, Alt-SysRq-9 on the keyboard.) I should have > mentioned that before sending the test patch. Oh well.... I used dmesg -n8.
On Fri, 17 Jun 2011, Andrey Rahmatullin wrote: > On Fri, Jun 17, 2011 at 10:14:16AM -0400, Alan Stern wrote: > > > > Okay, the next step is to try and find out exactly where the ehci > > > > suspend procedure is hanging. The patch below adds a bunch of > > > > printouts at various spots throughout the procedure. Let's see what > > > > shows up. > > > I don't see any of these new messages. > > Are you sure the new ehci-hcd.ko module was loaded when you ran the > > test? > Yes. I'll triple-check though. > > > Another thing you should do when testing: > > > > echo 9 >/proc/sys/kernel/printk > > > > (Or equivalently, Alt-SysRq-9 on the keyboard.) I should have > > mentioned that before sending the test patch. Oh well.... > I used dmesg -n8. Okay, then you must have seen all the verbose PM debugging messages. Which means the last suspend callback to run was for the 1-1 device rather than the usb1 device, right? If so, it means I was looking in the wrong place. 1-1 is the internal rate-matching hub, whereas usb1 is the EHCI controller's root hub. Can you add your own dev_info() statements to the hub_quiesce() and hub_suspend() routines in drivers/usb/core/hub.c, or should I send a patch? Alan Stern
On Fri, Jun 17, 2011 at 11:17:01AM -0400, Alan Stern wrote: > Okay, then you must have seen all the verbose PM debugging messages. > Which means the last suspend callback to run was for the 1-1 device > rather than the usb1 device, right? > > If so, it means I was looking in the wrong place. 1-1 is the internal > rate-matching hub, whereas usb1 is the EHCI controller's root hub. > > Can you add your own dev_info() statements to the hub_quiesce() and > hub_suspend() routines in drivers/usb/core/hub.c, or should I send a > patch? After "usb 1-1.1: usb suspend" I see a full "start hub_suspend, hub_quiesce, rest of hub_suspend" path for 1-1:1.0 (I've used &intf->dev in suspend and hub->intfdev in quiesce as a 1st arg for dev_info()), then "usb 1-1: usb suspend" and after that it hangs.
On Fri, 17 Jun 2011, Andrey Rahmatullin wrote: > On Fri, Jun 17, 2011 at 11:17:01AM -0400, Alan Stern wrote: > > Okay, then you must have seen all the verbose PM debugging messages. > > Which means the last suspend callback to run was for the 1-1 device > > rather than the usb1 device, right? > > > > If so, it means I was looking in the wrong place. 1-1 is the internal > > rate-matching hub, whereas usb1 is the EHCI controller's root hub. > > > > Can you add your own dev_info() statements to the hub_quiesce() and > > hub_suspend() routines in drivers/usb/core/hub.c, or should I send a > > patch? > After "usb 1-1.1: usb suspend" I see a full "start hub_suspend, > hub_quiesce, rest of hub_suspend" path for 1-1:1.0 (I've used &intf->dev > in suspend and hub->intfdev in quiesce as a 1st arg for dev_info()), then > "usb 1-1: usb suspend" and after that it hangs. Ah, okay. I feel stupid; I should have realized that would happen. Anyway, the next place to look is in usb_port_suspend(), also in hub.c. If you want to go a few levels higher as well, usb_port_suspend() is called from generic_suspend() in generic.c. That routine, in turn, is called from usb_suspend_device() in driver.c -- udriver->suspend should always be equal to generic_suspend. Alan Stern
On Fri, Jun 17, 2011 at 12:11:52PM -0400, Alan Stern wrote: > > After "usb 1-1.1: usb suspend" I see a full "start hub_suspend, > > hub_quiesce, rest of hub_suspend" path for 1-1:1.0 (I've used &intf->dev > > in suspend and hub->intfdev in quiesce as a 1st arg for dev_info()), then > > "usb 1-1: usb suspend" and after that it hangs. > Ah, okay. I feel stupid; I should have realized that would happen. > Anyway, the next place to look is in usb_port_suspend(), also in hub.c. After "usb 1-1: usb suspend" I have a message after usb_set_device_state(udev, USB_STATE_SUSPENDED); but not the message after the next line (msleep(10);). This is at the end of usb_port_suspend();
On Sat, 18 Jun 2011, Andrey Rahmatullin wrote: > After "usb 1-1: usb suspend" I have a message after > usb_set_device_state(udev, USB_STATE_SUSPENDED); but not the message after > the next line (msleep(10);). This is at the end of usb_port_suspend(); Heh -- you've just reached the limit of my expertise. I have no idea why msleep(10) would hang during suspend. It sounds like some sort of problem with the clock/timer subsystem, but whatever it is, I won't be able to help you fix it. Maybe Thomas can do more. Alan Stern
On Fri, Jun 17, 2011 at 02:54:41PM -0400, Alan Stern wrote: > > After "usb 1-1: usb suspend" I have a message after > > usb_set_device_state(udev, USB_STATE_SUSPENDED); but not the message after > > the next line (msleep(10);). This is at the end of usb_port_suspend(); > > Heh -- you've just reached the limit of my expertise. I have no idea > why msleep(10) would hang during suspend. It sounds like some sort of > problem with the clock/timer subsystem, but whatever it is, I won't be > able to help you fix it. Maybe Thomas can do more. I've removed that msleep, now I have all proper messages up to the end of generic_suspend() for usb1 (all in a 0.01 second) but the system still hangs after them.
On Sat, 18 Jun 2011, Andrey Rahmatullin wrote: > On Fri, Jun 17, 2011 at 02:54:41PM -0400, Alan Stern wrote: > > > After "usb 1-1: usb suspend" I have a message after > > > usb_set_device_state(udev, USB_STATE_SUSPENDED); but not the message after > > > the next line (msleep(10);). This is at the end of usb_port_suspend(); > > > > Heh -- you've just reached the limit of my expertise. I have no idea > > why msleep(10) would hang during suspend. It sounds like some sort of > > problem with the clock/timer subsystem, but whatever it is, I won't be > > able to help you fix it. Maybe Thomas can do more. > I've removed that msleep, now I have all proper messages up to the end of > generic_suspend() for usb1 (all in a 0.01 second) but the system still > hangs after them. Maybe another msleep() somewhere else in the suspend path is hanging. Alan Stern
Index: usb-3.0/drivers/usb/host/ehci-hub.c =================================================================== --- usb-3.0.orig/drivers/usb/host/ehci-hub.c +++ usb-3.0/drivers/usb/host/ehci-hub.c @@ -208,7 +208,7 @@ static int ehci_bus_suspend (struct usb_ int mask; int changed; - ehci_dbg(ehci, "suspend root hub\n"); + ehci_info(ehci, "suspend root hub\n"); if (time_before (jiffies, ehci->next_statechange)) msleep(5); @@ -216,6 +216,7 @@ static int ehci_bus_suspend (struct usb_ del_timer_sync(&ehci->iaa_watchdog); spin_lock_irq (&ehci->lock); +ehci_info(ehci, "A\n"); /* Once the controller is stopped, port resumes that are already * in progress won't complete. Hence if remote wakeup is enabled @@ -242,6 +243,7 @@ static int ehci_bus_suspend (struct usb_ } ehci->command = ehci_readl(ehci, &ehci->regs->command); ehci_work(ehci); +ehci_info(ehci, "B\n"); /* Unlike other USB host controller types, EHCI doesn't have * any notion of "global" or bus-wide suspend. The driver has @@ -285,6 +287,7 @@ static int ehci_bus_suspend (struct usb_ changed = 1; } } +ehci_info(ehci, "C\n"); if (changed && ehci->has_hostpc) { spin_unlock_irq(&ehci->lock); @@ -306,6 +309,7 @@ static int ehci_bus_suspend (struct usb_ "succeeded" : "failed"); } } +ehci_info(ehci, "D\n"); /* Apparently some devices need a >= 1-uframe delay here */ if (ehci->bus_suspended) @@ -314,6 +318,7 @@ static int ehci_bus_suspend (struct usb_ /* turn off now-idle HC */ ehci_halt (ehci); hcd->state = HC_STATE_SUSPENDED; +ehci_info(ehci, "E\n"); if (ehci->reclaim) end_unlink_async(ehci); @@ -324,6 +329,7 @@ static int ehci_bus_suspend (struct usb_ mask &= ~STS_PCD; ehci_writel(ehci, mask, &ehci->regs->intr_enable); ehci_readl(ehci, &ehci->regs->intr_enable); +ehci_info(ehci, "F\n"); ehci->next_statechange = jiffies + msecs_to_jiffies(10); spin_unlock_irq (&ehci->lock); @@ -332,6 +338,7 @@ static int ehci_bus_suspend (struct usb_ * want, and so we must delete any pending watchdog timer events. */ del_timer_sync(&ehci->watchdog); +ehci_info(ehci, "G\n"); return 0; }