Message ID | Pine.LNX.4.44L0.2001311037460.1577-100000@iolanthe.rowland.org (mailing list archive) |
---|---|
State | Mainlined |
Commit | 8099f58f1ecddf4f374f4828a3dff8397c7cbd74 |
Headers | show |
Series | USB: hub: Don't record a connect-change event during reset-resume | expand |
On Fri, 31 Jan 2020 10:39:26 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > Paul Zimmerman reports that his USB Bluetooth adapter sometimes > crashes following system resume, when it receives a > Get-Device-Descriptor request while it is busy doing something else. > > Such a request was added by commit a4f55d8b8c14 ("usb: hub: Check > device descriptor before resusciation"). It gets sent when the hub > driver's work thread checks whether a connect-change event on an > enabled port really indicates a new device has been connected, as > opposed to an old device momentarily disconnecting and then > reconnecting (which can happen with xHCI host controllers, since they > automatically enable connected ports). < snip > > Note that performing the unnecessary check is not actually a bug. > Devices are supposed to be able to send descriptors back to the host > even when they are busy doing something else. The underlying cause of > Paul's problem lies in his Bluetooth adapter. Nevertheless, we > shouldn't perform the same check twice in a row -- and as a nice side > benefit, removing the extra check allows the Bluetooth adapter to work > more reliably. Actually, at the time the failure happens, the bluetooth driver is putting the device into a "manufacturer mode" and downloading a firmware patch to the device. So I don't think we can fault the device for not responding to a get-descriptor request at that point. Probably there should be some kind of locking in the driver while that is being done. Nevertheless, your patch makes everything work again, so I think it's "good enough" :) Thanks, Paul
On Fri, 31 Jan 2020, Paul Zimmerman wrote: > > Note that performing the unnecessary check is not actually a bug. > > Devices are supposed to be able to send descriptors back to the host > > even when they are busy doing something else. The underlying cause of > > Paul's problem lies in his Bluetooth adapter. Nevertheless, we > > shouldn't perform the same check twice in a row -- and as a nice side > > benefit, removing the extra check allows the Bluetooth adapter to work > > more reliably. > > Actually, at the time the failure happens, the bluetooth driver is putting > the device into a "manufacturer mode" and downloading a firmware patch to > the device. So I don't think we can fault the device for not responding to > a get-descriptor request at that point. Probably there should be some kind > of locking in the driver while that is being done. Heh. We don't have any locking of that sort in the kernel. Any user at any time can run "lsusb -v" and that program will try to communicate with every attached USB device. There's no way to claim exclusive rights to a device. The fact that firmware loading works at all is more or less a matter of luck (although the odds are with us). > Nevertheless, your patch makes everything work again, so I think it's > "good enough" :) > > Thanks, > Paul You're welcome. Alan Stern
Index: usb-devel/drivers/usb/core/hub.c =================================================================== --- usb-devel.orig/drivers/usb/core/hub.c +++ usb-devel/drivers/usb/core/hub.c @@ -1216,11 +1216,6 @@ static void hub_activate(struct usb_hub #ifdef CONFIG_PM udev->reset_resume = 1; #endif - /* Don't set the change_bits when the device - * was powered off. - */ - if (test_bit(port1, hub->power_bits)) - set_bit(port1, hub->change_bits); } else { /* The power session is gone; tell hub_wq */
Paul Zimmerman reports that his USB Bluetooth adapter sometimes crashes following system resume, when it receives a Get-Device-Descriptor request while it is busy doing something else. Such a request was added by commit a4f55d8b8c14 ("usb: hub: Check device descriptor before resusciation"). It gets sent when the hub driver's work thread checks whether a connect-change event on an enabled port really indicates a new device has been connected, as opposed to an old device momentarily disconnecting and then reconnecting (which can happen with xHCI host controllers, since they automatically enable connected ports). The same kind of thing occurs when a port's power session is lost during system suspend. When the system wakes up it sees a connect-change event on the port, and if the child device's persist_enabled flag was set then hub_activate() sets the device's reset_resume flag as well as the port's bit in hub->change_bits. The reset-resume code then takes responsibility for checking that the same device is still attached to the port, and it does this as part of the device's resume pathway. By the time the hub driver's work thread starts up again, the device has already been fully reinitialized and is busy doing its own thing. There's no need for the work thread to do the same check a second time, and in fact this unnecessary check is what caused the problem that Paul observed. Note that performing the unnecessary check is not actually a bug. Devices are supposed to be able to send descriptors back to the host even when they are busy doing something else. The underlying cause of Paul's problem lies in his Bluetooth adapter. Nevertheless, we shouldn't perform the same check twice in a row -- and as a nice side benefit, removing the extra check allows the Bluetooth adapter to work more reliably. The work thread performs its check when it sees that the port's bit is set in hub->change_bits. In this situation that bit is interpreted as though a connect-change event had occurred on the port _after_ the reset-resume, which is not what actually happened. One possible fix would be to make the reset-resume code clear the port's bit in hub->change_bits. But it seems simpler to just avoid setting the bit during hub_activate() in the first place. That's what this patch does. (Proving that the patch is correct when CONFIG_PM is disabled requires a little thought. In that setting hub_activate() will be called only for initialization and resets, since there won't be any resumes or reset-resumes. During initialization and hub resets the hub doesn't have any child devices, and so this code path never gets executed.) Reported-and-tested-by: Paul Zimmerman <pauldzim@gmail.com> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://marc.info/?t=157949360700001&r=1&w=2 CC: David Heinzelmann <heinzelmann.david@gmail.com> CC: <stable@vger.kernel.org> --- [as1930] drivers/usb/core/hub.c | 5 ----- 1 file changed, 5 deletions(-)