diff mbox series

[v3] USB: HID: random timeout failures tackle try.

Message ID 20200204131555.25534-1-lja@iki.fi (mailing list archive)
State Superseded
Headers show
Series [v3] USB: HID: random timeout failures tackle try. | expand

Commit Message

Lauri Jakku Feb. 4, 2020, 1:15 p.m. UTC
There is multiple reports of random behaviour of USB HID devices.

I have mouse that acts sometimes quite randomly, I debugged with
logs others have published: that there is HW timeouts that leave
device in state that it is errorneus.

To fix this, I introduce retry mechanism in root of USB HID drivers.

Fix does not slow down operations at all if there is no -ETIMEDOUT
got from control message sending.

If there is one, then sleep 20ms and try again. Retry count is 20
witch translates maximium of 400ms before giving up. If the 400ms
boundary is reached the HW is really bad.

JUST to be clear:
    This does not make USB HID devices to sleep anymore than
    before, if all is golden and no timeouts are got.

Why modify usb-hid-core: No need to modify driver by driver.
At this time i do not know all the USB HID devices that timeouts,
but what i've researched, there are issues.

Timeout given is divided by 100, but taken care that it is always
at least 10ms.

so total time in common worst-case-scenario is:

 sleep of 20ms + common timeout divided by 100 (50ms) makes
 70ms per loop, 20 loops => 1.4sec .

Signed-off-by: Lauri Jakku <lja@iki.fi>
---
 drivers/usb/core/message.c | 55 ++++++++++++++++++++++++++++++++++----
 1 file changed, 50 insertions(+), 5 deletions(-)

Comments

Greg Kroah-Hartman Feb. 4, 2020, 1:53 p.m. UTC | #1
On Tue, Feb 04, 2020 at 03:15:56PM +0200, Lauri Jakku wrote:
> There is multiple reports of random behaviour of USB HID devices.
> 
> I have mouse that acts sometimes quite randomly, I debugged with
> logs others have published: that there is HW timeouts that leave
> device in state that it is errorneus.
> 
> To fix this, I introduce retry mechanism in root of USB HID drivers.
> 
> Fix does not slow down operations at all if there is no -ETIMEDOUT
> got from control message sending.
> 
> If there is one, then sleep 20ms and try again. Retry count is 20
> witch translates maximium of 400ms before giving up. If the 400ms
> boundary is reached the HW is really bad.
> 
> JUST to be clear:
>     This does not make USB HID devices to sleep anymore than
>     before, if all is golden and no timeouts are got.
> 
> Why modify usb-hid-core: No need to modify driver by driver.
> At this time i do not know all the USB HID devices that timeouts,
> but what i've researched, there are issues.
> 
> Timeout given is divided by 100, but taken care that it is always
> at least 10ms.
> 
> so total time in common worst-case-scenario is:
> 
>  sleep of 20ms + common timeout divided by 100 (50ms) makes
>  70ms per loop, 20 loops => 1.4sec .
> 
> Signed-off-by: Lauri Jakku <lja@iki.fi>
> ---
>  drivers/usb/core/message.c | 55 ++++++++++++++++++++++++++++++++++----
>  1 file changed, 50 insertions(+), 5 deletions(-)

What changed from v1 and v2?

That always has to be described below the --- line, as documented, to
give us a chance to understand what is happening here and why this is a
new version.

Please fix up and send a v4.

thanks,

greg k-h
kernel test robot Feb. 4, 2020, 4:52 p.m. UTC | #2
Hi Lauri,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on usb/usb-testing]
[also build test ERROR on v5.5 next-20200204]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Lauri-Jakku/USB-HID-random-timeout-failures-tackle-try/20200204-232348
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-testing
config: x86_64-defconfig (attached as .config)
compiler: gcc-7 (Debian 7.5.0-3) 7.5.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/usb/core/message.c: In function 'usb_control_msg':
>> drivers/usb/core/message.c:173:11: error: type defaults to 'int' in declaration of 'timeout_happened' [-Werror=implicit-int]
       static timeout_happened = 0;
              ^~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +173 drivers/usb/core/message.c

   108	
   109	/**
   110	 * usb_control_msg - Builds a control urb, sends it off and waits for completion
   111	 * @dev: pointer to the usb device to send the message to
   112	 * @pipe: endpoint "pipe" to send the message to
   113	 * @request: USB message request value
   114	 * @requesttype: USB message request type value
   115	 * @value: USB message value
   116	 * @index: USB message index value
   117	 * @data: pointer to the data to send
   118	 * @size: length in bytes of the data to send
   119	 * @timeout: time in msecs to wait for the message to complete before timing
   120	 *	out (if 0 the wait is forever)
   121	 *
   122	 * Context: !in_interrupt ()
   123	 *
   124	 * This function sends a simple control message to a specified endpoint and
   125	 * waits for the message to complete, or timeout.
   126	 *
   127	 * Don't use this function from within an interrupt context. If you need
   128	 * an asynchronous message, or need to send a message from within interrupt
   129	 * context, use usb_submit_urb(). If a thread in your driver uses this call,
   130	 * make sure your disconnect() method can wait for it to complete. Since you
   131	 * don't have a handle on the URB used, you can't cancel the request.
   132	 *
   133	 * Return: If successful, the number of bytes transferred. Otherwise, a negative
   134	 * error number.
   135	 */
   136	int usb_control_msg(struct usb_device *dev, unsigned int pipe, __u8 request,
   137			    __u8 requesttype, __u16 value, __u16 index, void *data,
   138			    __u16 size, int timeout)
   139	{
   140		struct usb_ctrlrequest *dr;
   141		int ret = -ETIMEDOUT;
   142	
   143		/* retry_cnt * 20ms, max retry time set to 400ms */
   144		int retry_cnt = 20;
   145	
   146		dr = kmalloc(sizeof(struct usb_ctrlrequest), GFP_NOIO);
   147		if (!dr)
   148			return -ENOMEM;
   149	
   150		dr->bRequestType = requesttype;
   151		dr->bRequest = request;
   152		dr->wValue = cpu_to_le16(value);
   153		dr->wIndex = cpu_to_le16(index);
   154		dr->wLength = cpu_to_le16(size);
   155	
   156		do {
   157			ret = usb_internal_control_msg(dev,
   158						pipe,
   159						dr,
   160						data,
   161						size,
   162						timeout);
   163	
   164			/*
   165			 * Linger a bit, prior to the next control message
   166			 * or if return value is timeout, but do try few
   167			 * times (max 400ms) before quitting. Adapt timeout
   168			 * to be smaller when we have timeout'd first time.
   169			 */
   170			if (dev->quirks & USB_QUIRK_DELAY_CTRL_MSG)
   171				msleep(200);
   172			else if (ret == -ETIMEDOUT) {
 > 173				static timeout_happened = 0;
   174	
   175				if ( ! timeout_happened ) {
   176					timeout_happened = 1;
   177	
   178					/* 
   179					 * If timeout is given, divide it
   180					 * by 100, if not, put 10ms timeout.
   181					 * 
   182					 * Then safeguard: if timeout is under
   183					 * 10ms, make timeout to be 10ms.
   184					 */
   185	
   186					if (timeout > 0)
   187						timeout /= 100;
   188					else
   189						timeout = 10;
   190	
   191					if (timeout < 10)
   192						timeout = 10;
   193	
   194				}
   195	
   196				msleep(20);
   197			}
   198	
   199			/* Loop while timeout, max loops: retry_cnt times. */
   200		} while ((retry_cnt-- > 0) && (ret == -ETIMEDOUT));
   201	
   202	
   203		kfree(dr);
   204	
   205		return ret;
   206	}
   207	EXPORT_SYMBOL_GPL(usb_control_msg);
   208	

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation
diff mbox series

Patch

diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index 5adf489428aa..614c762989ab 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -20,6 +20,7 @@ 
 #include <linux/usb/hcd.h>	/* for usbcore internals */
 #include <linux/usb/of.h>
 #include <asm/byteorder.h>
+#include <linux/errno.h>
 
 #include "usb.h"
 
@@ -137,7 +138,10 @@  int usb_control_msg(struct usb_device *dev, unsigned int pipe, __u8 request,
 		    __u16 size, int timeout)
 {
 	struct usb_ctrlrequest *dr;
-	int ret;
+	int ret = -ETIMEDOUT;
+
+	/* retry_cnt * 20ms, max retry time set to 400ms */
+	int retry_cnt = 20;
 
 	dr = kmalloc(sizeof(struct usb_ctrlrequest), GFP_NOIO);
 	if (!dr)
@@ -149,11 +153,52 @@  int usb_control_msg(struct usb_device *dev, unsigned int pipe, __u8 request,
 	dr->wIndex = cpu_to_le16(index);
 	dr->wLength = cpu_to_le16(size);
 
-	ret = usb_internal_control_msg(dev, pipe, dr, data, size, timeout);
+	do {
+		ret = usb_internal_control_msg(dev,
+					pipe,
+					dr,
+					data,
+					size,
+					timeout);
+
+		/*
+		 * Linger a bit, prior to the next control message
+		 * or if return value is timeout, but do try few
+		 * times (max 400ms) before quitting. Adapt timeout
+		 * to be smaller when we have timeout'd first time.
+		 */
+		if (dev->quirks & USB_QUIRK_DELAY_CTRL_MSG)
+			msleep(200);
+		else if (ret == -ETIMEDOUT) {
+			static timeout_happened = 0;
+
+			if ( ! timeout_happened ) {
+				timeout_happened = 1;
+
+				/* 
+				 * If timeout is given, divide it
+				 * by 100, if not, put 10ms timeout.
+				 * 
+				 * Then safeguard: if timeout is under
+				 * 10ms, make timeout to be 10ms.
+				 */
+
+				if (timeout > 0)
+					timeout /= 100;
+				else
+					timeout = 10;
+
+				if (timeout < 10)
+					timeout = 10;
+
+			}
+
+			msleep(20);
+		}
+
+		/* Loop while timeout, max loops: retry_cnt times. */
+	} while ((retry_cnt-- > 0) && (ret == -ETIMEDOUT));
 
-	/* Linger a bit, prior to the next control message. */
-	if (dev->quirks & USB_QUIRK_DELAY_CTRL_MSG)
-		msleep(200);
 
 	kfree(dr);