diff mbox

[1/2] Revert "Revert "HID: Fix logitech-dj: missing Unifying device issue""

Message ID 1374153691-25100-1-git-send-email-nlopezcasad@logitech.com (mailing list archive)
State New, archived
Delegated to: Jiri Kosina
Headers show

Commit Message

Nestor Lopez Casado July 18, 2013, 1:21 p.m. UTC
This reverts commit 8af6c08830b1ae114d1a8b548b1f8b056e068887.

This patch re-adds the workaround introduced by 596264082f10dd4
which was reverted by 8af6c08830b1ae114.

The original patch 596264 was needed to overcome a situation where
the hid-core would drop incoming reports while probe() was being
executed.

This issue was solved by c849a6143bec520af which added
hid_device_io_start() and hid_device_io_stop() that enable a specific
hid driver to opt-in for input reports while its probe() is being
executed.

Commit a9dd22b730857347 modified hid-logitech-dj so as to use the
functionality added to hid-core. Having done that, workaround 596264
was no longer necessary and was reverted by 8af6c08.

We now encounter a different problem that ends up 'again' thwarting
the Unifying receiver enumeration. The problem is time and usb controller
dependent. Ocasionally the reports sent to the usb receiver to start
the paired devices enumeration fail with -EPIPE and the receiver never
gets to enumerate the paired devices.

With dcd9006b1b053c7b1c the problem was "hidden" as the call to the usb
driver became asynchronous and none was catching the error from the
failing URB.

As the root cause for this failing SET_REPORT is not understood yet,
-possibly a race on the usb controller drivers or a problem with the
Unifying receiver- reintroducing this workaround solves the problem.

Overall what this workaround does is: If an input report from an
unknown device is received, then a (re)enumeration is performed.

related bug:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1194649

Signed-off-by: Nestor Lopez Casado <nlopezcasad@logitech.com>
---
 drivers/hid/hid-logitech-dj.c |   45 +++++++++++++++++++++++++++++++++++++++++
 drivers/hid/hid-logitech-dj.h |    1 +
 2 files changed, 46 insertions(+)

Comments

Peter Hurley July 18, 2013, 8:28 p.m. UTC | #1
[ +cc Sarah Sharp, linux-usb ]

On 07/18/2013 09:21 AM, Nestor Lopez Casado wrote:
> This reverts commit 8af6c08830b1ae114d1a8b548b1f8b056e068887.
>
> This patch re-adds the workaround introduced by 596264082f10dd4
> which was reverted by 8af6c08830b1ae114.
>
> The original patch 596264 was needed to overcome a situation where
> the hid-core would drop incoming reports while probe() was being
> executed.
>
> This issue was solved by c849a6143bec520af which added
> hid_device_io_start() and hid_device_io_stop() that enable a specific
> hid driver to opt-in for input reports while its probe() is being
> executed.
>
> Commit a9dd22b730857347 modified hid-logitech-dj so as to use the
> functionality added to hid-core. Having done that, workaround 596264
> was no longer necessary and was reverted by 8af6c08.
>
> We now encounter a different problem that ends up 'again' thwarting
> the Unifying receiver enumeration. The problem is time and usb controller
> dependent. Ocasionally the reports sent to the usb receiver to start
> the paired devices enumeration fail with -EPIPE and the receiver never
> gets to enumerate the paired devices.
>
> With dcd9006b1b053c7b1c the problem was "hidden" as the call to the usb
> driver became asynchronous and none was catching the error from the
> failing URB.
>
> As the root cause for this failing SET_REPORT is not understood yet,
> -possibly a race on the usb controller drivers or a problem with the
> Unifying receiver- reintroducing this workaround solves the problem.


Before we revert to using the workaround, I'd like to suggest that
this new "hidden" problem may be an interaction with the xhci_hcd host
controller driver only.

Looking at the related bug, the OP indicates the machine only has
USB3 ports. Additionally, comments #7, #100, and #104 of the original
bug report [1] add additional information that would seem to confirm
this suspicion.

Let me add I have this USB device running on the uhci_hcd driver
with or without this workaround on v3.10.


> Overall what this workaround does is: If an input report from an
> unknown device is received, then a (re)enumeration is performed.
>
> related bug:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1194649

I thought I saw someone reporting this problem recently on LKML;
where is the Reported-by so that Sarah can follow-up with the
reporter directly, if desired?

Regards,
Peter Hurley

[1]
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1039143


> Signed-off-by: Nestor Lopez Casado <nlopezcasad@logitech.com>
> ---
>   drivers/hid/hid-logitech-dj.c |   45 +++++++++++++++++++++++++++++++++++++++++
>   drivers/hid/hid-logitech-dj.h |    1 +
>   2 files changed, 46 insertions(+)
>
> diff --git a/drivers/hid/hid-logitech-dj.c b/drivers/hid/hid-logitech-dj.c
> index db3192b..0d13389 100644
> --- a/drivers/hid/hid-logitech-dj.c
> +++ b/drivers/hid/hid-logitech-dj.c
> @@ -192,6 +192,7 @@ static struct hid_ll_driver logi_dj_ll_driver;
>   static int logi_dj_output_hidraw_report(struct hid_device *hid, u8 * buf,
>   					size_t count,
>   					unsigned char report_type);
> +static int logi_dj_recv_query_paired_devices(struct dj_receiver_dev *djrcv_dev);
>
>   static void logi_dj_recv_destroy_djhid_device(struct dj_receiver_dev *djrcv_dev,
>   						struct dj_report *dj_report)
> @@ -232,6 +233,7 @@ static void logi_dj_recv_add_djhid_device(struct dj_receiver_dev *djrcv_dev,
>   	if (dj_report->report_params[DEVICE_PAIRED_PARAM_SPFUNCTION] &
>   	    SPFUNCTION_DEVICE_LIST_EMPTY) {
>   		dbg_hid("%s: device list is empty\n", __func__);
> +		djrcv_dev->querying_devices = false;
>   		return;
>   	}
>
> @@ -242,6 +244,12 @@ static void logi_dj_recv_add_djhid_device(struct dj_receiver_dev *djrcv_dev,
>   		return;
>   	}
>
> +	if (djrcv_dev->paired_dj_devices[dj_report->device_index]) {
> +		/* The device is already known. No need to reallocate it. */
> +		dbg_hid("%s: device is already known\n", __func__);
> +		return;
> +	}
> +
>   	dj_hiddev = hid_allocate_device();
>   	if (IS_ERR(dj_hiddev)) {
>   		dev_err(&djrcv_hdev->dev, "%s: hid_allocate_device failed\n",
> @@ -305,6 +313,7 @@ static void delayedwork_callback(struct work_struct *work)
>   	struct dj_report dj_report;
>   	unsigned long flags;
>   	int count;
> +	int retval;
>
>   	dbg_hid("%s\n", __func__);
>
> @@ -337,6 +346,25 @@ static void delayedwork_callback(struct work_struct *work)
>   		logi_dj_recv_destroy_djhid_device(djrcv_dev, &dj_report);
>   		break;
>   	default:
> +	/* A normal report (i. e. not belonging to a pair/unpair notification)
> +	 * arriving here, means that the report arrived but we did not have a
> +	 * paired dj_device associated to the report's device_index, this
> +	 * means that the original "device paired" notification corresponding
> +	 * to this dj_device never arrived to this driver. The reason is that
> +	 * hid-core discards all packets coming from a device while probe() is
> +	 * executing. */
> +	if (!djrcv_dev->paired_dj_devices[dj_report.device_index]) {
> +		/* ok, we don't know the device, just re-ask the
> +		 * receiver for the list of connected devices. */
> +		retval = logi_dj_recv_query_paired_devices(djrcv_dev);
> +		if (!retval) {
> +			/* everything went fine, so just leave */
> +			break;
> +		}
> +		dev_err(&djrcv_dev->hdev->dev,
> +			"%s:logi_dj_recv_query_paired_devices "
> +			"error:%d\n", __func__, retval);
> +		}
>   		dbg_hid("%s: unexpected report type\n", __func__);
>   	}
>   }
> @@ -367,6 +395,12 @@ static void logi_dj_recv_forward_null_report(struct dj_receiver_dev *djrcv_dev,
>   	if (!djdev) {
>   		dbg_hid("djrcv_dev->paired_dj_devices[dj_report->device_index]"
>   			" is NULL, index %d\n", dj_report->device_index);
> +		kfifo_in(&djrcv_dev->notif_fifo, dj_report, sizeof(struct dj_report));
> +
> +		if (schedule_work(&djrcv_dev->work) == 0) {
> +			dbg_hid("%s: did not schedule the work item, was already "
> +			"queued\n", __func__);
> +		}
>   		return;
>   	}
>
> @@ -397,6 +431,12 @@ static void logi_dj_recv_forward_report(struct dj_receiver_dev *djrcv_dev,
>   	if (dj_device == NULL) {
>   		dbg_hid("djrcv_dev->paired_dj_devices[dj_report->device_index]"
>   			" is NULL, index %d\n", dj_report->device_index);
> +		kfifo_in(&djrcv_dev->notif_fifo, dj_report, sizeof(struct dj_report));
> +
> +		if (schedule_work(&djrcv_dev->work) == 0) {
> +			dbg_hid("%s: did not schedule the work item, was already "
> +			"queued\n", __func__);
> +		}
>   		return;
>   	}
>
> @@ -444,6 +484,10 @@ static int logi_dj_recv_query_paired_devices(struct dj_receiver_dev *djrcv_dev)
>   	struct dj_report *dj_report;
>   	int retval;
>
> +	/* no need to protect djrcv_dev->querying_devices */
> +	if (djrcv_dev->querying_devices)
> +		return 0;
> +
>   	dj_report = kzalloc(sizeof(struct dj_report), GFP_KERNEL);
>   	if (!dj_report)
>   		return -ENOMEM;
> @@ -455,6 +499,7 @@ static int logi_dj_recv_query_paired_devices(struct dj_receiver_dev *djrcv_dev)
>   	return retval;
>   }
>
> +
>   static int logi_dj_recv_switch_to_dj_mode(struct dj_receiver_dev *djrcv_dev,
>   					  unsigned timeout)
>   {
> diff --git a/drivers/hid/hid-logitech-dj.h b/drivers/hid/hid-logitech-dj.h
> index fd28a5e..4a40003 100644
> --- a/drivers/hid/hid-logitech-dj.h
> +++ b/drivers/hid/hid-logitech-dj.h
> @@ -101,6 +101,7 @@ struct dj_receiver_dev {
>   	struct work_struct work;
>   	struct kfifo notif_fifo;
>   	spinlock_t lock;
> +	bool querying_devices;
>   };
>
>   struct dj_device {
>

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sarah Sharp July 18, 2013, 10:09 p.m. UTC | #2
On Thu, Jul 18, 2013 at 04:28:01PM -0400, Peter Hurley wrote:
> [ +cc Sarah Sharp, linux-usb ]
> 
> On 07/18/2013 09:21 AM, Nestor Lopez Casado wrote:
> >This reverts commit 8af6c08830b1ae114d1a8b548b1f8b056e068887.
> >
> >This patch re-adds the workaround introduced by 596264082f10dd4
> >which was reverted by 8af6c08830b1ae114.
> >
> >The original patch 596264 was needed to overcome a situation where
> >the hid-core would drop incoming reports while probe() was being
> >executed.
> >
> >This issue was solved by c849a6143bec520af which added
> >hid_device_io_start() and hid_device_io_stop() that enable a specific
> >hid driver to opt-in for input reports while its probe() is being
> >executed.
> >
> >Commit a9dd22b730857347 modified hid-logitech-dj so as to use the
> >functionality added to hid-core. Having done that, workaround 596264
> >was no longer necessary and was reverted by 8af6c08.
> >
> >We now encounter a different problem that ends up 'again' thwarting
> >the Unifying receiver enumeration. The problem is time and usb controller
> >dependent. Ocasionally the reports sent to the usb receiver to start
> >the paired devices enumeration fail with -EPIPE and the receiver never
> >gets to enumerate the paired devices.
> >
> >With dcd9006b1b053c7b1c the problem was "hidden" as the call to the usb
> >driver became asynchronous and none was catching the error from the
> >failing URB.
> >
> >As the root cause for this failing SET_REPORT is not understood yet,
> >-possibly a race on the usb controller drivers or a problem with the
> >Unifying receiver- reintroducing this workaround solves the problem.
> 
> 
> Before we revert to using the workaround, I'd like to suggest that
> this new "hidden" problem may be an interaction with the xhci_hcd host
> controller driver only.
> 
> Looking at the related bug, the OP indicates the machine only has
> USB3 ports. Additionally, comments #7, #100, and #104 of the original
> bug report [1] add additional information that would seem to confirm
> this suspicion.

Question: does this USB device need a control transfer to reset its
endpoints when the endpoints are not actually halted?  If so, yes, that
is a known xHCI driver bug that needs to be fixed.  The xHCI host will
not accept a Reset Endpoint command when the endpoints are not actually
halted, but the USB core will send the control transfer to reset the
endpoint.  That means the device and host toggles will be out of sync,
and all messages will start to fail with -EPIPE.

Can the OP capture a usbmon trace when the device starts failing?  That
will reveal whether this actually is the issue.  dmesg output with
CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING turned on would also
be helpful.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley July 18, 2013, 11:37 p.m. UTC | #3
On 07/18/2013 06:09 PM, Sarah Sharp wrote:
> On Thu, Jul 18, 2013 at 04:28:01PM -0400, Peter Hurley wrote:
>> [ +cc Sarah Sharp, linux-usb ]
>>
>> On 07/18/2013 09:21 AM, Nestor Lopez Casado wrote:
>>> This reverts commit 8af6c08830b1ae114d1a8b548b1f8b056e068887.
>>>
>>> This patch re-adds the workaround introduced by 596264082f10dd4
>>> which was reverted by 8af6c08830b1ae114.
>>>
>>> The original patch 596264 was needed to overcome a situation where
>>> the hid-core would drop incoming reports while probe() was being
>>> executed.
>>>
>>> This issue was solved by c849a6143bec520af which added
>>> hid_device_io_start() and hid_device_io_stop() that enable a specific
>>> hid driver to opt-in for input reports while its probe() is being
>>> executed.
>>>
>>> Commit a9dd22b730857347 modified hid-logitech-dj so as to use the
>>> functionality added to hid-core. Having done that, workaround 596264
>>> was no longer necessary and was reverted by 8af6c08.
>>>
>>> We now encounter a different problem that ends up 'again' thwarting
>>> the Unifying receiver enumeration. The problem is time and usb controller
>>> dependent. Ocasionally the reports sent to the usb receiver to start
>>> the paired devices enumeration fail with -EPIPE and the receiver never
>>> gets to enumerate the paired devices.
>>>
>>> With dcd9006b1b053c7b1c the problem was "hidden" as the call to the usb
>>> driver became asynchronous and none was catching the error from the
>>> failing URB.
>>>
>>> As the root cause for this failing SET_REPORT is not understood yet,
>>> -possibly a race on the usb controller drivers or a problem with the
>>> Unifying receiver- reintroducing this workaround solves the problem.
>>
>>
>> Before we revert to using the workaround, I'd like to suggest that
>> this new "hidden" problem may be an interaction with the xhci_hcd host
>> controller driver only.
>>
>> Looking at the related bug, the OP indicates the machine only has
>> USB3 ports. Additionally, comments #7, #100, and #104 of the original
>> bug report [1] add additional information that would seem to confirm
>> this suspicion.
>
> Question: does this USB device need a control transfer to reset its
> endpoints when the endpoints are not actually halted?  If so, yes, that
> is a known xHCI driver bug that needs to be fixed.  The xHCI host will
> not accept a Reset Endpoint command when the endpoints are not actually
> halted, but the USB core will send the control transfer to reset the
> endpoint.  That means the device and host toggles will be out of sync,
> and all messages will start to fail with -EPIPE.
>
> Can the OP capture a usbmon trace when the device starts failing?  That
> will reveal whether this actually is the issue.  dmesg output with
> CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING turned on would also
> be helpful.

Sarah,

I forwarded your usbmon capture request to the OP in the bug report
(I don't have an email address for the reporter).

As far as getting printk output from a custom kernel, I think that may
be beyond the reporter's capability. Perhaps one of the Ubuntu devs
triaging this bug could provide a test kernel for the OP with those
options on.

Joseph, would you be willing to do that?

Regards,
Peter Hurley
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Tissoires July 19, 2013, 8:35 a.m. UTC | #4
Hi Peter,

thanks for forwarding this to the appropriate people & mailing list.

Hi Sarah,

thanks for starting investigating this :)

On Fri, Jul 19, 2013 at 1:37 AM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>
>>>
>>>
>>> Before we revert to using the workaround, I'd like to suggest that
>>> this new "hidden" problem may be an interaction with the xhci_hcd host
>>> controller driver only.
>>>
>>> Looking at the related bug, the OP indicates the machine only has
>>> USB3 ports. Additionally, comments #7, #100, and #104 of the original
>>> bug report [1] add additional information that would seem to confirm
>>> this suspicion.

Definitively, this is a USB3 problem. However, it is not generic (I
can not reproduce it with my USB3 boards.)

>>
>>
>> Question: does this USB device need a control transfer to reset its
>> endpoints when the endpoints are not actually halted?  If so, yes, that
>> is a known xHCI driver bug that needs to be fixed.  The xHCI host will
>> not accept a Reset Endpoint command when the endpoints are not actually
>> halted, but the USB core will send the control transfer to reset the
>> endpoint.  That means the device and host toggles will be out of sync,
>> and all messages will start to fail with -EPIPE.
>>
>> Can the OP capture a usbmon trace when the device starts failing?  That
>> will reveal whether this actually is the issue.  dmesg output with
>> CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING turned on would also
>> be helpful.
>

Here is another linux-input thread were you have the usbmon traces:
http://www.spinics.net/lists/linux-input/msg26542.html
Wujun Zhou already did one test of a kernel patch for me (which did
not solve the problem, because I was not at the USB level), so I bet
he will be able to do some testings for you.

In the logs he posted (logitech_work.pcapng.gz), the interesting part
is starting from the capture #45:

#45: SET_REPORT request to switch the receiver to the "DJ" mode (the
receiver stops sending regular HID events, but goes into its
proprietary protocol)
#47: SET_REPORT response -> all good
#48: SET_REPORT request to ask the receiver to enumerate all of his
devices (it is called right after we received the previous response)
#49: SET_REPORT response -> -EPIPE
#50: URB_INTERRUPT_IN (~3 seconds later) -> the device is working normally

The weird thing is that only the first enumeration message failed with
-EPIPE: the device answers later control transfer correctly (#54 /
#55).

>
> Sarah,
>
> I forwarded your usbmon capture request to the OP in the bug report
> (I don't have an email address for the reporter).
>

Here are some other helpful information:
the first "fix" we have done is dcd9006b1b053c7b1c. It is linked to
several bugs:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1072082
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1039143
https://bugzilla.redhat.com/show_bug.cgi?id=840391
https://bugzilla.kernel.org/show_bug.cgi?id=49781

Most of them are people complaining, but in one of the comments,
adding a 500ms wait between the two control transfer (switch to DJ +
enumerate) fixed the -EPIPE problem. I interpreted it as a scheduled
problem (using direct call to usb_control_msg() vs use the scheduled
one usbhid_submit_message()) but it was just delaying the problem out
of the probe. Unfortunately, I missed that as I did not asked for the
usbmon traces at that time.

One last thing, I understood that Linus is also experiencing this
problem... Adding him in CC to let him know of the progress.

Cheers,
Benjamin
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joseph Salisbury July 19, 2013, 2:38 p.m. UTC | #5
On 07/18/2013 07:37 PM, Peter Hurley wrote:
> On 07/18/2013 06:09 PM, Sarah Sharp wrote:
>> On Thu, Jul 18, 2013 at 04:28:01PM -0400, Peter Hurley wrote:
>>> [ +cc Sarah Sharp, linux-usb ]
>>>
>>> On 07/18/2013 09:21 AM, Nestor Lopez Casado wrote:
>>>> This reverts commit 8af6c08830b1ae114d1a8b548b1f8b056e068887.
>>>>
>>>> This patch re-adds the workaround introduced by 596264082f10dd4
>>>> which was reverted by 8af6c08830b1ae114.
>>>>
>>>> The original patch 596264 was needed to overcome a situation where
>>>> the hid-core would drop incoming reports while probe() was being
>>>> executed.
>>>>
>>>> This issue was solved by c849a6143bec520af which added
>>>> hid_device_io_start() and hid_device_io_stop() that enable a specific
>>>> hid driver to opt-in for input reports while its probe() is being
>>>> executed.
>>>>
>>>> Commit a9dd22b730857347 modified hid-logitech-dj so as to use the
>>>> functionality added to hid-core. Having done that, workaround 596264
>>>> was no longer necessary and was reverted by 8af6c08.
>>>>
>>>> We now encounter a different problem that ends up 'again' thwarting
>>>> the Unifying receiver enumeration. The problem is time and usb
>>>> controller
>>>> dependent. Ocasionally the reports sent to the usb receiver to start
>>>> the paired devices enumeration fail with -EPIPE and the receiver never
>>>> gets to enumerate the paired devices.
>>>>
>>>> With dcd9006b1b053c7b1c the problem was "hidden" as the call to the
>>>> usb
>>>> driver became asynchronous and none was catching the error from the
>>>> failing URB.
>>>>
>>>> As the root cause for this failing SET_REPORT is not understood yet,
>>>> -possibly a race on the usb controller drivers or a problem with the
>>>> Unifying receiver- reintroducing this workaround solves the problem.
>>>
>>>
>>> Before we revert to using the workaround, I'd like to suggest that
>>> this new "hidden" problem may be an interaction with the xhci_hcd host
>>> controller driver only.
>>>
>>> Looking at the related bug, the OP indicates the machine only has
>>> USB3 ports. Additionally, comments #7, #100, and #104 of the original
>>> bug report [1] add additional information that would seem to confirm
>>> this suspicion.
>>
>> Question: does this USB device need a control transfer to reset its
>> endpoints when the endpoints are not actually halted?  If so, yes, that
>> is a known xHCI driver bug that needs to be fixed.  The xHCI host will
>> not accept a Reset Endpoint command when the endpoints are not actually
>> halted, but the USB core will send the control transfer to reset the
>> endpoint.  That means the device and host toggles will be out of sync,
>> and all messages will start to fail with -EPIPE.
>>
>> Can the OP capture a usbmon trace when the device starts failing?  That
>> will reveal whether this actually is the issue.  dmesg output with
>> CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING turned on would also
>> be helpful.
>
> Sarah,
>
> I forwarded your usbmon capture request to the OP in the bug report
> (I don't have an email address for the reporter).
>
> As far as getting printk output from a custom kernel, I think that may
> be beyond the reporter's capability. Perhaps one of the Ubuntu devs
> triaging this bug could provide a test kernel for the OP with those
> options on.
>
> Joseph, would you be willing to do that?

Sure thing.  I'll build a kernel and request that the bug reporter
collect usbmon data.

Thanks everyone for the feedback!

>
> Regards,
> Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern July 19, 2013, 3:14 p.m. UTC | #6
On Thu, 18 Jul 2013, Sarah Sharp wrote:

> Question: does this USB device need a control transfer to reset its
> endpoints when the endpoints are not actually halted?  If so, yes, that
> is a known xHCI driver bug that needs to be fixed.  The xHCI host will
> not accept a Reset Endpoint command when the endpoints are not actually
> halted, but the USB core will send the control transfer to reset the
> endpoint.  That means the device and host toggles will be out of sync,
> and all messages will start to fail with -EPIPE.

Why -EPIPE?  Isn't that code reserved to indicate a STALL?

In fact, there's no way to detect a toggle mismatch problem with a 
USB-2 device.  Packets with the wrong toggle value are simply ignored 
(or acknowledged and ignored).  The problem ends up appearing as an 
indefinite delay (for an IN transfer) or as data lost (for an OUT 
transfer).

I don't know what happens with USB-3 devices when the sequence numbers 
get out of alignment.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nestor Lopez Casado July 19, 2013, 4:43 p.m. UTC | #7
Hi Sarah,

On Fri, Jul 19, 2013 at 12:09 AM, Sarah Sharp
<sarah.a.sharp@linux.intel.com> wrote:
> On Thu, Jul 18, 2013 at 04:28:01PM -0400, Peter Hurley wrote:
>> [ +cc Sarah Sharp, linux-usb ]
>>
>> On 07/18/2013 09:21 AM, Nestor Lopez Casado wrote:
>> >This reverts commit 8af6c08830b1ae114d1a8b548b1f8b056e068887.
>> >
>> >This patch re-adds the workaround introduced by 596264082f10dd4
>> >which was reverted by 8af6c08830b1ae114.
>> >
>> >The original patch 596264 was needed to overcome a situation where
>> >the hid-core would drop incoming reports while probe() was being
>> >executed.
>> >
>> >This issue was solved by c849a6143bec520af which added
>> >hid_device_io_start() and hid_device_io_stop() that enable a specific
>> >hid driver to opt-in for input reports while its probe() is being
>> >executed.
>> >
>> >Commit a9dd22b730857347 modified hid-logitech-dj so as to use the
>> >functionality added to hid-core. Having done that, workaround 596264
>> >was no longer necessary and was reverted by 8af6c08.
>> >
>> >We now encounter a different problem that ends up 'again' thwarting
>> >the Unifying receiver enumeration. The problem is time and usb controller
>> >dependent. Ocasionally the reports sent to the usb receiver to start
>> >the paired devices enumeration fail with -EPIPE and the receiver never
>> >gets to enumerate the paired devices.
>> >
>> >With dcd9006b1b053c7b1c the problem was "hidden" as the call to the usb
>> >driver became asynchronous and none was catching the error from the
>> >failing URB.
>> >
>> >As the root cause for this failing SET_REPORT is not understood yet,
>> >-possibly a race on the usb controller drivers or a problem with the
>> >Unifying receiver- reintroducing this workaround solves the problem.
>>
>>
>> Before we revert to using the workaround, I'd like to suggest that
>> this new "hidden" problem may be an interaction with the xhci_hcd host
>> controller driver only.
>>
>> Looking at the related bug, the OP indicates the machine only has
>> USB3 ports. Additionally, comments #7, #100, and #104 of the original
>> bug report [1] add additional information that would seem to confirm
>> this suspicion.
>
> Question: does this USB device need a control transfer to reset its
> endpoints when the endpoints are not actually halted?  If so, yes, that
> is a known xHCI driver bug that needs to be fixed.  The xHCI host will
> not accept a Reset Endpoint command when the endpoints are not actually
> halted, but the USB core will send the control transfer to reset the
> endpoint.  That means the device and host toggles will be out of sync,
> and all messages will start to fail with -EPIPE.

Could you re-phrase your question providing a bit more detail? I don't
quite get the idea.

>
> Can the OP capture a usbmon trace when the device starts failing?  That
> will reveal whether this actually is the issue.  dmesg output with
> CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING turned on would also
> be helpful.
>
> Sarah Sharp

Cheers,
Nestor
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Kosina July 22, 2013, 11:44 a.m. UTC | #8
On Fri, 19 Jul 2013, Peter Hurley wrote:

> > > As far as getting printk output from a custom kernel, I think that may
> > > be beyond the reporter's capability. Perhaps one of the Ubuntu devs
> > > triaging this bug could provide a test kernel for the OP with those
> > > options on.
> > > 
> > > Joseph, would you be willing to do that?
> > 
> > Sure thing.  I'll build a kernel and request that the bug reporter
> > collect usbmon data.
> 
> Thanks Joseph for building the test kernel and getting it to the reporter!
> 
> Sarah,
> 
> I've attached the dmesg capture supplied by the original reporter on
> a 3.10 custom kernel w/ the kbuild options you requested.
> 
> It seems as if your initial suspicion is correct:
> 
> [   46.785490] xhci_hcd 0000:00:14.0: Endpoint 0x81 not halted, refusing to
> reset.
> [   46.785493] xhci_hcd 0000:00:14.0: Endpoint 0x82 not halted, refusing to
> reset.
> [   46.785496] xhci_hcd 0000:00:14.0: Endpoint 0x83 not halted, refusing to
> reset.
> [   46.785952] xhci_hcd 0000:00:14.0: Waiting for status stage event
> 
> At this point, would you recommend proceeding with the workaround or
> waiting for an xHCI bug fix?

Thanks for your efforts.

Seems like this might be a part of the picture, but not a complete one. 
Linus claims to have similar problem, but his receiver is not connected 
through xHCI (I got this as an off-list report, so can't really provide a 
pointer to ML archives).

Thanks,
Peter Hurley July 22, 2013, 2:03 p.m. UTC | #9
On 07/22/2013 07:44 AM, Jiri Kosina wrote:
> On Fri, 19 Jul 2013, Peter Hurley wrote:
>
>>>> As far as getting printk output from a custom kernel, I think that may
>>>> be beyond the reporter's capability. Perhaps one of the Ubuntu devs
>>>> triaging this bug could provide a test kernel for the OP with those
>>>> options on.
>>>>
>>>> Joseph, would you be willing to do that?
>>>
>>> Sure thing.  I'll build a kernel and request that the bug reporter
>>> collect usbmon data.
>>
>> Thanks Joseph for building the test kernel and getting it to the reporter!
>>
>> Sarah,
>>
>> I've attached the dmesg capture supplied by the original reporter on
>> a 3.10 custom kernel w/ the kbuild options you requested.
>>
>> It seems as if your initial suspicion is correct:
>>
>> [   46.785490] xhci_hcd 0000:00:14.0: Endpoint 0x81 not halted, refusing to
>> reset.
>> [   46.785493] xhci_hcd 0000:00:14.0: Endpoint 0x82 not halted, refusing to
>> reset.
>> [   46.785496] xhci_hcd 0000:00:14.0: Endpoint 0x83 not halted, refusing to
>> reset.
>> [   46.785952] xhci_hcd 0000:00:14.0: Waiting for status stage event
>>
>> At this point, would you recommend proceeding with the workaround or
>> waiting for an xHCI bug fix?
>
> Thanks for your efforts.
>
> Seems like this might be a part of the picture, but not a complete one.
> Linus claims to have similar problem, but his receiver is not connected
> through xHCI (I got this as an off-list report, so can't really provide a
> pointer to ML archives).

Ah, ok. I wasn't aware of that. I'll assume then that the necessary
people have the complete picture.

Regards,
Peter Hurley


--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Kosina July 22, 2013, 2:35 p.m. UTC | #10
On Thu, 18 Jul 2013, Nestor Lopez Casado wrote:

> This reverts commit 8af6c08830b1ae114d1a8b548b1f8b056e068887.
> 
> This patch re-adds the workaround introduced by 596264082f10dd4
> which was reverted by 8af6c08830b1ae114.
> 
> The original patch 596264 was needed to overcome a situation where
> the hid-core would drop incoming reports while probe() was being
> executed.
> 
> This issue was solved by c849a6143bec520af which added
> hid_device_io_start() and hid_device_io_stop() that enable a specific
> hid driver to opt-in for input reports while its probe() is being
> executed.
> 
> Commit a9dd22b730857347 modified hid-logitech-dj so as to use the
> functionality added to hid-core. Having done that, workaround 596264
> was no longer necessary and was reverted by 8af6c08.
> 
> We now encounter a different problem that ends up 'again' thwarting
> the Unifying receiver enumeration. The problem is time and usb controller
> dependent. Ocasionally the reports sent to the usb receiver to start
> the paired devices enumeration fail with -EPIPE and the receiver never
> gets to enumerate the paired devices.
[ ... snip ... ]

Ok, this is now in

	git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid.git for-3.11/logitech-enumeration-fix

Linus added to CC.

Linus -- could you please by any chance test whether the two patches in 
that branch make the problem you are observing any better? (and no, this 
is not a pull request yet).

It's still not clear whether we are chasing two different issues here, or 
not.

Thanks,
Alan Stern July 22, 2013, 3:27 p.m. UTC | #11
On Mon, 22 Jul 2013, Peter Hurley wrote:

> On 07/22/2013 07:44 AM, Jiri Kosina wrote:
...
> > Seems like this might be a part of the picture, but not a complete one.
> > Linus claims to have similar problem, but his receiver is not connected
> > through xHCI (I got this as an off-list report, so can't really provide a
> > pointer to ML archives).
> 
> Ah, ok. I wasn't aware of that. I'll assume then that the necessary
> people have the complete picture.

It might help to get a usbmon trace and dmesg log from Linus, if he
still has this problem.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds July 22, 2013, 7:21 p.m. UTC | #12
On Mon, Jul 22, 2013 at 7:35 AM, Jiri Kosina <jkosina@suse.cz> wrote:
>
> Linus -- could you please by any chance test whether the two patches in
> that branch make the problem you are observing any better? (and no, this
> is not a pull request yet).
>
> It's still not clear whether we are chasing two different issues here, or
> not.

I think it's two different issues. It sounds like the USB3 one is
fairly repeatable?

Mine is quite rare. I think it's so far happened just once during the
the 3.11 merge window+, and that's despite me rebooting usually a few
times a day (less now that things are calming down).

On Mon, Jul 22, 2013 at 8:27 AM, Jiri Kosina <jkosina@suse.cz> wrote:
>
> It might help to get a usbmon trace and dmesg log from Linus, if he
> still has this problem.

So see above. I've sent the dmesg - not that it's useful, since it
just lacks the lines showing the actual wireless device after plugging
it in (but the receiver is recognized). No error messages, just not
any working keyboard.

And it's not common. I think it has happened a handful of times over
the last four months or so.

                  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Wu Aug. 12, 2013, 9:54 p.m. UTC | #13
On Thursday 18 July 2013 16:28:01 Peter Hurley wrote:
> Before we revert to using the workaround, I'd like to suggest that
> this new "hidden" problem may be an interaction with the xhci_hcd host
> controller driver only.
> 
> Looking at the related bug, the OP indicates the machine only has
> USB3 ports. Additionally, comments #7, #100, and #104 of the original
> bug report add additional information that would seem to confirm
> this suspicion.
> 
> Let me add I have this USB device running on the uhci_hcd driver
> with or without this workaround on v3.10.

This problem does not seem specific to xhci, uhci seems also effected. Today I 
upgraded a system (running Arch Linux) from kernel 3.9.9 to 3.10.5. After a 
reboot to 3.10.5, things broke. The setup:

- There are two USB receivers plugged into USB 1.1 ports (different buses 
according to lsusb, uhci), each receiver is paired to a K360 keyboard.
- One of the receivers are passed to a QEMU guest with -usbdevice host:$busid.
$devid. This keyboard is working (probably because QEMU performed a reset).
- Since 3.10.5, the keyboard that is *not* passed to the QEMU guest is not 
functioning on reboot.

After closing the QEMU guest, the USB bus gets reset(?) after which the other 
keyboard suddenly gets detected. I had only booted 3.10.5 twice before rolling 
back to 3.9.9, both boots triggered the issue. Do I need to provide a usbmon, 
lsusb, dmesg and/ or other details from 3.10.5?


Note that there are other Arch Linux users who have reported issues[1][2] 
since upgrading to 3.10.z. Triggering a re-enumeration by writing the magic 
HID++ message[3] makes the paired devices appear again (as reported in 
forums[1], I haven't tried this on the affected UHCI machine).

While the underlying bug is fixed, can this patch be forwarded to stable? I see 
that 3.10.6 has been released, but still without this patch.

Regards,
Peter

 [1]: https://bbs.archlinux.org/viewtopic.php?id=167210
 [2]: https://bugs.archlinux.org/task/35991
 [3]: https://bbs.archlinux.org/viewtopic.php?pid=1309535#p1309535
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley Aug. 13, 2013, 12:13 p.m. UTC | #14
On 08/12/2013 05:54 PM, Peter Wu wrote:
> On Thursday 18 July 2013 16:28:01 Peter Hurley wrote:
>> Before we revert to using the workaround, I'd like to suggest that
>> this new "hidden" problem may be an interaction with the xhci_hcd host
>> controller driver only.
>>
>> Looking at the related bug, the OP indicates the machine only has
>> USB3 ports. Additionally, comments #7, #100, and #104 of the original
>> bug report add additional information that would seem to confirm
>> this suspicion.
>>
>> Let me add I have this USB device running on the uhci_hcd driver
>> with or without this workaround on v3.10.
>
> This problem does not seem specific to xhci, uhci seems also effected.

If true, it would certainly help to have a bug report confirming uhci
failure from a bare-metal system which contained:
1) kernel version
2) complete dmesg output
3) lsusb -v output
4) lsmod output
5) usbmon capture from a plug attempt

> Today I
> upgraded a system (running Arch Linux) from kernel 3.9.9 to 3.10.5. After a
> reboot to 3.10.5, things broke. The setup:
>
> - There are two USB receivers plugged into USB 1.1 ports (different buses
> according to lsusb, uhci), each receiver is paired to a K360 keyboard.
> - One of the receivers are passed to a QEMU guest with -usbdevice host:$busid.
> $devid. This keyboard is working (probably because QEMU performed a reset).
> - Since 3.10.5, the keyboard that is *not* passed to the QEMU guest is not
> functioning on reboot.
>
> After closing the QEMU guest, the USB bus gets reset(?) after which the other
> keyboard suddenly gets detected. I had only booted 3.10.5 twice before rolling
> back to 3.9.9, both boots triggered the issue. Do I need to provide a usbmon,
> lsusb, dmesg and/ or other details from 3.10.5?

Do both keyboards work on bare metal? Seems like this problem might be
specific to qemu (or kvm) and you may get more insight on those lists.

> Note that there are other Arch Linux users who have reported issues[1][2]

Unfortunately, not even one user in the referenced reports identified
the usb hub the receiver was plugged into.

> since upgrading to 3.10.z. Triggering a re-enumeration by writing the magic
> HID++ message[3] makes the paired devices appear again (as reported in
> forums[1], I haven't tried this on the affected UHCI machine).
>
> While the underlying bug is fixed, can this patch be forwarded to stable? I see
> that 3.10.6 has been released, but still without this patch.

This is still a workaround and not really a fix for the underlying bug.

> Regards,
> Peter
>
>   [1]: https://bbs.archlinux.org/viewtopic.php?id=167210
>   [2]: https://bugs.archlinux.org/task/35991
>   [3]: https://bbs.archlinux.org/viewtopic.php?pid=1309535#p1309535


--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Wu Aug. 13, 2013, 3:42 p.m. UTC | #15
On Tuesday 13 August 2013 08:13:17 Peter Hurley wrote:
> On 08/12/2013 05:54 PM, Peter Wu wrote:
> > On Thursday 18 July 2013 16:28:01 Peter Hurley wrote:
> >> Before we revert to using the workaround, I'd like to suggest that
> >> this new "hidden" problem may be an interaction with the xhci_hcd host
> >> controller driver only.
> >> 
> >> Looking at the related bug, the OP indicates the machine only has
> >> USB3 ports. Additionally, comments #7, #100, and #104 of the original
> >> bug report add additional information that would seem to confirm
> >> this suspicion.
> >> 
> >> Let me add I have this USB device running on the uhci_hcd driver
> >> with or without this workaround on v3.10.
> > 
> > This problem does not seem specific to xhci, uhci seems also effected.
> 
> If true, it would certainly help to have a bug report confirming uhci
> failure from a bare-metal system which contained:
> 1) kernel version
> 2) complete dmesg output
> 3) lsusb -v output
> 4) lsmod output
> 5) usbmon capture from a plug attempt

I was too fast in drawing a conclusion, besides the kernel I also upgraded 
some other packages. Today the issue also showed up in 3.9.9 + updated 
packages.

When checking the dmesg, the issue solved by this patch did not occur (the 
enumeration was successful).

> > Today I
> > upgraded a system (running Arch Linux) from kernel 3.9.9 to 3.10.5. After
> > a
> > reboot to 3.10.5, things broke. The setup:
> > 
> > - There are two USB receivers plugged into USB 1.1 ports (different buses
> > according to lsusb, uhci), each receiver is paired to a K360 keyboard.
> > - One of the receivers are passed to a QEMU guest with -usbdevice
> > host:$busid. $devid. This keyboard is working (probably because QEMU
> > performed a reset). - Since 3.10.5, the keyboard that is *not* passed to
> > the QEMU guest is not functioning on reboot.
> > 
> > After closing the QEMU guest, the USB bus gets reset(?) after which the
> > other keyboard suddenly gets detected. I had only booted 3.10.5 twice
> > before rolling back to 3.9.9, both boots triggered the issue. Do I need
> > to provide a usbmon, lsusb, dmesg and/ or other details from 3.10.5?
> 
> Do both keyboards work on bare metal? Seems like this problem might be
> specific to qemu (or kvm) and you may get more insight on those lists.

I haven't tested that, the system automatically boots into openbox + QEMU. 
Previously, both keyboards worked on bare metal, so I think it still works.

> > Note that there are other Arch Linux users who have reported issues[1][2]
> 
> Unfortunately, not even one user in the referenced reports identified
> the usb hub the receiver was plugged into.

I've asked it now.

> > since upgrading to 3.10.z. Triggering a re-enumeration by writing the
> > magic
> > HID++ message[3] makes the paired devices appear again (as reported in
> > forums[1], I haven't tried this on the affected UHCI machine).
> > 
> > While the underlying bug is fixed, can this patch be forwarded to stable?
> > I see that 3.10.6 has been released, but still without this patch.
> 
> This is still a workaround and not really a fix for the underlying bug.

I meant to say, "while the underlying bug is *being* fixed". Anyway, can this 
patch be applied to 3.10?

Sorry for the confusion with uhci, looking further it seems that the wrong USB 
receiver is being passed to QEMU. It's not a kernel issue, perhaps I can blame 
libusbx.

Regards,
Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley Aug. 13, 2013, 4:34 p.m. UTC | #16
On 08/13/2013 11:42 AM, Peter Wu wrote:
> On Tuesday 13 August 2013 08:13:17 Peter Hurley wrote:
>> On 08/12/2013 05:54 PM, Peter Wu wrote:
>>> On Thursday 18 July 2013 16:28:01 Peter Hurley wrote:
>>>> Before we revert to using the workaround, I'd like to suggest that
>>>> this new "hidden" problem may be an interaction with the xhci_hcd host
>>>> controller driver only.
>>>>
>>>> Looking at the related bug, the OP indicates the machine only has
>>>> USB3 ports. Additionally, comments #7, #100, and #104 of the original
>>>> bug report add additional information that would seem to confirm
>>>> this suspicion.
>>>>
>>>> Let me add I have this USB device running on the uhci_hcd driver
>>>> with or without this workaround on v3.10.
>>>
>>> This problem does not seem specific to xhci, uhci seems also effected.
>>
>> If true, it would certainly help to have a bug report confirming uhci
>> failure from a bare-metal system which contained:
>> 1) kernel version
>> 2) complete dmesg output
>> 3) lsusb -v output
>> 4) lsmod output
>> 5) usbmon capture from a plug attempt
>
> I was too fast in drawing a conclusion, besides the kernel I also upgraded
> some other packages. Today the issue also showed up in 3.9.9 + updated
> packages.
>
> When checking the dmesg, the issue solved by this patch did not occur (the
> enumeration was successful).

Thanks for double-checking.

>>> Today I
>>> upgraded a system (running Arch Linux) from kernel 3.9.9 to 3.10.5. After
>>> a
>>> reboot to 3.10.5, things broke. The setup:
>>>
>>> - There are two USB receivers plugged into USB 1.1 ports (different buses
>>> according to lsusb, uhci), each receiver is paired to a K360 keyboard.
>>> - One of the receivers are passed to a QEMU guest with -usbdevice
>>> host:$busid. $devid. This keyboard is working (probably because QEMU
>>> performed a reset). - Since 3.10.5, the keyboard that is *not* passed to
>>> the QEMU guest is not functioning on reboot.
>>>
>>> After closing the QEMU guest, the USB bus gets reset(?) after which the
>>> other keyboard suddenly gets detected. I had only booted 3.10.5 twice
>>> before rolling back to 3.9.9, both boots triggered the issue. Do I need
>>> to provide a usbmon, lsusb, dmesg and/ or other details from 3.10.5?
>>
>> Do both keyboards work on bare metal? Seems like this problem might be
>> specific to qemu (or kvm) and you may get more insight on those lists.
>
> I haven't tested that, the system automatically boots into openbox + QEMU.
> Previously, both keyboards worked on bare metal, so I think it still works.
>
>>> Note that there are other Arch Linux users who have reported issues[1][2]
>>
>> Unfortunately, not even one user in the referenced reports identified
>> the usb hub the receiver was plugged into.
>
> I've asked it now.

Thanks. And if someone has a uhci failure, filing a new bug on
bugzilla.kernel.org _and_ posting to linux-usb@ and
linux-input@vger.kernel.org improves the chances of the right
people seeing the problem. [xhci already has several reports plus I
subscribed to the kernel bug linked from the ArchLinux bug report.]

>>> since upgrading to 3.10.z. Triggering a re-enumeration by writing the
>>> magic
>>> HID++ message[3] makes the paired devices appear again (as reported in
>>> forums[1], I haven't tried this on the affected UHCI machine).
>>>
>>> While the underlying bug is fixed, can this patch be forwarded to stable?
>>> I see that 3.10.6 has been released, but still without this patch.
>>
>> This is still a workaround and not really a fix for the underlying bug.
>
> I meant to say, "while the underlying bug is *being* fixed". Anyway, can this
> patch be applied to 3.10?

That's really Jiri's call. TBH, I can't blame him for wanting to shake this
out in 3.11 before pushing to stable.

> Sorry for the confusion with uhci, looking further it seems that the wrong USB
> receiver is being passed to QEMU. It's not a kernel issue, perhaps I can blame
> libusbx.

No apologies necessary.

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/hid/hid-logitech-dj.c b/drivers/hid/hid-logitech-dj.c
index db3192b..0d13389 100644
--- a/drivers/hid/hid-logitech-dj.c
+++ b/drivers/hid/hid-logitech-dj.c
@@ -192,6 +192,7 @@  static struct hid_ll_driver logi_dj_ll_driver;
 static int logi_dj_output_hidraw_report(struct hid_device *hid, u8 * buf,
 					size_t count,
 					unsigned char report_type);
+static int logi_dj_recv_query_paired_devices(struct dj_receiver_dev *djrcv_dev);
 
 static void logi_dj_recv_destroy_djhid_device(struct dj_receiver_dev *djrcv_dev,
 						struct dj_report *dj_report)
@@ -232,6 +233,7 @@  static void logi_dj_recv_add_djhid_device(struct dj_receiver_dev *djrcv_dev,
 	if (dj_report->report_params[DEVICE_PAIRED_PARAM_SPFUNCTION] &
 	    SPFUNCTION_DEVICE_LIST_EMPTY) {
 		dbg_hid("%s: device list is empty\n", __func__);
+		djrcv_dev->querying_devices = false;
 		return;
 	}
 
@@ -242,6 +244,12 @@  static void logi_dj_recv_add_djhid_device(struct dj_receiver_dev *djrcv_dev,
 		return;
 	}
 
+	if (djrcv_dev->paired_dj_devices[dj_report->device_index]) {
+		/* The device is already known. No need to reallocate it. */
+		dbg_hid("%s: device is already known\n", __func__);
+		return;
+	}
+
 	dj_hiddev = hid_allocate_device();
 	if (IS_ERR(dj_hiddev)) {
 		dev_err(&djrcv_hdev->dev, "%s: hid_allocate_device failed\n",
@@ -305,6 +313,7 @@  static void delayedwork_callback(struct work_struct *work)
 	struct dj_report dj_report;
 	unsigned long flags;
 	int count;
+	int retval;
 
 	dbg_hid("%s\n", __func__);
 
@@ -337,6 +346,25 @@  static void delayedwork_callback(struct work_struct *work)
 		logi_dj_recv_destroy_djhid_device(djrcv_dev, &dj_report);
 		break;
 	default:
+	/* A normal report (i. e. not belonging to a pair/unpair notification)
+	 * arriving here, means that the report arrived but we did not have a
+	 * paired dj_device associated to the report's device_index, this
+	 * means that the original "device paired" notification corresponding
+	 * to this dj_device never arrived to this driver. The reason is that
+	 * hid-core discards all packets coming from a device while probe() is
+	 * executing. */
+	if (!djrcv_dev->paired_dj_devices[dj_report.device_index]) {
+		/* ok, we don't know the device, just re-ask the
+		 * receiver for the list of connected devices. */
+		retval = logi_dj_recv_query_paired_devices(djrcv_dev);
+		if (!retval) {
+			/* everything went fine, so just leave */
+			break;
+		}
+		dev_err(&djrcv_dev->hdev->dev,
+			"%s:logi_dj_recv_query_paired_devices "
+			"error:%d\n", __func__, retval);
+		}
 		dbg_hid("%s: unexpected report type\n", __func__);
 	}
 }
@@ -367,6 +395,12 @@  static void logi_dj_recv_forward_null_report(struct dj_receiver_dev *djrcv_dev,
 	if (!djdev) {
 		dbg_hid("djrcv_dev->paired_dj_devices[dj_report->device_index]"
 			" is NULL, index %d\n", dj_report->device_index);
+		kfifo_in(&djrcv_dev->notif_fifo, dj_report, sizeof(struct dj_report));
+
+		if (schedule_work(&djrcv_dev->work) == 0) {
+			dbg_hid("%s: did not schedule the work item, was already "
+			"queued\n", __func__);
+		}
 		return;
 	}
 
@@ -397,6 +431,12 @@  static void logi_dj_recv_forward_report(struct dj_receiver_dev *djrcv_dev,
 	if (dj_device == NULL) {
 		dbg_hid("djrcv_dev->paired_dj_devices[dj_report->device_index]"
 			" is NULL, index %d\n", dj_report->device_index);
+		kfifo_in(&djrcv_dev->notif_fifo, dj_report, sizeof(struct dj_report));
+
+		if (schedule_work(&djrcv_dev->work) == 0) {
+			dbg_hid("%s: did not schedule the work item, was already "
+			"queued\n", __func__);
+		}
 		return;
 	}
 
@@ -444,6 +484,10 @@  static int logi_dj_recv_query_paired_devices(struct dj_receiver_dev *djrcv_dev)
 	struct dj_report *dj_report;
 	int retval;
 
+	/* no need to protect djrcv_dev->querying_devices */
+	if (djrcv_dev->querying_devices)
+		return 0;
+
 	dj_report = kzalloc(sizeof(struct dj_report), GFP_KERNEL);
 	if (!dj_report)
 		return -ENOMEM;
@@ -455,6 +499,7 @@  static int logi_dj_recv_query_paired_devices(struct dj_receiver_dev *djrcv_dev)
 	return retval;
 }
 
+
 static int logi_dj_recv_switch_to_dj_mode(struct dj_receiver_dev *djrcv_dev,
 					  unsigned timeout)
 {
diff --git a/drivers/hid/hid-logitech-dj.h b/drivers/hid/hid-logitech-dj.h
index fd28a5e..4a40003 100644
--- a/drivers/hid/hid-logitech-dj.h
+++ b/drivers/hid/hid-logitech-dj.h
@@ -101,6 +101,7 @@  struct dj_receiver_dev {
 	struct work_struct work;
 	struct kfifo notif_fifo;
 	spinlock_t lock;
+	bool querying_devices;
 };
 
 struct dj_device {