diff mbox

uas: Limit qdepth at the scsi-host level

Message ID 1458377952-12567-1-git-send-email-hdegoede@redhat.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Hans de Goede March 19, 2016, 8:59 a.m. UTC
Commit 64d513ac31bd ("scsi: use host wide tags by default") causes
the scsi-core to queue more cmnds then we can handle on devices with
multiple LUNs, limit the qdepth at the scsi-host level instead of
per slave to fix this.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1315013
Cc: stable@vger.kernel.org # 4.4.x and 4.5.x
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 drivers/usb/storage/uas.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Sergei Shtylyov March 19, 2016, 3:52 p.m. UTC | #1
Hello.

On 3/19/2016 11:59 AM, Hans de Goede wrote:

> Commit 64d513ac31bd ("scsi: use host wide tags by default") causes
> the scsi-core to queue more cmnds then we can handle on devices with

    SCSI core? Commands?

> multiple LUNs, limit the qdepth at the scsi-host level instead of

    Queue depth?

> per slave to fix this.
>
> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1315013
> Cc: stable@vger.kernel.org # 4.4.x and 4.5.x
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>   drivers/usb/storage/uas.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
> index c90a7e4..b5cb7ab 100644
> --- a/drivers/usb/storage/uas.c
> +++ b/drivers/usb/storage/uas.c
[...]
> @@ -932,6 +931,12 @@ static int uas_probe(struct usb_interface *intf, const struct usb_device_id *id)
>   	if (result)
>   		goto set_alt0;
>
> +	/*
> +	 * 1 tag is reserved for untagged commands +
> +	 * 1 tag to avoid of by one errors in some bridge firmwares

    Off by one.

[...]

MBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley March 19, 2016, 5:06 p.m. UTC | #2
On Sat, 2016-03-19 at 09:59 +0100, Hans de Goede wrote:
> Commit 64d513ac31bd ("scsi: use host wide tags by default") causes
> the scsi-core to queue more cmnds then we can handle on devices with
> multiple LUNs, limit the qdepth at the scsi-host level instead of
> per slave to fix this.

Help me understand this bug a bit more.  Are you saying that the commit
you identify is causing the block layer to queue more commands than
you've set the per-lun limit to?  In which case we have a serious
problem for more than just UAS.  Or are you saying that UAS always had
a global command limit, but it just didn't get set correctly; however,
it mostly worked until the above commit exposed the problem?


James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hans de Goede March 20, 2016, 1:28 p.m. UTC | #3
Hi,

On 19-03-16 18:06, James Bottomley wrote:
> On Sat, 2016-03-19 at 09:59 +0100, Hans de Goede wrote:
>> Commit 64d513ac31bd ("scsi: use host wide tags by default") causes
>> the scsi-core to queue more cmnds then we can handle on devices with
>> multiple LUNs, limit the qdepth at the scsi-host level instead of
>> per slave to fix this.
>
> Help me understand this bug a bit more.  Are you saying that the commit
> you identify is causing the block layer to queue more commands than
> you've set the per-lun limit to?  In which case we have a serious
> problem for more than just UAS.  Or are you saying that UAS always had
> a global command limit, but it just didn't get set correctly; however,
> it mostly worked until the above commit exposed the problem?

The latter. UAS has always had a global command limit, which so far
was enforced via a shared tag map rather then setting can_queue
correctly. The identified commit removed the usage of a shared
tag map which causes the core to queue more commands *in total*
then the global limit. I've no reason to believe that the core was
exceeding the per-lun limit on a single lun, but for uas exceeding
the limit when counting all queued commands per lun combined is just
as bad, since it really always was a global limit.

This theory is supported by the only bug report for 4.4 being a
(rare) multi-lun uas disk enclosure.

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig March 21, 2016, 2:36 p.m. UTC | #4
On Sat, Mar 19, 2016 at 10:06:16AM -0700, James Bottomley wrote:
> Help me understand this bug a bit more.  Are you saying that the commit
> you identify is causing the block layer to queue more commands than
> you've set the per-lun limit to?  In which case we have a serious
> problem for more than just UAS.  Or are you saying that UAS always had
> a global command limit, but it just didn't get set correctly; however,
> it mostly worked until the above commit exposed the problem?

I did mishandle uas in my patch to make the host wide tags the default
and didn't notice that uas set a different limit for the host wide
tag limit and ->can_queue.  With the old code we'd apply the lower
of the two limits, which was the host wide tag limit, but my patch
removed that one and now purely relies on can_queue.

Now most uas devices only have a single lun, and the per-lun limit
was set to what the global limit should be as well, so most users
didn't really notice this.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig March 21, 2016, 2:36 p.m. UTC | #5
On Sat, Mar 19, 2016 at 09:59:12AM +0100, Hans de Goede wrote:
> Commit 64d513ac31bd ("scsi: use host wide tags by default") causes
> the scsi-core to queue more cmnds then we can handle on devices with
> multiple LUNs, limit the qdepth at the scsi-host level instead of
> per slave to fix this.
> 
> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1315013
> Cc: stable@vger.kernel.org # 4.4.x and 4.5.x
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index c90a7e4..b5cb7ab 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -800,7 +800,6 @@  static int uas_slave_configure(struct scsi_device *sdev)
 	if (devinfo->flags & US_FL_BROKEN_FUA)
 		sdev->broken_fua = 1;
 
-	scsi_change_queue_depth(sdev, devinfo->qdepth - 2);
 	return 0;
 }
 
@@ -932,6 +931,12 @@  static int uas_probe(struct usb_interface *intf, const struct usb_device_id *id)
 	if (result)
 		goto set_alt0;
 
+	/*
+	 * 1 tag is reserved for untagged commands +
+	 * 1 tag to avoid of by one errors in some bridge firmwares
+	 */
+	shost->can_queue = devinfo->qdepth - 2;
+
 	usb_set_intfdata(intf, shost);
 	result = scsi_add_host(shost, &intf->dev);
 	if (result)