diff mbox

scsi: handle ABORTED_COMMAND on Fujitsu ETERNUS

Message ID 20180115201852.15152-1-mwilck@suse.com (mailing list archive)
State Changes Requested
Headers show

Commit Message

Martin Wilck Jan. 15, 2018, 8:18 p.m. UTC
On Fujitsu ETERNUS systems, sense code ABORTED COMMAND with ASC/Q C1/01
is used to indicate temporary condition where the storage-internal path
to a target is switched from one controller to another. SCSI commands
that return with this error code must be retried unconditionally (i.e. without
the "maybe_retry" logic in scsi_decide_disposition); otherwise dm-multipath
might initiate a failover from a healthy path e.g. for REQ_FAILFAST_DEV
commands.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 drivers/scsi/scsi_error.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Martin K. Petersen Jan. 17, 2018, 6:15 a.m. UTC | #1
Martin,

> +		if (!strncmp(scmd->device->vendor, "FUJITSU", 7) &&
> +			   !strncmp(scmd->device->model, "ETERNUS_DXM", 11) &&

Blacklist, please.
Martin Wilck Jan. 17, 2018, 7:32 a.m. UTC | #2
On Wed, 2018-01-17 at 01:15 -0500, Martin K. Petersen wrote:
> Martin,
> 
> > +		if (!strncmp(scmd->device->vendor, "FUJITSU", 7)
> > &&
> > +			   !strncmp(scmd->device->model,
> > "ETERNUS_DXM", 11) &&
> 
> Blacklist, please.
> 

To make sure I understand correctly:

You'd like to spend a precious BLIST bit for this single device which
uses vendor-specific ASC/Q?

Regards,
Martin
Martin K. Petersen Jan. 18, 2018, 2:43 a.m. UTC | #3
Martin,

> You'd like to spend a precious BLIST bit for this single device which
> uses vendor-specific ASC/Q?

I really don't want string comparisons in the regular code paths. Also
not a fan of vendor-specific ASCs. But if you must use them, please add
a flag and trigger on that.

Since this is a bit of an unusual check condition combo, we could
entertain whether we should simply always ADD_TO_MLQUEUE on
0xb/0xc1/0x1. I wonder what would break?
Hannes Reinecke Jan. 18, 2018, 10:17 a.m. UTC | #4
On 01/18/2018 03:43 AM, Martin K. Petersen wrote:
> 
> Martin,
> 
>> You'd like to spend a precious BLIST bit for this single device which
>> uses vendor-specific ASC/Q?
> 
> I really don't want string comparisons in the regular code paths. Also
> not a fan of vendor-specific ASCs. But if you must use them, please add
> a flag and trigger on that.
> 
> Since this is a bit of an unusual check condition combo, we could
> entertain whether we should simply always ADD_TO_MLQUEUE on
> 0xb/0xc1/0x1. I wonder what would break?
> 
That's the usual problem I have with vendor-specific sense codes.
In general the risk is quite low of different vendors actually using the
same code; I guess we can get away with just adding a debug message here
and enable it without any vendor check.
Then we can reconsider the whole thing once we notice several vendors
using the same sense codes.

Cheers,

Hannes
James Bottomley Jan. 18, 2018, 2:24 p.m. UTC | #5
On Thu, 2018-01-18 at 11:17 +0100, Hannes Reinecke wrote:
> On 01/18/2018 03:43 AM, Martin K. Petersen wrote:
> > 
> > 
> > Martin,
> > 
> > > 
> > > You'd like to spend a precious BLIST bit for this single device
> > > which uses vendor-specific ASC/Q?
> > 
> > I really don't want string comparisons in the regular code paths.
> > Also not a fan of vendor-specific ASCs. But if you must use them,
> > please add a flag and trigger on that.
> > 
> > Since this is a bit of an unusual check condition combo, we could
> > entertain whether we should simply always ADD_TO_MLQUEUE on
> > 0xb/0xc1/0x1. I wonder what would break?
> > 
> That's the usual problem I have with vendor-specific sense codes.
> In general the risk is quite low of different vendors actually using
> the same code; I guess we can get away with just adding a debug
> message here and enable it without any vendor check.
> Then we can reconsider the whole thing once we notice several vendors
> using the same sense codes.

Murphy's law says that if we rely on Vendors to chose non-overlapping
numbers we'll end up with a clash fairly quickly ...

Can't we find a way of doing this in the device_handler?  That way we
can use vendor specific codes only when we know who the vendor is?

James
diff mbox

Patch

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index d042915ce895..598111a741d4 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -518,6 +518,16 @@  int scsi_check_sense(struct scsi_cmnd *scmd)
 		if (sshdr.asc == 0x10) /* DIF */
 			return SUCCESS;
 
+		if (!strncmp(scmd->device->vendor, "FUJITSU", 7) &&
+			   !strncmp(scmd->device->model, "ETERNUS_DXM", 11) &&
+			   (sshdr.asc == 0xc1) && (sshdr.ascq == 0x1)) {
+			/*
+			 * Fujitsu Eternus uses this vendor specific code
+			 * to indicate an internal reconfiguration status
+			 * which can be recovered with a retry.
+			 */
+			return ADD_TO_MLQUEUE;
+		}
 		return NEEDS_RETRY;
 	case NOT_READY:
 	case UNIT_ATTENTION: