Message ID | 1508224263-130302-1-git-send-email-hare@suse.de (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Hannes, > + if (!strncmp(scmd->device->vendor, "EMC", 3) && > + !strncmp(scmd->device->model, "SYMMETRIX", 9) && > + (sshdr.asc == 0x44) && (sshdr.ascq == 0x0)) { > + /* > + * EMC Symmetrix returns 'Internal target failure' > + * for a variety of internal issues, all of which > + * can be recovered by retry. > + */ > + return ADD_TO_MLQUEUE; > + } It's decidedly awful to have vendor/model-specific triggers in scsi_error. What are the drawbacks of just always refiring on AC/0x44/ITF?
On 10/18/2017 04:58 AM, Martin K. Petersen wrote: > > Hannes, > >> + if (!strncmp(scmd->device->vendor, "EMC", 3) && >> + !strncmp(scmd->device->model, "SYMMETRIX", 9) && >> + (sshdr.asc == 0x44) && (sshdr.ascq == 0x0)) { >> + /* >> + * EMC Symmetrix returns 'Internal target failure' >> + * for a variety of internal issues, all of which >> + * can be recovered by retry. >> + */ >> + return ADD_TO_MLQUEUE; >> + } > > It's decidedly awful to have vendor/model-specific triggers in > scsi_error. > > What are the drawbacks of just always refiring on AC/0x44/ITF? > Hmm. 'Internal target failure' is not very descriptive, so it could mean anything. Hence the rather awkward approach. But I just checked with the qemu code, and that returns 0x44/0x00 only if some (internal) call returned with -ENOMEM. So I guess we're safe to always retry here. Cheers, Hannes
On Wed, Oct 18, 2017 at 08:15:46AM +0200, Hannes Reinecke wrote: > > It's decidedly awful to have vendor/model-specific triggers in > > scsi_error. > > > > What are the drawbacks of just always refiring on AC/0x44/ITF? > > > Hmm. 'Internal target failure' is not very descriptive, so it could mean > anything. Hence the rather awkward approach. > But I just checked with the qemu code, and that returns 0x44/0x00 only > if some (internal) call returned with -ENOMEM. > So I guess we're safe to always retry here. And even if not we should add a quirk so that we can just check a bit instead of doing two string compares in the I/O completion path..
On 10/18/2017 08:34 AM, Christoph Hellwig wrote: > On Wed, Oct 18, 2017 at 08:15:46AM +0200, Hannes Reinecke wrote: >>> It's decidedly awful to have vendor/model-specific triggers in >>> scsi_error. >>> >>> What are the drawbacks of just always refiring on AC/0x44/ITF? >>> >> Hmm. 'Internal target failure' is not very descriptive, so it could mean >> anything. Hence the rather awkward approach. >> But I just checked with the qemu code, and that returns 0x44/0x00 only >> if some (internal) call returned with -ENOMEM. >> So I guess we're safe to always retry here. > > And even if not we should add a quirk so that we can just check a > bit instead of doing two string compares in the I/O completion path.. > Yeah, you are right. Will be adding a blacklist entry for this. Cheers, Hannes
> On 10/17/2017 09:11 AM, Hannes Reinecke wrote: >> EMC Symmetrix is returning 'internal target error' for a variety >> of conditions, most of which will be transient. >> So we should always retry it, even with failfast set. >> Otherwise we'd get spurious path flaps with multipath. >> >> Signed-off-by: Hannes Reinecke <hare@suse.de> >> --- >> drivers/scsi/scsi_error.c | 13 +++++++++++++ >> 1 file changed, 13 insertions(+) >> >> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c >> index 5086489..5722c2e 100644 >> --- a/drivers/scsi/scsi_error.c >> +++ b/drivers/scsi/scsi_error.c >> @@ -497,6 +497,16 @@ int scsi_check_sense(struct scsi_cmnd *scmd) >> if (sshdr.asc == 0x10) /* DIF */ >> return SUCCESS; >> >> + if (!strncmp(scmd->device->vendor, "EMC", 3) && >> + !strncmp(scmd->device->model, "SYMMETRIX", 9) && >> + (sshdr.asc == 0x44) && (sshdr.ascq == 0x0)) { >> + /* >> + * EMC Symmetrix returns 'Internal target failure' >> + * for a variety of internal issues, all of which >> + * can be recovered by retry. >> + */ >> + return ADD_TO_MLQUEUE; >> + } >> return NEEDS_RETRY; >> case NOT_READY: >> case UNIT_ATTENTION: >> @@ -1846,6 +1856,9 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd) >> rtn = scsi_check_sense(scmd); >> if (rtn == NEEDS_RETRY) >> goto maybe_retry; >> + else if (rtn == ADD_TO_MLQUEUE) >> + /* Always enforce a retry for ADD_TO_MLQUEUE */ >> + rtn = NEEDS_RETRY; >> /* if rtn == FAILED, we have no sense information; >> * returning FAILED will wake the error handler thread >> * to collect the sense and redo the decide >> On 10/18/2017 07:27 AM, Hannes Reinecke wrote: >> Yeah, you are right. >> Will be adding a blacklist entry for this. Is there a new patch ongoing?
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 5086489..5722c2e 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -497,6 +497,16 @@ int scsi_check_sense(struct scsi_cmnd *scmd) if (sshdr.asc == 0x10) /* DIF */ return SUCCESS; + if (!strncmp(scmd->device->vendor, "EMC", 3) && + !strncmp(scmd->device->model, "SYMMETRIX", 9) && + (sshdr.asc == 0x44) && (sshdr.ascq == 0x0)) { + /* + * EMC Symmetrix returns 'Internal target failure' + * for a variety of internal issues, all of which + * can be recovered by retry. + */ + return ADD_TO_MLQUEUE; + } return NEEDS_RETRY; case NOT_READY: case UNIT_ATTENTION: @@ -1846,6 +1856,9 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd) rtn = scsi_check_sense(scmd); if (rtn == NEEDS_RETRY) goto maybe_retry; + else if (rtn == ADD_TO_MLQUEUE) + /* Always enforce a retry for ADD_TO_MLQUEUE */ + rtn = NEEDS_RETRY; /* if rtn == FAILED, we have no sense information; * returning FAILED will wake the error handler thread * to collect the sense and redo the decide
EMC Symmetrix is returning 'internal target error' for a variety of conditions, most of which will be transient. So we should always retry it, even with failfast set. Otherwise we'd get spurious path flaps with multipath. Signed-off-by: Hannes Reinecke <hare@suse.de> --- drivers/scsi/scsi_error.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)