Message ID | 20220809000419.10674-15-michael.christie@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Use block pr_ops in LIO | expand |
On Mon, Aug 08, 2022 at 07:04:13PM -0500, Mike Christie wrote: > It's common to get a UA when doing PR commands. It could be due to a > target restarting, transport level relogin or other PR commands like a > release causing it. The upper layers don't get the sense and in some cases > have no idea if it's a SCSI device, so this has the sd layer retry. This seems like another case for the generic in-kernel passthrugh command retry discussed in the other thread. Can you split out two series with just bug fixes for nvme and scsi as I think we should probably get those into 6.0, and then we can do the actual feature on top of those?
On 8/9/22 2:16 AM, Christoph Hellwig wrote: > On Mon, Aug 08, 2022 at 07:04:13PM -0500, Mike Christie wrote: >> It's common to get a UA when doing PR commands. It could be due to a >> target restarting, transport level relogin or other PR commands like a >> release causing it. The upper layers don't get the sense and in some cases >> have no idea if it's a SCSI device, so this has the sd layer retry. > > This seems like another case for the generic in-kernel passthrugh > command retry discussed in the other thread. It is. > > Can you split out two series with just bug fixes for nvme and scsi > as I think we should probably get those into 6.0, and then we can > do the actual feature on top of those? Ok. Martin W, I'll submit a patch with a new SCMD flag that will fix both our problems.
On 8/9/22 09:24, Mike Christie wrote: > On 8/9/22 2:16 AM, Christoph Hellwig wrote: >> On Mon, Aug 08, 2022 at 07:04:13PM -0500, Mike Christie wrote: >>> It's common to get a UA when doing PR commands. It could be due to a >>> target restarting, transport level relogin or other PR commands like a >>> release causing it. The upper layers don't get the sense and in some cases >>> have no idea if it's a SCSI device, so this has the sd layer retry. >> >> This seems like another case for the generic in-kernel passthrugh >> command retry discussed in the other thread. > > It is. Has it been considered to introduce a flag that makes scsi_noretry_cmd() retry passthrough commands? Thanks, Bart.
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index bf080de9866d..61e88c7ffa44 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1683,6 +1683,8 @@ static int sd_get_unique_id(struct gendisk *disk, u8 id[16], return ret; } +#define SCSI_PR_UA_RETRIES 5 + static int sd_pr_in_command(struct block_device *bdev, u8 sa, unsigned char *data, int data_len) { @@ -1690,8 +1692,9 @@ static int sd_pr_in_command(struct block_device *bdev, u8 sa, struct scsi_device *sdev = sdkp->device; struct scsi_sense_hdr sshdr; u8 cmd[10] = { 0, }; - int result; + int result, ua_retries = SCSI_PR_UA_RETRIES; +retry: cmd[0] = PERSISTENT_RESERVE_IN; cmd[1] = sa; put_unaligned_be16(data_len, &cmd[7]); @@ -1700,6 +1703,9 @@ static int sd_pr_in_command(struct block_device *bdev, u8 sa, &sshdr, SD_TIMEOUT, sdkp->max_retries, NULL); if (scsi_status_is_check_condition(result) && scsi_sense_valid(&sshdr)) { + if (sshdr.sense_key == UNIT_ATTENTION && ua_retries-- > 0) + goto retry; + sdev_printk(KERN_INFO, sdev, "PR command failed: %d\n", result); scsi_print_sense_hdr(sdev, NULL, &sshdr); } @@ -1776,10 +1782,11 @@ static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key, struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk); struct scsi_device *sdev = sdkp->device; struct scsi_sense_hdr sshdr; - int result; + int result, ua_retries = SCSI_PR_UA_RETRIES; u8 cmd[16] = { 0, }; u8 data[24] = { 0, }; +retry: cmd[0] = PERSISTENT_RESERVE_OUT; cmd[1] = sa; cmd[2] = type; @@ -1794,6 +1801,9 @@ static int sd_pr_out_command(struct block_device *bdev, u8 sa, u64 key, if (scsi_status_is_check_condition(result) && scsi_sense_valid(&sshdr)) { + if (sshdr.sense_key == UNIT_ATTENTION && ua_retries-- > 0) + goto retry; + sdev_printk(KERN_INFO, sdev, "PR command failed: %d\n", result); scsi_print_sense_hdr(sdev, NULL, &sshdr); }
It's common to get a UA when doing PR commands. It could be due to a target restarting, transport level relogin or other PR commands like a release causing it. The upper layers don't get the sense and in some cases have no idea if it's a SCSI device, so this has the sd layer retry. Signed-off-by: Mike Christie <michael.christie@oracle.com> --- drivers/scsi/sd.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)