diff mbox

[1/1] multipath-tools: Change path checker for IBM IPR devices

Message ID 5432B3A5.4080001@linux.vnet.ibm.com (mailing list archive)
State Not Applicable, archived
Delegated to: Mike Snitzer
Headers show

Commit Message

Brian King Oct. 6, 2014, 3:22 p.m. UTC
On 10/01/2014 07:51 AM, Christoph Hellwig wrote:
> Unfortunately the patch wasn't quite correct - all TEST_UNIT_READY
> commands are sent as BLOCK_PC, so this would basically revert James'
> original fix for the SATL case.
> 
> Am I right to assume you only need the call to scsi_dh->check_sense and
> not the rest of the handling for the multipath path checker?  If that's
> the case something like the patch below sould work:

This would work if we also duplicated the 02/04/02 K/C/Q check in alua_check_sense
handler.

Wendy - can you try my patch below, along with Christoph's latest patch here
and see if that resolves the issue?

Thanks,

Brian

> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 5db8454..399c1c8 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -459,14 +459,6 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
>  	if (! scsi_command_normalize_sense(scmd, &sshdr))
>  		return FAILED;	/* no valid sense data */
> 
> -	if (scmd->cmnd[0] == TEST_UNIT_READY && scmd->scsi_done != scsi_eh_done)
> -		/*
> -		 * nasty: for mid-layer issued TURs, we need to return the
> -		 * actual sense data without any recovery attempt.  For eh
> -		 * issued ones, we need to try to recover and interpret
> -		 */
> -		return SUCCESS;
> -
>  	scsi_report_sense(sdev, &sshdr);
> 
>  	if (scsi_sense_is_deferred(&sshdr))
> @@ -482,6 +474,14 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
>  		/* handler does not care. Drop down to default handling */
>  	}
> 
> +	if (scmd->cmnd[0] == TEST_UNIT_READY && scmd->scsi_done != scsi_eh_done)
> +		/*
> +		 * nasty: for mid-layer issued TURs, we need to return the
> +		 * actual sense data without any recovery attempt.  For eh
> +		 * issued ones, we need to try to recover and interpret
> +		 */
> +		return SUCCESS;
> +
>  	/*
>  	 * Previous logic looked for FILEMARK, EOM or ILI which are
>  	 * mainly associated with tapes and returned SUCCESS.
> 



Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
---

 drivers/scsi/device_handler/scsi_dh_alua.c |    7 +++++++
 1 file changed, 7 insertions(+)

Comments

wenxiong@linux.vnet.ibm.com Oct. 6, 2014, 9:50 p.m. UTC | #1
Quoting Brian King <brking@linux.vnet.ibm.com>:

> On 10/01/2014 07:51 AM, Christoph Hellwig wrote:
>> Unfortunately the patch wasn't quite correct - all TEST_UNIT_READY
>> commands are sent as BLOCK_PC, so this would basically revert James'
>> original fix for the SATL case.
>>
>> Am I right to assume you only need the call to scsi_dh->check_sense and
>> not the rest of the handling for the multipath path checker?  If that's
>> the case something like the patch below sould work:
>
> This would work if we also duplicated the 02/04/02 K/C/Q check in  
> alua_check_sense
> handler.
>
> Wendy - can you try my patch below, along with Christoph's latest patch here
> and see if that resolves the issue?
>
> Thanks,
>
> Brian
>
>>
>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
>> index 5db8454..399c1c8 100644
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -459,14 +459,6 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
>>  	if (! scsi_command_normalize_sense(scmd, &sshdr))
>>  		return FAILED;	/* no valid sense data */
>>
>> -	if (scmd->cmnd[0] == TEST_UNIT_READY && scmd->scsi_done != scsi_eh_done)
>> -		/*
>> -		 * nasty: for mid-layer issued TURs, we need to return the
>> -		 * actual sense data without any recovery attempt.  For eh
>> -		 * issued ones, we need to try to recover and interpret
>> -		 */
>> -		return SUCCESS;
>> -
>>  	scsi_report_sense(sdev, &sshdr);
>>
>>  	if (scsi_sense_is_deferred(&sshdr))
>> @@ -482,6 +474,14 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
>>  		/* handler does not care. Drop down to default handling */
>>  	}
>>
>> +	if (scmd->cmnd[0] == TEST_UNIT_READY && scmd->scsi_done != scsi_eh_done)
>> +		/*
>> +		 * nasty: for mid-layer issued TURs, we need to return the
>> +		 * actual sense data without any recovery attempt.  For eh
>> +		 * issued ones, we need to try to recover and interpret
>> +		 */
>> +		return SUCCESS;
>> +
>>  	/*
>>  	 * Previous logic looked for FILEMARK, EOM or ILI which are
>>  	 * mainly associated with tapes and returned SUCCESS.
>>
>
>
>
> Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
> ---
>
>  drivers/scsi/device_handler/scsi_dh_alua.c |    7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff -puN  
> drivers/scsi/device_handler/scsi_dh_alua.c~alua_allow_restart  
> drivers/scsi/device_handler/scsi_dh_alua.c
> ---  
> linux/drivers/scsi/device_handler/scsi_dh_alua.c~alua_allow_restart	2014-10-06 10:19:16.184798305  
> -0500
> +++  
> linux-bjking1/drivers/scsi/device_handler/scsi_dh_alua.c	2014-10-06  
> 10:20:35.743165951 -0500
> @@ -474,6 +474,13 @@ static int alua_check_sense(struct scsi_
>  			 * LUN Not Ready -- Offline
>  			 */
>  			return SUCCESS;
> +		if (sdev->allow_restart &&
> +		    (sense_hdr->asc == 0x04) && (sense_hdr->ascq == 0x02))
> +			/*
> +			 * if the device is not started, we need to wake
> +			 * the error handler to start the motor
> +			 */
> +			return FAILED;
>  		break;
>  	case UNIT_ATTENTION:
>  		if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00)
> _
>
> --

Sorry it took some time since we need to re-config the systems for this test.

With Christoph's new patch only, still saw the failure.
With Christoph's new patch + Brian's patch, works fine, didn't see the  
failure.


Thanks,
Wendy


> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Christoph Hellwig Oct. 21, 2014, 11:03 a.m. UTC | #2
On Mon, Oct 06, 2014 at 05:50:32PM -0400, wenxiong@linux.vnet.ibm.com wrote:
> Sorry it took some time since we need to re-config the systems for this test.
> 
> With Christoph's new patch only, still saw the failure.
> With Christoph's new patch + Brian's patch, works fine, didn't see the
> failure.

Can one of you send me a tested series with both patches?

Thanks!

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff -puN drivers/scsi/device_handler/scsi_dh_alua.c~alua_allow_restart drivers/scsi/device_handler/scsi_dh_alua.c
--- linux/drivers/scsi/device_handler/scsi_dh_alua.c~alua_allow_restart	2014-10-06 10:19:16.184798305 -0500
+++ linux-bjking1/drivers/scsi/device_handler/scsi_dh_alua.c	2014-10-06 10:20:35.743165951 -0500
@@ -474,6 +474,13 @@  static int alua_check_sense(struct scsi_
 			 * LUN Not Ready -- Offline
 			 */
 			return SUCCESS;
+		if (sdev->allow_restart &&
+		    (sense_hdr->asc == 0x04) && (sense_hdr->ascq == 0x02))
+			/*
+			 * if the device is not started, we need to wake
+			 * the error handler to start the motor
+			 */
+			return FAILED;
 		break;
 	case UNIT_ATTENTION:
 		if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00)