mbox series

[v2,0/6] scsi:scsi_debug: Add error injection for single device

Message ID 20230428013320.347050-1-haowenchao2@huawei.com (mailing list archive)
Headers show
Series scsi:scsi_debug: Add error injection for single device | expand

Message

Wenchao Hao April 28, 2023, 1:33 a.m. UTC
The original error injection mechanism was based on scsi_host which
could not inject fault for a single SCSI device.

This patchset provides the ability to inject errors for a single
SCSI device. Now we supports inject timeout errors, queuecommand
errors, and hostbyte, driverbyte, statusbyte, and sense data for
specific SCSI Command.

The first two patch add an debugfs interface to add and inquiry single
device's error injection info; the third patch defined how to remove
an injection which has been added. The following 3 patches use the
injection info and generate the related error type.

V2:
  - Using debugfs rather than sysfs attribute interface to manage error

Wenchao Hao (6):
  scsi:scsi_debug: create scsi_debug directory in the debugfs filesystem
  scsi:scsi_debug: Add interface to manage single device's error inject
  scsi:scsi_debug: Define grammar to remove added error injection
  scsi:scsi_debug: timeout command if the error is injected
  scsi:scsi_debug: Return failed value if the error is injected
  scsi:scsi_debug: set command's result and sense data if the error is
    injected

 drivers/scsi/scsi_debug.c | 318 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 318 insertions(+)

Comments

Douglas Gilbert May 2, 2023, 11:52 p.m. UTC | #1
On 2023-04-27 21:33, Wenchao Hao wrote:
> The original error injection mechanism was based on scsi_host which
> could not inject fault for a single SCSI device.
> 
> This patchset provides the ability to inject errors for a single
> SCSI device. Now we supports inject timeout errors, queuecommand
> errors, and hostbyte, driverbyte, statusbyte, and sense data for
> specific SCSI Command.
> 
> The first two patch add an debugfs interface to add and inquiry single
> device's error injection info; the third patch defined how to remove
> an injection which has been added. The following 3 patches use the
> injection info and generate the related error type.
> 
> V2:
>    - Using debugfs rather than sysfs attribute interface to manage error
> 
> Wenchao Hao (6):
>    scsi:scsi_debug: create scsi_debug directory in the debugfs filesystem
>    scsi:scsi_debug: Add interface to manage single device's error inject
>    scsi:scsi_debug: Define grammar to remove added error injection
>    scsi:scsi_debug: timeout command if the error is injected
>    scsi:scsi_debug: Return failed value if the error is injected
>    scsi:scsi_debug: set command's result and sense data if the error is
>      injected
> 
>   drivers/scsi/scsi_debug.c | 318 ++++++++++++++++++++++++++++++++++++++
>   1 file changed, 318 insertions(+)

Been playing around with this patchset and it seems to work as expected. Took me
a while to work my way through interface description at the beginning of
   [PATCH v2 2/6] scsi:scsi_debug: Add interface to manage single device's error 
inject

so I cut and paste it into my scsi_debug.html page and did some work on it, see:
    https://doug-gilbert.github.io/scsi_debug.html

There is a new chapter titled: Per device error injection
Kept the ASCII art so it could be ported back to [PATCH v2 2/6]'s description
if Wenchao is agreeable.

So for the whole series:
   Acked-by: Douglas Gilbert <dgilbert@interlog.com>


One suggestion for later work: perhaps the Command opcode field could be
expanded to: x8[,x16] so optionally a Service Action (in hex) could be
given (e.g. '9e,10' for the READ CAPACITY (16) command).

Doug Gilbert
Wenchao Hao May 4, 2023, 12:56 p.m. UTC | #2
On 2023/5/3 7:52, Douglas Gilbert wrote:
> On 2023-04-27 21:33, Wenchao Hao wrote:
>> The original error injection mechanism was based on scsi_host which
>> could not inject fault for a single SCSI device.
>>
>> This patchset provides the ability to inject errors for a single
>> SCSI device. Now we supports inject timeout errors, queuecommand
>> errors, and hostbyte, driverbyte, statusbyte, and sense data for
>> specific SCSI Command.
>>
>> The first two patch add an debugfs interface to add and inquiry single
>> device's error injection info; the third patch defined how to remove
>> an injection which has been added. The following 3 patches use the
>> injection info and generate the related error type.
>>
>> V2:
>>    - Using debugfs rather than sysfs attribute interface to manage error
>>
>> Wenchao Hao (6):
>>    scsi:scsi_debug: create scsi_debug directory in the debugfs filesystem
>>    scsi:scsi_debug: Add interface to manage single device's error inject
>>    scsi:scsi_debug: Define grammar to remove added error injection
>>    scsi:scsi_debug: timeout command if the error is injected
>>    scsi:scsi_debug: Return failed value if the error is injected
>>    scsi:scsi_debug: set command's result and sense data if the error is
>>      injected
>>
>>   drivers/scsi/scsi_debug.c | 318 ++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 318 insertions(+)
> 
> Been playing around with this patchset and it seems to work as expected. Took me
> a while to work my way through interface description at the beginning of
>    [PATCH v2 2/6] scsi:scsi_debug: Add interface to manage single device's error inject
> 
> so I cut and paste it into my scsi_debug.html page and did some work on it, see:
>     https://doug-gilbert.github.io/scsi_debug.html
> 
> There is a new chapter titled: Per device error injection
> Kept the ASCII art so it could be ported back to [PATCH v2 2/6]'s description
> if Wenchao is agreeable.
> 

Thank you a lot, I would update the patch's description in next version.

> So for the whole series:
>    Acked-by: Douglas Gilbert <dgilbert@interlog.com>
> 
> 
> One suggestion for later work: perhaps the Command opcode field could be
> expanded to: x8[,x16] so optionally a Service Action (in hex) could be
> given (e.g. '9e,10' for the READ CAPACITY (16) command).
> 
> Doug Gilbert
> 
> 

Would you help me to check if my understanding is correct:

Define Command opcode as x16, split this 16bit as two parts, one for actually
SCSI Command opcode,, another one for Service Action. If so, it would make this
interface complex to use. I want to make it easy and we do not need to calculate
data.

I think there are other methods to support specify a Service Action:

method1. Redefine the General rule format and append Service Action to SCSI
    command opcode as following:

   +--------+------+-------------------------------------------------------+
   | Column | Type | Description                                           |
   +--------+------+-------------------------------------------------------+
   |   1    |  u8  | Error code                                            |
   |        |      |  0: timeout SCSI command                              |
   |        |      |  1: fail queuecommand, make queuecommand return       |
   |        |      |     given value                                       |
   |        |      |  2: fail command, finish command with SCSI status,    |
   |        |      |     sense key and ASC/ASCQ values                     |
   +--------+------+-------------------------------------------------------+
   |   2    |  s32 | Error count                                           |
   |        |      |  0: this rule will be ignored                         |
   |        |      |  positive: the rule will always take effect           |
   |        |      |  negative: the rule takes effect n times where -n is  |
   |        |      |            the value given. Ignored after n times     |
   +--------+------+-------------------------------------------------------+
   |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
   +--------+------+-------------------------------------------------------+
   |   4    |  x8  | specify a Service Action, 0xff for all commands       |
   +--------+------+-------------------------------------------------------+
   |  ...   |  xxx | Error type specific fields                            |
   +--------+------+-------------------------------------------------------+

method2. define new Error code for commands which need a Service Action,
    for example: define Error code 3 as the following format to timeout a
    command commands which need a Service Action:

   +--------+------+-------------------------------------------------------+
   | Column | Type | Description                                           |
   +--------+------+-------------------------------------------------------+
   |   1    |  u8  | Fix to 3                                              |
   +--------+------+-------------------------------------------------------+
   |   2    |  s32 | Error count                                           |
   |        |      |  0: this rule will be ignored                         |
   |        |      |  positive: the rule will always take effect           |
   |        |      |  negative: the rule takes effect n times where -n is  |
   |        |      |            the value given. Ignored after n times     |
   +--------+------+-------------------------------------------------------+
   |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
   +--------+------+-------------------------------------------------------+
   |   4    |  x8  | specify a Service Action, 0xff for all commands       |
   +--------+------+-------------------------------------------------------+

   We can inject timeout error for the READ CAPACITY (16) command with following:
   echo "3 -10 0x9e 0x10" > ${error}