mbox series

[v12,00/20] scsi: Allow scsi_execute users to request retries

Message ID 20231114013750.76609-1-michael.christie@oracle.com (mailing list archive)
Headers show
Series scsi: Allow scsi_execute users to request retries | expand

Message

Mike Christie Nov. 14, 2023, 1:37 a.m. UTC
The following patches were made over Martin's 6.7 staging branch
which contains a sd change that this patch requires.

The patches allow scsi_execute_cmd users to have scsi-ml retry the
cmd for it instead of the caller having to parse the error and loop
itself.

Note that I dropped most reviewed-by tags, because the structs changed
where before we only had struct scsi_failure we now have the struct
scsi_failure that is in:

struct scsi_failures {
        /*
         * If the failure does not have a specific limit in the scsi_failure
         * then this limit is followed.
         */
        int total_allowed;
        int total_retries;
        struct scsi_failure *failure_definitions;
};

so we can limit the total number of retries. The setup for this is
then different and I'm not sure if we everyone will like it. The
other parts of the conversions patches have not changed.


 drivers/scsi/Kconfig                        |   9 +
 drivers/scsi/ch.c                           |  27 ++-
 drivers/scsi/device_handler/scsi_dh_hp_sw.c |  49 +++--
 drivers/scsi/device_handler/scsi_dh_rdac.c  |  84 +++----
 drivers/scsi/scsi_lib.c                     |  26 ++-
 drivers/scsi/scsi_lib_test.c                | 330 ++++++++++++++++++++++++++++
 drivers/scsi/scsi_scan.c                    | 107 +++++----
 drivers/scsi/scsi_transport_spi.c           |  35 +--
 drivers/scsi/sd.c                           | 326 +++++++++++++++++----------
 drivers/scsi/ses.c                          |  66 ++++--
 drivers/scsi/sr.c                           |  38 ++--
 drivers/ufs/core/ufshcd.c                   |  22 +-
 12 files changed, 822 insertions(+), 297 deletions(-)

v12:
- Fix bug where a user could request no retries and skip going
through the scsi-eh.
- Drop support for allowing caller to have scsi-ml not retry
failures (we only allow caller to request retries).
- Fix formatting.
- Add support to control total number of retries.
- Fix kunit tests to add a missing test and comments.
- Fix missing SCMD_FAILURE_ASCQ_ANY in sd_spinup_disk.

v11:
- Document scsi_failure.result special values
- Fix sshdr fix git commit message where there was a missing word
- Use designated initializers for cdb setup
- Fix up various coding style comments from John like redoing if/else
error/success checks.
- Add patch to fix rdac issue where stale SCSH_DH values were returned
- Remove old comment from:
"[PATCH v10 16/33] scsi: spi: Have scsi-ml retry spi_execute errors"
- Drop EOPNOTSUPP use from:
"[PATCH v10 17/33] scsi: sd: Fix sshdr use in sd_suspend_common"
- Init errno to 0 when declared in:
"[PATCH v10 20/33] scsi: ch: Have scsi-ml retry ch_do_scsi errors"
- Add diffstat below

v10:
- Drop "," after {}.
- Hopefully fix outlook issues.

v9:
- Drop spi_execute changes from [PATCH] scsi: spi: Fix sshdr use
- Change git commit message for sshdr use fixes.

v8:
- Rebase.
- Saw the discussion about the possible bug where callers are
accessing the sshdr when it's not setup, so I added some patches
for that since I was going over the same code.

v7:
- Rebase against scsi_execute_cmd patchset.

v6:
- Fix kunit build when it's built as a module but scsi is not.
- Drop [PATCH 17/35] scsi: ufshcd: Convert to scsi_exec_req because
  that driver no longer uses scsi_execute.
- Convert ufshcd to use the scsi_failures struct because it now just does
  direct retries and does not do it's own deadline timing.
- Add back memset in read_capacity_16.
- Remove memset in get_sectorsize and replace with { } to init buf.

v5:
- Fix spelling (made sure I ran checkpatch strict)
- Drop SCMD_FAILURE_NONE
- Rename SCMD_FAILURE_ANY
- Fix media_not_present handling where it was being retried instead of
failed.
- Fix ILLEGAL_REQUEST handling in read_capacity_16 so it was not retried.
- Fix coding style, spelling and and naming convention in kunit and added
  more tests to handle cases like the media_not_present one where we want
  to force failures instead of retries.
- Drop cxlflash patch because it actually checked it's internal state before
  performing a retry which we currently do not support.

v4:
- Redefine cmd definitions if the cmd is touched.
- Fix up coding style issues.
- Use sam_status enum.
- Move failures initialization to scsi_initialize_rq
(also fixes KASAN error).
- Add kunit test.
- Add function comments.

v3:
- Use a for loop in scsi_check_passthrough
- Fix result handling/testing.
- Fix scsi_status_is_good handling.
- make __scsi_exec_req take a const arg
- Fix formatting in patch 24

v2:
- Rename scsi_prep_sense
- Change scsi_check_passthrough's loop and added some fixes
- Modified scsi_execute* so it uses a struct to pass in args

Comments

John Garry Nov. 16, 2023, 11:02 a.m. UTC | #1
On 14/11/2023 01:37, Mike Christie wrote:
> The following patches were made over Martin's 6.7 staging branch
> which contains a sd change that this patch requires.

Hi Mike,

It would be nice if we could apply to 6.8 staging - was the sd.c change 
not in v6.7-rc1?

Cheers,
John
Mike Christie Nov. 16, 2023, 4:29 p.m. UTC | #2
On 11/16/23 5:02 AM, John Garry wrote:
> On 14/11/2023 01:37, Mike Christie wrote:
>> The following patches were made over Martin's 6.7 staging branch
>> which contains a sd change that this patch requires.
> 
> Hi Mike,
> 
> It would be nice if we could apply to 6.8 staging - was the sd.c change not in v6.7-rc1?
> 

Staging didn't exist when I posted this.

It exists now, but is missing:

https://lore.kernel.org/linux-scsi/yq1zfzn4p8a.fsf@ca-mkp.ca.oracle.com/T/#u

so will not apply to 6.8.

I was actually shooting for 6.9 so I'll figure it out by then.