mbox series

[v7,0/8] Fix up and simplify error recovery mechanism

Message ID 1595912460-8860-1-git-send-email-cang@codeaurora.org (mailing list archive)
Headers show
Series Fix up and simplify error recovery mechanism | expand

Message

Can Guo July 28, 2020, 5 a.m. UTC
The changes have been tested with error injections of multiple error types (and
all kinds of mixture of them) during runtime, e.g. hibern8 enter/ exit error,
power mode change error and fatal/non-fatal error from IRQ context. During the
test, error injections happen randomly across all contexts, e.g. clk scaling,
clk gate/ungate, runtime suspend/resume and IRQ.

There are a few more fixes to resolve other minor problems based on the main
change, such as LINERESET handling and racing btw error handler and system
suspend/resume/shutdown, but they will be pushed after this series is taken,
due to there are already too many lines in these changes.

Change since v6:
- Modified change "scsi: ufs-qcom: Fix schedule while atomic error in ufs_qcom_dump_dbg_regs" to "scsi: ufs-qcom: Remove testbus dump in ufs_qcom_dump_dbg_regs"

Change since v5:
- Dropped change "scsi: ufs: Fix imbalanced scsi_block_reqs_cnt caused by ufshcd_hold()" as it is not quite related with this series
- Refined func ufshcd_err_handling_prepare in change "scsi: ufs: Recover hba runtime PM error in error handler"

Change since v4:
- Split the original change "ufs: ufs-qcom: Fix a few BUGs in func ufs_qcom_dump_dbg_regs()" to 2 small changes

Change since v3:
- Split the original change "scsi: ufs: Fix up and simplify error recovery mechanism" into 5 changes

Change since v2:
- Incorporate Bart's comment to change "scsi: ufs: Add checks before setting clk-gating states"
- Revised the commit msg of change "scsi: ufs: Fix up and simplify error recovery mechanism"

Change since v1:
- Fixed a compilation error in case that CONFIG_PM is N

Can Guo (8):
  scsi: ufs: Add checks before setting clk-gating states
  ufs: ufs-qcom: Fix race conditions caused by func
    ufs_qcom_testbus_config
  scsi: ufs-qcom: Remove testbus dump in ufs_qcom_dump_dbg_regs
  scsi: ufs: Add some debug infos to ufshcd_print_host_state
  scsi: ufs: Fix concurrency of error handler and other error recovery
    paths
  scsi: ufs: Recover hba runtime PM error in error handler
  scsi: ufs: Move dumps in IRQ handler to error handler
  scsi: ufs: Fix a racing problem btw error handler and runtime PM ops

 drivers/scsi/ufs/ufs-qcom.c  |  37 ----
 drivers/scsi/ufs/ufs-sysfs.c |   1 +
 drivers/scsi/ufs/ufshcd.c    | 494 +++++++++++++++++++++++++++----------------
 drivers/scsi/ufs/ufshcd.h    |  15 ++
 4 files changed, 324 insertions(+), 223 deletions(-)

Comments

Bean Huo July 28, 2020, 9:44 p.m. UTC | #1
On Mon, 2020-07-27 at 22:00 -0700, Can Guo wrote:
> The changes have been tested with error injections of multiple error
> types (and
> all kinds of mixture of them) during runtime, e.g. hibern8 enter/
> exit error,
> power mode change error and fatal/non-fatal error from IRQ context.
> During the
> test, error injections happen randomly across all contexts, e.g. clk
> scaling,
> clk gate/ungate, runtime suspend/resume and IRQ.
> 
> There are a few more fixes to resolve other minor problems based on
> the main
> change, such as LINERESET handling and racing btw error handler and
> system
> suspend/resume/shutdown, but they will be pushed after this series is
> taken,
> due to there are already too many lines in these changes.
> 
> Change since v6:
> - Modified change "scsi: ufs-qcom: Fix schedule while atomic error in
> ufs_qcom_dump_dbg_regs" to "scsi: ufs-qcom: Remove testbus dump in
> ufs_qcom_dump_dbg_regs"
> 
> Change since v5:
> - Dropped change "scsi: ufs: Fix imbalanced scsi_block_reqs_cnt
> caused by ufshcd_hold()" as it is not quite related with this series
> - Refined func ufshcd_err_handling_prepare in change "scsi: ufs:
> Recover hba runtime PM error in error handler"
> 
> Change since v4:
> - Split the original change "ufs: ufs-qcom: Fix a few BUGs in func
> ufs_qcom_dump_dbg_regs()" to 2 small changes
> 
> Change since v3:
> - Split the original change "scsi: ufs: Fix up and simplify error
> recovery mechanism" into 5 changes
> 
> Change since v2:
> - Incorporate Bart's comment to change "scsi: ufs: Add checks before
> setting clk-gating states"
> - Revised the commit msg of change "scsi: ufs: Fix up and simplify
> error recovery mechanism"
> 
> Change since v1:
> - Fixed a compilation error in case that CONFIG_PM is N
> 
> Can Guo (8):
>   scsi: ufs: Add checks before setting clk-gating states
>   ufs: ufs-qcom: Fix race conditions caused by func
>     ufs_qcom_testbus_config
>   scsi: ufs-qcom: Remove testbus dump in ufs_qcom_dump_dbg_regs
>   scsi: ufs: Add some debug infos to ufshcd_print_host_state
>   scsi: ufs: Fix concurrency of error handler and other error
> recovery
>     paths
>   scsi: ufs: Recover hba runtime PM error in error handler
>   scsi: ufs: Move dumps in IRQ handler to error handler
>   scsi: ufs: Fix a racing problem btw error handler and runtime PM
> ops
> 
>  drivers/scsi/ufs/ufs-qcom.c  |  37 ----
>  drivers/scsi/ufs/ufs-sysfs.c |   1 +
>  drivers/scsi/ufs/ufshcd.c    | 494 +++++++++++++++++++++++++++----
> ------------
>  drivers/scsi/ufs/ufshcd.h    |  15 ++
>  4 files changed, 324 insertions(+), 223 deletions(-)
> 
This series looks good to me.
Reviewed-by: Bean Huo <beanhuo@micron.com>

Thanks,
Bean