Message ID | 1623300218-9454-5-git-send-email-cang@codeaurora.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Complementary changes for error handling | expand |
On 6/9/21 9:43 PM, Can Guo wrote: > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > index 0c9d2ee..7dc0fda 100644 > --- a/drivers/scsi/ufs/ufshcd.c > +++ b/drivers/scsi/ufs/ufshcd.c > @@ -2758,6 +2758,16 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) > goto out; > } > > + if (unlikely(test_bit(tag, &hba->outstanding_reqs))) { > + if (hba->wl_pm_op_in_progress) { > + set_host_byte(cmd, DID_BAD_TARGET); > + cmd->scsi_done(cmd); > + } else { > + err = SCSI_MLQUEUE_HOST_BUSY; > + } > + goto out; > + } > + > hba->req_abort_count = 0; > > err = ufshcd_hold(hba, true); > @@ -2768,15 +2778,6 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) > WARN_ON(ufshcd_is_clkgating_allowed(hba) && > (hba->clk_gating.state != CLKS_ON)); > > - if (unlikely(test_bit(tag, &hba->outstanding_reqs))) { > - if (hba->wl_pm_op_in_progress) > - set_host_byte(cmd, DID_BAD_TARGET); > - else > - err = SCSI_MLQUEUE_HOST_BUSY; > - ufshcd_release(hba); > - goto out; > - } > - > lrbp = &hba->lrb[tag]; > WARN_ON(lrbp->cmd); > lrbp->cmd = cmd; Can the code under "if (unlikely(test_bit(tag, &hba->outstanding_reqs)))" be deleted instead of moving it? I don't think that it is useful to verify whether the block layer tag allocator works correctly. Additionally, I'm not aware of any similar code in any other SCSI LLD. Thanks, Bart.
On 2021-06-12 04:52, Bart Van Assche wrote: > On 6/9/21 9:43 PM, Can Guo wrote: >> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >> index 0c9d2ee..7dc0fda 100644 >> --- a/drivers/scsi/ufs/ufshcd.c >> +++ b/drivers/scsi/ufs/ufshcd.c >> @@ -2758,6 +2758,16 @@ static int ufshcd_queuecommand(struct Scsi_Host >> *host, struct scsi_cmnd *cmd) >> goto out; >> } >> >> + if (unlikely(test_bit(tag, &hba->outstanding_reqs))) { >> + if (hba->wl_pm_op_in_progress) { >> + set_host_byte(cmd, DID_BAD_TARGET); >> + cmd->scsi_done(cmd); >> + } else { >> + err = SCSI_MLQUEUE_HOST_BUSY; >> + } >> + goto out; >> + } >> + >> hba->req_abort_count = 0; >> >> err = ufshcd_hold(hba, true); >> @@ -2768,15 +2778,6 @@ static int ufshcd_queuecommand(struct Scsi_Host >> *host, struct scsi_cmnd *cmd) >> WARN_ON(ufshcd_is_clkgating_allowed(hba) && >> (hba->clk_gating.state != CLKS_ON)); >> >> - if (unlikely(test_bit(tag, &hba->outstanding_reqs))) { >> - if (hba->wl_pm_op_in_progress) >> - set_host_byte(cmd, DID_BAD_TARGET); >> - else >> - err = SCSI_MLQUEUE_HOST_BUSY; >> - ufshcd_release(hba); >> - goto out; >> - } >> - >> lrbp = &hba->lrb[tag]; >> WARN_ON(lrbp->cmd); >> lrbp->cmd = cmd; > > Can the code under "if (unlikely(test_bit(tag, > &hba->outstanding_reqs)))" be deleted instead of moving it? I don't > think that it is useful to verify whether the block layer tag allocator > works correctly. Additionally, I'm not aware of any similar code in any > other SCSI LLD. > ufshcd_abort() aborts PM requests differently from other requests - it simply evicts the cmd from lrbp [1], schedules error handler and returns SUCCESS (the reason why I am doing it this way is in patch #8). After ufshcd_abort() returns, the tag shall be released, the logic here is to prevent subsequent cmds re-use the lrbp [1] before error handler recovers the device and host. Thanks, Can Guo. > Thanks, > > Bart.
On 6/12/21 12:38 AM, Can Guo wrote: > On 2021-06-12 04:52, Bart Van Assche wrote: >> On 6/9/21 9:43 PM, Can Guo wrote: >>> @@ -2768,15 +2778,6 @@ static int ufshcd_queuecommand(struct >>> Scsi_Host *host, struct scsi_cmnd *cmd) >>> WARN_ON(ufshcd_is_clkgating_allowed(hba) && >>> (hba->clk_gating.state != CLKS_ON)); >>> >>> - if (unlikely(test_bit(tag, &hba->outstanding_reqs))) { >>> - if (hba->wl_pm_op_in_progress) >>> - set_host_byte(cmd, DID_BAD_TARGET); >>> - else >>> - err = SCSI_MLQUEUE_HOST_BUSY; >>> - ufshcd_release(hba); >>> - goto out; >>> - } >>> - >>> lrbp = &hba->lrb[tag]; >>> WARN_ON(lrbp->cmd); >>> lrbp->cmd = cmd; >> >> Can the code under "if (unlikely(test_bit(tag, >> &hba->outstanding_reqs)))" be deleted instead of moving it? I don't >> think that it is useful to verify whether the block layer tag allocator >> works correctly. Additionally, I'm not aware of any similar code in any >> other SCSI LLD. > > ufshcd_abort() aborts PM requests differently from other requests - > it simply evicts the cmd from lrbp [1], schedules error handler and > returns SUCCESS (the reason why I am doing it this way is in patch #8). > > After ufshcd_abort() returns, the tag shall be released, the logic > here is to prevent subsequent cmds re-use the lrbp [1] before error > handler recovers the device and host. Thanks for the background information. However, this approach sounds cumbersome to me. For PM requests, please change the UFS driver such that calling scsi_done() for aborted requests is postponed until error handling has finished and delete the code shown above from ufshcd_queuecommand(). Thanks, Bart.
On 2021-06-12 23:50, Bart Van Assche wrote: > On 6/12/21 12:38 AM, Can Guo wrote: >> On 2021-06-12 04:52, Bart Van Assche wrote: >>> On 6/9/21 9:43 PM, Can Guo wrote: >>>> @@ -2768,15 +2778,6 @@ static int ufshcd_queuecommand(struct >>>> Scsi_Host *host, struct scsi_cmnd *cmd) >>>> WARN_ON(ufshcd_is_clkgating_allowed(hba) && >>>> (hba->clk_gating.state != CLKS_ON)); >>>> >>>> - if (unlikely(test_bit(tag, &hba->outstanding_reqs))) { >>>> - if (hba->wl_pm_op_in_progress) >>>> - set_host_byte(cmd, DID_BAD_TARGET); >>>> - else >>>> - err = SCSI_MLQUEUE_HOST_BUSY; >>>> - ufshcd_release(hba); >>>> - goto out; >>>> - } >>>> - >>>> lrbp = &hba->lrb[tag]; >>>> WARN_ON(lrbp->cmd); >>>> lrbp->cmd = cmd; >>> >>> Can the code under "if (unlikely(test_bit(tag, >>> &hba->outstanding_reqs)))" be deleted instead of moving it? I don't >>> think that it is useful to verify whether the block layer tag >>> allocator >>> works correctly. Additionally, I'm not aware of any similar code in >>> any >>> other SCSI LLD. >> >> ufshcd_abort() aborts PM requests differently from other requests - >> it simply evicts the cmd from lrbp [1], schedules error handler and >> returns SUCCESS (the reason why I am doing it this way is in patch >> #8). >> >> After ufshcd_abort() returns, the tag shall be released, the logic >> here is to prevent subsequent cmds re-use the lrbp [1] before error >> handler recovers the device and host. > > Thanks for the background information. However, this approach sounds > cumbersome to me. For PM requests, please change the UFS driver such > that calling scsi_done() for aborted requests is postponed until error > handling has finished and delete the code shown above from > ufshcd_queuecommand(). I will delete the code in next version, since I believe the hba_state checks before the code is enough to achieve the same purpose, so this code becomes redundant. Thanks, Can Guo. > > Thanks, > > Bart.
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 0c9d2ee..7dc0fda 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -2758,6 +2758,16 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) goto out; } + if (unlikely(test_bit(tag, &hba->outstanding_reqs))) { + if (hba->wl_pm_op_in_progress) { + set_host_byte(cmd, DID_BAD_TARGET); + cmd->scsi_done(cmd); + } else { + err = SCSI_MLQUEUE_HOST_BUSY; + } + goto out; + } + hba->req_abort_count = 0; err = ufshcd_hold(hba, true); @@ -2768,15 +2778,6 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) WARN_ON(ufshcd_is_clkgating_allowed(hba) && (hba->clk_gating.state != CLKS_ON)); - if (unlikely(test_bit(tag, &hba->outstanding_reqs))) { - if (hba->wl_pm_op_in_progress) - set_host_byte(cmd, DID_BAD_TARGET); - else - err = SCSI_MLQUEUE_HOST_BUSY; - ufshcd_release(hba); - goto out; - } - lrbp = &hba->lrb[tag]; WARN_ON(lrbp->cmd); lrbp->cmd = cmd;
Commit 7a7e66c65d4148fc3f23b058405bc9f102414fcb ("scsi: ufs: Fix a race condition between ufshcd_abort() and eh_work()") forgot to complete the cmd, which takes an occupied lrb, before returning in queuecommand. This change adds the missing codes. Fixes: 7a7e66c65d414 ("scsi: ufs: Fix a race condition between ufshcd_abort() and eh_work()") Signed-off-by: Can Guo <cang@codeaurora.org> --- drivers/scsi/ufs/ufshcd.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)