Message ID | 20210107185316.788815-3-jaegeuk@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | Two UFS fixes | expand |
On 2021-01-08 02:53, Jaegeuk Kim wrote: > From: Jaegeuk Kim <jaegeuk@google.com> > > When non-fatal error like line-reset happens, ufshcd_err_handler() > starts to > abort tasks by ufshcd_try_to_abort_task(). When it tries to issue tm > request, > we've hit two warnings. > > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 > blk_get_request+0x68/0x70 > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 > blk_mq_get_tag+0x438/0x46c > > After fixing the above warnings, I've hit another tm_cmd timeout, which > may be > caused by unstable controller state. > > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out > > Then, ufshcd_err_handler() enters full reset, and I hit kernel stuck. > It turned > out ufshcd_print_trs() printed too many messages in console which > requires CPU > locks. Likewise hba->silence_err_logs, we need to avoid too verbose > messages. > Actually it came from ufshcd_transfer_rsp_status() when requeuing > commands back. > Indeed, this is actually not an error case, so let's fix it. > > Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to > allocate and free TMFs") > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> It is really good to find out the root cause! Thanks for the fix. Reviewed-by: Can Guo <cang@codeaurora.org> > --- > drivers/scsi/ufs/ufshcd.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > index e6e7bdf99cd7..2a715f13fe1d 100644 > --- a/drivers/scsi/ufs/ufshcd.c > +++ b/drivers/scsi/ufs/ufshcd.c > @@ -4996,7 +4996,8 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, > struct ufshcd_lrb *lrbp) > break; > } /* end of switch */ > > - if ((host_byte(result) != DID_OK) && !hba->silence_err_logs) > + if ((host_byte(result) != DID_OK) && > + (host_byte(result) != DID_REQUEUE) && !hba->silence_err_logs) > ufshcd_print_trs(hba, 1 << lrbp->task_tag, true); > return result; > } > @@ -6302,9 +6303,13 @@ static irqreturn_t ufshcd_intr(int irq, void > *__hba) > intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS); > } > > - if (enabled_intr_status && retval == IRQ_NONE) { > - dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n", > - __func__, intr_status); > + if (enabled_intr_status && retval == IRQ_NONE && > + !ufshcd_eh_in_progress(hba)) { > + dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, > 0x%08x)\n", > + __func__, > + intr_status, > + hba->ufs_stats.last_intr_status, > + enabled_intr_status); > ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: "); > } > > @@ -6348,7 +6353,10 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba > *hba, > * Even though we use wait_event() which sleeps indefinitely, > * the maximum wait time is bounded by %TM_CMD_TIMEOUT. > */ > - req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); > + req = blk_get_request(q, REQ_OP_DRV_OUT, 0); > + if (IS_ERR(req)) > + return PTR_ERR(req); > + > req->end_io_data = &wait; > free_slot = req->tag; > WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index e6e7bdf99cd7..2a715f13fe1d 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -4996,7 +4996,8 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) break; } /* end of switch */ - if ((host_byte(result) != DID_OK) && !hba->silence_err_logs) + if ((host_byte(result) != DID_OK) && + (host_byte(result) != DID_REQUEUE) && !hba->silence_err_logs) ufshcd_print_trs(hba, 1 << lrbp->task_tag, true); return result; } @@ -6302,9 +6303,13 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba) intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS); } - if (enabled_intr_status && retval == IRQ_NONE) { - dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n", - __func__, intr_status); + if (enabled_intr_status && retval == IRQ_NONE && + !ufshcd_eh_in_progress(hba)) { + dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, 0x%08x)\n", + __func__, + intr_status, + hba->ufs_stats.last_intr_status, + enabled_intr_status); ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: "); } @@ -6348,7 +6353,10 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, * Even though we use wait_event() which sleeps indefinitely, * the maximum wait time is bounded by %TM_CMD_TIMEOUT. */ - req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); + req = blk_get_request(q, REQ_OP_DRV_OUT, 0); + if (IS_ERR(req)) + return PTR_ERR(req); + req->end_io_data = &wait; free_slot = req->tag; WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);