diff mbox series

[v2] scsi: ufs: core: fix racing issue during ufshcd_mcq_abort

Message ID 20231114043704.52525-1-hy50.seo@samsung.com (mailing list archive)
State Superseded
Headers show
Series [v2] scsi: ufs: core: fix racing issue during ufshcd_mcq_abort | expand

Commit Message

SEO HOYOUNG Nov. 14, 2023, 4:37 a.m. UTC
If cq complete irq raise during abort processing,
the command has already been complete.
So could not get utag to erase cmd like below log.
Because the cmd that was handling abort has already been completed

ufshcd_try_to_abort_task: cmd pending in the device. tag = 25
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000194
Mem abort info:
ESR = 0x0000000096000006
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x06: level 2 translation fault
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0

pc : blk_mq_unique_tag+0x8/0x14
lr : ufshcd_mcq_sq_cleanup+0x6c/0x1b8
sp : ffffffc03e3b3b10
x29: ffffffc03e3b3b10 x28: 0000000000000001 x27: ffffff8830b34f68
x26: ffffff8830b34f6c x25: ffffff8830b34040 x24: 0000000000000000
x23: 0000000000000f18 x22: ffffffc03e3b3bb8 x21: 0000000000000019
x20: 0000000000000019 x19: ffffff8830b309b0 x18: ffffffc00a1b5380
x17: 00000000529c6ef0 x16: 00000000529c6ef0 x15: 0000000000000000
x14: 0000000000000010 x13: 0000000000000032 x12: 0000001169e8a5bc
x11: 0000000000000001 x10: ffffff885dfc1588 x9 : 0000000000000019
x8 : 0000000000000000 x7 : 0000000000000001 x6 : fffffffdef706f28
x5 : 000000000000283d x4 : 0000000000000001 x3 : 0000000000000000
x2 : 0000000000000003 x1 : 0000000000000019 x0 : ffffff8855781200
Call trace:
blk_mq_unique_tag+0x8/0x14
ufshcd_clear_cmd+0x34/0x118
ufshcd_try_to_abort_task+0x1c4/0x4b0
ufshcd_err_handler+0x8d0/0xd24
process_one_work+0x1e4/0x43c
worker_thread+0x25c/0x430
kthread+0x104/0x1d4
ret_from_fork+0x10/0x20

v1 -> v2: fix build error

Signed-off-by: SEO HOYOUNG <hy50.seo@samsung.com>
---
 drivers/ufs/core/ufshcd.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Bart Van Assche Nov. 14, 2023, 5:54 p.m. UTC | #1
On 11/13/23 20:37, SEO HOYOUNG wrote:
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 9a730a794b66..55f4c0eeda7b 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -7574,6 +7574,10 @@ int ufshcd_try_to_abort_task(struct ufs_hba *hba, int tag)
>   		goto out;
>   	}
>   
> +	if (!ufshcd_cmd_inflight(lrbp->cmd) ||
> +	    test_bit(SCMD_STATE_COMPLETE, &lrbp->cmd->state))
> +		goto out;
> +
>   	err = ufshcd_clear_cmd(hba, tag);
>   	if (err)
>   		dev_err(hba->dev, "%s: Failed clearing cmd at tag %d, err %d\n",

The UFS driver must not set SCMD_STATE_COMPLETE itself. Only the SCSI core is
allowed to set that bit.

Thanks,

Bart.
SEO HOYOUNG Nov. 15, 2023, 5:27 a.m. UTC | #2
> -----Original Message-----
> From: Bart Van Assche <bvanassche@acm.org>
> Sent: Wednesday, November 15, 2023 2:55 AM
> To: SEO HOYOUNG <hy50.seo@samsung.com>; linux-scsi@vger.kernel.org; linux-
> kernel@vger.kernel.org; alim.akhtar@samsung.com; avri.altman@wdc.com;
> jejb@linux.ibm.com; martin.petersen@oracle.com; beanhuo@micron.com;
> kwangwon.min@samsung.com; kwmad.kim@samsung.com; sh425.lee@samsung.com;
> sc.suh@samsung.com; quic_nguyenb@quicinc.com; cpgs@samsung.com;
> grant.jung@samsung.com; junwoo80.lee@samsung.com
> Subject: Re: [PATCH v2] scsi: ufs: core: fix racing issue during
> ufshcd_mcq_abort
> 
> On 11/13/23 20:37, SEO HOYOUNG wrote:
> > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> > index 9a730a794b66..55f4c0eeda7b 100644
> > --- a/drivers/ufs/core/ufshcd.c
> > +++ b/drivers/ufs/core/ufshcd.c
> > @@ -7574,6 +7574,10 @@ int ufshcd_try_to_abort_task(struct ufs_hba *hba,
> int tag)
> >   		goto out;
> >   	}
> >
> > +	if (!ufshcd_cmd_inflight(lrbp->cmd) ||
> > +	    test_bit(SCMD_STATE_COMPLETE, &lrbp->cmd->state))
> > +		goto out;
> > +
> >   	err = ufshcd_clear_cmd(hba, tag);
> >   	if (err)
> >   		dev_err(hba->dev, "%s: Failed clearing cmd at tag %d,
> err %d\n",
> 
> The UFS driver must not set SCMD_STATE_COMPLETE itself. Only the SCSI core
> is allowed to set that bit.
> 
> Thanks,
> 
> Bart.

Hi,

The test_bit() function just check SCMD_STATE_COMPLETE bit state.
Do not set SCMD_STATE_COMPLETE field. 
This function is also used in ufshcd_mcq_compl_pending_transfer() 
to check the status of cmd.

Thanks.
BRs SEO.
Bart Van Assche Nov. 15, 2023, 6:22 p.m. UTC | #3
On 11/14/23 21:27, hoyoung seo wrote:
> The test_bit() function just check SCMD_STATE_COMPLETE bit state.
> Do not set SCMD_STATE_COMPLETE field.
> This function is also used in ufshcd_mcq_compl_pending_transfer()
> to check the status of cmd.

Right, I misread your patch. Can you please take a look at the following
patch and verify whether it perhaps addresses the same issue that you want
to address: https://lore.kernel.org/linux-scsi/20231115131024.15829-1-peter.wang@mediatek.com/

Thanks,

Bart.
SEO HOYOUNG Nov. 16, 2023, 1:56 a.m. UTC | #4
> -----Original Message-----
> From: Bart Van Assche <bvanassche@acm.org>
> Sent: Thursday, November 16, 2023 3:22 AM
> To: hoyoung seo <hy50.seo@samsung.com>; linux-scsi@vger.kernel.org; linux-
> kernel@vger.kernel.org; alim.akhtar@samsung.com; avri.altman@wdc.com;
> jejb@linux.ibm.com; martin.petersen@oracle.com; beanhuo@micron.com;
> kwangwon.min@samsung.com; kwmad.kim@samsung.com; sh425.lee@samsung.com;
> sc.suh@samsung.com; quic_nguyenb@quicinc.com; cpgs@samsung.com;
> grant.jung@samsung.com; junwoo80.lee@samsung.com
> Subject: Re: [PATCH v2] scsi: ufs: core: fix racing issue during
> ufshcd_mcq_abort
> 
> On 11/14/23 21:27, hoyoung seo wrote:
> > The test_bit() function just check SCMD_STATE_COMPLETE bit state.
> > Do not set SCMD_STATE_COMPLETE field.
> > This function is also used in ufshcd_mcq_compl_pending_transfer()
> > to check the status of cmd.
> 
> Right, I misread your patch. Can you please take a look at the following
> patch and verify whether it perhaps addresses the same issue that you want
> to address: https://lore.kernel.org/linux-scsi/20231115131024.15829-1-
> peter.wang@mediatek.com/
> 
> Thanks,
> 
> Bart.

Hi, Bart.

It seem that not same issue.
That problem is that the try_to_abort_task() function has already been 
processed.
So clear_cmd() has been processed.

However, this issue was already completed IO request before executing clear_cmd().
So can't get utag to clean up sq.
This patch to prevent access to utag for cmd that has already been completed.

Thanks
Seo.
Bart Van Assche Nov. 16, 2023, 9:34 p.m. UTC | #5
On 11/13/23 20:37, SEO HOYOUNG wrote:
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 9a730a794b66..55f4c0eeda7b 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -7574,6 +7574,10 @@ int ufshcd_try_to_abort_task(struct ufs_hba *hba, int tag)
>   		goto out;
>   	}
>   
> +	if (!ufshcd_cmd_inflight(lrbp->cmd) ||
> +	    test_bit(SCMD_STATE_COMPLETE, &lrbp->cmd->state))
> +		goto out;
> +
>   	err = ufshcd_clear_cmd(hba, tag);
>   	if (err)
>   		dev_err(hba->dev, "%s: Failed clearing cmd at tag %d, err %d\n",

I think that the above approach is racy: the state of lrbp->cmd can
change after it has been checked and before ufshcd_clear_cmd() is called.

Thanks,

Bart.
diff mbox series

Patch

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 9a730a794b66..55f4c0eeda7b 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -7574,6 +7574,10 @@  int ufshcd_try_to_abort_task(struct ufs_hba *hba, int tag)
 		goto out;
 	}
 
+	if (!ufshcd_cmd_inflight(lrbp->cmd) ||
+	    test_bit(SCMD_STATE_COMPLETE, &lrbp->cmd->state))
+		goto out;
+
 	err = ufshcd_clear_cmd(hba, tag);
 	if (err)
 		dev_err(hba->dev, "%s: Failed clearing cmd at tag %d, err %d\n",