diff mbox series

[v1] scsi: ufs: core: Process abort completed command in MCQ mode

Message ID 20231101084504.79087-1-hy50.seo@samsung.com (mailing list archive)
State Superseded
Headers show
Series [v1] scsi: ufs: core: Process abort completed command in MCQ mode | expand

Commit Message

SEO HOYOUNG Nov. 1, 2023, 8:45 a.m. UTC
In MCQ mode, the case where OCS is updated to aborted is as follows
 1. when abort processing is completed
 2. When a duplicate command occurs

In case of 1 situation, cmd should be re-request.
So in the case of cmd whose abort processing is completed in MCQ mode,
the ufs driver needs to update to scsi with DID_REQUEUE.

Signed-off-by: SEO HOYOUNG <hy50.seo@samsung.com>
---
 drivers/ufs/core/ufshcd.c | 9 ++++++++-
 include/ufs/ufshci.h      | 1 +
 2 files changed, 9 insertions(+), 1 deletion(-)

Comments

Bart Van Assche Nov. 1, 2023, 4:39 p.m. UTC | #1
On 11/1/23 01:45, SEO HOYOUNG wrote:
> In MCQ mode, the case where OCS is updated to aborted is as follows
>   1. when abort processing is completed
>   2. When a duplicate command occurs

What is a "duplicate command"? The UFSHCI driver guarantees that each
SCSI command has a unique tag.

> In case of 1 situation, cmd should be re-request.

It should be resubmitted by the SCSI error handler. The UFSHCI driver
does not have to request this explicitly. See also the code at the end
of scmd_eh_abort_handler().

>   	case OCS_ABORTED:
> -		result |= DID_ABORT << 16;
> +		if (cqe)
> +			eec = le32_to_cpu(cqe->status) & MASK_EEC;
> +
> +		if (is_mcq_enabled(hba) && !eec)
> +			result |= DID_REQUEUE << 16;
> +		else
> +			result |= DID_ABORT << 16;
>   		break;

I don't think this change is necessary. Additionally, introducing
different behavior for MCQ compared to legacy mode in this code path is
suspicious. Why should how commands are queued affect how aborts are
processed?

Thanks,

Bart.
SEO HOYOUNG Nov. 2, 2023, 4:07 a.m. UTC | #2
> -----Original Message-----
> From: Bart Van Assche <bvanassche@acm.org>
> Sent: Thursday, November 2, 2023 1:39 AM
> To: SEO HOYOUNG <hy50.seo@samsung.com>; linux-scsi@vger.kernel.org; linux-
> kernel@vger.kernel.org; alim.akhtar@samsung.com; avri.altman@wdc.com;
> jejb@linux.ibm.com; martin.petersen@oracle.com; beanhuo@micron.com;
> kwangwon.min@samsung.com; kwmad.kim@samsung.com; sh425.lee@samsung.com;
> sc.suh@samsung.com; quic_nguyenb@quicinc.com; cpgs@samsung.com
> Subject: Re: [PATCH v1] scsi: ufs: core: Process abort completed command
> in MCQ mode
> 
> On 11/1/23 01:45, SEO HOYOUNG wrote:
> > In MCQ mode, the case where OCS is updated to aborted is as follows
> >   1. when abort processing is completed
> >   2. When a duplicate command occurs
> 
> What is a "duplicate command"? The UFSHCI driver guarantees that each SCSI
> command has a unique tag.
> 
> > In case of 1 situation, cmd should be re-request.
> 
> It should be resubmitted by the SCSI error handler. The UFSHCI driver does
> not have to request this explicitly. See also the code at the end of
> scmd_eh_abort_handler().
> 
> >   	case OCS_ABORTED:
> > -		result |= DID_ABORT << 16;
> > +		if (cqe)
> > +			eec = le32_to_cpu(cqe->status) & MASK_EEC;
> > +
> > +		if (is_mcq_enabled(hba) && !eec)
> > +			result |= DID_REQUEUE << 16;
> > +		else
> > +			result |= DID_ABORT << 16;
> >   		break;
> 
> I don't think this change is necessary. Additionally, introducing
> different behavior for MCQ compared to legacy mode in this code path is
> suspicious. Why should how commands are queued affect how aborts are
> processed?
> 
> Thanks,
> 
> Bart.

when the ufs host receives any error, the ufs driver executes the error-hander.
If the error-hendler attempts re-init, it must abort and organize unprocessed
 requests.
The above operation is the same for both MCQ/legacy mode.
However, in the MCQ mode, if b or c is included in the following specs, 
the OCS is updated to aborted, which is different from the legacy mode.

B. If the command is in the Submission Queue and not issued to the device yet, 
the host controller will mark the command to be skipped in the Submission Queue.
The host controller will post to the Completion Queue to update the OCS field 
with ‘ABORTED’.
C. If the command is issued to the device already but there is no response yet 
from the device, the host software driver issue the Abort task management function
 to the device for that command.
Then the host driver set SQRTCy.ICU as ‘1’ to initiate the clean up the hardware 
resources. The host controller will post to the Completion Queue to update the OCS
 field with ‘ABORTED’.

Unlike legacy mode, this phenomenon causes unintended behavior. (As shown in the log below)

[1:  kworker/u20:2:23157] ufshcd_try_to_abort_task: cmd pending in the device. tag = 9
[3:  kworker/u20:2:23157] Aborting tag 9 / CDB 0x2a succeeded
[4:      swapper/4:    0] sd 0:0:0:0: [sda] tag#9 UNKNOWN(0x2003) Result: hostbyte=0x05 driverbyte=DRIVER_OK cmd_age=0s // DID_ABORT
[4:      swapper/4:    0] sd 0:0:0:0: [sda] tag#9 CDB: opcode=0x2a 2a 00 00 d3 02 00 00 01 00 00
[4:      swapper/4:    0] I/O error, dev sda, sector 110628864 op 0x1:(WRITE) flags 0x800 phys_seg 256 prio class 2


For commands that have completed the abort operation in MCQ mode,
since OCS has been updated to aborted, it seems that it will be retransmitted only
 when it is made to REQUEUE.

Thanks.

SEO.
Bart Van Assche Nov. 2, 2023, 7:36 p.m. UTC | #3
On 11/1/23 21:07, hoyoung seo wrote:
> when the ufs host receives any error, the ufs driver executes the error-hander.
> If the error-hendler attempts re-init, it must abort and organize unprocessed
>   requests.
> The above operation is the same for both MCQ/legacy mode.
> However, in the MCQ mode, if b or c is included in the following specs,
> the OCS is updated to aborted, which is different from the legacy mode.
> 
> B. If the command is in the Submission Queue and not issued to the device yet,
> the host controller will mark the command to be skipped in the Submission Queue.
> The host controller will post to the Completion Queue to update the OCS field
> with ‘ABORTED’.
> C. If the command is issued to the device already but there is no response yet
> from the device, the host software driver issue the Abort task management function
>   to the device for that command.
> Then the host driver set SQRTCy.ICU as ‘1’ to initiate the clean up the hardware
> resources. The host controller will post to the Completion Queue to update the OCS
>   field with ‘ABORTED’.
> 
> Unlike legacy mode, this phenomenon causes unintended behavior. (As shown in the log below)
> 
> [1:  kworker/u20:2:23157] ufshcd_try_to_abort_task: cmd pending in the device. tag = 9
> [3:  kworker/u20:2:23157] Aborting tag 9 / CDB 0x2a succeeded
> [4:      swapper/4:    0] sd 0:0:0:0: [sda] tag#9 UNKNOWN(0x2003) Result: hostbyte=0x05 driverbyte=DRIVER_OK cmd_age=0s // DID_ABORT
> [4:      swapper/4:    0] sd 0:0:0:0: [sda] tag#9 CDB: opcode=0x2a 2a 00 00 d3 02 00 00 01 00 00
> [4:      swapper/4:    0] I/O error, dev sda, sector 110628864 op 0x1:(WRITE) flags 0x800 phys_seg 256 prio class 2
> 
> 
> For commands that have completed the abort operation in MCQ mode,
> since OCS has been updated to aborted, it seems that it will be retransmitted only
>   when it is made to REQUEUE.

Hi Hoyoung,

Thank you for having provided this clarification - this really helps.

Regarding (B): I would appreciate it if this patch would be reworked
such that no new 'if (is_mcq_enabled(hba))' statements are introduced.
Has it been considered to modify ufshcd_mcq_sqe_search() such that it
sets the SCSI result to DID_REQUEUE << 16 instead of modifying
ufshcd_transfer_rsp_status()?

Regarding (C): SQRTCy.ICU is only set by ufshcd_mcq_sq_cleanup() and the 
only caller of that function is ufshcd_clear_cmd().
There is only one function that calls ufshcd_clear_cmd() for SCSI
commands, namely ufshcd_eh_device_reset_handler(). The latter function
should not set the SCSI result code. All it should do is to abort all
pending commands. The SCSI error handler will resubmit all aborted commands.

Thanks,

Bart.
SEO HOYOUNG Nov. 6, 2023, 4:56 a.m. UTC | #4
> -----Original Message-----
> From: Bart Van Assche <bvanassche@acm.org>
> Sent: Friday, November 3, 2023 4:37 AM
> To: hoyoung seo <hy50.seo@samsung.com>; linux-scsi@vger.kernel.org; linux-
> kernel@vger.kernel.org; alim.akhtar@samsung.com; avri.altman@wdc.com;
> jejb@linux.ibm.com; martin.petersen@oracle.com; beanhuo@micron.com;
> kwangwon.min@samsung.com; kwmad.kim@samsung.com; sh425.lee@samsung.com;
> sc.suh@samsung.com; quic_nguyenb@quicinc.com; cpgs@samsung.com
> Subject: Re: [PATCH v1] scsi: ufs: core: Process abort completed command
> in MCQ mode
> 
> 
> On 11/1/23 21:07, hoyoung seo wrote:
> > when the ufs host receives any error, the ufs driver executes the error-
> hander.
> > If the error-hendler attempts re-init, it must abort and organize
> unprocessed
> >   requests.
> > The above operation is the same for both MCQ/legacy mode.
> > However, in the MCQ mode, if b or c is included in the following
> > specs, the OCS is updated to aborted, which is different from the legacy
> mode.
> >
> > B. If the command is in the Submission Queue and not issued to the
> > device yet, the host controller will mark the command to be skipped in
> the Submission Queue.
> > The host controller will post to the Completion Queue to update the
> > OCS field with ‘ABORTED’.
> > C. If the command is issued to the device already but there is no
> > response yet from the device, the host software driver issue the Abort
> task management function
> >   to the device for that command.
> > Then the host driver set SQRTCy.ICU as ‘1’ to initiate the clean up
> > the hardware resources. The host controller will post to the Completion
> Queue to update the OCS
> >   field with ‘ABORTED’.
> >
> > Unlike legacy mode, this phenomenon causes unintended behavior. (As
> > shown in the log below)
> >
> > [1:  kworker/u20:2:23157] ufshcd_try_to_abort_task: cmd pending in the
> > device. tag = 9
> > [3:  kworker/u20:2:23157] Aborting tag 9 / CDB 0x2a succeeded
> > [4:      swapper/4:    0] sd 0:0:0:0: [sda] tag#9 UNKNOWN(0x2003) Result:
> hostbyte=0x05 driverbyte=DRIVER_OK cmd_age=0s // DID_ABORT
> > [4:      swapper/4:    0] sd 0:0:0:0: [sda] tag#9 CDB: opcode=0x2a 2a 00
> 00 d3 02 00 00 01 00 00
> > [4:      swapper/4:    0] I/O error, dev sda, sector 110628864 op
> 0x1:(WRITE) flags 0x800 phys_seg 256 prio class 2
> >
> >
> > For commands that have completed the abort operation in MCQ mode,
> > since OCS has been updated to aborted, it seems that it will be
> retransmitted only
> >   when it is made to REQUEUE.
> 
> Hi Hoyoung,
> 
> Thank you for having provided this clarification - this really helps.
> 
> Regarding (B): I would appreciate it if this patch would be reworked such
> that no new 'if (is_mcq_enabled(hba))' statements are introduced.
> Has it been considered to modify ufshcd_mcq_sqe_search() such that it sets
> the SCSI result to DID_REQUEUE << 16 instead of modifying
> ufshcd_transfer_rsp_status()?
> 
> Regarding (C): SQRTCy.ICU is only set by ufshcd_mcq_sq_cleanup() and the
> only caller of that function is ufshcd_clear_cmd().
> There is only one function that calls ufshcd_clear_cmd() for SCSI commands,
> namely ufshcd_eh_device_reset_handler(). The latter function should not
> set the SCSI result code. All it should do is to abort all pending
> commands. The SCSI error handler will resubmit all aborted commands.
> 
> Thanks,
> 
> Bart.

Hi,

I do not understand what do you want.

Regarding (B)-> In the ufshcd_mcq_sqe_search(), just find desc_addr of the tag 
for setting nullify in SQD. 
After that the host h/w checks the nullify during processes the SQD,  
and sets the OCS to 'aborted'.

Regarding (C)-> Also, when the ufs driver sets the SQRTCy.ICU to 1, 
the host h/w updates the OCS to the Aborted.

So the ufs driver s/w can't do anything until the OCS update.
the host h/w must update the OCS and then process the Aborted.

What you want is not to arbitrarily modify the results for OCS?
Therefore, if we use the current MCQ, we will do something that 
we do not intend to do.
Therefore, the OCS processing code for mcq and legacy mode needs 
to be disadvantageous.

If you do not want to use 'if (is_mcq_enabled (hba)', 
I will upload the patch again

Thanks.

BRs SEO.
diff mbox series

Patch

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 68d7da02944f..234b7d7ec037 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -5307,6 +5307,7 @@  ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
 	enum utp_ocs ocs;
 	u8 upiu_flags;
 	u32 resid;
+	u8 eec;
 
 	upiu_flags = lrbp->ucd_rsp_ptr->header.flags;
 	resid = be32_to_cpu(lrbp->ucd_rsp_ptr->sr.residual_transfer_count);
@@ -5371,7 +5372,13 @@  ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
 		}
 		break;
 	case OCS_ABORTED:
-		result |= DID_ABORT << 16;
+		if (cqe)
+			eec = le32_to_cpu(cqe->status) & MASK_EEC;
+
+		if (is_mcq_enabled(hba) && !eec)
+			result |= DID_REQUEUE << 16;
+		else
+			result |= DID_ABORT << 16;
 		break;
 	case OCS_INVALID_COMMAND_STATUS:
 		result |= DID_REQUEUE << 16;
diff --git a/include/ufs/ufshci.h b/include/ufs/ufshci.h
index d5accacae6bc..9aefc7e6d0fc 100644
--- a/include/ufs/ufshci.h
+++ b/include/ufs/ufshci.h
@@ -465,6 +465,7 @@  enum utp_ocs {
 
 enum {
 	MASK_OCS			= 0x0F,
+	MASK_EEC			= 0xF0,
 };
 
 /* The maximum length of the data byte count field in the PRDT is 256KB */