diff mbox series

[v1,5/5] scsi: ufs: Complete pending requests in host reset and restore path

Message ID 1573200932-384-6-git-send-email-cang@codeaurora.org (mailing list archive)
State Superseded
Headers show
Series UFS driver general fixes bundle 5 | expand

Commit Message

Can Guo Nov. 8, 2019, 8:15 a.m. UTC
In UFS host reset and restore path, before probe, we stop and start the
host controller once. After host controller is stopped, the pending
requests, if any, are cleared from the doorbell, but no completion IRQ
would be raised due to the hba is stopped.
These pending requests shall be completed along with the first NOP_OUT
command(as it is the first command which can raise a transfer completion
IRQ) sent during probe.
Since the OCSs of these pending requests are not SUCCESS(because they are
not yet literally finished), their UPIUs shall be dumped. When there are
multiple pending requests, the UPIU dump can be overwhelming and may lead
to stability issues because it is in atomic context.
Therefore, before probe, complete these pending requests right after host
controller is stopped.

Signed-off-by: Can Guo <cang@codeaurora.org>
---
 drivers/scsi/ufs/ufshcd.c | 20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

Comments

Alim Akhtar Nov. 13, 2019, 2:24 a.m. UTC | #1
Hi Can,

On Fri, Nov 8, 2019 at 1:50 PM Can Guo <cang@codeaurora.org> wrote:
>
> In UFS host reset and restore path, before probe, we stop and start the
> host controller once. After host controller is stopped, the pending
> requests, if any, are cleared from the doorbell, but no completion IRQ
> would be raised due to the hba is stopped.
> These pending requests shall be completed along with the first NOP_OUT
> command(as it is the first command which can raise a transfer completion
> IRQ) sent during probe.
> Since the OCSs of these pending requests are not SUCCESS(because they are
> not yet literally finished), their UPIUs shall be dumped. When there are
> multiple pending requests, the UPIU dump can be overwhelming and may lead
> to stability issues because it is in atomic context.
> Therefore, before probe, complete these pending requests right after host
> controller is stopped.
>
> Signed-off-by: Can Guo <cang@codeaurora.org>
> ---
Looks good, I hope this is tested on your platform.
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>

>  drivers/scsi/ufs/ufshcd.c | 20 +++++++-------------
>  1 file changed, 7 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 5950a7c..4df4136 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -5404,8 +5404,8 @@ static void ufshcd_err_handler(struct work_struct *work)
>
>         /*
>          * if host reset is required then skip clearing the pending
> -        * transfers forcefully because they will automatically get
> -        * cleared after link startup.
> +        * transfers forcefully because they will get cleared during
> +        * host reset and restore
>          */
>         if (needs_reset)
>                 goto skip_pending_xfer_clear;
> @@ -6333,9 +6333,13 @@ static int ufshcd_host_reset_and_restore(struct ufs_hba *hba)
>         int err;
>         unsigned long flags;
>
> -       /* Reset the host controller */
> +       /*
> +        * Stop the host controller and complete the requests
> +        * cleared by h/w
> +        */
>         spin_lock_irqsave(hba->host->host_lock, flags);
>         ufshcd_hba_stop(hba, false);
> +       ufshcd_complete_requests(hba);
>         spin_unlock_irqrestore(hba->host->host_lock, flags);
>
>         /* scale up clocks to max frequency before full reinitialization */
> @@ -6369,7 +6373,6 @@ static int ufshcd_host_reset_and_restore(struct ufs_hba *hba)
>  static int ufshcd_reset_and_restore(struct ufs_hba *hba)
>  {
>         int err = 0;
> -       unsigned long flags;
>         int retries = MAX_HOST_RESET_RETRIES;
>
>         do {
> @@ -6379,15 +6382,6 @@ static int ufshcd_reset_and_restore(struct ufs_hba *hba)
>                 err = ufshcd_host_reset_and_restore(hba);
>         } while (err && --retries);
>
> -       /*
> -        * After reset the door-bell might be cleared, complete
> -        * outstanding requests in s/w here.
> -        */
> -       spin_lock_irqsave(hba->host->host_lock, flags);
> -       ufshcd_transfer_req_compl(hba);
> -       ufshcd_tmc_handler(hba);
> -       spin_unlock_irqrestore(hba->host->host_lock, flags);
> -
>         return err;
>  }
>
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
Can Guo Nov. 13, 2019, 3:01 a.m. UTC | #2
On 2019-11-13 10:24, Alim Akhtar wrote:
> Hi Can,
> 
> On Fri, Nov 8, 2019 at 1:50 PM Can Guo <cang@codeaurora.org> wrote:
>> 
>> In UFS host reset and restore path, before probe, we stop and start 
>> the
>> host controller once. After host controller is stopped, the pending
>> requests, if any, are cleared from the doorbell, but no completion IRQ
>> would be raised due to the hba is stopped.
>> These pending requests shall be completed along with the first NOP_OUT
>> command(as it is the first command which can raise a transfer 
>> completion
>> IRQ) sent during probe.
>> Since the OCSs of these pending requests are not SUCCESS(because they 
>> are
>> not yet literally finished), their UPIUs shall be dumped. When there 
>> are
>> multiple pending requests, the UPIU dump can be overwhelming and may 
>> lead
>> to stability issues because it is in atomic context.
>> Therefore, before probe, complete these pending requests right after 
>> host
>> controller is stopped.
>> 
>> Signed-off-by: Can Guo <cang@codeaurora.org>
>> ---
> Looks good, I hope this is tested on your platform.
> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
> 

Hi Alim,

Thanks for the review. We've tested it out on our platforms.

Best regards,
Can Guo.

>>  drivers/scsi/ufs/ufshcd.c | 20 +++++++-------------
>>  1 file changed, 7 insertions(+), 13 deletions(-)
>> 
>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>> index 5950a7c..4df4136 100644
>> --- a/drivers/scsi/ufs/ufshcd.c
>> +++ b/drivers/scsi/ufs/ufshcd.c
>> @@ -5404,8 +5404,8 @@ static void ufshcd_err_handler(struct 
>> work_struct *work)
>> 
>>         /*
>>          * if host reset is required then skip clearing the pending
>> -        * transfers forcefully because they will automatically get
>> -        * cleared after link startup.
>> +        * transfers forcefully because they will get cleared during
>> +        * host reset and restore
>>          */
>>         if (needs_reset)
>>                 goto skip_pending_xfer_clear;
>> @@ -6333,9 +6333,13 @@ static int ufshcd_host_reset_and_restore(struct 
>> ufs_hba *hba)
>>         int err;
>>         unsigned long flags;
>> 
>> -       /* Reset the host controller */
>> +       /*
>> +        * Stop the host controller and complete the requests
>> +        * cleared by h/w
>> +        */
>>         spin_lock_irqsave(hba->host->host_lock, flags);
>>         ufshcd_hba_stop(hba, false);
>> +       ufshcd_complete_requests(hba);
>>         spin_unlock_irqrestore(hba->host->host_lock, flags);
>> 
>>         /* scale up clocks to max frequency before full 
>> reinitialization */
>> @@ -6369,7 +6373,6 @@ static int ufshcd_host_reset_and_restore(struct 
>> ufs_hba *hba)
>>  static int ufshcd_reset_and_restore(struct ufs_hba *hba)
>>  {
>>         int err = 0;
>> -       unsigned long flags;
>>         int retries = MAX_HOST_RESET_RETRIES;
>> 
>>         do {
>> @@ -6379,15 +6382,6 @@ static int ufshcd_reset_and_restore(struct 
>> ufs_hba *hba)
>>                 err = ufshcd_host_reset_and_restore(hba);
>>         } while (err && --retries);
>> 
>> -       /*
>> -        * After reset the door-bell might be cleared, complete
>> -        * outstanding requests in s/w here.
>> -        */
>> -       spin_lock_irqsave(hba->host->host_lock, flags);
>> -       ufshcd_transfer_req_compl(hba);
>> -       ufshcd_tmc_handler(hba);
>> -       spin_unlock_irqrestore(hba->host->host_lock, flags);
>> -
>>         return err;
>>  }
>> 
>> --
>> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
>> Forum,
>> a Linux Foundation Collaborative Project
>>
Bean Huo Nov. 13, 2019, 10:04 p.m. UTC | #3
> 
> In UFS host reset and restore path, before probe, we stop and start the host
> controller once. After host controller is stopped, the pending requests, if any,
> are cleared from the doorbell, but no completion IRQ would be raised due to the
> hba is stopped.
> These pending requests shall be completed along with the first NOP_OUT
> command(as it is the first command which can raise a transfer completion
> IRQ) sent during probe.

Hi, Can
I am not sure for this point, because there is HW/SW device reset before or after host reset/restore.
Device HW/SW reset also will clear the pended tasks in device side. That will be better.
I think Qcom platform already enabled HW reset.

//Bean

> Since the OCSs of these pending requests are not SUCCESS(because they are not
> yet literally finished), their UPIUs shall be dumped. When there are multiple
> pending requests, the UPIU dump can be overwhelming and may lead to stability
> issues because it is in atomic context.
> Therefore, before probe, complete these pending requests right after host
> controller is stopped.
Can Guo Nov. 14, 2019, 1:03 a.m. UTC | #4
On 2019-11-14 06:04, Bean Huo (beanhuo) wrote:
>> 
>> In UFS host reset and restore path, before probe, we stop and start 
>> the host
>> controller once. After host controller is stopped, the pending 
>> requests, if any,
>> are cleared from the doorbell, but no completion IRQ would be raised 
>> due to the
>> hba is stopped.
>> These pending requests shall be completed along with the first NOP_OUT
>> command(as it is the first command which can raise a transfer 
>> completion
>> IRQ) sent during probe.
> 
> Hi, Can
> I am not sure for this point, because there is HW/SW device reset
> before or after host reset/restore.
> Device HW/SW reset also will clear the pended tasks in device side.
> That will be better.
> I think Qcom platform already enabled HW reset.
> 
> //Bean
> 

Hi Bean,

By pending tasks here, it means the requests sent down from scsi/block 
layer,
but have not yet been handled by ufs driver(cmd->scsi_done() have not 
been called yet for these requests).
For these requests, although removed by host and UFS device in their HW 
queues(doorbell),
UFS driver still needs to complete them from SW side(call 
cmd->scsi_done() for each one of them) to
let upper layer know that they are finished(although not successfully) 
to avoid hitting
timeout of these pending tasks. I hope I make my explanation clearly.

Best Regards,
Can Guo.

>> Since the OCSs of these pending requests are not SUCCESS(because they 
>> are not
>> yet literally finished), their UPIUs shall be dumped. When there are 
>> multiple
>> pending requests, the UPIU dump can be overwhelming and may lead to 
>> stability
>> issues because it is in atomic context.
>> Therefore, before probe, complete these pending requests right after 
>> host
>> controller is stopped.
Can Guo Nov. 14, 2019, 1:18 a.m. UTC | #5
On 2019-11-14 09:03, cang@codeaurora.org wrote:
> On 2019-11-14 06:04, Bean Huo (beanhuo) wrote:
>>> 
>>> In UFS host reset and restore path, before probe, we stop and start 
>>> the host
>>> controller once. After host controller is stopped, the pending 
>>> requests, if any,
>>> are cleared from the doorbell, but no completion IRQ would be raised 
>>> due to the
>>> hba is stopped.
>>> These pending requests shall be completed along with the first 
>>> NOP_OUT
>>> command(as it is the first command which can raise a transfer 
>>> completion
>>> IRQ) sent during probe.
>> 
>> Hi, Can
>> I am not sure for this point, because there is HW/SW device reset
>> before or after host reset/restore.
>> Device HW/SW reset also will clear the pended tasks in device side.
>> That will be better.
>> I think Qcom platform already enabled HW reset.
>> 
>> //Bean
>> 
> 
> Hi Bean,
> 
> By pending tasks here, it means the requests sent down from scsi/block 
> layer,
> but have not yet been handled by ufs driver(cmd->scsi_done() have not
> been called yet for these requests).
> For these requests, although removed by host and UFS device in their
> HW queues(doorbell),
> UFS driver still needs to complete them from SW side(call
> cmd->scsi_done() for each one of them) to
> let upper layer know that they are finished(although not successfully)
> to avoid hitting
> timeout of these pending tasks. I hope I make my explanation clearly.
> 
> Best Regards,
> Can Guo.
> 

Hi Bean,

Just want to add up more phrases. We do have HW/SW reset.
Sorry about below lines which make you confused. Here I am just 
describing what
is like with previous code. Since these pending requests does not have
a chance to be handled in their IRQ handler after hba is stopped, and as
they have been cleared from doorbell already, then once there is an 
available
transfer completion IRQ, these requests will be handled in the IRQ 
handler,
no matter what is the transfer completion IRQ fired for. And NOP_OUT is 
just
the first command that can fire a transer completion IRQ.

Can Guo.

These pending requests shall be completed along with the first NOP_OUT
command(as it is the first command which can raise a transfer completion
IRQ) sent during probe.

>>> Since the OCSs of these pending requests are not SUCCESS(because they 
>>> are not
>>> yet literally finished), their UPIUs shall be dumped. When there are 
>>> multiple
>>> pending requests, the UPIU dump can be overwhelming and may lead to 
>>> stability
>>> issues because it is in atomic context.
>>> Therefore, before probe, complete these pending requests right after 
>>> host
>>> controller is stopped.
Bean Huo Nov. 14, 2019, 1:03 p.m. UTC | #6
> 
> Signed-off-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Tested-by: Bean Huo <beanhuo@micron.com>
diff mbox series

Patch

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5950a7c..4df4136 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5404,8 +5404,8 @@  static void ufshcd_err_handler(struct work_struct *work)
 
 	/*
 	 * if host reset is required then skip clearing the pending
-	 * transfers forcefully because they will automatically get
-	 * cleared after link startup.
+	 * transfers forcefully because they will get cleared during
+	 * host reset and restore
 	 */
 	if (needs_reset)
 		goto skip_pending_xfer_clear;
@@ -6333,9 +6333,13 @@  static int ufshcd_host_reset_and_restore(struct ufs_hba *hba)
 	int err;
 	unsigned long flags;
 
-	/* Reset the host controller */
+	/*
+	 * Stop the host controller and complete the requests
+	 * cleared by h/w
+	 */
 	spin_lock_irqsave(hba->host->host_lock, flags);
 	ufshcd_hba_stop(hba, false);
+	ufshcd_complete_requests(hba);
 	spin_unlock_irqrestore(hba->host->host_lock, flags);
 
 	/* scale up clocks to max frequency before full reinitialization */
@@ -6369,7 +6373,6 @@  static int ufshcd_host_reset_and_restore(struct ufs_hba *hba)
 static int ufshcd_reset_and_restore(struct ufs_hba *hba)
 {
 	int err = 0;
-	unsigned long flags;
 	int retries = MAX_HOST_RESET_RETRIES;
 
 	do {
@@ -6379,15 +6382,6 @@  static int ufshcd_reset_and_restore(struct ufs_hba *hba)
 		err = ufshcd_host_reset_and_restore(hba);
 	} while (err && --retries);
 
-	/*
-	 * After reset the door-bell might be cleared, complete
-	 * outstanding requests in s/w here.
-	 */
-	spin_lock_irqsave(hba->host->host_lock, flags);
-	ufshcd_transfer_req_compl(hba);
-	ufshcd_tmc_handler(hba);
-	spin_unlock_irqrestore(hba->host->host_lock, flags);
-
 	return err;
 }