diff mbox series

[RFT] mmc: tmio: reset device on timeout, too

Message ID 20200821081654.28280-1-wsa+renesas@sang-engineering.com (mailing list archive)
State New, archived
Headers show
Series [RFT] mmc: tmio: reset device on timeout, too | expand

Commit Message

Wolfram Sang Aug. 21, 2020, 8:16 a.m. UTC
When a command response times out, the TMIO driver has been resetting
the controller ever since. However, this means some initialization like
bus width or tuning settings will be forgotten. To ensure proper working
in all code paths, we will enforce a reset of the remote device, too.
Many thanks to the Renesas BSP team for the detailed description of the
problem.

Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---

This patch depends on the TMIO reset refactorization:

[RFT 0/6] mmc: refactor reset callbacks

Looking also for tests here. Thanks!

 drivers/mmc/host/tmio_mmc_core.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Yoshihiro Shimoda Aug. 28, 2020, 12:18 p.m. UTC | #1
Hi Wolfram-san,

> From: Wolfram Sang, Sent: Friday, August 21, 2020 5:17 PM
> 
> When a command response times out, the TMIO driver has been resetting
> the controller ever since. However, this means some initialization like
> bus width or tuning settings will be forgotten. To ensure proper working
> in all code paths, we will enforce a reset of the remote device, too.
> Many thanks to the Renesas BSP team for the detailed description of the
> problem.
> 
> Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> ---
> 
> This patch depends on the TMIO reset refactorization:
> 
> [RFT 0/6] mmc: refactor reset callbacks
> 
> Looking also for tests here. Thanks!

Thank you for the patch!

Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>

Also, I tested on R-Car H3 ES3.0 and confirmed that this patch resolved
an issue which any commands of eMMC could not work after
tmio_mmc_reset_work() was called. So,

Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>

Best regards,
Yoshihiro Shimoda
Ulf Hansson Aug. 28, 2020, 12:35 p.m. UTC | #2
On Fri, 21 Aug 2020 at 10:17, Wolfram Sang
<wsa+renesas@sang-engineering.com> wrote:
>
> When a command response times out, the TMIO driver has been resetting
> the controller ever since. However, this means some initialization like
> bus width or tuning settings will be forgotten. To ensure proper working
> in all code paths, we will enforce a reset of the remote device, too.
> Many thanks to the Renesas BSP team for the detailed description of the
> problem.
>
> Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> ---
>
> This patch depends on the TMIO reset refactorization:
>
> [RFT 0/6] mmc: refactor reset callbacks
>
> Looking also for tests here. Thanks!
>
>  drivers/mmc/host/tmio_mmc_core.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
> index ab910043808f..0d64308c619f 100644
> --- a/drivers/mmc/host/tmio_mmc_core.c
> +++ b/drivers/mmc/host/tmio_mmc_core.c
> @@ -220,6 +220,7 @@ static void tmio_mmc_reset_work(struct work_struct *work)
>         spin_unlock_irqrestore(&host->lock, flags);
>
>         tmio_mmc_reset(host);
> +       mmc_hw_reset(host->mmc);

This isn't how mmc_hw_reset() is intended to be used. Instead, the
idea is that it should be called by upper layer code, when some error
path is triggered for an I/O request.

However, let me think a bit about this.

>
>         /* Ready for new calls */
>         host->mrq = NULL;
> --
> 2.20.1
>

Kind regards
Uffe
Wolfram Sang Aug. 30, 2020, 1:03 p.m. UTC | #3
> This isn't how mmc_hw_reset() is intended to be used. Instead, the
> idea is that it should be called by upper layer code, when some error
> path is triggered for an I/O request.

Hmm, there are some wireless drivers using it as well. I am confused, is
this considered "upper layer"?

drivers/net/wireless/ath/ath10k/sdio.c: ret = mmc_hw_reset(ar_sdio->func->card->host);
drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c:        mmc_hw_reset(sdiodev->func1->card->host);
drivers/net/wireless/marvell/mwifiex/sdio.c:    ret = mmc_hw_reset(func->card->host);
drivers/net/wireless/ti/wlcore/sdio.c:  mmc_hw_reset(card->host);

I'd like to understand, so I can add some docs. Because the intended use
is nowhere documented to the best of my knowledge.

> However, let me think a bit about this.

Sure, thanks for the help!
Ulf Hansson Sept. 9, 2020, 11:24 a.m. UTC | #4
On Sun, 30 Aug 2020 at 15:04, Wolfram Sang
<wsa+renesas@sang-engineering.com> wrote:
>
>
> > This isn't how mmc_hw_reset() is intended to be used. Instead, the
> > idea is that it should be called by upper layer code, when some error
> > path is triggered for an I/O request.
>
> Hmm, there are some wireless drivers using it as well. I am confused, is
> this considered "upper layer"?
>
> drivers/net/wireless/ath/ath10k/sdio.c: ret = mmc_hw_reset(ar_sdio->func->card->host);
> drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c:        mmc_hw_reset(sdiodev->func1->card->host);
> drivers/net/wireless/marvell/mwifiex/sdio.c:    ret = mmc_hw_reset(func->card->host);
> drivers/net/wireless/ti/wlcore/sdio.c:  mmc_hw_reset(card->host);

Correct, these are "upper layers". The same applies for the mmc block
device driver.

In this way there is a guarantee that the struct mmc_card is still present.

>
> I'd like to understand, so I can add some docs. Because the intended use
> is nowhere documented to the best of my knowledge.

That would be great. I appreciate all kinds of improvements on the doc parts.

>
> > However, let me think a bit about this.
>
> Sure, thanks for the help!

Thinking more about this.

Perhaps a better option is to return a specific error code for the
last request, that makes the core run mmc_hw_reset(). Or potentially,
add a host cap and let the core treat some error code, specifically
for hosts like tmio.

Kind regards
Uffe
Wolfram Sang Sept. 9, 2020, 11:37 a.m. UTC | #5
Hi Ulf,

> > Hmm, there are some wireless drivers using it as well. I am confused, is
> > this considered "upper layer"?
> >
> > drivers/net/wireless/ath/ath10k/sdio.c: ret = mmc_hw_reset(ar_sdio->func->card->host);
> > drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c:        mmc_hw_reset(sdiodev->func1->card->host);
> > drivers/net/wireless/marvell/mwifiex/sdio.c:    ret = mmc_hw_reset(func->card->host);
> > drivers/net/wireless/ti/wlcore/sdio.c:  mmc_hw_reset(card->host);
> 
> Correct, these are "upper layers". The same applies for the mmc block
> device driver.
> 
> In this way there is a guarantee that the struct mmc_card is still present.

Ah, now I get it. "upper layers" as in consumers. And because consumers
sit on a card, this guarantees that mmc_card is still there. Correct?

> That would be great. I appreciate all kinds of improvements on the doc parts.

You are welcome!

> Perhaps a better option is to return a specific error code for the
> last request, that makes the core run mmc_hw_reset(). Or potentially,
> add a host cap and let the core treat some error code, specifically
> for hosts like tmio.

A specific errno could work. I don't see the advantage of a CAP (besides
it is rather a quirk than a cap). We could also have
'mmc_controller_card_reset()' or something which ensures mmc_card is
present and let that controllers call when they see fit. Or?

Thanks for your help,

   Wolfram
Ulf Hansson Sept. 9, 2020, 12:45 p.m. UTC | #6
On Wed, 9 Sep 2020 at 13:37, Wolfram Sang
<wsa+renesas@sang-engineering.com> wrote:
>
> Hi Ulf,
>
> > > Hmm, there are some wireless drivers using it as well. I am confused, is
> > > this considered "upper layer"?
> > >
> > > drivers/net/wireless/ath/ath10k/sdio.c: ret = mmc_hw_reset(ar_sdio->func->card->host);
> > > drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c:        mmc_hw_reset(sdiodev->func1->card->host);
> > > drivers/net/wireless/marvell/mwifiex/sdio.c:    ret = mmc_hw_reset(func->card->host);
> > > drivers/net/wireless/ti/wlcore/sdio.c:  mmc_hw_reset(card->host);
> >
> > Correct, these are "upper layers". The same applies for the mmc block
> > device driver.
> >
> > In this way there is a guarantee that the struct mmc_card is still present.
>
> Ah, now I get it. "upper layers" as in consumers. And because consumers
> sit on a card, this guarantees that mmc_card is still there. Correct?

Yes.

>
> > That would be great. I appreciate all kinds of improvements on the doc parts.
>
> You are welcome!
>
> > Perhaps a better option is to return a specific error code for the
> > last request, that makes the core run mmc_hw_reset(). Or potentially,
> > add a host cap and let the core treat some error code, specifically
> > for hosts like tmio.
>
> A specific errno could work. I don't see the advantage of a CAP (besides
> it is rather a quirk than a cap). We could also have
> 'mmc_controller_card_reset()' or something which ensures mmc_card is
> present and let that controllers call when they see fit. Or?

Maybe something like "mmc_controller_card_reset" could work, but it's
not going to be that straight forward. In the end, we depend on the
context for when the host driver would call such a function. In some
cases it must call mmc_claim_host() while in others it shouldn't.

BTW, I see that tmio_mmc_reset() is called at
tmio_mmc_host_runtime_resume(). This seems to work fine without having
to make a full reset of the card. Why can't you do something similar
to that instead?

Kind regards
Uffe
Wolfram Sang Sept. 15, 2020, 10:05 a.m. UTC | #7
> > Ah, now I get it. "upper layers" as in consumers. And because consumers
> > sit on a card, this guarantees that mmc_card is still there. Correct?
> 
> Yes.

Good, I'll prepare a patch, hopefully in the next days.

> Maybe something like "mmc_controller_card_reset" could work, but it's
> not going to be that straight forward. In the end, we depend on the
> context for when the host driver would call such a function. In some
> cases it must call mmc_claim_host() while in others it shouldn't.

I see. It seems we should try to handle it locally in the driver then.

> BTW, I see that tmio_mmc_reset() is called at
> tmio_mmc_host_runtime_resume(). This seems to work fine without having
> to make a full reset of the card. Why can't you do something similar
> to that instead?

Good question. I'll investigate that. I am a bit afraid that it neither
works and only RPM never kicked in because of a workaround. But I need
to prove that, maybe it is something else...

Thanks for the help, Ulf!
diff mbox series

Patch

diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
index ab910043808f..0d64308c619f 100644
--- a/drivers/mmc/host/tmio_mmc_core.c
+++ b/drivers/mmc/host/tmio_mmc_core.c
@@ -220,6 +220,7 @@  static void tmio_mmc_reset_work(struct work_struct *work)
 	spin_unlock_irqrestore(&host->lock, flags);
 
 	tmio_mmc_reset(host);
+	mmc_hw_reset(host->mmc);
 
 	/* Ready for new calls */
 	host->mrq = NULL;