Message ID | 20210621070009.13655-1-wsa+renesas@sang-engineering.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mmc: renesas_sdhi: sys_dmac: abort DMA synced to avoid timeouts | expand |
Hi Wolfram, Thanks for your work. On 2021-06-21 09:00:09 +0200, Wolfram Sang wrote: > When aborting DMA, we terminate the transfer without waiting for it to > succeed. This may lead to races which can e.g. lead to timeout problems > when tuning. Remove the deprecated dmaengine_terminate_all() function > and use the explicit dmaengine_terminate_sync(). > > Fixes: e3de2be7368d ("mmc: tmio_mmc: fix card eject during IO with DMA") > Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> > Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> > --- > > Geert, this fixes the issue you have seen on your Koelsch board on my > Lager board. Can you test again with this patch please? I'm not exactly sure what problem Geert is experience but I unfortunately have tuning problems on Koelsch. The problem is the same with and without this patch however. I'm trying on-top of v5.13-rc7 with and without this patch and this is what I experience. # Insert card in SD0 [ 57.794238] mmc0: new ultra high speed SDR104 SDHC card at address aaaa [ 57.801363] mmcblk0: mmc0:aaaa SL32G 29.7 GiB (ro) [ 57.820427] GPT:partition_entry_array_crc32 values don't match: 0x9ad84b1 != 0xb110df4b [ 57.828456] GPT:Primary header thinks Alt. header is not at the end of the disk. [ 57.835901] GPT:11526300 != 62333951 [ 57.839484] GPT:Alternate GPT header not at the end of the disk. [ 57.845514] GPT:11526300 != 62333951 [ 57.849093] GPT: Use GNU Parted to correct GPT errors. [ 57.854306] mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 # Eject card and insert it in SD1 [ 70.261657] mmc0: tuning execution failed: -5 [ 70.266377] mmc0: card aaaa removed [ 77.769959] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD52) [ 82.889951] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD52) [ 88.009948] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD0) [ 93.129966] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD8) [ 98.249955] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 103.369944] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 108.489946] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 113.609921] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 118.729885] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) [ 123.849848] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) [ 128.969823] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) [ 134.089817] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) [ 139.209774] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD1) [ 144.409755] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD52) [ 149.529735] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD52) [ 154.649720] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD0) [ 159.769709] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD8) [ 164.889693] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 170.009685] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 175.129729] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 180.249673] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 185.369656] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) [ 190.489650] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) [ 195.609654] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) [ 200.729631] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) [ 205.849630] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD1) [ 211.049615] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD52) [ 216.169621] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD52) [ 221.289616] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD0) [ 226.409611] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD8) [ 231.529605] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 236.649600] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 241.769580] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 246.889543] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 252.009503] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) ... timeout messages continue ... The experience is the same if I directly insert the card in SD1 after a reset, I only included the SD0 cycle to verify the card itself is good. I tested on v5.12 and there the card works but is identified as SDR50, # Insert into SD1 [ 102.667405] mmc0: new ultra high speed SDR50 SDHC card at address aaaa [ 102.676211] mmcblk0: mmc0:aaaa SL32G 29.7 GiB (ro) [ 102.695241] GPT:partition_entry_array_crc32 values don't match: 0x9ad84b1 != 0xb110df4b [ 102.703312] GPT:Primary header thinks Alt. header is not at the end of the disk. [ 102.710754] GPT:11526300 != 62333951 [ 102.714335] GPT:Alternate GPT header not at the end of the disk. [ 102.720360] GPT:11526300 != 62333951 [ 102.723937] GPT: Use GNU Parted to correct GPT errors. [ 102.729158] mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 Maybe there are more then one patch needed to fix this? Should I rerun my test on a different base? > > I noticed that Renesas driver are quite an active user of this > deprecated dmaengine function. I will audit and improve the other > drivers meanwhile. > > drivers/mmc/host/renesas_sdhi_sys_dmac.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/mmc/host/renesas_sdhi_sys_dmac.c b/drivers/mmc/host/renesas_sdhi_sys_dmac.c > index ffa64211f4de..6956b83469c8 100644 > --- a/drivers/mmc/host/renesas_sdhi_sys_dmac.c > +++ b/drivers/mmc/host/renesas_sdhi_sys_dmac.c > @@ -108,9 +108,9 @@ static void renesas_sdhi_sys_dmac_abort_dma(struct tmio_mmc_host *host) > renesas_sdhi_sys_dmac_enable_dma(host, false); > > if (host->chan_rx) > - dmaengine_terminate_all(host->chan_rx); > + dmaengine_terminate_sync(host->chan_rx); > if (host->chan_tx) > - dmaengine_terminate_all(host->chan_tx); > + dmaengine_terminate_sync(host->chan_tx); > > renesas_sdhi_sys_dmac_enable_dma(host, true); > } > -- > 2.30.2 >
Hi Wolfram, On Mon, Jun 21, 2021 at 9:00 AM Wolfram Sang <wsa+renesas@sang-engineering.com> wrote: > When aborting DMA, we terminate the transfer without waiting for it to > succeed. This may lead to races which can e.g. lead to timeout problems > when tuning. Remove the deprecated dmaengine_terminate_all() function > and use the explicit dmaengine_terminate_sync(). > > Fixes: e3de2be7368d ("mmc: tmio_mmc: fix card eject during IO with DMA") > Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> > Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Thanks for your patch! > Geert, this fixes the issue you have seen on your Koelsch board on my > Lager board. Can you test again with this patch please? Unfortunately it does not fix the issue with my SanDisk Extreme 32GB A1 microSD card in the microSD adapter that came with the card (Conrad 1553726): [ 201.590669] sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19) [ 206.710441] sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19) [ 211.830215] sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19) [ 216.949986] sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19) [ 222.069747] sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19) [ 227.189509] sh_mobile_sdhi ee100000.mmc: timeout waiting for hardware interrupt (CMD19) (switching slot) [ 227.230383] mmc0: tuning execution failed: -5 [ 227.234812] mmc0: error -5 whilst initialising SD card [ 238.628970] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD52) [ 243.748755] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD52) [ 248.868475] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD0) [ 253.988234] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD8) [ 259.107995] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 264.227746] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 269.347563] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 274.468165] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD5) [ 278.558584] sh_mobile_sdhi ee160000.mmc: timeout waiting for hardware interrupt (CMD52) [ 279.588686] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) [ 283.679075] sh_mobile_sdhi ee160000.mmc: timeout waiting for hardware interrupt (CMD52) [ 284.709164] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) [ 288.789522] sh_mobile_sdhi ee160000.mmc: timeout waiting for hardware interrupt (CMD0) [ 289.829596] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) [ 293.909915] sh_mobile_sdhi ee160000.mmc: timeout waiting for hardware interrupt (CMD8) [ 294.949984] sh_mobile_sdhi ee140000.mmc: timeout waiting for hardware interrupt (CMD55) The same card works before/after on Salvator-XS with R-Car H3ES2.0. Tree is renesas-devel-2021-06-21-v5.13-rc7 without/with your patch. Gr{oetje,eeting}s, Geert
> > Geert, this fixes the issue you have seen on your Koelsch board on my > > Lager board. Can you test again with this patch please? > > Unfortunately it does not fix the issue with my SanDisk Extreme 32GB > A1 microSD card in the microSD adapter that came with the card > (Conrad 1553726): Pity. Interestingly, unlike yesterday, my card works fine with SDHI0/SDR104 now even without my patch. It fails on SDHI2/SDR50 with a timeout, despite no tuning. Back to the drawing board, I guess...
While the patch does not fix the timeout issue (also verified using an Alt board), it does fix this follow-up issue here: [ 191.485194] ------------[ cut here ]------------ [ 191.490296] WARNING: CPU: 0 PID: 698 at kernel/dma/debug.c:498 add_dma_entry+0x158/0x180 [ 191.499397] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0117cc3a So, I think we want this patch nonetheless but I need to rephrase the commit message.
diff --git a/drivers/mmc/host/renesas_sdhi_sys_dmac.c b/drivers/mmc/host/renesas_sdhi_sys_dmac.c index ffa64211f4de..6956b83469c8 100644 --- a/drivers/mmc/host/renesas_sdhi_sys_dmac.c +++ b/drivers/mmc/host/renesas_sdhi_sys_dmac.c @@ -108,9 +108,9 @@ static void renesas_sdhi_sys_dmac_abort_dma(struct tmio_mmc_host *host) renesas_sdhi_sys_dmac_enable_dma(host, false); if (host->chan_rx) - dmaengine_terminate_all(host->chan_rx); + dmaengine_terminate_sync(host->chan_rx); if (host->chan_tx) - dmaengine_terminate_all(host->chan_tx); + dmaengine_terminate_sync(host->chan_tx); renesas_sdhi_sys_dmac_enable_dma(host, true); }
When aborting DMA, we terminate the transfer without waiting for it to succeed. This may lead to races which can e.g. lead to timeout problems when tuning. Remove the deprecated dmaengine_terminate_all() function and use the explicit dmaengine_terminate_sync(). Fixes: e3de2be7368d ("mmc: tmio_mmc: fix card eject during IO with DMA") Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> --- Geert, this fixes the issue you have seen on your Koelsch board on my Lager board. Can you test again with this patch please? I noticed that Renesas driver are quite an active user of this deprecated dmaengine function. I will audit and improve the other drivers meanwhile. drivers/mmc/host/renesas_sdhi_sys_dmac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)