diff mbox series

[1/2] arm64: dts: rockchip: remove supports-cqe from rk3588 jaguar

Message ID 20250219093303.2320517-1-heiko@sntech.de (mailing list archive)
State New
Headers show
Series [1/2] arm64: dts: rockchip: remove supports-cqe from rk3588 jaguar | expand

Commit Message

Heiko Stübner Feb. 19, 2025, 9:33 a.m. UTC
From: Heiko Stuebner <heiko.stuebner@cherry.de>

The sdhci controller supports cqe it seems and necessary code also is in
place - in theory.

At this point Jaguar and Tiger are the only boards enabling cqe support
on the rk3588 and we are seeing reliability issues under load.

This can be caused by either a controller-, hw- or driver-issue and
definitly needs more investigation to work properly it seems.

So disable cqe support on Jaguar for now.

Fixes: d1b8b36a2cc5 ("arm64: dts: rockchip: add Theobroma Jaguar SBC")
Signed-off-by: Heiko Stuebner <heiko.stuebner@cherry.de>
---
 arch/arm64/boot/dts/rockchip/rk3588-jaguar.dts | 1 -
 1 file changed, 1 deletion(-)

Comments

Quentin Schulz Feb. 19, 2025, 4:06 p.m. UTC | #1
Hi Heiko,

On 2/19/25 10:33 AM, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@cherry.de>
> 
> The sdhci controller supports cqe it seems and necessary code also is in
> place - in theory.
> 
> At this point Jaguar and Tiger are the only boards enabling cqe support
> on the rk3588 and we are seeing reliability issues under load.
> 
> This can be caused by either a controller-, hw- or driver-issue and
> definitly needs more investigation to work properly it seems.
> 
> So disable cqe support on Jaguar for now.
> 

Seems more reasonable to me for the time being.

Aside from the reliability issues, I could also trigger a stack trace with:

$ mmc rpmb read-counter /dev/mmcblk0rpmb
[ 1119.647435] mmc0: Timeout waiting for hardware interrupt.
[ 1119.653480] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 1119.660676] mmc0: sdhci: Sys addr:  0x00000001 | Version:  0x00000005
[ 1119.667871] mmc0: sdhci: Blk size:  0x00007200 | Blk cnt:  0x00000000
[ 1119.675066] mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x0000002b
[ 1119.682261] mmc0: sdhci: Present:   0x03f701f6 | Host ctl: 0x00000035
[ 1119.689455] mmc0: sdhci: Power:     0x00000001 | Blk gap:  0x00000000
[ 1119.696649] mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x00000407
[ 1119.703845] mmc0: sdhci: Timeout:   0x0000000e | Int stat: 0x00000000
[ 1119.711039] mmc0: sdhci: Int enab:  0x03ff000b | Sig enab: 0x03ff000b
[ 1119.718235] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
[ 1119.725429] mmc0: sdhci: Caps:      0x226dc881 | Caps_1:   0x08000007
[ 1119.732624] mmc0: sdhci: Cmd:       0x0000193a | Max curr: 0x00000000
[ 1119.739819] mmc0: sdhci: Resp[0]:   0x00000900 | Resp[1]:  0x00000000
[ 1119.747014] mmc0: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x000007d9
[ 1119.754209] mmc0: sdhci: Host ctl2: 0x0000000f
[ 1119.759169] mmc0: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0x0057b200
[ 1119.766363] mmc0: sdhci: ============================================
[ 1119.773595] sdhci-dwcmshc fe2e0000.mmc: __mmc_blk_ioctl_cmd: data 
error -110

FWIW, the changes that Rockchip seems to have done on top of that driver 
in their 6.1 vendor fork are the following commits:

https://git.theobroma-systems.com/jaguar-linux.git/commit/drivers/mmc/host/sdhci-of-dwcmshc.c?id=2ef0767967138d333360ec0f399f1d68646741c3&h=linux-6.1-stan-rkr3.2-jaguar
https://git.theobroma-systems.com/jaguar-linux.git/commit/drivers/mmc/host/sdhci-of-dwcmshc.c?id=75dfde714bbe81e938190142d07307fa864fda34&h=linux-6.1-stan-rkr3.2-jaguar

Maybe something worth having a look at some time in the future.

Reviewed-by: Quentin Schulz <quentin.schulz@cherry.de>

Thanks!
Quentin
Heiko Stübner Feb. 19, 2025, 4:56 p.m. UTC | #2
Am Mittwoch, 19. Februar 2025, 17:06:52 MEZ schrieb Quentin Schulz:
> Hi Heiko,
> 
> On 2/19/25 10:33 AM, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@cherry.de>
> > 
> > The sdhci controller supports cqe it seems and necessary code also is in
> > place - in theory.
> > 
> > At this point Jaguar and Tiger are the only boards enabling cqe support
> > on the rk3588 and we are seeing reliability issues under load.
> > 
> > This can be caused by either a controller-, hw- or driver-issue and
> > definitly needs more investigation to work properly it seems.
> > 
> > So disable cqe support on Jaguar for now.
> > 
> 
> Seems more reasonable to me for the time being.
> 
> Aside from the reliability issues, I could also trigger a stack trace with:
> 
> $ mmc rpmb read-counter /dev/mmcblk0rpmb
> [ 1119.647435] mmc0: Timeout waiting for hardware interrupt.
> [ 1119.653480] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
> [ 1119.660676] mmc0: sdhci: Sys addr:  0x00000001 | Version:  0x00000005
> [ 1119.667871] mmc0: sdhci: Blk size:  0x00007200 | Blk cnt:  0x00000000
> [ 1119.675066] mmc0: sdhci: Argument:  0x00000000 | Trn mode: 0x0000002b
> [ 1119.682261] mmc0: sdhci: Present:   0x03f701f6 | Host ctl: 0x00000035
> [ 1119.689455] mmc0: sdhci: Power:     0x00000001 | Blk gap:  0x00000000
> [ 1119.696649] mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x00000407
> [ 1119.703845] mmc0: sdhci: Timeout:   0x0000000e | Int stat: 0x00000000
> [ 1119.711039] mmc0: sdhci: Int enab:  0x03ff000b | Sig enab: 0x03ff000b
> [ 1119.718235] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
> [ 1119.725429] mmc0: sdhci: Caps:      0x226dc881 | Caps_1:   0x08000007
> [ 1119.732624] mmc0: sdhci: Cmd:       0x0000193a | Max curr: 0x00000000
> [ 1119.739819] mmc0: sdhci: Resp[0]:   0x00000900 | Resp[1]:  0x00000000
> [ 1119.747014] mmc0: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x000007d9
> [ 1119.754209] mmc0: sdhci: Host ctl2: 0x0000000f
> [ 1119.759169] mmc0: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0x0057b200
> [ 1119.766363] mmc0: sdhci: ============================================
> [ 1119.773595] sdhci-dwcmshc fe2e0000.mmc: __mmc_blk_ioctl_cmd: data 
> error -110

I can reproduce this timeout with CQE enabled.

After disabling CQE, this goes away to the regularly expected response.


> FWIW, the changes that Rockchip seems to have done on top of that driver 
> in their 6.1 vendor fork are the following commits:
> 
> https://git.theobroma-systems.com/jaguar-linux.git/commit/drivers/mmc/host/sdhci-of-dwcmshc.c?id=2ef0767967138d333360ec0f399f1d68646741c3&h=linux-6.1-stan-rkr3.2-jaguar
> https://git.theobroma-systems.com/jaguar-linux.git/commit/drivers/mmc/host/sdhci-of-dwcmshc.c?id=75dfde714bbe81e938190142d07307fa864fda34&h=linux-6.1-stan-rkr3.2-jaguar
> 
> Maybe something worth having a look at some time in the future.
> 
> Reviewed-by: Quentin Schulz <quentin.schulz@cherry.de>
> 
> Thanks!
> Quentin
>
Heiko Stübner Feb. 20, 2025, 9:31 a.m. UTC | #3
On Wed, 19 Feb 2025 10:33:02 +0100, Heiko Stuebner wrote:
> The sdhci controller supports cqe it seems and necessary code also is in
> place - in theory.
> 
> At this point Jaguar and Tiger are the only boards enabling cqe support
> on the rk3588 and we are seeing reliability issues under load.
> 
> This can be caused by either a controller-, hw- or driver-issue and
> definitly needs more investigation to work properly it seems.
> 
> [...]

Applied, thanks!

[1/2] arm64: dts: rockchip: remove supports-cqe from rk3588 jaguar
      commit: 304b0a60d38dc24bfbfc9adc7d254d1cf8f98317
[2/2] arm64: dts: rockchip: remove supports-cqe from rk3588 tiger
      commit: 3e0711f89e5e7b0c7b2ab4843dc92dcbbdbba777

Best regards,
diff mbox series

Patch

diff --git a/arch/arm64/boot/dts/rockchip/rk3588-jaguar.dts b/arch/arm64/boot/dts/rockchip/rk3588-jaguar.dts
index 7e8f0a452ca0..be5426b61cac 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-jaguar.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3588-jaguar.dts
@@ -613,7 +613,6 @@  &sdhci {
 	non-removable;
 	pinctrl-names = "default";
 	pinctrl-0 = <&emmc_bus8 &emmc_cmd &emmc_clk &emmc_data_strobe>;
-	supports-cqe;
 	vmmc-supply = <&vcc_3v3_s3>;
 	vqmmc-supply = <&vcc_1v8_s3>;
 	status = "okay";