Message ID | 20250219093303.2320517-1-heiko@sntech.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/2] arm64: dts: rockchip: remove supports-cqe from rk3588 jaguar | expand |
Hi Heiko, On 2/19/25 10:33 AM, Heiko Stuebner wrote: > From: Heiko Stuebner <heiko.stuebner@cherry.de> > > The sdhci controller supports cqe it seems and necessary code also is in > place - in theory. > > At this point Jaguar and Tiger are the only boards enabling cqe support > on the rk3588 and we are seeing reliability issues under load. > > This can be caused by either a controller-, hw- or driver-issue and > definitly needs more investigation to work properly it seems. > > So disable cqe support on Jaguar for now. > Seems more reasonable to me for the time being. Aside from the reliability issues, I could also trigger a stack trace with: $ mmc rpmb read-counter /dev/mmcblk0rpmb [ 1119.647435] mmc0: Timeout waiting for hardware interrupt. [ 1119.653480] mmc0: sdhci: ============ SDHCI REGISTER DUMP =========== [ 1119.660676] mmc0: sdhci: Sys addr: 0x00000001 | Version: 0x00000005 [ 1119.667871] mmc0: sdhci: Blk size: 0x00007200 | Blk cnt: 0x00000000 [ 1119.675066] mmc0: sdhci: Argument: 0x00000000 | Trn mode: 0x0000002b [ 1119.682261] mmc0: sdhci: Present: 0x03f701f6 | Host ctl: 0x00000035 [ 1119.689455] mmc0: sdhci: Power: 0x00000001 | Blk gap: 0x00000000 [ 1119.696649] mmc0: sdhci: Wake-up: 0x00000000 | Clock: 0x00000407 [ 1119.703845] mmc0: sdhci: Timeout: 0x0000000e | Int stat: 0x00000000 [ 1119.711039] mmc0: sdhci: Int enab: 0x03ff000b | Sig enab: 0x03ff000b [ 1119.718235] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000 [ 1119.725429] mmc0: sdhci: Caps: 0x226dc881 | Caps_1: 0x08000007 [ 1119.732624] mmc0: sdhci: Cmd: 0x0000193a | Max curr: 0x00000000 [ 1119.739819] mmc0: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0x00000000 [ 1119.747014] mmc0: sdhci: Resp[2]: 0x328f5903 | Resp[3]: 0x000007d9 [ 1119.754209] mmc0: sdhci: Host ctl2: 0x0000000f [ 1119.759169] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x0057b200 [ 1119.766363] mmc0: sdhci: ============================================ [ 1119.773595] sdhci-dwcmshc fe2e0000.mmc: __mmc_blk_ioctl_cmd: data error -110 FWIW, the changes that Rockchip seems to have done on top of that driver in their 6.1 vendor fork are the following commits: https://git.theobroma-systems.com/jaguar-linux.git/commit/drivers/mmc/host/sdhci-of-dwcmshc.c?id=2ef0767967138d333360ec0f399f1d68646741c3&h=linux-6.1-stan-rkr3.2-jaguar https://git.theobroma-systems.com/jaguar-linux.git/commit/drivers/mmc/host/sdhci-of-dwcmshc.c?id=75dfde714bbe81e938190142d07307fa864fda34&h=linux-6.1-stan-rkr3.2-jaguar Maybe something worth having a look at some time in the future. Reviewed-by: Quentin Schulz <quentin.schulz@cherry.de> Thanks! Quentin
Am Mittwoch, 19. Februar 2025, 17:06:52 MEZ schrieb Quentin Schulz: > Hi Heiko, > > On 2/19/25 10:33 AM, Heiko Stuebner wrote: > > From: Heiko Stuebner <heiko.stuebner@cherry.de> > > > > The sdhci controller supports cqe it seems and necessary code also is in > > place - in theory. > > > > At this point Jaguar and Tiger are the only boards enabling cqe support > > on the rk3588 and we are seeing reliability issues under load. > > > > This can be caused by either a controller-, hw- or driver-issue and > > definitly needs more investigation to work properly it seems. > > > > So disable cqe support on Jaguar for now. > > > > Seems more reasonable to me for the time being. > > Aside from the reliability issues, I could also trigger a stack trace with: > > $ mmc rpmb read-counter /dev/mmcblk0rpmb > [ 1119.647435] mmc0: Timeout waiting for hardware interrupt. > [ 1119.653480] mmc0: sdhci: ============ SDHCI REGISTER DUMP =========== > [ 1119.660676] mmc0: sdhci: Sys addr: 0x00000001 | Version: 0x00000005 > [ 1119.667871] mmc0: sdhci: Blk size: 0x00007200 | Blk cnt: 0x00000000 > [ 1119.675066] mmc0: sdhci: Argument: 0x00000000 | Trn mode: 0x0000002b > [ 1119.682261] mmc0: sdhci: Present: 0x03f701f6 | Host ctl: 0x00000035 > [ 1119.689455] mmc0: sdhci: Power: 0x00000001 | Blk gap: 0x00000000 > [ 1119.696649] mmc0: sdhci: Wake-up: 0x00000000 | Clock: 0x00000407 > [ 1119.703845] mmc0: sdhci: Timeout: 0x0000000e | Int stat: 0x00000000 > [ 1119.711039] mmc0: sdhci: Int enab: 0x03ff000b | Sig enab: 0x03ff000b > [ 1119.718235] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000 > [ 1119.725429] mmc0: sdhci: Caps: 0x226dc881 | Caps_1: 0x08000007 > [ 1119.732624] mmc0: sdhci: Cmd: 0x0000193a | Max curr: 0x00000000 > [ 1119.739819] mmc0: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0x00000000 > [ 1119.747014] mmc0: sdhci: Resp[2]: 0x328f5903 | Resp[3]: 0x000007d9 > [ 1119.754209] mmc0: sdhci: Host ctl2: 0x0000000f > [ 1119.759169] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x0057b200 > [ 1119.766363] mmc0: sdhci: ============================================ > [ 1119.773595] sdhci-dwcmshc fe2e0000.mmc: __mmc_blk_ioctl_cmd: data > error -110 I can reproduce this timeout with CQE enabled. After disabling CQE, this goes away to the regularly expected response. > FWIW, the changes that Rockchip seems to have done on top of that driver > in their 6.1 vendor fork are the following commits: > > https://git.theobroma-systems.com/jaguar-linux.git/commit/drivers/mmc/host/sdhci-of-dwcmshc.c?id=2ef0767967138d333360ec0f399f1d68646741c3&h=linux-6.1-stan-rkr3.2-jaguar > https://git.theobroma-systems.com/jaguar-linux.git/commit/drivers/mmc/host/sdhci-of-dwcmshc.c?id=75dfde714bbe81e938190142d07307fa864fda34&h=linux-6.1-stan-rkr3.2-jaguar > > Maybe something worth having a look at some time in the future. > > Reviewed-by: Quentin Schulz <quentin.schulz@cherry.de> > > Thanks! > Quentin >
On Wed, 19 Feb 2025 10:33:02 +0100, Heiko Stuebner wrote: > The sdhci controller supports cqe it seems and necessary code also is in > place - in theory. > > At this point Jaguar and Tiger are the only boards enabling cqe support > on the rk3588 and we are seeing reliability issues under load. > > This can be caused by either a controller-, hw- or driver-issue and > definitly needs more investigation to work properly it seems. > > [...] Applied, thanks! [1/2] arm64: dts: rockchip: remove supports-cqe from rk3588 jaguar commit: 304b0a60d38dc24bfbfc9adc7d254d1cf8f98317 [2/2] arm64: dts: rockchip: remove supports-cqe from rk3588 tiger commit: 3e0711f89e5e7b0c7b2ab4843dc92dcbbdbba777 Best regards,
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-jaguar.dts b/arch/arm64/boot/dts/rockchip/rk3588-jaguar.dts index 7e8f0a452ca0..be5426b61cac 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588-jaguar.dts +++ b/arch/arm64/boot/dts/rockchip/rk3588-jaguar.dts @@ -613,7 +613,6 @@ &sdhci { non-removable; pinctrl-names = "default"; pinctrl-0 = <&emmc_bus8 &emmc_cmd &emmc_clk &emmc_data_strobe>; - supports-cqe; vmmc-supply = <&vcc_3v3_s3>; vqmmc-supply = <&vcc_1v8_s3>; status = "okay";