Message ID | 20241031070232.1793078-9-masterr3c0rd@epochal.quest (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | sunxi: A100/A133 second stage support | expand |
On Thu, 31 Oct 2024 04:02:21 -0300 Cody Eksal <masterr3c0rd@epochal.quest> wrote: > During testing, it was noted that MMC would fail to initialize, with > "mmc: fatal err update clk timeout" being printed in the log. It was > found that CLK_SET_RATE_NO_REPARENT was set on the MMC controllers, and > that removing this allows MMC to initialize. Therefore, remove > CLK_SET_RATE_NO_REPARENT from mmc0/1/2. Well, while this change indeed prevented that error message you mentioned, but the SD card still doesn't work for me: it probes and I can mount a filesystem on it, but then it hangs, for instance when running an "ls" on it. It could be my setup (lacking DT or device issue or missing kernel config), though, and the eMMC works for me this way, but it would be good to have that sorted. Also it would be good to know why CLK_SET_RATE_NO_REPARENT was put there in the first place: I don't see it in any other MMC clocks in sunxi-ng, so it wasn't just copied&pasted. So was there a problem that this flag was supposed to fix? Is that something that only applied to older kernels (back when the MMC patches were first posted), and which has now been fixed/changed elsewhere? I feel a bit uneasy of just removing this just because it works(TM), especially if it doesn't really (SD card for me, for instance). Cheers, Andre > Signed-off-by: Cody Eksal <masterr3c0rd@epochal.quest> > --- > drivers/clk/sunxi-ng/ccu-sun50i-a100.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a100.c b/drivers/clk/sunxi-ng/ccu-sun50i-a100.c > index bbaa82978716..a59e420b195d 100644 > --- a/drivers/clk/sunxi-ng/ccu-sun50i-a100.c > +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a100.c > @@ -436,7 +436,7 @@ static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc0_clk, "mmc0", mmc_parents, 0x830, > 24, 2, /* mux */ > BIT(31), /* gate */ > 2, /* post-div */ > - CLK_SET_RATE_NO_REPARENT); > + 0); > > static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1", mmc_parents, 0x834, > 0, 4, /* M */ > @@ -444,7 +444,7 @@ static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1", mmc_parents, 0x834, > 24, 2, /* mux */ > BIT(31), /* gate */ > 2, /* post-div */ > - CLK_SET_RATE_NO_REPARENT); > + 0); > > static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc2_clk, "mmc2", mmc_parents, 0x838, > 0, 4, /* M */ > @@ -452,7 +452,7 @@ static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc2_clk, "mmc2", mmc_parents, 0x838, > 24, 2, /* mux */ > BIT(31), /* gate */ > 2, /* post-div */ > - CLK_SET_RATE_NO_REPARENT); > + 0); > > static SUNXI_CCU_GATE(bus_mmc0_clk, "bus-mmc0", "ahb3", 0x84c, BIT(0), 0); > static SUNXI_CCU_GATE(bus_mmc1_clk, "bus-mmc1", "ahb3", 0x84c, BIT(1), 0);
On 2024/10/31 9:08 am, Andre Przywara wrote: > Well, while this change indeed prevented that error message you mentioned, > but the SD card still doesn't work for me: it probes and I can mount a > filesystem on it, but then it hangs, for instance when running an "ls" on > it. It could be my setup (lacking DT or device issue or missing kernel > config), though, and the eMMC works for me this way, but it would be good > to have that sorted. I'm investigating this now; it appears mmc2/eMMC is more consistent when CLK_NO_REPARENT is set > Also it would be good to know why CLK_SET_RATE_NO_REPARENT was put there > in the first place: I don't see it in any other MMC clocks in sunxi-ng, so > it wasn't just copied&pasted. Seeing that mmc2 acts better with the flag, perhaps it was copy + pasted from that config. Or perhaps the issues we're running into comes from elsewhere in the chain. At the moment, that's only speculation, though; I'm waiting on a device that has an SD card slot so I can perform more testing myself and debug these issues. > So was there a problem that this flag was supposed to fix? Is that > something that only applied to older kernels (back when the MMC patches > were first posted), and which has now been fixed/changed elsewhere? Yangtao Li/Frank Lee assumably no longer works at Allwinner, as the email he used to submit this originally no longer exists, but I believe the same Yangtao is now a maintainer of the Allwinner cpufreq subsystem, and is CC'd on these patches. I'm sending this reply to him as well; perhaps he may have some additional insight. > I feel a bit uneasy of just removing this just because it works(TM), > especially if it doesn't really (SD card for me, for instance). I agree; I was quickly preparing V2 to hopefully get this in before the 6.13 window for the sunxi tree closed, and added this in last minute after verifying it worked on my current device, which lacks an SD card slot. This patch can be skipped for now, as it's apparent MMC0/1 require a little more love before we can merge it in. I'll submit new patches in the future once this is figured out. Thanks! - Cody > Cheers, > Andre > >> Signed-off-by: Cody Eksal <masterr3c0rd@epochal.quest> >> --- >> drivers/clk/sunxi-ng/ccu-sun50i-a100.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-)
On Sat, 02 Nov 2024 18:44:41 -0300 Cody Eksal <masterr3c0rd@epochal.quest> wrote: Hi Cody, thanks for staying on this issue! > On 2024/10/31 9:08 am, Andre Przywara wrote: > > Well, while this change indeed prevented that error message you mentioned, > > but the SD card still doesn't work for me: it probes and I can mount a > > filesystem on it, but then it hangs, for instance when running an "ls" on > > it. It could be my setup (lacking DT or device issue or missing kernel > > config), though, and the eMMC works for me this way, but it would be good > > to have that sorted. > I'm investigating this now; it appears mmc2/eMMC is more consistent when > CLK_NO_REPARENT is set What do you mean with "more consistent", exactly? I still don't get why NO_REPARENT would help here in the first place: we have three clocks as potential parents: OSC24MHz, PLL_PERIPH0, PLL_PERIPH1. The first one is too slow for typical MMC rates, and PERIPH1 is typically disabled (it's the same rates as PERIPH0, so there is little need for it). So PERIPH0 is to clock to go, and I don't see what NO_REPARENT would change here. So those are my observations: With NO_REPARENT (current mainline): - SD card fails to probe: sunxi-mmc 4020000.mmc: fatal err update clk timeout - SD card is still parented to PERIPH0-2x (probably because U-Boot set that up), but uses a divider of 256 for a clock rate of 4687500 Hz. This probably leads to the failures. - eMMC works, but is parented to the 24MHz OSC, probably because U-Boot did not touch it. The clock rate is 12MHz, the read speed is 10MB/s. With removing NO_REPARENT, so with this patch: - SD cards probes, I can mount a VFAT fs on it, and sometimes "ls" that, but it hangs soon afterwards, for instance when trying to benchmark it. - SD clock is set up correctly: parent is PLL_PERIPH0-2x, rate is 50 MHz, correct for High Speed@4bit and its 25MB/s bus speed. - eMMC works fine, clock parent is PLL-PERIPH0-2x, rate is 100 MHz, correct for HS-200 (100 MHz * 8 bit * 2(DDR)). The read speed is 72MB/s, which sounds alright, and might be a limitation of the flash chip. So NO_REPARENT is always worse for me. > > Also it would be good to know why CLK_SET_RATE_NO_REPARENT was put there > > in the first place: I don't see it in any other MMC clocks in sunxi-ng, so > > it wasn't just copied&pasted. > Seeing that mmc2 acts better with the flag, perhaps it was copy + pasted > from that config. Or perhaps the issues we're running into comes from > elsewhere in the chain. At the moment, that's only speculation, though; > I'm waiting on a device that has an SD card slot so I can perform more > testing myself and debug these issues. > > > So was there a problem that this flag was supposed to fix? Is that > > something that only applied to older kernels (back when the MMC patches > > were first posted), and which has now been fixed/changed elsewhere? > Yangtao Li/Frank Lee assumably no longer works at Allwinner, as the email > he used to submit this originally no longer exists, but I believe the same > Yangtao is now a maintainer of the Allwinner cpufreq subsystem, and is > CC'd on these patches. I'm sending this reply to him as well; perhaps he > may have some additional insight. > > > I feel a bit uneasy of just removing this just because it works(TM), > > especially if it doesn't really (SD card for me, for instance). > I agree; I was quickly preparing V2 to hopefully get this in before the > 6.13 window for the sunxi tree closed, and added this in last minute after > verifying it worked on my current device, which lacks an SD card slot. > > This patch can be skipped for now, as it's apparent MMC0/1 require a little > more love before we can merge it in. I'll submit new patches in the future > once this is figured out. This patch would be a fix anyway (with a Fixes: tag), so we can push it still into 6.13, after -rc1, and it would be backported. So it's not as critical, timing-wise. Cheers, Andre > > Thanks! > - Cody > > > Cheers, > > Andre > > > >> Signed-off-by: Cody Eksal <masterr3c0rd@epochal.quest> > >> --- > >> drivers/clk/sunxi-ng/ccu-sun50i-a100.c | 6 +++--- > >> 1 file changed, 3 insertions(+), 3 deletions(-) >
On Sun, 3 Nov 2024 02:09:29 +0000 Andre Przywara <andre.przywara@arm.com> wrote: Hi, > On Sat, 02 Nov 2024 18:44:41 -0300 > Cody Eksal <masterr3c0rd@epochal.quest> wrote: > > Hi Cody, > > thanks for staying on this issue! > > > On 2024/10/31 9:08 am, Andre Przywara wrote: > > > Well, while this change indeed prevented that error message you mentioned, > > > but the SD card still doesn't work for me: it probes and I can mount a > > > filesystem on it, but then it hangs, for instance when running an "ls" on > > > it. It could be my setup (lacking DT or device issue or missing kernel > > > config), though, and the eMMC works for me this way, but it would be good > > > to have that sorted. > > I'm investigating this now; it appears mmc2/eMMC is more consistent when > > CLK_NO_REPARENT is set > > What do you mean with "more consistent", exactly? > I still don't get why NO_REPARENT would help here in the first place: > we have three clocks as potential parents: OSC24MHz, PLL_PERIPH0, > PLL_PERIPH1. The first one is too slow for typical MMC rates, and > PERIPH1 is typically disabled (it's the same rates as PERIPH0, so > there is little need for it). So PERIPH0 is to clock to go, and I don't > see what NO_REPARENT would change here. > > So those are my observations: > With NO_REPARENT (current mainline): > - SD card fails to probe: > sunxi-mmc 4020000.mmc: fatal err update clk timeout > - SD card is still parented to PERIPH0-2x (probably because U-Boot set > that up), but uses a divider of 256 for a clock rate of 4687500 Hz. > This probably leads to the failures. > - eMMC works, but is parented to the 24MHz OSC, probably because U-Boot > did not touch it. The clock rate is 12MHz, the read speed is 10MB/s. > With removing NO_REPARENT, so with this patch: > - SD cards probes, I can mount a VFAT fs on it, and sometimes "ls" > that, but it hangs soon afterwards, for instance when trying to > benchmark it. It turns out that this it due to a wrong DMA block size description in the MMC driver: the A100/A133 only supports 8K blocks, not 64K as currently advertised there. Patch for that here: [1] With that patch I see the SD card fully working, and at the correct speed, so this patch here is: Reviewed-by: Andre Przywara <andre.przywara@arm.com> Please add a Fixes tag and Cc: stable to the tags, like this: Fixes: fb038ce4db55 ("clk: sunxi-ng: add support for the Allwinner A100 CCU") Cc: stable@vger.kernel.org And make sure to also Cc: that address when sending that out. Or Chen-Yu adds the tags while committing, and we send this separately to stable? And maybe we can change the commit message to be a bit less vague, and just say while it's unknown why the flag was there in the first place, it doesn't make any sense and severely limits MMC transfer speeds. Cheers, Andre [1] https://lore.kernel.org/linux-sunxi/20241107014240.24669-1-andre.przywara@arm.com/T/#u > - SD clock is set up correctly: parent is PLL_PERIPH0-2x, rate is 50 > MHz, correct for High Speed@4bit and its 25MB/s bus speed. > - eMMC works fine, clock parent is PLL-PERIPH0-2x, rate is 100 MHz, > correct for HS-200 (100 MHz * 8 bit * 2(DDR)). The read speed is > 72MB/s, which sounds alright, and might be a limitation of the flash > chip. > > So NO_REPARENT is always worse for me. > > > > Also it would be good to know why CLK_SET_RATE_NO_REPARENT was put there > > > in the first place: I don't see it in any other MMC clocks in sunxi-ng, so > > > it wasn't just copied&pasted. > > Seeing that mmc2 acts better with the flag, perhaps it was copy + pasted > > from that config. Or perhaps the issues we're running into comes from > > elsewhere in the chain. At the moment, that's only speculation, though; > > I'm waiting on a device that has an SD card slot so I can perform more > > testing myself and debug these issues. > > > > > So was there a problem that this flag was supposed to fix? Is that > > > something that only applied to older kernels (back when the MMC patches > > > were first posted), and which has now been fixed/changed elsewhere? > > Yangtao Li/Frank Lee assumably no longer works at Allwinner, as the email > > he used to submit this originally no longer exists, but I believe the same > > Yangtao is now a maintainer of the Allwinner cpufreq subsystem, and is > > CC'd on these patches. I'm sending this reply to him as well; perhaps he > > may have some additional insight. > > > > > I feel a bit uneasy of just removing this just because it works(TM), > > > especially if it doesn't really (SD card for me, for instance). > > I agree; I was quickly preparing V2 to hopefully get this in before the > > 6.13 window for the sunxi tree closed, and added this in last minute after > > verifying it worked on my current device, which lacks an SD card slot. > > > > This patch can be skipped for now, as it's apparent MMC0/1 require a little > > more love before we can merge it in. I'll submit new patches in the future > > once this is figured out. > > This patch would be a fix anyway (with a Fixes: tag), so we can push it > still into 6.13, after -rc1, and it would be backported. So it's not as > critical, timing-wise. > > Cheers, > Andre > > > > > Thanks! > > - Cody > > > > > Cheers, > > > Andre > > > > > >> Signed-off-by: Cody Eksal <masterr3c0rd@epochal.quest> > > >> --- > > >> drivers/clk/sunxi-ng/ccu-sun50i-a100.c | 6 +++--- > > >> 1 file changed, 3 insertions(+), 3 deletions(-) > > > >
diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a100.c b/drivers/clk/sunxi-ng/ccu-sun50i-a100.c index bbaa82978716..a59e420b195d 100644 --- a/drivers/clk/sunxi-ng/ccu-sun50i-a100.c +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a100.c @@ -436,7 +436,7 @@ static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc0_clk, "mmc0", mmc_parents, 0x830, 24, 2, /* mux */ BIT(31), /* gate */ 2, /* post-div */ - CLK_SET_RATE_NO_REPARENT); + 0); static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1", mmc_parents, 0x834, 0, 4, /* M */ @@ -444,7 +444,7 @@ static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1", mmc_parents, 0x834, 24, 2, /* mux */ BIT(31), /* gate */ 2, /* post-div */ - CLK_SET_RATE_NO_REPARENT); + 0); static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc2_clk, "mmc2", mmc_parents, 0x838, 0, 4, /* M */ @@ -452,7 +452,7 @@ static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc2_clk, "mmc2", mmc_parents, 0x838, 24, 2, /* mux */ BIT(31), /* gate */ 2, /* post-div */ - CLK_SET_RATE_NO_REPARENT); + 0); static SUNXI_CCU_GATE(bus_mmc0_clk, "bus-mmc0", "ahb3", 0x84c, BIT(0), 0); static SUNXI_CCU_GATE(bus_mmc1_clk, "bus-mmc1", "ahb3", 0x84c, BIT(1), 0);
During testing, it was noted that MMC would fail to initialize, with "mmc: fatal err update clk timeout" being printed in the log. It was found that CLK_SET_RATE_NO_REPARENT was set on the MMC controllers, and that removing this allows MMC to initialize. Therefore, remove CLK_SET_RATE_NO_REPARENT from mmc0/1/2. Signed-off-by: Cody Eksal <masterr3c0rd@epochal.quest> --- drivers/clk/sunxi-ng/ccu-sun50i-a100.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)