Message ID | 20220926171002.62352-1-biju.das.jz@bp.renesas.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4] mmc: renesas_sdhi: Fix rounding errors | expand |
On Mon, Sep 26, 2022 at 06:10:02PM +0100, Biju Das wrote: > Due to clk rounding errors on RZ/G2L platforms, it selects a clock source > with a lower clock rate compared to a higher one. > For eg: The rounding error (533333333 Hz / 4 * 4 = 533333332 Hz < 5333333 > 33 Hz) selects a clk source of 400 MHz instead of 533.333333 MHz. > > This patch fixes this issue by adding a margin of (1/1024) higher to > the clock rate. > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> > Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Looks very good to me now! Thanks for keeping at it: Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Hi Biju, On Mon, Sep 26, 2022 at 7:10 PM Biju Das <biju.das.jz@bp.renesas.com> wrote: > Due to clk rounding errors on RZ/G2L platforms, it selects a clock source > with a lower clock rate compared to a higher one. > For eg: The rounding error (533333333 Hz / 4 * 4 = 533333332 Hz < 5333333 > 33 Hz) selects a clk source of 400 MHz instead of 533.333333 MHz. > > This patch fixes this issue by adding a margin of (1/1024) higher to > the clock rate. > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> > Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> > --- > v3->v4: > * Added Tested-by tag from Wolfram. > * Updated commit description and code comment with real example. Thanks for the update! Unfortunately this patch causes a change in the clock frequencies used on R-Car M2-W: -clk_summary: sd0 97500000 +clk_summary: sd0 32500000 -clk_summary: sdhi0 97500000 +clk_summary: sdhi0 32500000 -clk_summary: sd3 12786885 +clk_summary: sd3 12187500 -clk_summary: sdhi3 12786885 +clk_summary: sdhi3 12187500 -clk_summary: sd2 12786885 +clk_summary: sd2 12187500 -clk_summary: sdhi2 12786885 +clk_summary: sdhi2 12187500 > --- a/drivers/mmc/host/renesas_sdhi_core.c > +++ b/drivers/mmc/host/renesas_sdhi_core.c > @@ -153,10 +154,17 @@ static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host, > * greater than, new_clock. As we can divide by 1 << i for > * any i in [0, 9] we want the input clock to be as close as > * possible, but no greater than, new_clock << i. > + * > + * Add an upper limit of 1/1024 rate higher to the clock rate to fix > + * clk rate jumping to lower rate due to rounding error (eg: RZ/G2L has > + * 3 clk sources 533.333333 MHz, 400 MHz and 266.666666 MHz. The request > + * for 533.333333 MHz will selects a slower 400 MHz due to rounding > + * error (533333333 Hz / 4 * 4 = 533333332 Hz < 533333333 Hz)). > */ > + new_upper_limit = (new_clock << i) + ((new_clock << i) >> 10); Mea culpa: while new_clock is a constant inside the loop, i is not! So it should be moved back inside the loop below. With that change, R-Car M2-W is happy again, and I noticed no regression on R-Car H3 ES2.0. > for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) { > freq = clk_round_rate(ref_clk, new_clock << i); > - if (freq > (new_clock << i)) { > + if (freq > new_upper_limit) { > /* Too fast; look for a slightly slower option */ > freq = clk_round_rate(ref_clk, (new_clock << i) / 4 * 3); > if (freq > (new_clock << i)) ^^^^^^^^^^^^^^^^ Probably this should become new_upper_limit too, for consistency? It doesn't seem to matter in my testing, though. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Hi Geert, Thanks for the testing. > Subject: Re: [PATCH v4] mmc: renesas_sdhi: Fix rounding errors > > Hi Biju, > > On Mon, Sep 26, 2022 at 7:10 PM Biju Das <biju.das.jz@bp.renesas.com> > wrote: > > Due to clk rounding errors on RZ/G2L platforms, it selects a clock > > source with a lower clock rate compared to a higher one. > > For eg: The rounding error (533333333 Hz / 4 * 4 = 533333332 Hz < > > 5333333 > > 33 Hz) selects a clk source of 400 MHz instead of 533.333333 MHz. > > > > This patch fixes this issue by adding a margin of (1/1024) higher to > > the clock rate. > > > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> > > Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> > > --- > > v3->v4: > > * Added Tested-by tag from Wolfram. > > * Updated commit description and code comment with real example. > > Thanks for the update! > > Unfortunately this patch causes a change in the clock frequencies used > on R-Car M2-W: > > -clk_summary: sd0 97500000 > +clk_summary: sd0 32500000 > -clk_summary: sdhi0 97500000 > +clk_summary: sdhi0 32500000 > > -clk_summary: sd3 12786885 > +clk_summary: sd3 12187500 > -clk_summary: sdhi3 12786885 > +clk_summary: sdhi3 12187500 > > -clk_summary: sd2 12786885 > +clk_summary: sd2 12187500 > -clk_summary: sdhi2 12786885 > +clk_summary: sdhi2 12187500 That is not good. > > > --- a/drivers/mmc/host/renesas_sdhi_core.c > > +++ b/drivers/mmc/host/renesas_sdhi_core.c > > > @@ -153,10 +154,17 @@ static unsigned int > renesas_sdhi_clk_update(struct tmio_mmc_host *host, > > * greater than, new_clock. As we can divide by 1 << i for > > * any i in [0, 9] we want the input clock to be as close as > > * possible, but no greater than, new_clock << i. > > + * > > + * Add an upper limit of 1/1024 rate higher to the clock > rate to fix > > + * clk rate jumping to lower rate due to rounding error (eg: > RZ/G2L has > > + * 3 clk sources 533.333333 MHz, 400 MHz and 266.666666 MHz. > The request > > + * for 533.333333 MHz will selects a slower 400 MHz due to > rounding > > + * error (533333333 Hz / 4 * 4 = 533333332 Hz < 533333333 > Hz)). > > */ > > + new_upper_limit = (new_clock << i) + ((new_clock << i) >> > 10); > > Mea culpa: while new_clock is a constant inside the loop, i is not! > So it should be moved back inside the loop below. > With that change, R-Car M2-W is happy again, and I noticed no > regression on R-Car H3 ES2.0. OK. > > > for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) { > > freq = clk_round_rate(ref_clk, new_clock << i); > > - if (freq > (new_clock << i)) { > > + if (freq > new_upper_limit) { > > /* Too fast; look for a slightly slower > option */ > > freq = clk_round_rate(ref_clk, (new_clock << > i) / 4 * 3); > > if (freq > (new_clock << i)) > ^^^^^^^^^^^^^^^^ Probably this > should become new_upper_limit too, for consistency? > It doesn't seem to matter in my testing, though. OK. Will do the below change in next version. - new_upper_limit = (new_clock << i) + ((new_clock << i) >> 10); for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) { freq = clk_round_rate(ref_clk, new_clock << i); + new_upper_limit = (new_clock << i) + ((new_clock << i) >> 10); if (freq > new_upper_limit) { /* Too fast; look for a slightly slower option */ freq = clk_round_rate(ref_clk, (new_clock << i) / 4 * 3); - if (freq > (new_clock << i)) + if (freq > new_upper_limit) continue; } Cheers, Biju
Hi Geert, > Subject: RE: [PATCH v4] mmc: renesas_sdhi: Fix rounding errors > > Hi Geert, > > Thanks for the testing. > > > Subject: Re: [PATCH v4] mmc: renesas_sdhi: Fix rounding errors > > > > Hi Biju, > > > > On Mon, Sep 26, 2022 at 7:10 PM Biju Das > <biju.das.jz@bp.renesas.com> > > wrote: > > > Due to clk rounding errors on RZ/G2L platforms, it selects a clock > > > source with a lower clock rate compared to a higher one. > > > For eg: The rounding error (533333333 Hz / 4 * 4 = 533333332 Hz < > > > 5333333 > > > 33 Hz) selects a clk source of 400 MHz instead of 533.333333 MHz. > > > > > > This patch fixes this issue by adding a margin of (1/1024) higher > to > > > the clock rate. > > > > > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> > > > Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> > > > --- > > > v3->v4: > > > * Added Tested-by tag from Wolfram. > > > * Updated commit description and code comment with real example. > > > > Thanks for the update! > > > > Unfortunately this patch causes a change in the clock frequencies > used > > on R-Car M2-W: > > > > -clk_summary: sd0 97500000 > > +clk_summary: sd0 32500000 > > -clk_summary: sdhi0 97500000 > > +clk_summary: sdhi0 32500000 > > > > -clk_summary: sd3 12786885 > > +clk_summary: sd3 12187500 > > -clk_summary: sdhi3 12786885 > > +clk_summary: sdhi3 12187500 > > > > -clk_summary: sd2 12786885 > > +clk_summary: sd2 12187500 > > -clk_summary: sdhi2 12786885 > > +clk_summary: sdhi2 12187500 > > That is not good. Other than this, it is better to test performance as well to check any regression due to second fix in this patch. Note: After mounting, I use below command to check performance. dd if=/dev/zero of=test oflag=direct bs=8M count=64 Cheers, Biju > > > > > > --- a/drivers/mmc/host/renesas_sdhi_core.c > > > +++ b/drivers/mmc/host/renesas_sdhi_core.c > > > > > @@ -153,10 +154,17 @@ static unsigned int > > renesas_sdhi_clk_update(struct tmio_mmc_host *host, > > > * greater than, new_clock. As we can divide by 1 << i > for > > > * any i in [0, 9] we want the input clock to be as close > as > > > * possible, but no greater than, new_clock << i. > > > + * > > > + * Add an upper limit of 1/1024 rate higher to the clock > > rate to fix > > > + * clk rate jumping to lower rate due to rounding error > (eg: > > RZ/G2L has > > > + * 3 clk sources 533.333333 MHz, 400 MHz and 266.666666 > MHz. > > The request > > > + * for 533.333333 MHz will selects a slower 400 MHz due to > > rounding > > > + * error (533333333 Hz / 4 * 4 = 533333332 Hz < 533333333 > > Hz)). > > > */ > > > + new_upper_limit = (new_clock << i) + ((new_clock << i) >> > > 10); > > > > Mea culpa: while new_clock is a constant inside the loop, i is not! > > So it should be moved back inside the loop below. > > With that change, R-Car M2-W is happy again, and I noticed no > > regression on R-Car H3 ES2.0. > > OK. > > > > > > for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) > { > > > freq = clk_round_rate(ref_clk, new_clock << i); > > > - if (freq > (new_clock << i)) { > > > + if (freq > new_upper_limit) { > > > /* Too fast; look for a slightly slower > > option */ > > > freq = clk_round_rate(ref_clk, (new_clock > << > > i) / 4 * 3); > > > if (freq > (new_clock << i)) > > ^^^^^^^^^^^^^^^^ Probably this > > should become new_upper_limit too, for consistency? > > It doesn't seem to matter in my testing, though. > > OK. Will do the below change in next version. > > - new_upper_limit = (new_clock << i) + ((new_clock << i) >> 10); > for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) { > freq = clk_round_rate(ref_clk, new_clock << i); > + new_upper_limit = (new_clock << i) + ((new_clock << i) >> > 10); > if (freq > new_upper_limit) { > /* Too fast; look for a slightly slower option */ > freq = clk_round_rate(ref_clk, (new_clock << i) / 4 * > 3); > - if (freq > (new_clock << i)) > + if (freq > new_upper_limit) > continue; > } > > Cheers, > Biju
Hi kernel test robot, >; Prabhakar Mahadev Lad > <prabhakar.mahadev-lad.rj@bp.renesas.com> > Subject: Re: [PATCH v4] mmc: renesas_sdhi: Fix rounding errors > > Hi Biju, > > Thank you for the patch! Perhaps something to improve: > > If you fix the issue, kindly add following tag where applicable > | Reported-by: kernel test robot <lkp@intel.com> > > All warnings (new ones prefixed by >>): > > >> drivers/mmc/host/renesas_sdhi_core.c:164:34: warning: variable 'i' > is > >> uninitialized when used here [-Wuninitialized] > new_upper_limit = (new_clock << i) + ((new_clock << i) >> > 10); > ^ > drivers/mmc/host/renesas_sdhi_core.c:132:7: note: initialize the > variable 'i' to silence this warning > int i; > ^ > = 0 > 1 warning generated. > > > vim +/i +164 drivers/mmc/host/renesas_sdhi_core.c I have sent v5. https://patchwork.kernel.org/project/linux-renesas-soc/patch/20220928110755.849275-1-biju.das.jz@bp.renesas.com/ Cheers, Biju > > 123 > 124 static unsigned int renesas_sdhi_clk_update(struct > tmio_mmc_host *host, > 125 unsigned int wanted_clock) > 126 { > 127 struct renesas_sdhi *priv = host_to_priv(host); > 128 struct clk *ref_clk = priv->clk; > 129 unsigned int freq, diff, best_freq = 0, diff_min = > ~0; > 130 unsigned int new_clock, clkh_shift = 0; > 131 unsigned int new_upper_limit; > 132 int i; > 133 > 134 /* > 135 * We simply return the current rate if a) we are not > on a R-Car Gen2+ > 136 * SoC (may work for others, but untested) or b) if > the SCC needs its > 137 * clock during tuning, so we don't change the > external clock setup. > 138 */ > 139 if (!(host->pdata->flags & TMIO_MMC_MIN_RCAR2) || > mmc_doing_tune(host->mmc)) > 140 return clk_get_rate(priv->clk); > 141 > 142 if (priv->clkh) { > 143 /* HS400 with 4TAP needs different clock > settings */ > 144 bool use_4tap = priv->quirks && priv->quirks- > >hs400_4taps; > 145 bool need_slow_clkh = host->mmc->ios.timing == > MMC_TIMING_MMC_HS400; > 146 clkh_shift = use_4tap && need_slow_clkh ? 1 : > 2; > 147 ref_clk = priv->clkh; > 148 } > 149 > 150 new_clock = wanted_clock << clkh_shift; > 151 > 152 /* > 153 * We want the bus clock to be as close as possible > to, but no > 154 * greater than, new_clock. As we can divide by 1 << > i for > 155 * any i in [0, 9] we want the input clock to be as > close as > 156 * possible, but no greater than, new_clock << i. > 157 * > 158 * Add an upper limit of 1/1024 rate higher to the > clock rate to fix > 159 * clk rate jumping to lower rate due to rounding > error (eg: RZ/G2L has > 160 * 3 clk sources 533.333333 MHz, 400 MHz and > 266.666666 MHz. The request > 161 * for 533.333333 MHz will selects a slower 400 MHz > due to rounding > 162 * error (533333333 Hz / 4 * 4 = 533333332 Hz < > 533333333 Hz)). > 163 */ > > 164 new_upper_limit = (new_clock << i) + ((new_clock << > i) >> 10); > 165 for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; > i--) { > 166 freq = clk_round_rate(ref_clk, new_clock << i); > 167 if (freq > new_upper_limit) { > 168 /* Too fast; look for a slightly slower > option */ > 169 freq = clk_round_rate(ref_clk, (new_clock > << i) / 4 * 3); > 170 if (freq > (new_clock << i)) > 171 continue; > 172 } > 173 > 174 diff = new_clock - (freq >> i); > 175 if (diff <= diff_min) { > 176 best_freq = freq; > 177 diff_min = diff; > 178 } > 179 } > 180 > 181 clk_set_rate(ref_clk, best_freq); > 182 > 183 if (priv->clkh) > 184 clk_set_rate(priv->clk, best_freq >> > clkh_shift); > 185 > 186 return clk_get_rate(priv->clk); > 187 } > 188 > > -- > 0-DAY CI Kernel Test Service
diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c index 6edbf5c161ab..d1b8130ee37f 100644 --- a/drivers/mmc/host/renesas_sdhi_core.c +++ b/drivers/mmc/host/renesas_sdhi_core.c @@ -128,6 +128,7 @@ static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host, struct clk *ref_clk = priv->clk; unsigned int freq, diff, best_freq = 0, diff_min = ~0; unsigned int new_clock, clkh_shift = 0; + unsigned int new_upper_limit; int i; /* @@ -153,10 +154,17 @@ static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host, * greater than, new_clock. As we can divide by 1 << i for * any i in [0, 9] we want the input clock to be as close as * possible, but no greater than, new_clock << i. + * + * Add an upper limit of 1/1024 rate higher to the clock rate to fix + * clk rate jumping to lower rate due to rounding error (eg: RZ/G2L has + * 3 clk sources 533.333333 MHz, 400 MHz and 266.666666 MHz. The request + * for 533.333333 MHz will selects a slower 400 MHz due to rounding + * error (533333333 Hz / 4 * 4 = 533333332 Hz < 533333333 Hz)). */ + new_upper_limit = (new_clock << i) + ((new_clock << i) >> 10); for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) { freq = clk_round_rate(ref_clk, new_clock << i); - if (freq > (new_clock << i)) { + if (freq > new_upper_limit) { /* Too fast; look for a slightly slower option */ freq = clk_round_rate(ref_clk, (new_clock << i) / 4 * 3); if (freq > (new_clock << i)) @@ -181,6 +189,7 @@ static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host, static void renesas_sdhi_set_clock(struct tmio_mmc_host *host, unsigned int new_clock) { + unsigned int clk_margin; u32 clk = 0, clock; sd_ctrl_write16(host, CTL_SD_CARD_CLK_CTL, ~CLK_CTL_SCLKEN & @@ -194,7 +203,13 @@ static void renesas_sdhi_set_clock(struct tmio_mmc_host *host, host->mmc->actual_clock = renesas_sdhi_clk_update(host, new_clock); clock = host->mmc->actual_clock / 512; - for (clk = 0x80000080; new_clock >= (clock << 1); clk >>= 1) + /* + * Add a margin of 1/1024 rate higher to the clock rate in order + * to avoid clk variable setting a value of 0 due to the margin + * provided for actual_clock in renesas_sdhi_clk_update(). + */ + clk_margin = new_clock >> 10; + for (clk = 0x80000080; new_clock + clk_margin >= (clock << 1); clk >>= 1) clock <<= 1; /* 1/1 clock is option */