Message ID | 20210627223959.188139-3-martin.blumenstingl@googlemail.com (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Series | clk: meson: rounding for fast clocks on 32-bit SoCs | expand |
Quoting Martin Blumenstingl (2021-06-27 15:39:58) > .determine_rate is meant to replace .round_rate. The former comes with a > benefit which is especially relevant on 32-bit systems: since > .determine_rate uses an "unsigned long" (compared to a "signed long" > which is used by .round_rate) the maximum value on 32-bit systems > increases from 2^31 (or approx. 2.14GHz) to 2^32 (or approx. 4.29GHz). > Switch to a .determine_rate implementation by default so 32-bit systems > can benefit from the increased maximum value as well as so we have one > fewer user of .round_rate. > > Reviewed-by: Jerome Brunet <jbrunet@baylibre.com> > Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> > --- Applied to clk-next
On Mon, Jun 28, 2021 at 12:39:58AM +0200, Martin Blumenstingl wrote: > .determine_rate is meant to replace .round_rate. The former comes with a > benefit which is especially relevant on 32-bit systems: since > .determine_rate uses an "unsigned long" (compared to a "signed long" > which is used by .round_rate) the maximum value on 32-bit systems > increases from 2^31 (or approx. 2.14GHz) to 2^32 (or approx. 4.29GHz). > Switch to a .determine_rate implementation by default so 32-bit systems > can benefit from the increased maximum value as well as so we have one > fewer user of .round_rate. > > Reviewed-by: Jerome Brunet <jbrunet@baylibre.com> > Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> In next-20210701: 0.000000] 8<--- cut here --- [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 0.000000] pgd = (ptrval) [ 0.000000] [00000000] *pgd=00000000 [ 0.000000] Internal error: Oops: 80000005 [#1] SMP ARM [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.13.0-next-20210701 #1 [ 0.000000] Hardware name: Freescale i.MX6 Ultralite (Device Tree) [ 0.000000] PC is at 0x0 [ 0.000000] LR is at clk_core_determine_round_nolock+0xb4/0xe0 [ 0.000000] pc : [<00000000>] lr : [<c07be330>] psr: a00000d3 [ 0.000000] sp : c1701e48 ip : 00000000 fp : c1f6b340 [ 0.000000] r10: c1f6b33c r9 : d082007c r8 : c1709a18 [ 0.000000] r7 : 05e69ec0 r6 : 00000000 r5 : c1701e58 r4 : c2090480 [ 0.000000] r3 : 00000000 r2 : c1701e64 r1 : 05e69ec0 r0 : c208fe80 [ 0.000000] Flags: NzCv IRQs off FIQs off Mode SVC_32 ISA ARM Segment none [ 0.000000] Control: 10c5387d Table: 8000406a DAC: 00000051 [ 0.000000] Register r0 information: slab kmalloc-64 start c208fe80 pointer offset 0 size 64 [ 0.000000] Register r1 information: non-paged memory [ 0.000000] Register r2 information: non-slab/vmalloc memory [ 0.000000] Register r3 information: NULL pointer [ 0.000000] Register r4 information: slab kmalloc-192 start c2090480 pointer offset 0 size 192 [ 0.000000] Register r5 information: non-slab/vmalloc memory [ 0.000000] Register r6 information: NULL pointer [ 0.000000] Register r7 information: non-paged memory [ 0.000000] Register r8 information: non-slab/vmalloc memory [ 0.000000] Register r9 information: 0-page vmalloc region starting at 0xd0820000 allocated at of_iomap+0x4c/0x68 [ 0.000000] Register r10 information: non-slab/vmalloc memory [ 0.000000] Register r11 information: non-slab/vmalloc memory [ 0.000000] Register r12 information: NULL pointer [ 0.000000] Process swapper/0 (pid: 0, stack limit = 0x(ptrval)) [ 0.000000] Stack: (0xc1701e48 to 0xc1702000) [ 0.000000] 1e40: c2090480 c1700000 00000000 c07c5480 05e69ec0 00000000 [ 0.000000] 1e60: ffffffff 179a7b00 c208d500 e6ff0344 c208ff00 05e69ec0 d0820080 00000000 [ 0.000000] 1e80: 0000001c c07c55c0 c1f6b324 c2012c04 d0820080 c163c310 c1f6b340 00000000 [ 0.000000] 1ea0: 00000804 d0820074 0000001c 00000000 c186052c c109a5b0 c186052c c0caf26c [ 0.000000] 1ec0: cbdc67ac d0820000 c2012c04 d0820060 d0820048 d0820028 cbdcc0cc d0820018 [ 0.000000] 1ee0: d0820020 d0820038 d0820030 c2012c04 c1701f1c 00000000 c2005e80 c1701f1c [ 0.000000] 1f00: 00000004 c2005e88 cbdcc0cc cbdcc138 00000001 c162a4e4 c1700000 c1700000 [ 0.000000] 1f20: 00000000 c2005e88 c2005e88 cbdc5884 00000000 c1701fc8 c17093cc c0838f44 [ 0.000000] 1f40: c1600dd0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 0.000000] 1f60: 00000000 00000000 00000000 00000000 00000000 e6ff0344 00000000 c1902000 [ 0.000000] 1f80: c1663a80 00000012 c1700000 c1709380 00000000 c170caa8 c14cb424 c1604d34 [ 0.000000] 1fa0: c1902000 c1600e0c ffffffff ffffffff 00000000 c1600588 00000000 c1700000 [ 0.000000] 1fc0: 00000000 c1663a80 e6fa0e44 00000000 00000000 c1600330 00000051 10c0387d [ 0.000000] 1fe0: ffffffff 8833b000 410fc075 10c5387d 00000000 00000000 00000000 00000000 [ 0.000000] [<c07be330>] (clk_core_determine_round_nolock) from [<c07c5480>] (clk_core_set_rate_nolock+0x184/0x294) [ 0.000000] [<c07c5480>] (clk_core_set_rate_nolock) from [<c07c55c0>] (clk_set_rate+0x30/0x64) [ 0.000000] [<c07c55c0>] (clk_set_rate) from [<c163c310>] (imx6ul_clocks_init+0x2798/0x2a44) [ 0.000000] [<c163c310>] (imx6ul_clocks_init) from [<c162a4e4>] (of_clk_init+0x180/0x26c) [ 0.000000] [<c162a4e4>] (of_clk_init) from [<c1604d34>] (time_init+0x20/0x30) [ 0.000000] [<c1604d34>] (time_init) from [<c1600e0c>] (start_kernel+0x4c8/0x6cc) [ 0.000000] [<c1600e0c>] (start_kernel) from [<00000000>] (0x0) [ 0.000000] Code: bad PC value [ 0.000000] ---[ end trace 7009a0f298fd39e9 ]--- [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! Bisct points to this patch as culprit. Reverting it fixes the problem. Guenter
Hi Guenter, On Thu, Jul 1, 2021 at 10:25 PM Guenter Roeck <linux@roeck-us.net> wrote: [...] > [ 0.000000] [<c07be330>] (clk_core_determine_round_nolock) from [<c07c5480>] (clk_core_set_rate_nolock+0x184/0x294) > [ 0.000000] [<c07c5480>] (clk_core_set_rate_nolock) from [<c07c55c0>] (clk_set_rate+0x30/0x64) > [ 0.000000] [<c07c55c0>] (clk_set_rate) from [<c163c310>] (imx6ul_clocks_init+0x2798/0x2a44) > [ 0.000000] [<c163c310>] (imx6ul_clocks_init) from [<c162a4e4>] (of_clk_init+0x180/0x26c) > [ 0.000000] [<c162a4e4>] (of_clk_init) from [<c1604d34>] (time_init+0x20/0x30) > [ 0.000000] [<c1604d34>] (time_init) from [<c1600e0c>] (start_kernel+0x4c8/0x6cc) > [ 0.000000] [<c1600e0c>] (start_kernel) from [<00000000>] (0x0) > [ 0.000000] Code: bad PC value > [ 0.000000] ---[ end trace 7009a0f298fd39e9 ]--- > [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! > > Bisct points to this patch as culprit. Reverting it fixes the problem. sorry for breaking imx6 - and at the same time: thanks for reporting this! Do you have some additional information about this crash (which clock this relates to, file and line number, etc.)? I am struggling to understand the cause of this NULL dereference My patch doesn't change the clk_core_determine_round_nolock() implementation and the new determine_rate code-path (inside that function) doesn't seem to be more fragile in terms of NULL values compared to the round_rate code-path. Instead I think it's more likely that the problem is somewhere within clk_divider_determine_rate() (or in any helper function it uses), but that doesn't show up in the trace I don't have any imx6 board myself and so far I am unable to reproduce this crash on any hardware I have. However, if it's a problem in my clk-divider.c changes then I'd like to find the cause (ASAP) because possibly more SoCs may be broken... Best regards, Martin
On 7/1/21 1:57 PM, Martin Blumenstingl wrote: > Hi Guenter, > > On Thu, Jul 1, 2021 at 10:25 PM Guenter Roeck <linux@roeck-us.net> wrote: > [...] >> [ 0.000000] [<c07be330>] (clk_core_determine_round_nolock) from [<c07c5480>] (clk_core_set_rate_nolock+0x184/0x294) >> [ 0.000000] [<c07c5480>] (clk_core_set_rate_nolock) from [<c07c55c0>] (clk_set_rate+0x30/0x64) >> [ 0.000000] [<c07c55c0>] (clk_set_rate) from [<c163c310>] (imx6ul_clocks_init+0x2798/0x2a44) >> [ 0.000000] [<c163c310>] (imx6ul_clocks_init) from [<c162a4e4>] (of_clk_init+0x180/0x26c) >> [ 0.000000] [<c162a4e4>] (of_clk_init) from [<c1604d34>] (time_init+0x20/0x30) >> [ 0.000000] [<c1604d34>] (time_init) from [<c1600e0c>] (start_kernel+0x4c8/0x6cc) >> [ 0.000000] [<c1600e0c>] (start_kernel) from [<00000000>] (0x0) >> [ 0.000000] Code: bad PC value >> [ 0.000000] ---[ end trace 7009a0f298fd39e9 ]--- >> [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! >> >> Bisct points to this patch as culprit. Reverting it fixes the problem. > sorry for breaking imx6 - and at the same time: thanks for reporting this! > > Do you have some additional information about this crash (which clock > this relates to, file and line number, etc.)? > I am struggling to understand the cause of this NULL dereference > My patch doesn't change the clk_core_determine_round_nolock() > implementation and the new determine_rate code-path (inside that > function) doesn't seem to be more fragile in terms of NULL values > compared to the round_rate code-path. > Instead I think it's more likely that the problem is somewhere within > clk_divider_determine_rate() (or in any helper function it uses), but > that doesn't show up in the trace > > I don't have any imx6 board myself and so far I am unable to reproduce > this crash on any hardware I have. > However, if it's a problem in my clk-divider.c changes then I'd like > to find the cause (ASAP) because possibly more SoCs may be broken... > I don't have such a board either. The problem shows up in my qemu boot tests. See https://kerneltests.org/builders/qemu-arm-v7-next/builds/38/steps/qemubuildcommand/logs/stdio for an example. The problem reproduces with qemu's mcimx6ul-evk and sabrelite emulations. Guenter
Quoting Guenter Roeck (2021-07-01 14:43:29) > On 7/1/21 1:57 PM, Martin Blumenstingl wrote: > > Hi Guenter, > > > > On Thu, Jul 1, 2021 at 10:25 PM Guenter Roeck <linux@roeck-us.net> wrote: > > [...] > >> [ 0.000000] [<c07be330>] (clk_core_determine_round_nolock) from [<c07c5480>] (clk_core_set_rate_nolock+0x184/0x294) > >> [ 0.000000] [<c07c5480>] (clk_core_set_rate_nolock) from [<c07c55c0>] (clk_set_rate+0x30/0x64) > >> [ 0.000000] [<c07c55c0>] (clk_set_rate) from [<c163c310>] (imx6ul_clocks_init+0x2798/0x2a44) > >> [ 0.000000] [<c163c310>] (imx6ul_clocks_init) from [<c162a4e4>] (of_clk_init+0x180/0x26c) > >> [ 0.000000] [<c162a4e4>] (of_clk_init) from [<c1604d34>] (time_init+0x20/0x30) > >> [ 0.000000] [<c1604d34>] (time_init) from [<c1600e0c>] (start_kernel+0x4c8/0x6cc) > >> [ 0.000000] [<c1600e0c>] (start_kernel) from [<00000000>] (0x0) > >> [ 0.000000] Code: bad PC value > >> [ 0.000000] ---[ end trace 7009a0f298fd39e9 ]--- > >> [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! > >> > >> Bisct points to this patch as culprit. Reverting it fixes the problem. > > sorry for breaking imx6 - and at the same time: thanks for reporting this! > > > > Do you have some additional information about this crash (which clock > > this relates to, file and line number, etc.)? > > I am struggling to understand the cause of this NULL dereference > > My patch doesn't change the clk_core_determine_round_nolock() > > implementation and the new determine_rate code-path (inside that > > function) doesn't seem to be more fragile in terms of NULL values > > compared to the round_rate code-path. > > Instead I think it's more likely that the problem is somewhere within > > clk_divider_determine_rate() (or in any helper function it uses), but > > that doesn't show up in the trace > > > > I don't have any imx6 board myself and so far I am unable to reproduce > > this crash on any hardware I have. > > However, if it's a problem in my clk-divider.c changes then I'd like > > to find the cause (ASAP) because possibly more SoCs may be broken... > > > > I don't have such a board either. The problem shows up in my qemu boot tests. See > https://kerneltests.org/builders/qemu-arm-v7-next/builds/38/steps/qemubuildcommand/logs/stdio > for an example. The problem reproduces with qemu's mcimx6ul-evk and sabrelite > emulations. > Would be great if we had some kunit tests for the divider code. No doubt it would have caught this. I think for now I'll just drop this patch from clk-next. Thanks for testing.
Quoting Martin Blumenstingl (2021-07-01 13:57:28) > Hi Guenter, > > On Thu, Jul 1, 2021 at 10:25 PM Guenter Roeck <linux@roeck-us.net> wrote: > [...] > > [ 0.000000] [<c07be330>] (clk_core_determine_round_nolock) from [<c07c5480>] (clk_core_set_rate_nolock+0x184/0x294) > > [ 0.000000] [<c07c5480>] (clk_core_set_rate_nolock) from [<c07c55c0>] (clk_set_rate+0x30/0x64) > > [ 0.000000] [<c07c55c0>] (clk_set_rate) from [<c163c310>] (imx6ul_clocks_init+0x2798/0x2a44) > > [ 0.000000] [<c163c310>] (imx6ul_clocks_init) from [<c162a4e4>] (of_clk_init+0x180/0x26c) > > [ 0.000000] [<c162a4e4>] (of_clk_init) from [<c1604d34>] (time_init+0x20/0x30) > > [ 0.000000] [<c1604d34>] (time_init) from [<c1600e0c>] (start_kernel+0x4c8/0x6cc) > > [ 0.000000] [<c1600e0c>] (start_kernel) from [<00000000>] (0x0) > > [ 0.000000] Code: bad PC value > > [ 0.000000] ---[ end trace 7009a0f298fd39e9 ]--- > > [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! > > > > Bisct points to this patch as culprit. Reverting it fixes the problem. > sorry for breaking imx6 - and at the same time: thanks for reporting this! > > Do you have some additional information about this crash (which clock > this relates to, file and line number, etc.)? > I am struggling to understand the cause of this NULL dereference > My patch doesn't change the clk_core_determine_round_nolock() > implementation and the new determine_rate code-path (inside that > function) doesn't seem to be more fragile in terms of NULL values > compared to the round_rate code-path. > Instead I think it's more likely that the problem is somewhere within > clk_divider_determine_rate() (or in any helper function it uses), but > that doesn't show up in the trace My guess is that we have drivers copying the clk_ops from the divider_ops structure and so they are copying over round_rate but not determine_rate. > > I don't have any imx6 board myself and so far I am unable to reproduce > this crash on any hardware I have. > However, if it's a problem in my clk-divider.c changes then I'd like > to find the cause (ASAP) because possibly more SoCs may be broken... >
Quoting Martin Blumenstingl (2021-06-27 15:39:58) > .determine_rate is meant to replace .round_rate. The former comes with a > benefit which is especially relevant on 32-bit systems: since > .determine_rate uses an "unsigned long" (compared to a "signed long" > which is used by .round_rate) the maximum value on 32-bit systems > increases from 2^31 (or approx. 2.14GHz) to 2^32 (or approx. 4.29GHz). > Switch to a .determine_rate implementation by default so 32-bit systems > can benefit from the increased maximum value as well as so we have one > fewer user of .round_rate. > > Reviewed-by: Jerome Brunet <jbrunet@baylibre.com> > Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> > --- > drivers/clk/clk-divider.c | 18 +++++++++--------- > 1 file changed, 9 insertions(+), 9 deletions(-) > > diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c > index 87ba4966b0e8..9e05e81116af 100644 > --- a/drivers/clk/clk-divider.c > +++ b/drivers/clk/clk-divider.c > @@ -425,8 +425,8 @@ long divider_ro_round_rate_parent(struct clk_hw *hw, struct clk_hw *parent, > } > EXPORT_SYMBOL_GPL(divider_ro_round_rate_parent); > > -static long clk_divider_round_rate(struct clk_hw *hw, unsigned long rate, > - unsigned long *prate) > +static int clk_divider_determine_rate(struct clk_hw *hw, > + struct clk_rate_request *req) > { > struct clk_divider *divider = to_clk_divider(hw); > > @@ -437,13 +437,13 @@ static long clk_divider_round_rate(struct clk_hw *hw, unsigned long rate, > val = clk_div_readl(divider) >> divider->shift; > val &= clk_div_mask(divider->width); > > - return divider_ro_round_rate(hw, rate, prate, divider->table, > - divider->width, divider->flags, > - val); > + return divider_ro_determine_rate(hw, req, divider->table, > + divider->width, > + divider->flags, val); > } > > - return divider_round_rate(hw, rate, prate, divider->table, > - divider->width, divider->flags); > + return divider_determine_rate(hw, req, divider->table, divider->width, > + divider->flags); > } > > int divider_get_val(unsigned long rate, unsigned long parent_rate, > @@ -500,14 +500,14 @@ static int clk_divider_set_rate(struct clk_hw *hw, unsigned long rate, > > const struct clk_ops clk_divider_ops = { > .recalc_rate = clk_divider_recalc_rate, > - .round_rate = clk_divider_round_rate, > + .determine_rate = clk_divider_determine_rate, Guess was right. $ git grep clk_divider_ops -- drivers/clk/imx/ drivers/clk/imx/clk-divider-gate.c: return clk_divider_ops.round_rate(hw, rate, prate); > .set_rate = clk_divider_set_rate, > }; > EXPORT_SYMBOL_GPL(clk_divider_ops); > > const struct clk_ops clk_divider_ro_ops = { > .recalc_rate = clk_divider_recalc_rate, > - .round_rate = clk_divider_round_rate, > + .determine_rate = clk_divider_determine_rate,
Hi Stephen, On Fri, Jul 2, 2021 at 3:02 AM Stephen Boyd <sboyd@kernel.org> wrote: [...] > My guess is that we have drivers copying the clk_ops from the > divider_ops structure and so they are copying over round_rate but not > determine_rate. I just learned something new - thanks for investigating this as well! $ git grep "clk_divider_ops\.round_rate" drivers/ drivers/clk/bcm/clk-bcm2835.c: return clk_divider_ops.round_rate(hw, rate, parent_rate); drivers/clk/clk-stm32f4.c: return clk_divider_ops.round_rate(hw, rate, prate); drivers/clk/clk-stm32h7.c: return clk_divider_ops.round_rate(hw, rate, prate); drivers/clk/clk-stm32mp1.c: req->rate = clk_divider_ops.round_rate(hw, req->rate, &best_parent_rate); drivers/clk/imx/clk-divider-gate.c: return clk_divider_ops.round_rate(hw, rate, prate); $ git grep "clk_divider_ro_ops\.round_rate" drivers/ $ Changing these over to use clk_divider_ops.determine_rate doesn't seem too hard. The part that I am not sure about is how to organize the patches. 1) amend the changes to all relevant drivers (from above) to this patch 2) multiple patches: - adding .determine_rate to the default divider ops (but not removing .round_rate) - a single patch for each relevant driver (from above) - removing .round_rate from the default divider ops Another approach is to first create clk_divider_determine_rate() (as done here) and export it. Then I could have one individual patch for each relevant driver (from above) to use: .determine_rate = clk_divider_determine_rate, Then finally I could remove clk_divider_round_rate() and switch over the default divider ops to .determine_rate as well. Which way do you prefer? Best regards, Martin
Hi On 02.07.2021 11:19, Martin Blumenstingl wrote: > Hi Stephen, > > On Fri, Jul 2, 2021 at 3:02 AM Stephen Boyd <sboyd@kernel.org> wrote: > [...] >> My guess is that we have drivers copying the clk_ops from the >> divider_ops structure and so they are copying over round_rate but not >> determine_rate. > I just learned something new - thanks for investigating this as well! > > $ git grep "clk_divider_ops\.round_rate" drivers/ > drivers/clk/bcm/clk-bcm2835.c: return clk_divider_ops.round_rate(hw, > rate, parent_rate); I confirm that this issue appears also on Raspberry Pi 3b+ board. I was about to write a bug report, but you were faster. The funny thing is that is so nondeterministic, that automated bisecting failed to catch it. > drivers/clk/clk-stm32f4.c: return clk_divider_ops.round_rate(hw, > rate, prate); > drivers/clk/clk-stm32h7.c: return clk_divider_ops.round_rate(hw, > rate, prate); > drivers/clk/clk-stm32mp1.c: req->rate = > clk_divider_ops.round_rate(hw, req->rate, &best_parent_rate); > drivers/clk/imx/clk-divider-gate.c: return > clk_divider_ops.round_rate(hw, rate, prate); > $ git grep "clk_divider_ro_ops\.round_rate" drivers/ > $ > > Changing these over to use clk_divider_ops.determine_rate doesn't seem too hard. > The part that I am not sure about is how to organize the patches. > 1) amend the changes to all relevant drivers (from above) to this patch > 2) multiple patches: > - adding .determine_rate to the default divider ops (but not removing > .round_rate) > - a single patch for each relevant driver (from above) > - removing .round_rate from the default divider ops > > Another approach is to first create clk_divider_determine_rate() (as > done here) and export it. > Then I could have one individual patch for each relevant driver (from > above) to use: > .determine_rate = clk_divider_determine_rate, > Then finally I could remove clk_divider_round_rate() and switch over > the default divider ops to .determine_rate as well. > > Which way do you prefer? > > > Best regards, > Martin > Best regards
Quoting Martin Blumenstingl (2021-07-02 02:19:37) > Hi Stephen, > > On Fri, Jul 2, 2021 at 3:02 AM Stephen Boyd <sboyd@kernel.org> wrote: > [...] > > My guess is that we have drivers copying the clk_ops from the > > divider_ops structure and so they are copying over round_rate but not > > determine_rate. > I just learned something new - thanks for investigating this as well! > > $ git grep "clk_divider_ops\.round_rate" drivers/ > drivers/clk/bcm/clk-bcm2835.c: return clk_divider_ops.round_rate(hw, > rate, parent_rate); > drivers/clk/clk-stm32f4.c: return clk_divider_ops.round_rate(hw, > rate, prate); > drivers/clk/clk-stm32h7.c: return clk_divider_ops.round_rate(hw, > rate, prate); > drivers/clk/clk-stm32mp1.c: req->rate = > clk_divider_ops.round_rate(hw, req->rate, &best_parent_rate); > drivers/clk/imx/clk-divider-gate.c: return > clk_divider_ops.round_rate(hw, rate, prate); > $ git grep "clk_divider_ro_ops\.round_rate" drivers/ > $ > > Changing these over to use clk_divider_ops.determine_rate doesn't seem too hard. > The part that I am not sure about is how to organize the patches. > 1) amend the changes to all relevant drivers (from above) to this patch > 2) multiple patches: > - adding .determine_rate to the default divider ops (but not removing > .round_rate) > - a single patch for each relevant driver (from above) > - removing .round_rate from the default divider ops > > Another approach is to first create clk_divider_determine_rate() (as > done here) and export it. > Then I could have one individual patch for each relevant driver (from > above) to use: > .determine_rate = clk_divider_determine_rate, > Then finally I could remove clk_divider_round_rate() and switch over > the default divider ops to .determine_rate as well. > > Which way do you prefer? > I'd prefer we leave round_rate assigned in clk_divider_ops and clk_divider_ro_ops but then also assign the determine_rate function. We have some duplication but that's OK. Then make individual patches to migrate each driver over to the new clk op. We could stack a final patch on top to remove the round_rate function from clk divider. Unfortunately, if some driver wants to use round_rate then it will fail in interesting ways. Probably best to live with it until we decide to drop round_rate entirely. Patches to convert all the round_rate code over to determine_rate would be welcome in the meantime.
Quoting Marek Szyprowski (2021-07-02 05:46:11) > Hi > > On 02.07.2021 11:19, Martin Blumenstingl wrote: > > Hi Stephen, > > > > On Fri, Jul 2, 2021 at 3:02 AM Stephen Boyd <sboyd@kernel.org> wrote: > > [...] > >> My guess is that we have drivers copying the clk_ops from the > >> divider_ops structure and so they are copying over round_rate but not > >> determine_rate. > > I just learned something new - thanks for investigating this as well! > > > > $ git grep "clk_divider_ops\.round_rate" drivers/ > > drivers/clk/bcm/clk-bcm2835.c: return clk_divider_ops.round_rate(hw, > > rate, parent_rate); > > I confirm that this issue appears also on Raspberry Pi 3b+ board. I was > about to write a bug report, but you were faster. The funny thing is > that is so nondeterministic, that automated bisecting failed to catch it. > I'd think it was deterministic. The function pointer is NULL after this patch so it should always try to set the PC to 0 and fail to execute. Unless that is somehow executable?
Hi Stephen, On Fri, Jul 2, 2021 at 10:59 PM Stephen Boyd <sboyd@kernel.org> wrote: [...] > I'd prefer we leave round_rate assigned in clk_divider_ops and > clk_divider_ro_ops but then also assign the determine_rate function. We > have some duplication but that's OK. Then make individual patches to > migrate each driver over to the new clk op. I just sent a series with those changes: [0] > We could stack a final patch on top to remove the round_rate function > from clk divider. Unfortunately, if some driver wants to use round_rate > then it will fail in interesting ways. Probably best to live with it > until we decide to drop round_rate entirely. Patches to convert all the > round_rate code over to determine_rate would be welcome in the meantime. For now I have omitted the patch to remove .round_rate from clk_divider_ops. Also I will start migrating .round_rate over to .determine_rate in drivers/clk/meson/ (as it's the only hardware with CCF support that I have). Best regards, Martin [0] https://patchwork.kernel.org/project/linux-clk/cover/20210702225145.2643303-1-martin.blumenstingl@googlemail.com/
diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c index 87ba4966b0e8..9e05e81116af 100644 --- a/drivers/clk/clk-divider.c +++ b/drivers/clk/clk-divider.c @@ -425,8 +425,8 @@ long divider_ro_round_rate_parent(struct clk_hw *hw, struct clk_hw *parent, } EXPORT_SYMBOL_GPL(divider_ro_round_rate_parent); -static long clk_divider_round_rate(struct clk_hw *hw, unsigned long rate, - unsigned long *prate) +static int clk_divider_determine_rate(struct clk_hw *hw, + struct clk_rate_request *req) { struct clk_divider *divider = to_clk_divider(hw); @@ -437,13 +437,13 @@ static long clk_divider_round_rate(struct clk_hw *hw, unsigned long rate, val = clk_div_readl(divider) >> divider->shift; val &= clk_div_mask(divider->width); - return divider_ro_round_rate(hw, rate, prate, divider->table, - divider->width, divider->flags, - val); + return divider_ro_determine_rate(hw, req, divider->table, + divider->width, + divider->flags, val); } - return divider_round_rate(hw, rate, prate, divider->table, - divider->width, divider->flags); + return divider_determine_rate(hw, req, divider->table, divider->width, + divider->flags); } int divider_get_val(unsigned long rate, unsigned long parent_rate, @@ -500,14 +500,14 @@ static int clk_divider_set_rate(struct clk_hw *hw, unsigned long rate, const struct clk_ops clk_divider_ops = { .recalc_rate = clk_divider_recalc_rate, - .round_rate = clk_divider_round_rate, + .determine_rate = clk_divider_determine_rate, .set_rate = clk_divider_set_rate, }; EXPORT_SYMBOL_GPL(clk_divider_ops); const struct clk_ops clk_divider_ro_ops = { .recalc_rate = clk_divider_recalc_rate, - .round_rate = clk_divider_round_rate, + .determine_rate = clk_divider_determine_rate, }; EXPORT_SYMBOL_GPL(clk_divider_ro_ops);