Message ID | 20200305175138.92075-1-thierry.reding@gmail.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | clk: Do not recalc rate for reparented clocks | expand |
Quoting Thierry Reding (2020-03-05 09:51:38) > From: Thierry Reding <treding@nvidia.com> > > As part of the clock frequency change sequence, a driver may need to > reparent a clock. In that case, the rate will already have been updated > and the cached parent rate will no longer be valid, so just skip the > recalculation. Can you describe more about what's going on and why this needs to change? For example, we have 'set_rate_and_parent' for cases where a driver reparents a clk during a rate change. Is that not sufficient here? > > Signed-off-by: Thierry Reding <treding@nvidia.com> > --- > drivers/clk/clk.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c > index ebfc1e2103cb..49d92f4785a2 100644 > --- a/drivers/clk/clk.c > +++ b/drivers/clk/clk.c > @@ -2079,7 +2079,14 @@ static void clk_change_rate(struct clk_core *core) > > trace_clk_set_rate_complete(core, core->new_rate); > > - core->rate = clk_recalc(core, best_parent_rate); > + /* > + * Some drivers need to change the parent of a clock as part of the > + * rate change sequence. In that case, best_parent_rate is no longer > + * valid. However, reparenting already recalculates the rate for the > + * entire clock subtree, so we can safely skip this here. > + */ > + if (core->parent == parent) > + core->rate = clk_recalc(core, best_parent_rate); > I wonder if we can undo the recursion and figure out a way to make this only happen once, when we want it to happen.
On 3/10/20 3:48 AM, Stephen Boyd wrote: > Quoting Thierry Reding (2020-03-05 09:51:38) >> From: Thierry Reding <treding@nvidia.com> >> >> As part of the clock frequency change sequence, a driver may need to >> reparent a clock. In that case, the rate will already have been updated >> and the cached parent rate will no longer be valid, so just skip the >> recalculation. > > Can you describe more about what's going on and why this needs to > change? For example, we have 'set_rate_and_parent' for cases where a > driver reparents a clk during a rate change. Is that not sufficient > here? > I believe this is related to the memory clock change sequence on Tegra, where the EMC clock may have to be reparented multiple times during the rate change sequence. If EMC is running off parent A, and the requested new OPP also uses parent A, we have to temporarily switch to a different OPP with parent B so that the rate change sequence can be executed on parent A without affecting the system's memory accesses. Mikko >> >> Signed-off-by: Thierry Reding <treding@nvidia.com> >> --- >> drivers/clk/clk.c | 9 ++++++++- >> 1 file changed, 8 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c >> index ebfc1e2103cb..49d92f4785a2 100644 >> --- a/drivers/clk/clk.c >> +++ b/drivers/clk/clk.c >> @@ -2079,7 +2079,14 @@ static void clk_change_rate(struct clk_core *core) >> >> trace_clk_set_rate_complete(core, core->new_rate); >> >> - core->rate = clk_recalc(core, best_parent_rate); >> + /* >> + * Some drivers need to change the parent of a clock as part of the >> + * rate change sequence. In that case, best_parent_rate is no longer >> + * valid. However, reparenting already recalculates the rate for the >> + * entire clock subtree, so we can safely skip this here. >> + */ >> + if (core->parent == parent) >> + core->rate = clk_recalc(core, best_parent_rate); >> > > I wonder if we can undo the recursion and figure out a way to make this > only happen once, when we want it to happen. >
On Mon, Mar 09, 2020 at 06:48:47PM -0700, Stephen Boyd wrote: > Quoting Thierry Reding (2020-03-05 09:51:38) > > From: Thierry Reding <treding@nvidia.com> > > > > As part of the clock frequency change sequence, a driver may need to > > reparent a clock. In that case, the rate will already have been updated > > and the cached parent rate will no longer be valid, so just skip the > > recalculation. > > Can you describe more about what's going on and why this needs to > change? For example, we have 'set_rate_and_parent' for cases where a > driver reparents a clk during a rate change. Is that not sufficient > here? By "driver" above, I mean clock driver. One example where we need this, and currently work around it, is the Tegra124 EMC (external memory controller) clock driver. You can find this here: drivers/clk/clk-emc.c If you look at the emc_recalc_rate() function, it needs to override the parent clock rate because CCF calls it with the clock rate of the parent prior to the rate change, whereas the parent (and/or the rate) potentially change during the rate change. The reason why we can't use ->set_rate_and_parent() here is because the consumers never actually change the parent for the EMC clock. Instead it is up to the EMC driver to select the parent based on the specific rate that is requested. The parent clock is encoded in a table of supported EMC frequencies that a particular board can support. This table is passed to the EMC driver from the firmware or hardcoded in DT, depending on the particular SoC generation. For a bit more background: the use-case is to accumulate a set of bandwidth requests within the EMC driver, which can then be converted into a clock frequency that is required for the EMC to provide that bandwidth. Since the controls for this are sprinkled around a bit, the nicest way to represent this is using an "emc" clock (in the clock and reset controller) and the EMC driver that deals with programming the EMC (which has registers separate from the clock and reset controller). So basically how this works is that the "emc" clock driver needs to call back into the EMC driver to perform the clock change. This being a clock that is needed to operate system memory, we have to be very careful when things happen. So a frequency switch typically requires the old parent to remain enabled while the EMC is programmed with new settings and then switched to the new parent. Only then can the old parent be disabled. The parent change happens fairly deep inside the EMC sequence and can't be isolated from that, unfortunately. In order to deal with that, we basically "fixup" the clock tree after the frequency change by calling clk_hw_reparent() manually. This in turn causes the parent to change, but clk_set_rate() doesn't know about that. Instead, it has the parent rate cached in a local variable, which will no longer be the correct value after the switch has happened. Fortunately, clk_hw_reparent() already causes the rates of the entire subtree of clocks to be recalculated, so there isn't anything left to do when we return from the rate change. This patch detects when this has happened by checking that the current parent against the "cached" parent that the cached parent rate is based on. If they aren't equal, it assumes that the clock driver has manually reparented the clock and does not have to do anything else. Does that clarify things? Thierry > > Signed-off-by: Thierry Reding <treding@nvidia.com> > > --- > > drivers/clk/clk.c | 9 ++++++++- > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c > > index ebfc1e2103cb..49d92f4785a2 100644 > > --- a/drivers/clk/clk.c > > +++ b/drivers/clk/clk.c > > @@ -2079,7 +2079,14 @@ static void clk_change_rate(struct clk_core *core) > > > > trace_clk_set_rate_complete(core, core->new_rate); > > > > - core->rate = clk_recalc(core, best_parent_rate); > > + /* > > + * Some drivers need to change the parent of a clock as part of the > > + * rate change sequence. In that case, best_parent_rate is no longer > > + * valid. However, reparenting already recalculates the rate for the > > + * entire clock subtree, so we can safely skip this here. > > + */ > > + if (core->parent == parent) > > + core->rate = clk_recalc(core, best_parent_rate); > > > > I wonder if we can undo the recursion and figure out a way to make this > only happen once, when we want it to happen.
On Tue, Mar 10, 2020 at 12:25:01PM +0200, Mikko Perttunen wrote: > On 3/10/20 3:48 AM, Stephen Boyd wrote: > > Quoting Thierry Reding (2020-03-05 09:51:38) > > > From: Thierry Reding <treding@nvidia.com> > > > > > > As part of the clock frequency change sequence, a driver may need to > > > reparent a clock. In that case, the rate will already have been updated > > > and the cached parent rate will no longer be valid, so just skip the > > > recalculation. > > > > Can you describe more about what's going on and why this needs to > > change? For example, we have 'set_rate_and_parent' for cases where a > > driver reparents a clk during a rate change. Is that not sufficient > > here? > > > > I believe this is related to the memory clock change sequence on Tegra, > where the EMC clock may have to be reparented multiple times during the rate > change sequence. > > If EMC is running off parent A, and the requested new OPP also uses parent > A, we have to temporarily switch to a different OPP with parent B so that > the rate change sequence can be executed on parent A without affecting the > system's memory accesses. It's not only that. The mismatch can happen every time that we change the operating point because each such change could cause either the parent and/or the parent rate to change. The CCF always caches the parent rate from before the rate change, assuming that the parent will not change. Perhaps a clearer way of saying this is that the parent change is an internal part of the rate change. clk_hw_reparent() already allows us to notify the CCF of the hierarchical change in the clock tree. But then we need to additionally either override the parent rate in the driver, like we currently do on Tegra124, or have the core ignore the cached parent rate if the parent is no longer what it expects. I should probably have followed up this patch with a corresponding patch to the Tegra124 EMC driver that removes the workaround, like below: --- >8 --- diff --git a/drivers/clk/tegra/clk-emc.c b/drivers/clk/tegra/clk-emc.c index 745f9faa98d8..f29f0a0e48de 100644 --- a/drivers/clk/tegra/clk-emc.c +++ b/drivers/clk/tegra/clk-emc.c @@ -92,12 +92,6 @@ static unsigned long emc_recalc_rate(struct clk_hw *hw, tegra = container_of(hw, struct tegra_clk_emc, hw); - /* - * CCF wrongly assumes that the parent won't change during set_rate, - * so get the parent rate explicitly. - */ - parent_rate = clk_hw_get_rate(clk_hw_get_parent(hw)); - val = readl(tegra->clk_regs + CLK_SOURCE_EMC); div = val & CLK_SOURCE_EMC_EMC_2X_CLK_DIVISOR_MASK; --- >8 --- But I haven't actually tested that yet because I don't have a Tegra124 board set up at the moment. I do have a working Tegra210 EMC scaling implementation that I've tested this with and that has exactly the same issue and this fixes it. Thierry
On Tue, Mar 10, 2020 at 11:42:27AM +0100, Thierry Reding wrote: > On Tue, Mar 10, 2020 at 12:25:01PM +0200, Mikko Perttunen wrote: > > On 3/10/20 3:48 AM, Stephen Boyd wrote: > > > Quoting Thierry Reding (2020-03-05 09:51:38) > > > > From: Thierry Reding <treding@nvidia.com> > > > > > > > > As part of the clock frequency change sequence, a driver may need to > > > > reparent a clock. In that case, the rate will already have been updated > > > > and the cached parent rate will no longer be valid, so just skip the > > > > recalculation. > > > > > > Can you describe more about what's going on and why this needs to > > > change? For example, we have 'set_rate_and_parent' for cases where a > > > driver reparents a clk during a rate change. Is that not sufficient > > > here? > > > > > > > I believe this is related to the memory clock change sequence on Tegra, > > where the EMC clock may have to be reparented multiple times during the rate > > change sequence. > > > > If EMC is running off parent A, and the requested new OPP also uses parent > > A, we have to temporarily switch to a different OPP with parent B so that > > the rate change sequence can be executed on parent A without affecting the > > system's memory accesses. > > It's not only that. The mismatch can happen every time that we change > the operating point because each such change could cause either the > parent and/or the parent rate to change. The CCF always caches the > parent rate from before the rate change, assuming that the parent will > not change. > > Perhaps a clearer way of saying this is that the parent change is an > internal part of the rate change. clk_hw_reparent() already allows us to > notify the CCF of the hierarchical change in the clock tree. But then we > need to additionally either override the parent rate in the driver, like > we currently do on Tegra124, or have the core ignore the cached parent > rate if the parent is no longer what it expects. > > I should probably have followed up this patch with a corresponding patch > to the Tegra124 EMC driver that removes the workaround, like below: > > --- >8 --- > diff --git a/drivers/clk/tegra/clk-emc.c b/drivers/clk/tegra/clk-emc.c > index 745f9faa98d8..f29f0a0e48de 100644 > --- a/drivers/clk/tegra/clk-emc.c > +++ b/drivers/clk/tegra/clk-emc.c > @@ -92,12 +92,6 @@ static unsigned long emc_recalc_rate(struct clk_hw *hw, > > tegra = container_of(hw, struct tegra_clk_emc, hw); > > - /* > - * CCF wrongly assumes that the parent won't change during set_rate, > - * so get the parent rate explicitly. > - */ > - parent_rate = clk_hw_get_rate(clk_hw_get_parent(hw)); > - > val = readl(tegra->clk_regs + CLK_SOURCE_EMC); > div = val & CLK_SOURCE_EMC_EMC_2X_CLK_DIVISOR_MASK; > > --- >8 --- > > But I haven't actually tested that yet because I don't have a Tegra124 > board set up at the moment. I did test the above change along with the core change on a Jetson TK1 board (which is Tegra124) and unfortunately that doesn't work for all rate changes. I'll need to dig deeper to see what exactly is going on, and to see if there's a simple fix for that. If not it might be better to leave these types of workarounds in the drivers instead. Thierry
diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index ebfc1e2103cb..49d92f4785a2 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -2079,7 +2079,14 @@ static void clk_change_rate(struct clk_core *core) trace_clk_set_rate_complete(core, core->new_rate); - core->rate = clk_recalc(core, best_parent_rate); + /* + * Some drivers need to change the parent of a clock as part of the + * rate change sequence. In that case, best_parent_rate is no longer + * valid. However, reparenting already recalculates the rate for the + * entire clock subtree, so we can safely skip this here. + */ + if (core->parent == parent) + core->rate = clk_recalc(core, best_parent_rate); if (core->flags & CLK_SET_RATE_UNGATE) { unsigned long flags;