Message ID | 20211109022558.14529-1-shawn.guo@linaro.org (mailing list archive) |
---|---|
Headers | show |
Series | clk: qcom: smd-rpm: Report enable state to framework | expand |
Hi Shawn, On Tue, Nov 09, 2021 at 10:25:55AM +0800, Shawn Guo wrote: > Currently the enable state of smd-rpm clocks are not properly reported > back to framework due to missing .is_enabled and .is_prepared hooks. > This causes a couple of issues. > > - All those unused clocks are not voted for off, because framework has > no knowledge that they are unused. It becomes a problem for vlow > power mode support, as we do not have every single RPM clock claimed > and voted for off by client devices, and rely on clock framework to > disable those unused RPM clocks. > I posted a similar patch a bit more than a year ago [1]. Back then one of the concerns was that we might disable critical clocks just because they have no driver using it actively. For example, not all of the platforms using clk-smd-rpm already have an interconnect driver. Disabling the interconnect related clocks will almost certainly make the device lock up completely. (I tried it back then, it definitely does...) I proposed adding CLK_IGNORE_UNUSED for the interconnect related clocks back then [2] which would allow disabling most of the clocks at least. Stephen Boyd had an alternative proposal to instead move the interconnect related clocks completely out of clk-smd-rpm [3]. But I'm still unsure how this would work in a backwards compatible way. [4] Since your patches are more or less identical I'm afraid the same concerns still need to be solved somehow. :) Thanks, Stephan [1]: https://lore.kernel.org/linux-arm-msm/20200817140908.185976-1-stephan@gerhold.net/ [2]: https://lore.kernel.org/linux-arm-msm/20200818080738.GA46574@gerhold.net/ [3]: https://lore.kernel.org/linux-arm-msm/159796605593.334488.8355244657387381953@swboyd.mtv.corp.google.com/ [4]: https://lore.kernel.org/linux-arm-msm/20200821064857.GA905@gerhold.net/
On Tue 09 Nov 02:26 PST 2021, Stephan Gerhold wrote: > Hi Shawn, > > On Tue, Nov 09, 2021 at 10:25:55AM +0800, Shawn Guo wrote: > > Currently the enable state of smd-rpm clocks are not properly reported > > back to framework due to missing .is_enabled and .is_prepared hooks. > > This causes a couple of issues. > > > > - All those unused clocks are not voted for off, because framework has > > no knowledge that they are unused. It becomes a problem for vlow > > power mode support, as we do not have every single RPM clock claimed > > and voted for off by client devices, and rely on clock framework to > > disable those unused RPM clocks. > > > > I posted a similar patch a bit more than a year ago [1]. Back then one > of the concerns was that we might disable critical clocks just because > they have no driver using it actively. For example, not all of the > platforms using clk-smd-rpm already have an interconnect driver. > Disabling the interconnect related clocks will almost certainly make the > device lock up completely. (I tried it back then, it definitely does...) > > I proposed adding CLK_IGNORE_UNUSED for the interconnect related clocks > back then [2] which would allow disabling most of the clocks at least. > Stephen Boyd had an alternative proposal to instead move the > interconnect related clocks completely out of clk-smd-rpm [3]. > But I'm still unsure how this would work in a backwards compatible way. [4] > With the introduction of QoS the interconnect drivers need to be mmio devices, and plural, while in order to talk to the RPM we need something on the rpmsg bus. So I don't think Stephen's proposal will work, unless we like in the RPMh case comes up with an equivalent of the bcm-voter (which just moved the clocks from one clock driver to a "clock" driver). On the other hand, if clocks and the clk-smd-rpm driver in particular moves to sync_state then this wouldn't be a problem... Regards, Bjorn > Since your patches are more or less identical I'm afraid the same > concerns still need to be solved somehow. :) > > Thanks, > Stephan > > [1]: https://lore.kernel.org/linux-arm-msm/20200817140908.185976-1-stephan@gerhold.net/ > [2]: https://lore.kernel.org/linux-arm-msm/20200818080738.GA46574@gerhold.net/ > [3]: https://lore.kernel.org/linux-arm-msm/159796605593.334488.8355244657387381953@swboyd.mtv.corp.google.com/ > [4]: https://lore.kernel.org/linux-arm-msm/20200821064857.GA905@gerhold.net/
Hi Stephan, On Tue, Nov 09, 2021 at 11:26:21AM +0100, Stephan Gerhold wrote: > Hi Shawn, > > On Tue, Nov 09, 2021 at 10:25:55AM +0800, Shawn Guo wrote: > > Currently the enable state of smd-rpm clocks are not properly reported > > back to framework due to missing .is_enabled and .is_prepared hooks. > > This causes a couple of issues. > > > > - All those unused clocks are not voted for off, because framework has > > no knowledge that they are unused. It becomes a problem for vlow > > power mode support, as we do not have every single RPM clock claimed > > and voted for off by client devices, and rely on clock framework to > > disable those unused RPM clocks. > > > > I posted a similar patch a bit more than a year ago [1]. Ouch, that's unfortunate! If your patch landed, I wouldn't have had to spend such a long time to figure out why my platform fails to reach vlow power mode :( > Back then one > of the concerns was that we might disable critical clocks just because > they have no driver using it actively. For example, not all of the > platforms using clk-smd-rpm already have an interconnect driver. > Disabling the interconnect related clocks will almost certainly make the > device lock up completely. (I tried it back then, it definitely does...) > > I proposed adding CLK_IGNORE_UNUSED for the interconnect related clocks > back then [2] which would allow disabling most of the clocks at least. > Stephen Boyd had an alternative proposal to instead move the > interconnect related clocks completely out of clk-smd-rpm [3]. > But I'm still unsure how this would work in a backwards compatible way. [4] > > Since your patches are more or less identical I'm afraid the same > concerns still need to be solved somehow. :) I do not really understand why smd-rpm clock driver needs to be a special case. This is a very common issue, mostly in device early support phase where not all clock consumer drivers are ready. Flag CLK_IGNORE_UNUSED and kernel cmdline 'clk_ignore_unused' are created just for that. Those "broken" platforms should be booted with 'clk_ignore_unused' until they have related consumer drivers in place. IMHO, properly reporting enable state to framework is definitely the right thing to do, and should have been done from day one. Shawn > [1]: https://lore.kernel.org/linux-arm-msm/20200817140908.185976-1-stephan@gerhold.net/ > [2]: https://lore.kernel.org/linux-arm-msm/20200818080738.GA46574@gerhold.net/ > [3]: https://lore.kernel.org/linux-arm-msm/159796605593.334488.8355244657387381953@swboyd.mtv.corp.google.com/ > [4]: https://lore.kernel.org/linux-arm-msm/20200821064857.GA905@gerhold.net/
On Wed 10 Nov 05:15 PST 2021, Shawn Guo wrote: > Hi Stephan, > > On Tue, Nov 09, 2021 at 11:26:21AM +0100, Stephan Gerhold wrote: > > Hi Shawn, > > > > On Tue, Nov 09, 2021 at 10:25:55AM +0800, Shawn Guo wrote: > > > Currently the enable state of smd-rpm clocks are not properly reported > > > back to framework due to missing .is_enabled and .is_prepared hooks. > > > This causes a couple of issues. > > > > > > - All those unused clocks are not voted for off, because framework has > > > no knowledge that they are unused. It becomes a problem for vlow > > > power mode support, as we do not have every single RPM clock claimed > > > and voted for off by client devices, and rely on clock framework to > > > disable those unused RPM clocks. > > > > > > > I posted a similar patch a bit more than a year ago [1]. > > Ouch, that's unfortunate! If your patch landed, I wouldn't have had to > spend such a long time to figure out why my platform fails to reach vlow > power mode :( > > > Back then one > > of the concerns was that we might disable critical clocks just because > > they have no driver using it actively. For example, not all of the > > platforms using clk-smd-rpm already have an interconnect driver. > > Disabling the interconnect related clocks will almost certainly make the > > device lock up completely. (I tried it back then, it definitely does...) > > > > I proposed adding CLK_IGNORE_UNUSED for the interconnect related clocks > > back then [2] which would allow disabling most of the clocks at least. > > Stephen Boyd had an alternative proposal to instead move the > > interconnect related clocks completely out of clk-smd-rpm [3]. > > But I'm still unsure how this would work in a backwards compatible way. [4] > > > > Since your patches are more or less identical I'm afraid the same > > concerns still need to be solved somehow. :) > > I do not really understand why smd-rpm clock driver needs to be a special > case. This is a very common issue, mostly in device early support phase > where not all clock consumer drivers are ready. Flag CLK_IGNORE_UNUSED > and kernel cmdline 'clk_ignore_unused' are created just for that. Those > "broken" platforms should be booted with 'clk_ignore_unused' until they > have related consumer drivers in place. Afaict we still have the problem that if the interconnect driver is compiled as a module, or for other reasons doesn't probe until after late_initcall() clk-smd-rpm will turn off these clocks and we never will get a chance to probe the interconnect provider. I believe the way to handle that is to rely on sync_state, but there seems to be a lot of corner cases here. But with that in place, I agree that we should handle this temporarily during bringup by the use of clk_ignore_unused. > IMHO, properly reporting enable state to framework is definitely the > right thing to do, and should have been done from day one. > I always thought is_enabled() should reflect the hardware state - in particular for clk_summary. The particular concern being that by initializing the is_enabled() state to either true or false, we're making an assumption about the hardware state. And if something where to do if (enabled) disable (or if (disabled) enable), we might skip a critical operation just because we tricked the logic. So, do you need it for anything other than clk_disable_unused()? I have a clock in the MDP with similar issue, where we don't have is_enabled() but I need it to be disabled by clk_disable_unused(), because the next iteration turns off the parent and locks up the still "active" rcg. So far I've not received any feedback on this though... https://lore.kernel.org/all/20210707043859.195870-1-bjorn.andersson@linaro.org/ With this approach we don't make any assumptions about the hardware state, beyond the fact that we will issue a disable in clk_disable_unused() if no one has yet enabled the clock - which at worst turns off a clock that's already is off. Regards, Bjorn > Shawn > > > [1]: https://lore.kernel.org/linux-arm-msm/20200817140908.185976-1-stephan@gerhold.net/ > > [2]: https://lore.kernel.org/linux-arm-msm/20200818080738.GA46574@gerhold.net/ > > [3]: https://lore.kernel.org/linux-arm-msm/159796605593.334488.8355244657387381953@swboyd.mtv.corp.google.com/ > > [4]: https://lore.kernel.org/linux-arm-msm/20200821064857.GA905@gerhold.net/
On Wed, Nov 10, 2021 at 09:15:11PM +0800, Shawn Guo wrote: > On Tue, Nov 09, 2021 at 11:26:21AM +0100, Stephan Gerhold wrote: > > On Tue, Nov 09, 2021 at 10:25:55AM +0800, Shawn Guo wrote: > > > Currently the enable state of smd-rpm clocks are not properly reported > > > back to framework due to missing .is_enabled and .is_prepared hooks. > > > This causes a couple of issues. > > > > > > - All those unused clocks are not voted for off, because framework has > > > no knowledge that they are unused. It becomes a problem for vlow > > > power mode support, as we do not have every single RPM clock claimed > > > and voted for off by client devices, and rely on clock framework to > > > disable those unused RPM clocks. > > > > > > > I posted a similar patch a bit more than a year ago [1]. > > Ouch, that's unfortunate! If your patch landed, I wouldn't have had to > spend such a long time to figure out why my platform fails to reach vlow > power mode :( > Sorry, I was waiting for Stephen to reply and eventually decided to shift focus to other things first. :) The whole low-power topic is kind of frustrating on older platforms because they currently still lack almost everything that is necessary to reach those low power states. Even things that you already consider natural for newer platforms (such as interconnect) are still very much work in progress on all older ones. > > Back then one > > of the concerns was that we might disable critical clocks just because > > they have no driver using it actively. For example, not all of the > > platforms using clk-smd-rpm already have an interconnect driver. > > Disabling the interconnect related clocks will almost certainly make the > > device lock up completely. (I tried it back then, it definitely does...) > > > > I proposed adding CLK_IGNORE_UNUSED for the interconnect related clocks > > back then [2] which would allow disabling most of the clocks at least. > > Stephen Boyd had an alternative proposal to instead move the > > interconnect related clocks completely out of clk-smd-rpm [3]. > > But I'm still unsure how this would work in a backwards compatible way. [4] > > > > Since your patches are more or less identical I'm afraid the same > > concerns still need to be solved somehow. :) > > I do not really understand why smd-rpm clock driver needs to be a special > case. This is a very common issue, mostly in device early support phase > where not all clock consumer drivers are ready. Flag CLK_IGNORE_UNUSED > and kernel cmdline 'clk_ignore_unused' are created just for that. Those > "broken" platforms should be booted with 'clk_ignore_unused' until they > have related consumer drivers in place. IMHO, properly reporting enable > state to framework is definitely the right thing to do, and should have > been done from day one. > ... And therefore I think we should be careful with such changes, especially if they would prevent devices from booting completely. Unfortunately the users trying to make use of old platforms are also often the ones who might not be aware that they suddenly need "clk_ignore_unused" just to boot a system that was previously working (mostly) fine, except for the whole low-power topic. I fully agree with you that disabling the unused clocks here is the right thing to do, but I think we should try to carefully flag the most important clocks in the driver to avoid causing too many regressions. Thanks, Stephan
On Wed 10 Nov 06:09 PST 2021, Stephan Gerhold wrote: > On Wed, Nov 10, 2021 at 09:15:11PM +0800, Shawn Guo wrote: > > On Tue, Nov 09, 2021 at 11:26:21AM +0100, Stephan Gerhold wrote: > > > On Tue, Nov 09, 2021 at 10:25:55AM +0800, Shawn Guo wrote: > > > > Currently the enable state of smd-rpm clocks are not properly reported > > > > back to framework due to missing .is_enabled and .is_prepared hooks. > > > > This causes a couple of issues. > > > > > > > > - All those unused clocks are not voted for off, because framework has > > > > no knowledge that they are unused. It becomes a problem for vlow > > > > power mode support, as we do not have every single RPM clock claimed > > > > and voted for off by client devices, and rely on clock framework to > > > > disable those unused RPM clocks. > > > > > > > > > > I posted a similar patch a bit more than a year ago [1]. > > > > Ouch, that's unfortunate! If your patch landed, I wouldn't have had to > > spend such a long time to figure out why my platform fails to reach vlow > > power mode :( > > > > Sorry, I was waiting for Stephen to reply and eventually decided to > shift focus to other things first. :) > > The whole low-power topic is kind of frustrating on older platforms > because they currently still lack almost everything that is necessary to > reach those low power states. Even things that you already consider > natural for newer platforms (such as interconnect) are still very much > work in progress on all older ones. > > > > Back then one > > > of the concerns was that we might disable critical clocks just because > > > they have no driver using it actively. For example, not all of the > > > platforms using clk-smd-rpm already have an interconnect driver. > > > Disabling the interconnect related clocks will almost certainly make the > > > device lock up completely. (I tried it back then, it definitely does...) > > > > > > I proposed adding CLK_IGNORE_UNUSED for the interconnect related clocks > > > back then [2] which would allow disabling most of the clocks at least. > > > Stephen Boyd had an alternative proposal to instead move the > > > interconnect related clocks completely out of clk-smd-rpm [3]. > > > But I'm still unsure how this would work in a backwards compatible way. [4] > > > > > > Since your patches are more or less identical I'm afraid the same > > > concerns still need to be solved somehow. :) > > > > I do not really understand why smd-rpm clock driver needs to be a special > > case. This is a very common issue, mostly in device early support phase > > where not all clock consumer drivers are ready. Flag CLK_IGNORE_UNUSED > > and kernel cmdline 'clk_ignore_unused' are created just for that. Those > > "broken" platforms should be booted with 'clk_ignore_unused' until they > > have related consumer drivers in place. IMHO, properly reporting enable > > state to framework is definitely the right thing to do, and should have > > been done from day one. > > > > ... And therefore I think we should be careful with such changes, > especially if they would prevent devices from booting completely. > Unfortunately the users trying to make use of old platforms are also > often the ones who might not be aware that they suddenly need > "clk_ignore_unused" just to boot a system that was previously working > (mostly) fine, except for the whole low-power topic. > > I fully agree with you that disabling the unused clocks here is the > right thing to do, but I think we should try to carefully flag the most > important clocks in the driver to avoid causing too many regressions. > I don't fancy the idea of forcing everyone to run with specific kernel command line parameters - in particular not as a means to avoid "regressions". I think the only way around this problem is to figure out how to move the clk disablement to sync_state - probably per clock driver. Regards, Bjorn
On Wed, Nov 10, 2021 at 05:48:10AM -0800, Bjorn Andersson wrote: > > IMHO, properly reporting enable state to framework is definitely the > > right thing to do, and should have been done from day one. > > > > I always thought is_enabled() should reflect the hardware state - in > particular for clk_summary. The particular concern being that by > initializing the is_enabled() state to either true or false, we're > making an assumption about the hardware state. And if something where to > do if (enabled) disable (or if (disabled) enable), we might skip a > critical operation just because we tricked the logic. That's probably why clk_smd_rpm_handoff() is called. As there is no way to query RPM for resource state, we send enable request for all RPM clocks to get hardware and software state in sync. > So, do you need it for anything other than clk_disable_unused()? Not critical, but I need it for debugfs clk_summary as well. Shawn
On Thu 11 Nov 03:39 CST 2021, Shawn Guo wrote: > On Wed, Nov 10, 2021 at 05:48:10AM -0800, Bjorn Andersson wrote: > > > IMHO, properly reporting enable state to framework is definitely the > > > right thing to do, and should have been done from day one. > > > > > > > I always thought is_enabled() should reflect the hardware state - in > > particular for clk_summary. The particular concern being that by > > initializing the is_enabled() state to either true or false, we're > > making an assumption about the hardware state. And if something where to > > do if (enabled) disable (or if (disabled) enable), we might skip a > > critical operation just because we tricked the logic. > > That's probably why clk_smd_rpm_handoff() is called. As there is no way > to query RPM for resource state, we send enable request for all RPM > clocks to get hardware and software state in sync. > clk_smd_rpm_handoff() will ensure that all SMD clocks are enabled, and at max speed during rpm_smd_clk_probe(). Once clients starts actually voting for rates that will change. (Un)fortunately as we don't provide an implementation of is_enabled() clk_disable_unused() won't try to turn them off. This similar to a problem I have elsewhere, for which I proposed: https://lore.kernel.org/linux-arm-msm/20211203035436.3505743-1-bjorn.andersson@linaro.org/ We should at some point introduce this for the SMD clocks as well. However, we have two problems: 1) Compiling e.g. the interconnect provider as a module would mean that clk_disable_unused() kicks in before the client has had a chance to vote for the clock. 2) One client may enable the clock during its probe and then disable it. Being the last active user the clock framework happily turns off the clock. For both of these cases, we need to ensure that the clocks aren't disabled until sync_state() kicks in. Regards, Bjorn > > So, do you need it for anything other than clk_disable_unused()? > > Not critical, but I need it for debugfs clk_summary as well. > > Shawn
Quoting Stephan Gerhold (2021-11-09 02:26:21) > Hi Shawn, > > On Tue, Nov 09, 2021 at 10:25:55AM +0800, Shawn Guo wrote: > > Currently the enable state of smd-rpm clocks are not properly reported > > back to framework due to missing .is_enabled and .is_prepared hooks. > > This causes a couple of issues. > > > > - All those unused clocks are not voted for off, because framework has > > no knowledge that they are unused. It becomes a problem for vlow > > power mode support, as we do not have every single RPM clock claimed > > and voted for off by client devices, and rely on clock framework to > > disable those unused RPM clocks. > > > > I posted a similar patch a bit more than a year ago [1]. Back then one > of the concerns was that we might disable critical clocks just because > they have no driver using it actively. For example, not all of the > platforms using clk-smd-rpm already have an interconnect driver. > Disabling the interconnect related clocks will almost certainly make the > device lock up completely. (I tried it back then, it definitely does...) > > I proposed adding CLK_IGNORE_UNUSED for the interconnect related clocks > back then [2] which would allow disabling most of the clocks at least. > Stephen Boyd had an alternative proposal to instead move the > interconnect related clocks completely out of clk-smd-rpm [3]. > But I'm still unsure how this would work in a backwards compatible way. [4] We should stop adding to the pile of smd-rpm clks that are clearly interconnects. I'm ready to stop accepting patches like 78b727d02815 ("clk: qcom: smd-rpm: Add QCM2290 RPM clock support"). Someone needs to put in the work to make an interconnect provider that directly talks to the rpm, without going through the clk framework just because the rpm talks in kHz for these resources. These clk have no parent and are essentially a proxy for some firmware interface to the rpm but we put it behind the clk framework for reasons I don't know why. I honestly don't understand the backwards incompatibility argument for this either. If we're adding more SoC support for this driver we need to stop and figure out a better approach. Make a new interconnect driver, plug it in via DT, wait a release cycle, and finally dump the smd-rpm clk node from older platforms that were using the clk framework. At least for new SoCs this problem doesn't exist. There's that one graphics clk (RPM_SMD_GFX3D_CLK_SRC) but I don't see it used anywhere. So it's not really important? Maybe we need to set some bandwidth in the graphics clk driver? I dunno. > > Since your patches are more or less identical I'm afraid the same > concerns still need to be solved somehow. :) > > Thanks, > Stephan > > [1]: https://lore.kernel.org/linux-arm-msm/20200817140908.185976-1-stephan@gerhold.net/ > [2]: https://lore.kernel.org/linux-arm-msm/20200818080738.GA46574@gerhold.net/ > [3]: https://lore.kernel.org/linux-arm-msm/159796605593.334488.8355244657387381953@swboyd.mtv.corp.google.com/ > [4]: https://lore.kernel.org/linux-arm-msm/20200821064857.GA905@gerhold.net/