Message ID | 20190129015547.213276-1-swboyd@chromium.org (mailing list archive) |
---|---|
Headers | show |
Series | DVFS in the OPP core | expand |
Adding few folks to the thread who might be interested in this stuff. On 28-01-19, 17:55, Stephen Boyd wrote: > This patch series is an RFC around how we can implement DVFS for devices > that aren't your typical OPPish device (i.e. GPU/CPU). They don't have a > strict set of frequencies that they have been tested at to derive some > operating performance point. Instead they have a coarser set of > frequency max or 'fmax' OPPs that describe the maiximum frequency the > device can operate at with a given voltage. > > The approach we take is to let the devm_pm_opp_set_rate() API accept 0 > as a valid frequency indicating the frequency isn't required anymore and > to make the same API use the clk framework to round the frequency passed > in instead of relying on the OPP table to specify each frequency that > can be used. Once we have these two patches in place, we can use the OPP > API to change clk rates instead of clk_set_rate() and use all the recent > OPP enhancements that have been made around required-opps and genpd > performance states to do DVFS for all devices. Generally speaking I am fine with such an approach but I am not sure about what others would say on this as they had objections to using OPP core for setting the rate itself. FWIW, I suggested exactly this solution sometime back [1] - Drivers need to use two API sets to change clock rate (OPP helpers) and enable/disable them (CLK framework helpers) and this leads us to exactly that combination. Is that acceptable ? It doesn't look great to me as well.. - Do we expect the callers will disable clk before calling opp-set-rate with 0 ? We should remove the regulator requirements as well along with perf-state. - What about enabling/disabling clock as well from OPP framework. We can enable it on the very first call to opp-set-rate and disable when freq is 0. That will simplify the drivers as well. > One nice feature of this approach is that we don't need to change the > OPP binding to support this. We can specify only the max frequencies and > the voltage requirements in DT with the existing binding and slightly > tweak the OPP code to achieve these results. > > This series includes a conversion of the uart and spi drivers on > qcom sdm845 where these patches are being developed. It shows how a > driver is converted from the clk APIs to the OPP APIs and how > enable/disable state of the clk is communicated to the OPP layer. > > Some open topics and initial points for discussion are: > > 1) The dev_pm_opp_set_rate() API changes may break something that's > relying on the rate rounding that OPP provides. If those exist, > we may need to implement another API that is more explicit about using > the clk API instead of the OPP tables. > > 2) dev_pm_opp_set_rate(0) is an interesting solution to the problem of > dropping the rate requirement. Is there anything better than this? > > 3) How do we handle devices that already have power-domains specified in > DT? The opp binding for required-opps doesn't let us specify the power > domain to target, instead it assumes that whatever power domain is > attached to a device is the one that OPP needs to use to change the > genpd performance state. Do we need a > dev_pm_opp_set_required_opps_name() or something to be explicit about > this? Can we have some way for the power domain that required-opps > correspond to be expressed in the OPP tables themselves? > > 4) How do we achieve the "full constraint" state? i.e. what do we do > about some devices being active and others being inactive during boot > and making sure that the voltage for the shared power domains doesn't > drop until all devices requiring it have informed OPP about their > power requirements? > > Rajendra Nayak (4): > OPP: Make dev_pm_opp_set_rate() with freq=0 as valid > tty: serial: qcom_geni_serial: Use OPP API to set clk/perf state > spi: spi-geni-qcom: Use OPP API to set clk/perf state > arm64: dts: sdm845: Add OPP table for all qup devices > > Stephen Boyd (1): > OPP: Don't overwrite rounded clk rate > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 115 ++++++++++++++++++++++++++ > drivers/opp/core.c | 26 ++++-- > drivers/spi/spi-geni-qcom.c | 12 ++- > drivers/tty/serial/qcom_geni_serial.c | 15 +++- > 4 files changed, 155 insertions(+), 13 deletions(-) > > For the interested, these patches are located here: > > https://github.com/rrnayak/linux/ v5.0-rc3/opp-corners-wip > > I've generated these patches by cutting off the top of that tree and > massaging the commit text a bit for the first patch. > > base-commit: 49a57857aeea06ca831043acbb0fa5e0f50602fd > prerequisite-patch-id: 9c3ee728603596b8b0ba06ffd66084bdc21b1271 > prerequisite-patch-id: f160e050bcd74f5de6fad66329381853869a6a97 > prerequisite-patch-id: aa23548d2b486c29489b4304d85799d08762254e > prerequisite-patch-id: 40dd117c45fecb4308298352346546612db94b64 > prerequisite-patch-id: cd102fa42bab19897c2295e8b990b2156626054a > prerequisite-patch-id: 3b9e5c8ed65ee96cc0f1c50166cf6bbb597ef582 > prerequisite-patch-id: 7e71b957b90ad51d0619944d5ebc859380e8e3e4 > prerequisite-patch-id: 5abd2bd6b3ae3e91551e2b8f9295169019ba82c7 > prerequisite-patch-id: 68bb3e44cf4e5dbd363a1a1750e4d378971ed393 > prerequisite-patch-id: 882b14ef9527b15d22cfddbb5fa2b9d43df1ff04 > prerequisite-patch-id: 6fc14ddeb074fb0826f1f40031e9d161f1361666 > -- > Sent by a computer through tubes
On Thu, Jan 31, 2019 at 10:23 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > Adding few folks to the thread who might be interested in this stuff. > > On 28-01-19, 17:55, Stephen Boyd wrote: > > This patch series is an RFC around how we can implement DVFS for devices > > that aren't your typical OPPish device (i.e. GPU/CPU). They don't have a > > strict set of frequencies that they have been tested at to derive some > > operating performance point. Instead they have a coarser set of > > frequency max or 'fmax' OPPs that describe the maiximum frequency the > > device can operate at with a given voltage. > > > > The approach we take is to let the devm_pm_opp_set_rate() API accept 0 > > as a valid frequency indicating the frequency isn't required anymore and > > to make the same API use the clk framework to round the frequency passed > > in instead of relying on the OPP table to specify each frequency that > > can be used. Once we have these two patches in place, we can use the OPP > > API to change clk rates instead of clk_set_rate() and use all the recent > > OPP enhancements that have been made around required-opps and genpd > > performance states to do DVFS for all devices. > > Generally speaking I am fine with such an approach but I am not sure > about what others would say on this as they had objections to using > OPP core for setting the rate itself. > > FWIW, I suggested exactly this solution sometime back [1] > > - Drivers need to use two API sets to change clock rate (OPP helpers) > and enable/disable them (CLK framework helpers) and this leads us to > exactly that combination. Is that acceptable ? It doesn't look great > to me as well.. I agree here. > - Do we expect the callers will disable clk before calling > opp-set-rate with 0 ? We should remove the regulator requirements as > well along with perf-state. Well, disabling clk affects HW in general, doesn't it? > - What about enabling/disabling clock as well from OPP framework. We > can enable it on the very first call to opp-set-rate and disable > when freq is 0. That will simplify the drivers as well. That sounds compelling, but I guess there are cases in which you can gate the clock regardless of the frequency setting. How would that work then?
On 31-01-19, 10:58, Rafael J. Wysocki wrote: > On Thu, Jan 31, 2019 at 10:23 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > Adding few folks to the thread who might be interested in this stuff. > > > > On 28-01-19, 17:55, Stephen Boyd wrote: > > > This patch series is an RFC around how we can implement DVFS for devices > > > that aren't your typical OPPish device (i.e. GPU/CPU). They don't have a > > > strict set of frequencies that they have been tested at to derive some > > > operating performance point. Instead they have a coarser set of > > > frequency max or 'fmax' OPPs that describe the maiximum frequency the > > > device can operate at with a given voltage. > > > > > > The approach we take is to let the devm_pm_opp_set_rate() API accept 0 > > > as a valid frequency indicating the frequency isn't required anymore and > > > to make the same API use the clk framework to round the frequency passed > > > in instead of relying on the OPP table to specify each frequency that > > > can be used. Once we have these two patches in place, we can use the OPP > > > API to change clk rates instead of clk_set_rate() and use all the recent > > > OPP enhancements that have been made around required-opps and genpd > > > performance states to do DVFS for all devices. > > > > Generally speaking I am fine with such an approach but I am not sure > > about what others would say on this as they had objections to using > > OPP core for setting the rate itself. > > > > FWIW, I suggested exactly this solution sometime back [1] > > > > - Drivers need to use two API sets to change clock rate (OPP helpers) > > and enable/disable them (CLK framework helpers) and this leads us to > > exactly that combination. Is that acceptable ? It doesn't look great > > to me as well.. > > I agree here. > > > - Do we expect the callers will disable clk before calling > > opp-set-rate with 0 ? We should remove the regulator requirements as > > well along with perf-state. > > Well, disabling clk affects HW in general, doesn't it? Yeah, but the regulator may be shared and is running at higher voltages just because of the clock requirement of the device getting disabled here. Or did I misunderstood what you wanted to say ? > > - What about enabling/disabling clock as well from OPP framework. We > > can enable it on the very first call to opp-set-rate and disable > > when freq is 0. That will simplify the drivers as well. > > That sounds compelling, but I guess there are cases in which you can > gate the clock regardless of the frequency setting. How would that > work then? Can you give any example here ? I am not sure I understood the concern here.
On Thu, Jan 31, 2019 at 11:06 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 31-01-19, 10:58, Rafael J. Wysocki wrote: > > On Thu, Jan 31, 2019 at 10:23 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > > > Adding few folks to the thread who might be interested in this stuff. > > > > > > On 28-01-19, 17:55, Stephen Boyd wrote: > > > > This patch series is an RFC around how we can implement DVFS for devices > > > > that aren't your typical OPPish device (i.e. GPU/CPU). They don't have a > > > > strict set of frequencies that they have been tested at to derive some > > > > operating performance point. Instead they have a coarser set of > > > > frequency max or 'fmax' OPPs that describe the maiximum frequency the > > > > device can operate at with a given voltage. > > > > > > > > The approach we take is to let the devm_pm_opp_set_rate() API accept 0 > > > > as a valid frequency indicating the frequency isn't required anymore and > > > > to make the same API use the clk framework to round the frequency passed > > > > in instead of relying on the OPP table to specify each frequency that > > > > can be used. Once we have these two patches in place, we can use the OPP > > > > API to change clk rates instead of clk_set_rate() and use all the recent > > > > OPP enhancements that have been made around required-opps and genpd > > > > performance states to do DVFS for all devices. > > > > > > Generally speaking I am fine with such an approach but I am not sure > > > about what others would say on this as they had objections to using > > > OPP core for setting the rate itself. > > > > > > FWIW, I suggested exactly this solution sometime back [1] > > > > > > - Drivers need to use two API sets to change clock rate (OPP helpers) > > > and enable/disable them (CLK framework helpers) and this leads us to > > > exactly that combination. Is that acceptable ? It doesn't look great > > > to me as well.. > > > > I agree here. > > > > > - Do we expect the callers will disable clk before calling > > > opp-set-rate with 0 ? We should remove the regulator requirements as > > > well along with perf-state. > > > > Well, disabling clk affects HW in general, doesn't it? > > Yeah, but the regulator may be shared and is running at higher > voltages just because of the clock requirement of the device getting > disabled here. Or did I misunderstood what you wanted to say ? What I wanted to say is that if the caller is required to disable clk beforehand, that may be as good as setting its rate to zero already. > > > - What about enabling/disabling clock as well from OPP framework. We > > > can enable it on the very first call to opp-set-rate and disable > > > when freq is 0. That will simplify the drivers as well. > > > > That sounds compelling, but I guess there are cases in which you can > > gate the clock regardless of the frequency setting. How would that > > work then? > > Can you give any example here ? I am not sure I understood the concern > here. It looks like I was confused somehow, never mind. :-)
On 31-01-19, 11:36, Rafael J. Wysocki wrote: > On Thu, Jan 31, 2019 at 11:06 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > On 31-01-19, 10:58, Rafael J. Wysocki wrote: > > > On Thu, Jan 31, 2019 at 10:23 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > > > > > Adding few folks to the thread who might be interested in this stuff. > > > > > > > > On 28-01-19, 17:55, Stephen Boyd wrote: > > > > > This patch series is an RFC around how we can implement DVFS for devices > > > > > that aren't your typical OPPish device (i.e. GPU/CPU). They don't have a > > > > > strict set of frequencies that they have been tested at to derive some > > > > > operating performance point. Instead they have a coarser set of > > > > > frequency max or 'fmax' OPPs that describe the maiximum frequency the > > > > > device can operate at with a given voltage. > > > > > > > > > > The approach we take is to let the devm_pm_opp_set_rate() API accept 0 > > > > > as a valid frequency indicating the frequency isn't required anymore and > > > > > to make the same API use the clk framework to round the frequency passed > > > > > in instead of relying on the OPP table to specify each frequency that > > > > > can be used. Once we have these two patches in place, we can use the OPP > > > > > API to change clk rates instead of clk_set_rate() and use all the recent > > > > > OPP enhancements that have been made around required-opps and genpd > > > > > performance states to do DVFS for all devices. > > > > > > > > Generally speaking I am fine with such an approach but I am not sure > > > > about what others would say on this as they had objections to using > > > > OPP core for setting the rate itself. > > > > > > > > FWIW, I suggested exactly this solution sometime back [1] > > > > > > > > - Drivers need to use two API sets to change clock rate (OPP helpers) > > > > and enable/disable them (CLK framework helpers) and this leads us to > > > > exactly that combination. Is that acceptable ? It doesn't look great > > > > to me as well.. > > > > > > I agree here. > > > > > > > - Do we expect the callers will disable clk before calling > > > > opp-set-rate with 0 ? We should remove the regulator requirements as > > > > well along with perf-state. > > > > > > Well, disabling clk affects HW in general, doesn't it? > > > > Yeah, but the regulator may be shared and is running at higher > > voltages just because of the clock requirement of the device getting > > disabled here. Or did I misunderstood what you wanted to say ? > > What I wanted to say is that if the caller is required to disable clk > beforehand, that may be as good as setting its rate to zero already. Right, but that doesn't mean that the requirements put on the regulator and genpd are gone as well and that is why I favor the OPP core to handle all the parts otherwise write the rules explicitly. If the OPP core doesn't disable the clock and the caller also hasn't done the same, then disabling the genpd/regulator may break things up.
> 3) How do we handle devices that already have power-domains specified in > DT? The opp binding for required-opps doesn't let us specify the power > domain to target, instead it assumes that whatever power domain is > attached to a device is the one that OPP needs to use to change the > genpd performance state. Do we need a > dev_pm_opp_set_required_opps_name() or something to be explicit about > this? Can we have some way for the power domain that required-opps > correspond to be expressed in the OPP tables themselves? I was converting a few more drivers to use the proposed approach in this RFC, in order to identify all outstanding issues we need to deal with, and specifically for UFS, I end up with this exact scenario where UFS already has an existing power domain (gdsc) and I need to add another one (rpmhpd) for setting the performance state. If I use dev_pm_opp_of_add_table() to add the opp table from DT, the opp layer assumes its the same device on which it can do a dev_pm_genpd_set_performance_state() with, however the device that's actually associated with the pm_domain when we have multiple power domains is infact the one (dummy) that we create when the driver makes a call to dev_pm_domain_attach_by_name/id(). Any thoughts on whats a good way to handle this?
Quoting Viresh Kumar (2019-01-31 01:23:49) > Adding few folks to the thread who might be interested in this stuff. > > On 28-01-19, 17:55, Stephen Boyd wrote: > > This patch series is an RFC around how we can implement DVFS for devices > > that aren't your typical OPPish device (i.e. GPU/CPU). They don't have a > > strict set of frequencies that they have been tested at to derive some > > operating performance point. Instead they have a coarser set of > > frequency max or 'fmax' OPPs that describe the maiximum frequency the > > device can operate at with a given voltage. > > > > The approach we take is to let the devm_pm_opp_set_rate() API accept 0 > > as a valid frequency indicating the frequency isn't required anymore and > > to make the same API use the clk framework to round the frequency passed > > in instead of relying on the OPP table to specify each frequency that > > can be used. Once we have these two patches in place, we can use the OPP > > API to change clk rates instead of clk_set_rate() and use all the recent > > OPP enhancements that have been made around required-opps and genpd > > performance states to do DVFS for all devices. > > Generally speaking I am fine with such an approach but I am not sure > about what others would say on this as they had objections to using > OPP core for setting the rate itself. > > FWIW, I suggested exactly this solution sometime back [1] > > - Drivers need to use two API sets to change clock rate (OPP helpers) > and enable/disable them (CLK framework helpers) and this leads us to > exactly that combination. Is that acceptable ? It doesn't look great > to me as well.. Agreed. I don't think anyone thinks this looks great, but I'll argue that it's improving OPP for devices that already use it so that we can remove voltage requirements when their clk is off. Think about CPUs that are in their own clk domain where we want to remove the voltage requirement when those CPUs are offline, or a GPU that wants to remove its voltage requirement when it turns off clks. The combination is already out there, just OPP hasn't solved this problem. The only other plan I had was to implement another API like dev_pm_set_state() or something like that and have that do something like what the OPP rate API does right now. The main difference being that the argument to the function would be some opaque u64 that's converted by the bus/class/genpd attached to the device into whatever frequency/voltage/performance state is desired (and sequenced in the right order too). And then I was thinking that runtime PM or explicit dev_pm_set_state() calls would be used to tell this new layer that the device was going to a lower power mode with some other number (sub-kHz integer?) and have that be translated again into some frequency/voltage/performance state. Either way, each driver will have to change from using the clk APIs to changes rates to something else like one of these APIs, so I don't see a huge difference. Drivers will have to change. > > - Do we expect the callers will disable clk before calling > opp-set-rate with 0 ? We should remove the regulator requirements as > well along with perf-state. Yes, that's the plan. Problems come along with this though, like shared resource constraints and actually knowing the clk on/off state, frequency, voltage, etc. at boot time and making sure to keep those constraints satisfied during normal operation. > > - What about enabling/disabling clock as well from OPP framework. We > can enable it on the very first call to opp-set-rate and disable > when freq is 0. That will simplify the drivers as well. It works when those drivers aren't calling clk_disable() directly from some irq handler. I don't think that's very common, but in those cases we would probably want to allow drivers to quickly gate and ungate their clks but then defer the sleeping stuff (voltages and off chip clks) to the schedulable contexts. We'll still be left with a small number of drivers that want to only call clk_prepare() and clk_unprepare() from within OPP and keep calling clk_enable() and clk_disable() from their driver. So introduce different APIs for those drivers to indicate this to OPP? And only do that when it becomes a requirement? Otherwise I don't really see a problem with the OPP call handling the enable state of the clk as well. > > > One nice feature of this approach is that we don't need to change the > > OPP binding to support this. We can specify only the max frequencies and > > the voltage requirements in DT with the existing binding and slightly > > tweak the OPP code to achieve these results. > > > > This series includes a conversion of the uart and spi drivers on > > qcom sdm845 where these patches are being developed. It shows how a > > driver is converted from the clk APIs to the OPP APIs and how > > enable/disable state of the clk is communicated to the OPP layer. > > > > Some open topics and initial points for discussion are: > > > > 1) The dev_pm_opp_set_rate() API changes may break something that's > > relying on the rate rounding that OPP provides. If those exist, > > we may need to implement another API that is more explicit about using > > the clk API instead of the OPP tables. > > > > 2) dev_pm_opp_set_rate(0) is an interesting solution to the problem of > > dropping the rate requirement. Is there anything better than this? > > > > 3) How do we handle devices that already have power-domains specified in > > DT? The opp binding for required-opps doesn't let us specify the power > > domain to target, instead it assumes that whatever power domain is > > attached to a device is the one that OPP needs to use to change the > > genpd performance state. Do we need a > > dev_pm_opp_set_required_opps_name() or something to be explicit about > > this? Can we have some way for the power domain that required-opps > > correspond to be expressed in the OPP tables themselves? > > > > 4) How do we achieve the "full constraint" state? i.e. what do we do > > about some devices being active and others being inactive during boot > > and making sure that the voltage for the shared power domains doesn't > > drop until all devices requiring it have informed OPP about their > > power requirements? > > Any comments on these topics?
On Thu, 7 Feb 2019 at 08:58, Stephen Boyd <swboyd@chromium.org> wrote: > > Quoting Viresh Kumar (2019-01-31 01:23:49) > > Adding few folks to the thread who might be interested in this stuff. > > > > On 28-01-19, 17:55, Stephen Boyd wrote: > > > This patch series is an RFC around how we can implement DVFS for devices > > > that aren't your typical OPPish device (i.e. GPU/CPU). They don't have a > > > strict set of frequencies that they have been tested at to derive some > > > operating performance point. Instead they have a coarser set of > > > frequency max or 'fmax' OPPs that describe the maiximum frequency the > > > device can operate at with a given voltage. > > > > > > The approach we take is to let the devm_pm_opp_set_rate() API accept 0 > > > as a valid frequency indicating the frequency isn't required anymore and > > > to make the same API use the clk framework to round the frequency passed > > > in instead of relying on the OPP table to specify each frequency that > > > can be used. Once we have these two patches in place, we can use the OPP > > > API to change clk rates instead of clk_set_rate() and use all the recent > > > OPP enhancements that have been made around required-opps and genpd > > > performance states to do DVFS for all devices. > > > > Generally speaking I am fine with such an approach but I am not sure > > about what others would say on this as they had objections to using > > OPP core for setting the rate itself. > > > > FWIW, I suggested exactly this solution sometime back [1] > > > > - Drivers need to use two API sets to change clock rate (OPP helpers) > > and enable/disable them (CLK framework helpers) and this leads us to > > exactly that combination. Is that acceptable ? It doesn't look great > > to me as well.. > > Agreed. I don't think anyone thinks this looks great, but I'll argue > that it's improving OPP for devices that already use it so that we can > remove voltage requirements when their clk is off. Think about CPUs that > are in their own clk domain where we want to remove the voltage > requirement when those CPUs are offline, or a GPU that wants to remove > its voltage requirement when it turns off clks. The combination is > already out there, just OPP hasn't solved this problem. > > The only other plan I had was to implement another API like > dev_pm_set_state() or something like that and have that do something > like what the OPP rate API does right now. The main difference being > that the argument to the function would be some opaque u64 that's > converted by the bus/class/genpd attached to the device into whatever > frequency/voltage/performance state is desired (and sequenced in the > right order too). And then I was thinking that runtime PM or explicit > dev_pm_set_state() calls would be used to tell this new layer that the > device was going to a lower power mode with some other number (sub-kHz > integer?) and have that be translated again into some > frequency/voltage/performance state. > > Either way, each driver will have to change from using the clk APIs to > changes rates to something else like one of these APIs, so I don't see a > huge difference. Drivers will have to change. > > > > > - Do we expect the callers will disable clk before calling > > opp-set-rate with 0 ? We should remove the regulator requirements as > > well along with perf-state. > > Yes, that's the plan. Problems come along with this though, like shared > resource constraints and actually knowing the clk on/off state, > frequency, voltage, etc. at boot time and making sure to keep those > constraints satisfied during normal operation. > > > > > - What about enabling/disabling clock as well from OPP framework. We > > can enable it on the very first call to opp-set-rate and disable > > when freq is 0. That will simplify the drivers as well. > > It works when those drivers aren't calling clk_disable() directly from > some irq handler. I don't think that's very common, but in those cases > we would probably want to allow drivers to quickly gate and ungate their > clks but then defer the sleeping stuff (voltages and off chip clks) to > the schedulable contexts. We'll still be left with a small number of > drivers that want to only call clk_prepare() and clk_unprepare() from > within OPP and keep calling clk_enable() and clk_disable() from their > driver. So introduce different APIs for those drivers to indicate this > to OPP? And only do that when it becomes a requirement? > > Otherwise I don't really see a problem with the OPP call handling the > enable state of the clk as well. I think we also need to consider cross SoC drivers. One SoC may have both clocks and OPPs to manage, while another may have only clocks. Even it this may be fairly uncommon, we should consider it, before we decide to fold in additional clock management, like clk_prepare|unprepare() for example, behind the dev_pm_opp_set_rate() API. The point is, the driver may need to call clk_prepare|enable() anyways, unless we make that conditional depending on a DT compatible string, for example. Of course, because the clock prepare/enable is reference counted, there may not be a problem in practice to have both the OPP and driver to deal with it. > > > > > > One nice feature of this approach is that we don't need to change the > > > OPP binding to support this. We can specify only the max frequencies and > > > the voltage requirements in DT with the existing binding and slightly > > > tweak the OPP code to achieve these results. > > > > > > This series includes a conversion of the uart and spi drivers on > > > qcom sdm845 where these patches are being developed. It shows how a > > > driver is converted from the clk APIs to the OPP APIs and how > > > enable/disable state of the clk is communicated to the OPP layer. > > > > > > Some open topics and initial points for discussion are: > > > > > > 1) The dev_pm_opp_set_rate() API changes may break something that's > > > relying on the rate rounding that OPP provides. If those exist, > > > we may need to implement another API that is more explicit about using > > > the clk API instead of the OPP tables. > > > > > > 2) dev_pm_opp_set_rate(0) is an interesting solution to the problem of > > > dropping the rate requirement. Is there anything better than this? This seems like a reasonable approach to me. At least it seams silly to invent a separate API and I can't think of anything else that makes sense. > > > > > > 3) How do we handle devices that already have power-domains specified in > > > DT? The opp binding for required-opps doesn't let us specify the power > > > domain to target, instead it assumes that whatever power domain is > > > attached to a device is the one that OPP needs to use to change the > > > genpd performance state. Do we need a > > > dev_pm_opp_set_required_opps_name() or something to be explicit about > > > this? Can we have some way for the power domain that required-opps > > > correspond to be expressed in the OPP tables themselves? Is the about the multiple PM domains per device? I thought we have already covered this case, or at least part of it [1]. Doesn't dev_pm_opp_set_genpd_virt_dev() and friends, help you with this, no? > > > > > > 4) How do we achieve the "full constraint" state? i.e. what do we do > > > about some devices being active and others being inactive during boot > > > and making sure that the voltage for the shared power domains doesn't > > > drop until all devices requiring it have informed OPP about their > > > power requirements? Good question. What kind of problems do you see. Will drivers fail to probe? Will the HW operate outside valid conditions? > > > > > Any comments on these topics? > Kind regards Uffe [1] https://lkml.org/lkml/2018/10/25/204
Quoting Rajendra Nayak (2019-02-06 22:57:12) > > > 3) How do we handle devices that already have power-domains specified in > > DT? The opp binding for required-opps doesn't let us specify the power > > domain to target, instead it assumes that whatever power domain is > > attached to a device is the one that OPP needs to use to change the > > genpd performance state. Do we need a > > dev_pm_opp_set_required_opps_name() or something to be explicit about > > this? Can we have some way for the power domain that required-opps > > correspond to be expressed in the OPP tables themselves? > > I was converting a few more drivers to use the proposed approach in this > RFC, in order to identify all outstanding issues we need to deal with, > and specifically for UFS, I end up with this exact scenario where UFS already > has an existing power domain (gdsc) and I need to add another one (rpmhpd) for > setting the performance state. > > If I use dev_pm_opp_of_add_table() to add the opp table from DT, the opp > layer assumes its the same device on which it can do a dev_pm_genpd_set_performance_state() > with, however the device that's actually associated with the pm_domain when we > have multiple power domains is infact the one (dummy) that we create when > the driver makes a call to dev_pm_domain_attach_by_name/id(). > > Any thoughts on whats a good way to handle this? > Ulf mentioned that we can use dev_pm_opp_set_genpd_virt_dev() for this. Does that API work here?
On 2/8/2019 1:17 AM, Stephen Boyd wrote: > Quoting Rajendra Nayak (2019-02-06 22:57:12) >> >>> 3) How do we handle devices that already have power-domains specified in >>> DT? The opp binding for required-opps doesn't let us specify the power >>> domain to target, instead it assumes that whatever power domain is >>> attached to a device is the one that OPP needs to use to change the >>> genpd performance state. Do we need a >>> dev_pm_opp_set_required_opps_name() or something to be explicit about >>> this? Can we have some way for the power domain that required-opps >>> correspond to be expressed in the OPP tables themselves? >> >> I was converting a few more drivers to use the proposed approach in this >> RFC, in order to identify all outstanding issues we need to deal with, >> and specifically for UFS, I end up with this exact scenario where UFS already >> has an existing power domain (gdsc) and I need to add another one (rpmhpd) for >> setting the performance state. >> >> If I use dev_pm_opp_of_add_table() to add the opp table from DT, the opp >> layer assumes its the same device on which it can do a dev_pm_genpd_set_performance_state() >> with, however the device that's actually associated with the pm_domain when we >> have multiple power domains is infact the one (dummy) that we create when >> the driver makes a call to dev_pm_domain_attach_by_name/id(). >> >> Any thoughts on whats a good way to handle this? >> > > Ulf mentioned that we can use dev_pm_opp_set_genpd_virt_dev() for this. > Does that API work here? Ah, yes, that should work, I hadn't noticed this API existed.
On 06-02-19, 23:58, Stephen Boyd wrote: > Quoting Viresh Kumar (2019-01-31 01:23:49) > > FWIW, I suggested exactly this solution sometime back [1] > > > > - Drivers need to use two API sets to change clock rate (OPP helpers) > > and enable/disable them (CLK framework helpers) and this leads us to > > exactly that combination. Is that acceptable ? It doesn't look great > > to me as well.. > > Agreed. I don't think anyone thinks this looks great, but I'll argue > that it's improving OPP for devices that already use it so that we can > remove voltage requirements when their clk is off. Think about CPUs that > are in their own clk domain where we want to remove the voltage > requirement when those CPUs are offline, or a GPU that wants to remove > its voltage requirement when it turns off clks. The combination is > already out there, just OPP hasn't solved this problem. > > The only other plan I had was to implement another API like > dev_pm_set_state() or something like that and have that do something > like what the OPP rate API does right now. The main difference being > that the argument to the function would be some opaque u64 that's > converted by the bus/class/genpd attached to the device into whatever > frequency/voltage/performance state is desired (and sequenced in the > right order too). And then I was thinking that runtime PM or explicit > dev_pm_set_state() calls would be used to tell this new layer that the > device was going to a lower power mode with some other number (sub-kHz > integer?) and have that be translated again into some > frequency/voltage/performance state. > > Either way, each driver will have to change from using the clk APIs to > changes rates to something else like one of these APIs, so I don't see a > huge difference. Drivers will have to change. I agree, that's why I wrote the dev_pm_opp_set_rate() API initially. > > > > - Do we expect the callers will disable clk before calling > > opp-set-rate with 0 ? We should remove the regulator requirements as > > well along with perf-state. > > Yes, that's the plan. Problems come along with this though, like shared > resource constraints and actually knowing the clk on/off state, > frequency, voltage, etc. at boot time and making sure to keep those > constraints satisfied during normal operation. But that isn't any different from drivers doing clk_disable directly, right ? So that shouldn't worry us. > > - What about enabling/disabling clock as well from OPP framework. We > > can enable it on the very first call to opp-set-rate and disable > > when freq is 0. That will simplify the drivers as well. > > It works when those drivers aren't calling clk_disable() directly from > some irq handler. I don't think that's very common, but in those cases > we would probably want to allow drivers to quickly gate and ungate their > clks but then defer the sleeping stuff (voltages and off chip clks) to > the schedulable contexts. We'll still be left with a small number of > drivers that want to only call clk_prepare() and clk_unprepare() from > within OPP and keep calling clk_enable() and clk_disable() from their > driver. So introduce different APIs for those drivers to indicate this > to OPP? And only do that when it becomes a requirement? I am not sure I understood this well. The reference counting within clk/regulator should let both the layers (driver and opp core) work just fine. Why would a driver don't want OPP core to call clk_prepare_enable() all the time ? > Otherwise I don't really see a problem with the OPP call handling the > enable state of the clk as well. Right, so I would like that to be part of this series when this gets implemented. > > > One nice feature of this approach is that we don't need to change the > > > OPP binding to support this. We can specify only the max frequencies and > > > the voltage requirements in DT with the existing binding and slightly > > > tweak the OPP code to achieve these results. > > > > > > This series includes a conversion of the uart and spi drivers on > > > qcom sdm845 where these patches are being developed. It shows how a > > > driver is converted from the clk APIs to the OPP APIs and how > > > enable/disable state of the clk is communicated to the OPP layer. > > > > > > Some open topics and initial points for discussion are: > > > > > > 1) The dev_pm_opp_set_rate() API changes may break something that's > > > relying on the rate rounding that OPP provides. If those exist, > > > we may need to implement another API that is more explicit about using > > > the clk API instead of the OPP tables. I don't remember any such cases, I may have forgotten about those though. > > > 2) dev_pm_opp_set_rate(0) is an interesting solution to the problem of > > > dropping the rate requirement. Is there anything better than this? I am okay with it. I don't want to invent another set of APIs to enable / disable the resources. > > > 3) How do we handle devices that already have power-domains specified in > > > DT? The opp binding for required-opps doesn't let us specify the power > > > domain to target, instead it assumes that whatever power domain is > > > attached to a device is the one that OPP needs to use to change the > > > genpd performance state. Do we need a > > > dev_pm_opp_set_required_opps_name() or something to be explicit about Yeah, we may need to come up with something like this eventually. I had written something like that earlier, but then it wasn't required. > > > this? Can we have some way for the power domain that required-opps > > > correspond to be expressed in the OPP tables themselves? Not sure I understood it. Can you explain with some example please ? > > > 4) How do we achieve the "full constraint" state? i.e. what do we do > > > about some devices being active and others being inactive during boot > > > and making sure that the voltage for the shared power domains doesn't > > > drop until all devices requiring it have informed OPP about their > > > power requirements? We need the boot-constraint framework for that. I think this is a problem which we have currently as well. I am waiting for the bus-scaling framework to get in, after that we will have lot of cases where boot-constraints would be required and it won't be limited to just clcd then.
On 07-02-19, 14:37, Ulf Hansson wrote: > I think we also need to consider cross SoC drivers. One SoC may have > both clocks and OPPs to manage, while another may have only clocks. We already have that case with CPUs as well and dev_pm_opp_set_rate() takes care of it. > Even it this may be fairly uncommon, we should consider it, before we > decide to fold in additional clock management, like > clk_prepare|unprepare() for example, behind the dev_pm_opp_set_rate() > API. > > The point is, the driver may need to call clk_prepare|enable() > anyways, unless we make that conditional depending on a DT compatible > string, for example. Of course, because the clock prepare/enable is > reference counted, there may not be a problem in practice to have both > the OPP and driver to deal with it.
On Fri, 8 Feb 2019 at 08:17, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 07-02-19, 14:37, Ulf Hansson wrote: > > I think we also need to consider cross SoC drivers. One SoC may have > > both clocks and OPPs to manage, while another may have only clocks. > > We already have that case with CPUs as well and dev_pm_opp_set_rate() > takes care of it. I think you may have misunderstood my point. Or maybe I don't get yours. :-) What if there is no OPP at all to use, then dev_pm_opp_set_rate() is just a noop, right? In this scenario the driver still need to call clk_set_rate(). How do we cope with these cases? > > > Even it this may be fairly uncommon, we should consider it, before we > > decide to fold in additional clock management, like > > clk_prepare|unprepare() for example, behind the dev_pm_opp_set_rate() > > API. > > > > The point is, the driver may need to call clk_prepare|enable() > > anyways, unless we make that conditional depending on a DT compatible > > string, for example. Of course, because the clock prepare/enable is > > reference counted, there may not be a problem in practice to have both > > the OPP and driver to deal with it. > Kind regards Uffe
On 08-02-19, 10:45, Ulf Hansson wrote: > On Fri, 8 Feb 2019 at 08:17, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > On 07-02-19, 14:37, Ulf Hansson wrote: > > > I think we also need to consider cross SoC drivers. One SoC may have > > > both clocks and OPPs to manage, while another may have only clocks. > > > > We already have that case with CPUs as well and dev_pm_opp_set_rate() > > takes care of it. > > I think you may have misunderstood my point. Or maybe I don't get yours. :-) It was me. I thought you are talking about regulators and that is what is already managed, i.e. to work with or without regulators. > What if there is no OPP at all to use, then dev_pm_opp_set_rate() is > just a noop, right? In this scenario the driver still need to call > clk_set_rate(). > > How do we cope with these cases? Yeah, that would be a problem and hacking the OPP core may not be the right solution :(
On Fri, 8 Feb 2019 at 11:05, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 08-02-19, 10:45, Ulf Hansson wrote: > > On Fri, 8 Feb 2019 at 08:17, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > > > On 07-02-19, 14:37, Ulf Hansson wrote: > > > > I think we also need to consider cross SoC drivers. One SoC may have > > > > both clocks and OPPs to manage, while another may have only clocks. > > > > > > We already have that case with CPUs as well and dev_pm_opp_set_rate() > > > takes care of it. > > > > I think you may have misunderstood my point. Or maybe I don't get yours. :-) > > It was me. I thought you are talking about regulators and that is what > is already managed, i.e. to work with or without regulators. > > > What if there is no OPP at all to use, then dev_pm_opp_set_rate() is > > just a noop, right? In this scenario the driver still need to call > > clk_set_rate(). > > > > How do we cope with these cases? > > Yeah, that would be a problem and hacking the OPP core may not be the > right solution :( I guess one simple way forward could just be to check if there is an OPP handle/table available, then use dev_pm_opp_set_rate(). When no OPP handle/table, use clk_set_rate() *instead*, not both. That could work, don't you think? Kind regards Uffe
On 08-02-19, 11:31, Ulf Hansson wrote: > On Fri, 8 Feb 2019 at 11:05, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > On 08-02-19, 10:45, Ulf Hansson wrote: > > > On Fri, 8 Feb 2019 at 08:17, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > > > > > On 07-02-19, 14:37, Ulf Hansson wrote: > > > > > I think we also need to consider cross SoC drivers. One SoC may have > > > > > both clocks and OPPs to manage, while another may have only clocks. > > > > > > > > We already have that case with CPUs as well and dev_pm_opp_set_rate() > > > > takes care of it. > > > > > > I think you may have misunderstood my point. Or maybe I don't get yours. :-) > > > > It was me. I thought you are talking about regulators and that is what > > is already managed, i.e. to work with or without regulators. > > > > > What if there is no OPP at all to use, then dev_pm_opp_set_rate() is > > > just a noop, right? In this scenario the driver still need to call > > > clk_set_rate(). > > > > > > How do we cope with these cases? > > > > Yeah, that would be a problem and hacking the OPP core may not be the > > right solution :( > > I guess one simple way forward could just be to check if there is an > OPP handle/table available, then use dev_pm_opp_set_rate(). When no > OPP handle/table, use clk_set_rate() *instead*, not both. > > That could work, don't you think? Yeah, just that it adds more conditional code in drivers, while we wanted to make them light-weight :)