diff mbox series

[RFC,1/2] soc: rockchip: power-domain: Manage resource conflicts with firmware

Message ID 20220405184816.RFC.1.Ib865f199d15221eab4ff77f70bd7e9e2eb04d32f@changeid (mailing list archive)
State New, archived
Headers show
Series rockchip / devfreq: Coordinate DRAM controller resources between ATF and kernel | expand

Commit Message

Brian Norris April 6, 2022, 1:48 a.m. UTC
On RK3399 platforms, power domains are managed mostly by the kernel
(drivers/soc/rockchip/pm_domains.c), but there are a few exceptions
where ARM Trusted Firmware has to be involved:

(1) system suspend/resume
(2) DRAM DVFS (a.k.a., "ddrfreq")

Exception (1) does not cause much conflict, since the kernel has
quiesced itself by the time we make the relevant PSCI call.

Exception (2) can cause conflict, because of two actions:

(a) ARM Trusted Firmware needs to read/modify/write the PMU_BUS_IDLE_REQ
    register to idle the memory controller domain; the kernel driver
    also has to touch this register for other domains.
(b) ARM Trusted Firmware needs to manage the clocks associated with
    these domains.

To elaborate on (b): idling a power domain has always required ungating
an array of clocks; see this old explanation from Rockchip:
https://lore.kernel.org/linux-arm-kernel/54503C19.9060607@rock-chips.com/

Historically, ARM Trusted Firmware has avoided this issue by using a
special PMU_CRU_GATEDIS_CON0 register -- this register ungates all the
necessary clocks -- when idling the memory controller. Unfortunately,
we've found that this register is not 100% sufficient; it does not turn
the relevant PLLs on [0].

So it's possible to trigger issues with something like the following:

1. enable a power domain (e.g., RK3399_PD_VDU) -- kernel will
   temporarily enable relevant clocks/PLLs, then turn them back off
   2. a PLL (e.g., PLL_NPLL) is part of the clock tree for
      RK3399_PD_VDU's clocks but otherwise unused; NPLL is disabled
3. perform a ddrfreq transition (rk3399_dmcfreq_target() -> ...
   drivers/clk/rockchip/clk-ddr.c / ROCKCHIP_SIP_DRAM_FREQ)
   4. ARM Trusted Firmware unagates VDU clocks (via PMU_CRU_GATEDIS_CON0)
   5. ARM Trusted firmware idles the memory controller domain
   6. Step 5 waits on the VDU domain/clocks, but NPLL is still off

i.e., we hang the system.

So for (b), we need to at a minimum manage the relevant PLLs on behalf
of firmware. It's easier to simply manage the whole clock tree, in a
similar way we do in rockchip_pd_power().

For (a), we need to provide mutual exclusion betwen rockchip_pd_power()
and firmware. To resolve that, we simply grab the PMU mutex and release
it when ddrfreq is done.

The Chromium OS kernel has been carrying versions of part of this hack
for a while, based on some new custom notifiers [1]. I've rewritten as a
simple function call between the drivers, which is OK because:

 * the PMU driver isn't enabled, and we don't have this problem at all
   (the firmware should have left us in an OK state, and there are no
   runtime conflicts); or
 * the PMU driver is present, and is a single instance.

And the power-domain driver cannot be removed, so there's no lifetime
management to worry about.

For completeness, there's a 'dmc_pmu_mutex' to guard (likely
theoretical?) probe()-time races. It's OK for the memory controller
driver to start running before the PMU, because the PMU will avoid any
critical actions during the block() sequence.

[0] The RK3399 TRM for PMU_CRU_GATEDIS_CON0 only talks about ungating
    clocks. Based on experimentation, we've found that it does not power
    up the necessary PLLs.

[1] CHROMIUM: soc: rockchip: power-domain: Add notifier to dmc driver
    https://chromium-review.googlesource.com/q/I242dbd706d352f74ff706f5cbf42ebb92f9bcc60
    Notably, the Chromium solution only handled conflict (a), not (b).
    In practice, item (b) wasn't a problem in many cases because we
    never managed to fully power off PLLs. Now that the (upstream) video
    decoder driver performs runtime clock management, we often power off
    NPLL.

Signed-off-by: Brian Norris <briannorris@chromium.org>
---

 drivers/soc/rockchip/pm_domains.c | 118 ++++++++++++++++++++++++++++++
 include/soc/rockchip/pm_domains.h |  25 +++++++
 2 files changed, 143 insertions(+)
 create mode 100644 include/soc/rockchip/pm_domains.h

Comments

Doug Anderson April 6, 2022, 2:26 a.m. UTC | #1
Hi,

On Tue, Apr 5, 2022 at 6:49 PM Brian Norris <briannorris@chromium.org> wrote:
>
> On RK3399 platforms, power domains are managed mostly by the kernel
> (drivers/soc/rockchip/pm_domains.c), but there are a few exceptions
> where ARM Trusted Firmware has to be involved:
>
> (1) system suspend/resume
> (2) DRAM DVFS (a.k.a., "ddrfreq")
>
> Exception (1) does not cause much conflict, since the kernel has
> quiesced itself by the time we make the relevant PSCI call.
>
> Exception (2) can cause conflict, because of two actions:
>
> (a) ARM Trusted Firmware needs to read/modify/write the PMU_BUS_IDLE_REQ
>     register to idle the memory controller domain; the kernel driver
>     also has to touch this register for other domains.
> (b) ARM Trusted Firmware needs to manage the clocks associated with
>     these domains.
>
> To elaborate on (b): idling a power domain has always required ungating
> an array of clocks; see this old explanation from Rockchip:
> https://lore.kernel.org/linux-arm-kernel/54503C19.9060607@rock-chips.com/
>
> Historically, ARM Trusted Firmware has avoided this issue by using a
> special PMU_CRU_GATEDIS_CON0 register -- this register ungates all the
> necessary clocks -- when idling the memory controller. Unfortunately,
> we've found that this register is not 100% sufficient; it does not turn
> the relevant PLLs on [0].
>
> So it's possible to trigger issues with something like the following:
>
> 1. enable a power domain (e.g., RK3399_PD_VDU) -- kernel will
>    temporarily enable relevant clocks/PLLs, then turn them back off
>    2. a PLL (e.g., PLL_NPLL) is part of the clock tree for
>       RK3399_PD_VDU's clocks but otherwise unused; NPLL is disabled
> 3. perform a ddrfreq transition (rk3399_dmcfreq_target() -> ...
>    drivers/clk/rockchip/clk-ddr.c / ROCKCHIP_SIP_DRAM_FREQ)
>    4. ARM Trusted Firmware unagates VDU clocks (via PMU_CRU_GATEDIS_CON0)
>    5. ARM Trusted firmware idles the memory controller domain
>    6. Step 5 waits on the VDU domain/clocks, but NPLL is still off
>
> i.e., we hang the system.
>
> So for (b), we need to at a minimum manage the relevant PLLs on behalf
> of firmware. It's easier to simply manage the whole clock tree, in a
> similar way we do in rockchip_pd_power().
>
> For (a), we need to provide mutual exclusion betwen rockchip_pd_power()
> and firmware. To resolve that, we simply grab the PMU mutex and release
> it when ddrfreq is done.
>
> The Chromium OS kernel has been carrying versions of part of this hack
> for a while, based on some new custom notifiers [1]. I've rewritten as a
> simple function call between the drivers, which is OK because:
>
>  * the PMU driver isn't enabled, and we don't have this problem at all
>    (the firmware should have left us in an OK state, and there are no
>    runtime conflicts); or
>  * the PMU driver is present, and is a single instance.
>
> And the power-domain driver cannot be removed, so there's no lifetime
> management to worry about.
>
> For completeness, there's a 'dmc_pmu_mutex' to guard (likely
> theoretical?) probe()-time races. It's OK for the memory controller
> driver to start running before the PMU, because the PMU will avoid any
> critical actions during the block() sequence.
>
> [0] The RK3399 TRM for PMU_CRU_GATEDIS_CON0 only talks about ungating
>     clocks. Based on experimentation, we've found that it does not power
>     up the necessary PLLs.
>
> [1] CHROMIUM: soc: rockchip: power-domain: Add notifier to dmc driver
>     https://chromium-review.googlesource.com/q/I242dbd706d352f74ff706f5cbf42ebb92f9bcc60
>     Notably, the Chromium solution only handled conflict (a), not (b).
>     In practice, item (b) wasn't a problem in many cases because we
>     never managed to fully power off PLLs. Now that the (upstream) video
>     decoder driver performs runtime clock management, we often power off
>     NPLL.
>
> Signed-off-by: Brian Norris <briannorris@chromium.org>
> ---
>
>  drivers/soc/rockchip/pm_domains.c | 118 ++++++++++++++++++++++++++++++
>  include/soc/rockchip/pm_domains.h |  25 +++++++
>  2 files changed, 143 insertions(+)

I've already done several pre-review of a few versions of this, so at
this point I'm pretty happy with where things are. Feel free to add:

Reviewed-by: Douglas Anderson <dianders@chromium.org>
Chanwoo Choi April 7, 2022, 5:04 a.m. UTC | #2
Hi Brian,

Thanks for very detailed description.
I understand this issue.

Instead of adding the specific function for only rockchip,
how about adding new function pointer (like block/unblock or start/stop and others)
into 'struct generic_pm_domain'? And add new pm_genpd_* function
to control the power domain. Because it is better to use subsystem interface.

Best Regards,
Chanwoo Choi


On 4/6/22 10:48 AM, Brian Norris wrote:
> On RK3399 platforms, power domains are managed mostly by the kernel
> (drivers/soc/rockchip/pm_domains.c), but there are a few exceptions
> where ARM Trusted Firmware has to be involved:
> 
> (1) system suspend/resume
> (2) DRAM DVFS (a.k.a., "ddrfreq")
> 
> Exception (1) does not cause much conflict, since the kernel has
> quiesced itself by the time we make the relevant PSCI call.
> 
> Exception (2) can cause conflict, because of two actions:
> 
> (a) ARM Trusted Firmware needs to read/modify/write the PMU_BUS_IDLE_REQ
>     register to idle the memory controller domain; the kernel driver
>     also has to touch this register for other domains.
> (b) ARM Trusted Firmware needs to manage the clocks associated with
>     these domains.
> 
> To elaborate on (b): idling a power domain has always required ungating
> an array of clocks; see this old explanation from Rockchip:
> https://lore.kernel.org/linux-arm-kernel/54503C19.9060607@rock-chips.com/
> 
> Historically, ARM Trusted Firmware has avoided this issue by using a
> special PMU_CRU_GATEDIS_CON0 register -- this register ungates all the
> necessary clocks -- when idling the memory controller. Unfortunately,
> we've found that this register is not 100% sufficient; it does not turn
> the relevant PLLs on [0].
> 
> So it's possible to trigger issues with something like the following:
> 
> 1. enable a power domain (e.g., RK3399_PD_VDU) -- kernel will
>    temporarily enable relevant clocks/PLLs, then turn them back off
>    2. a PLL (e.g., PLL_NPLL) is part of the clock tree for
>       RK3399_PD_VDU's clocks but otherwise unused; NPLL is disabled
> 3. perform a ddrfreq transition (rk3399_dmcfreq_target() -> ...
>    drivers/clk/rockchip/clk-ddr.c / ROCKCHIP_SIP_DRAM_FREQ)
>    4. ARM Trusted Firmware unagates VDU clocks (via PMU_CRU_GATEDIS_CON0)
>    5. ARM Trusted firmware idles the memory controller domain
>    6. Step 5 waits on the VDU domain/clocks, but NPLL is still off
> 
> i.e., we hang the system.
> 
> So for (b), we need to at a minimum manage the relevant PLLs on behalf
> of firmware. It's easier to simply manage the whole clock tree, in a
> similar way we do in rockchip_pd_power().
> 
> For (a), we need to provide mutual exclusion betwen rockchip_pd_power()
> and firmware. To resolve that, we simply grab the PMU mutex and release
> it when ddrfreq is done.
> 
> The Chromium OS kernel has been carrying versions of part of this hack
> for a while, based on some new custom notifiers [1]. I've rewritten as a
> simple function call between the drivers, which is OK because:
> 
>  * the PMU driver isn't enabled, and we don't have this problem at all
>    (the firmware should have left us in an OK state, and there are no
>    runtime conflicts); or
>  * the PMU driver is present, and is a single instance.
> 
> And the power-domain driver cannot be removed, so there's no lifetime
> management to worry about.
> 
> For completeness, there's a 'dmc_pmu_mutex' to guard (likely
> theoretical?) probe()-time races. It's OK for the memory controller
> driver to start running before the PMU, because the PMU will avoid any
> critical actions during the block() sequence.
> 
> [0] The RK3399 TRM for PMU_CRU_GATEDIS_CON0 only talks about ungating
>     clocks. Based on experimentation, we've found that it does not power
>     up the necessary PLLs.
> 
> [1] CHROMIUM: soc: rockchip: power-domain: Add notifier to dmc driver
>     https://chromium-review.googlesource.com/q/I242dbd706d352f74ff706f5cbf42ebb92f9bcc60
>     Notably, the Chromium solution only handled conflict (a), not (b).
>     In practice, item (b) wasn't a problem in many cases because we
>     never managed to fully power off PLLs. Now that the (upstream) video
>     decoder driver performs runtime clock management, we often power off
>     NPLL.
> 
> Signed-off-by: Brian Norris <briannorris@chromium.org>
> ---
> 
>  drivers/soc/rockchip/pm_domains.c | 118 ++++++++++++++++++++++++++++++
>  include/soc/rockchip/pm_domains.h |  25 +++++++
>  2 files changed, 143 insertions(+)
>  create mode 100644 include/soc/rockchip/pm_domains.h
> 
> diff --git a/drivers/soc/rockchip/pm_domains.c b/drivers/soc/rockchip/pm_domains.c
> index 1b029e494274..bc0afc52299b 100644
> --- a/drivers/soc/rockchip/pm_domains.c
> +++ b/drivers/soc/rockchip/pm_domains.c
> @@ -8,6 +8,7 @@
>  #include <linux/io.h>
>  #include <linux/iopoll.h>
>  #include <linux/err.h>
> +#include <linux/mutex.h>
>  #include <linux/pm_clock.h>
>  #include <linux/pm_domain.h>
>  #include <linux/of_address.h>
> @@ -16,6 +17,7 @@
>  #include <linux/clk.h>
>  #include <linux/regmap.h>
>  #include <linux/mfd/syscon.h>
> +#include <soc/rockchip/pm_domains.h>
>  #include <dt-bindings/power/px30-power.h>
>  #include <dt-bindings/power/rk3036-power.h>
>  #include <dt-bindings/power/rk3066-power.h>
> @@ -139,6 +141,109 @@ struct rockchip_pmu {
>  #define DOMAIN_RK3568(name, pwr, req, wakeup)		\
>  	DOMAIN_M(name, pwr, pwr, req, req, req, wakeup)
>  
> +/*
> + * Dynamic Memory Controller may need to coordinate with us -- see
> + * rockchip_pmu_block().
> + *
> + * dmc_pmu_mutex protects registration-time races, so DMC driver doesn't try to
> + * block() while we're initializing the PMU.
> + */
> +static DEFINE_MUTEX(dmc_pmu_mutex);
> +static struct rockchip_pmu *dmc_pmu;
> +
> +/*
> + * Block PMU transitions and make sure they don't interfere with ARM Trusted
> + * Firmware operations. There are two conflicts, noted in the comments below.
> + *
> + * Caller must unblock PMU transitions via rockchip_pmu_unblock().
> + */
> +int rockchip_pmu_block(void)
> +{
> +	struct rockchip_pmu *pmu;
> +	struct generic_pm_domain *genpd;
> +	struct rockchip_pm_domain *pd;
> +	int i, ret;
> +
> +	mutex_lock(&dmc_pmu_mutex);
> +
> +	/* No PMU (yet)? Then we just block rockchip_pmu_probe(). */
> +	if (!dmc_pmu)
> +		return 0;
> +	pmu = dmc_pmu;
> +
> +	/*
> +	 * mutex blocks all idle transitions: we can't touch the
> +	 * PMU_BUS_IDLE_REQ (our ".idle_offset") register while ARM Trusted
> +	 * Firmware might be using it.
> +	 */
> +	mutex_lock(&pmu->mutex);
> +
> +	/*
> +	 * Power domain clocks: Per Rockchip, we *must* keep certain clocks
> +	 * enabled for the duration of power-domain transitions. Most
> +	 * transitions are handled by this driver, but some cases (in
> +	 * particular, DRAM DVFS / memory-controller idle) must be handled by
> +	 * firmware. Firmware can handle most clock management via a special
> +	 * "ungate" register (PMU_CRU_GATEDIS_CON0), but unfortunately, this
> +	 * doesn't handle PLLs. We can assist this transition by doing the
> +	 * clock management on behalf of firmware.
> +	 */
> +	for (i = 0; i < pmu->genpd_data.num_domains; i++) {
> +		genpd = pmu->genpd_data.domains[i];
> +		if (genpd) {
> +			pd = to_rockchip_pd(genpd);
> +			ret = clk_bulk_enable(pd->num_clks, pd->clks);
> +			if (ret < 0) {
> +				dev_err(pmu->dev,
> +					"failed to enable clks for domain '%s': %d\n",
> +					genpd->name, ret);
> +				goto err;
> +			}
> +		}
> +	}
> +
> +	return 0;
> +
> +err:
> +	for (i = i - 1; i >= 0; i--) {
> +		genpd = pmu->genpd_data.domains[i];
> +		if (genpd) {
> +			pd = to_rockchip_pd(genpd);
> +			clk_bulk_disable(pd->num_clks, pd->clks);
> +		}
> +	}
> +	mutex_unlock(&pmu->mutex);
> +	mutex_unlock(&dmc_pmu_mutex);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(rockchip_pmu_block);
> +
> +/* Unblock PMU transitions. */
> +void rockchip_pmu_unblock(void)
> +{
> +	struct rockchip_pmu *pmu;
> +	struct generic_pm_domain *genpd;
> +	struct rockchip_pm_domain *pd;
> +	int i;
> +
> +	if (dmc_pmu) {
> +		pmu = dmc_pmu;
> +		for (i = 0; i < pmu->genpd_data.num_domains; i++) {
> +			genpd = pmu->genpd_data.domains[i];
> +			if (genpd) {
> +				pd = to_rockchip_pd(genpd);
> +				clk_bulk_disable(pd->num_clks, pd->clks);
> +			}
> +		}
> +
> +		mutex_unlock(&pmu->mutex);
> +	}
> +
> +	mutex_unlock(&dmc_pmu_mutex);
> +}
> +EXPORT_SYMBOL_GPL(rockchip_pmu_unblock);
> +
>  static bool rockchip_pmu_domain_is_idle(struct rockchip_pm_domain *pd)
>  {
>  	struct rockchip_pmu *pmu = pd->pmu;
> @@ -690,6 +795,12 @@ static int rockchip_pm_domain_probe(struct platform_device *pdev)
>  
>  	error = -ENODEV;
>  
> +	/*
> +	 * Prevent any rockchip_pmu_block() from racing with the remainder of
> +	 * setup (clocks, register initialization).
> +	 */
> +	mutex_lock(&dmc_pmu_mutex);
> +
>  	for_each_available_child_of_node(np, node) {
>  		error = rockchip_pm_add_one_domain(pmu, node);
>  		if (error) {
> @@ -719,10 +830,17 @@ static int rockchip_pm_domain_probe(struct platform_device *pdev)
>  		goto err_out;
>  	}
>  
> +	/* We only expect one PMU. */
> +	if (!WARN_ON_ONCE(dmc_pmu))
> +		dmc_pmu = pmu;
> +
> +	mutex_unlock(&dmc_pmu_mutex);
> +
>  	return 0;
>  
>  err_out:
>  	rockchip_pm_domain_cleanup(pmu);
> +	mutex_unlock(&dmc_pmu_mutex);
>  	return error;
>  }
>  
> diff --git a/include/soc/rockchip/pm_domains.h b/include/soc/rockchip/pm_domains.h
> new file mode 100644
> index 000000000000..7dbd941fc937
> --- /dev/null
> +++ b/include/soc/rockchip/pm_domains.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright 2022, The Chromium OS Authors. All rights reserved.
> + */
> +
> +#ifndef __SOC_ROCKCHIP_PM_DOMAINS_H__
> +#define __SOC_ROCKCHIP_PM_DOMAINS_H__
> +
> +#ifdef CONFIG_ROCKCHIP_PM_DOMAINS
> +
> +int rockchip_pmu_block(void);
> +void rockchip_pmu_unblock(void);
> +
> +#else /* CONFIG_ROCKCHIP_PM_DOMAINS */
> +
> +static inline int rockchip_pmu_block(void)
> +{
> +	return 0;
> +}
> +
> +static inline void rockchip_pmu_unblock(void) { }
> +
> +#endif /* CONFIG_ROCKCHIP_PM_DOMAINS */
> +
> +#endif /* __SOC_ROCKCHIP_PM_DOMAINS_H__ */
>
Brian Norris April 9, 2022, 3:34 a.m. UTC | #3
Hi Chanwoo,

On Wed, Apr 6, 2022 at 9:38 PM Chanwoo Choi <cw00.choi@samsung.com> wrote:
> Instead of adding the specific function for only rockchip,
> how about adding new function pointer (like block/unblock or start/stop and others)
> into 'struct generic_pm_domain'? And add new pm_genpd_* function
> to control the power domain.

I suppose that is technically possible, but I'm not sure it makes a
ton of sense.

First, genpd doesn't seem to typically expose operations directly to
client device drivers. It's mostly about abstract handling of the
dependencies of "how do I power on this device?" behind the scenes of
things like pm_runtime_*(). I guess maybe something like
dev_pm_genpd_set_performance_state() is an approximately similar API
though (i.e., a genpd operation exposed to client drivers)? I could
try to go that route, if the genpd maintainers think this makes sense.

But secondly, this isn't exactly an operation on one power domain.
It's an operation on the entire power controller. I suppose I could
make a new domain here for the memory controller, and teach that
domain to implicitly manipulate all the other domains provided by the
PMU, but that feels like a fake abstraction to me.

Lastly, and perhaps least importantly: this likely would require a
device tree binding change. So far, the memory controller hasn't had
its own power domain. I guess one could argue that it has some
similarities to a power domain, albeit one that is managed in firmware
-- so maybe this is a reasonable "bug" to fix, if it really comes down
to it.

> Because it is better to use subsystem interface.

I don't agree this is universally true. It makes sense when there are
truly abstract concepts represented, which are likely to appear across
multiple implementations. Or maybe if the object model is complex. But
this operation seems very SoC-specific to me, and it's pretty simple
to implement this way. Or, do you think this is really something that
others will need -- pausing (and powering) a power controller so
another entity can manage it?

I guess I'd also like some thoughts from the genpd maintainers (CC'd),
of whether this seems like a good fit for a new genpd callback and
API.

Brian
Peter Geis April 12, 2022, 4:01 p.m. UTC | #4
On Fri, Apr 8, 2022 at 11:36 PM Brian Norris <briannorris@chromium.org> wrote:
>
> Hi Chanwoo,
>
> On Wed, Apr 6, 2022 at 9:38 PM Chanwoo Choi <cw00.choi@samsung.com> wrote:
> > Instead of adding the specific function for only rockchip,
> > how about adding new function pointer (like block/unblock or start/stop and others)
> > into 'struct generic_pm_domain'? And add new pm_genpd_* function
> > to control the power domain.

Good Morning,

As requested, I tested this on the rockpro64.
It solves the issue of the hang on initial frequency adjustment down to 400mhz.
It seems there are still issues with the ddr4 calculation in mainline
atf, as when we jump back up to 800mhz it hangs.

Thank you, this is greatly appreciated!
Tested-by: Peter Geis <pgwipeout@gmail.com>
on rk3399-rockpro64

Very Respectfully,
Peter Geis

>
> I suppose that is technically possible, but I'm not sure it makes a
> ton of sense.
>
> First, genpd doesn't seem to typically expose operations directly to
> client device drivers. It's mostly about abstract handling of the
> dependencies of "how do I power on this device?" behind the scenes of
> things like pm_runtime_*(). I guess maybe something like
> dev_pm_genpd_set_performance_state() is an approximately similar API
> though (i.e., a genpd operation exposed to client drivers)? I could
> try to go that route, if the genpd maintainers think this makes sense.
>
> But secondly, this isn't exactly an operation on one power domain.
> It's an operation on the entire power controller. I suppose I could
> make a new domain here for the memory controller, and teach that
> domain to implicitly manipulate all the other domains provided by the
> PMU, but that feels like a fake abstraction to me.
>
> Lastly, and perhaps least importantly: this likely would require a
> device tree binding change. So far, the memory controller hasn't had
> its own power domain. I guess one could argue that it has some
> similarities to a power domain, albeit one that is managed in firmware
> -- so maybe this is a reasonable "bug" to fix, if it really comes down
> to it.
>
> > Because it is better to use subsystem interface.
>
> I don't agree this is universally true. It makes sense when there are
> truly abstract concepts represented, which are likely to appear across
> multiple implementations. Or maybe if the object model is complex. But
> this operation seems very SoC-specific to me, and it's pretty simple
> to implement this way. Or, do you think this is really something that
> others will need -- pausing (and powering) a power controller so
> another entity can manage it?
>
> I guess I'd also like some thoughts from the genpd maintainers (CC'd),
> of whether this seems like a good fit for a new genpd callback and
> API.
>
> Brian
>
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
Chanwoo Choi April 13, 2022, 10:14 p.m. UTC | #5
Hi Brian,

On 22. 4. 9. 12:34, Brian Norris wrote:
> Hi Chanwoo,
> 
> On Wed, Apr 6, 2022 at 9:38 PM Chanwoo Choi <cw00.choi@samsung.com> wrote:
>> Instead of adding the specific function for only rockchip,
>> how about adding new function pointer (like block/unblock or start/stop and others)
>> into 'struct generic_pm_domain'? And add new pm_genpd_* function
>> to control the power domain.
> 
> I suppose that is technically possible, but I'm not sure it makes a
> ton of sense.
> 
> First, genpd doesn't seem to typically expose operations directly to
> client device drivers. It's mostly about abstract handling of the
> dependencies of "how do I power on this device?" behind the scenes of
> things like pm_runtime_*(). I guess maybe something like
> dev_pm_genpd_set_performance_state() is an approximately similar API
> though (i.e., a genpd operation exposed to client drivers)? I could
> try to go that route, if the genpd maintainers think this makes sense.
> 
> But secondly, this isn't exactly an operation on one power domain.
> It's an operation on the entire power controller. I suppose I could
> make a new domain here for the memory controller, and teach that
> domain to implicitly manipulate all the other domains provided by the
> PMU, but that feels like a fake abstraction to me.
> 
> Lastly, and perhaps least importantly: this likely would require a
> device tree binding change. So far, the memory controller hasn't had
> its own power domain. I guess one could argue that it has some
> similarities to a power domain, albeit one that is managed in firmware
> -- so maybe this is a reasonable "bug" to fix, if it really comes down
> to it.
> 
>> Because it is better to use subsystem interface.
> 
> I don't agree this is universally true. It makes sense when there are
> truly abstract concepts represented, which are likely to appear across
> multiple implementations. Or maybe if the object model is complex. But
> this operation seems very SoC-specific to me, and it's pretty simple
> to implement this way. Or, do you think this is really something that
> others will need -- pausing (and powering) a power controller so
> another entity can manage it?

Thanks for detailed reply.

I agree your thinking. If possible, just I prefer to use standard
subsystem interface. But as you commented, it is not general case
that this issue is related to all power domains. Also, there is dt
binding issue as you described.

I agree this approach. Thanks.

(snip)
Heiko Stübner May 8, 2022, 3:05 p.m. UTC | #6
Am Mittwoch, 6. April 2022, 03:48:41 CEST schrieb Brian Norris:
> On RK3399 platforms, power domains are managed mostly by the kernel
> (drivers/soc/rockchip/pm_domains.c), but there are a few exceptions
> where ARM Trusted Firmware has to be involved:
> 
> (1) system suspend/resume
> (2) DRAM DVFS (a.k.a., "ddrfreq")
> 
> Exception (1) does not cause much conflict, since the kernel has
> quiesced itself by the time we make the relevant PSCI call.
> 
> Exception (2) can cause conflict, because of two actions:
> 
> (a) ARM Trusted Firmware needs to read/modify/write the PMU_BUS_IDLE_REQ
>     register to idle the memory controller domain; the kernel driver
>     also has to touch this register for other domains.
> (b) ARM Trusted Firmware needs to manage the clocks associated with
>     these domains.
> 
> To elaborate on (b): idling a power domain has always required ungating
> an array of clocks; see this old explanation from Rockchip:
> https://lore.kernel.org/linux-arm-kernel/54503C19.9060607@rock-chips.com/
> 
> Historically, ARM Trusted Firmware has avoided this issue by using a
> special PMU_CRU_GATEDIS_CON0 register -- this register ungates all the
> necessary clocks -- when idling the memory controller. Unfortunately,
> we've found that this register is not 100% sufficient; it does not turn
> the relevant PLLs on [0].
> 
> So it's possible to trigger issues with something like the following:
> 
> 1. enable a power domain (e.g., RK3399_PD_VDU) -- kernel will
>    temporarily enable relevant clocks/PLLs, then turn them back off
>    2. a PLL (e.g., PLL_NPLL) is part of the clock tree for
>       RK3399_PD_VDU's clocks but otherwise unused; NPLL is disabled
> 3. perform a ddrfreq transition (rk3399_dmcfreq_target() -> ...
>    drivers/clk/rockchip/clk-ddr.c / ROCKCHIP_SIP_DRAM_FREQ)
>    4. ARM Trusted Firmware unagates VDU clocks (via PMU_CRU_GATEDIS_CON0)
>    5. ARM Trusted firmware idles the memory controller domain
>    6. Step 5 waits on the VDU domain/clocks, but NPLL is still off
> 
> i.e., we hang the system.
> 
> So for (b), we need to at a minimum manage the relevant PLLs on behalf
> of firmware. It's easier to simply manage the whole clock tree, in a
> similar way we do in rockchip_pd_power().
> 
> For (a), we need to provide mutual exclusion betwen rockchip_pd_power()
> and firmware. To resolve that, we simply grab the PMU mutex and release
> it when ddrfreq is done.
> 
> The Chromium OS kernel has been carrying versions of part of this hack
> for a while, based on some new custom notifiers [1]. I've rewritten as a
> simple function call between the drivers, which is OK because:
> 
>  * the PMU driver isn't enabled, and we don't have this problem at all
>    (the firmware should have left us in an OK state, and there are no
>    runtime conflicts); or
>  * the PMU driver is present, and is a single instance.
> 
> And the power-domain driver cannot be removed, so there's no lifetime
> management to worry about.
> 
> For completeness, there's a 'dmc_pmu_mutex' to guard (likely
> theoretical?) probe()-time races. It's OK for the memory controller
> driver to start running before the PMU, because the PMU will avoid any
> critical actions during the block() sequence.
> 
> [0] The RK3399 TRM for PMU_CRU_GATEDIS_CON0 only talks about ungating
>     clocks. Based on experimentation, we've found that it does not power
>     up the necessary PLLs.
> 
> [1] CHROMIUM: soc: rockchip: power-domain: Add notifier to dmc driver
>     https://chromium-review.googlesource.com/q/I242dbd706d352f74ff706f5cbf42ebb92f9bcc60
>     Notably, the Chromium solution only handled conflict (a), not (b).
>     In practice, item (b) wasn't a problem in many cases because we
>     never managed to fully power off PLLs. Now that the (upstream) video
>     decoder driver performs runtime clock management, we often power off
>     NPLL.
> 
> Signed-off-by: Brian Norris <briannorris@chromium.org>

Reviewed-by: Heiko Stuebner <heiko@sntech.de>
diff mbox series

Patch

diff --git a/drivers/soc/rockchip/pm_domains.c b/drivers/soc/rockchip/pm_domains.c
index 1b029e494274..bc0afc52299b 100644
--- a/drivers/soc/rockchip/pm_domains.c
+++ b/drivers/soc/rockchip/pm_domains.c
@@ -8,6 +8,7 @@ 
 #include <linux/io.h>
 #include <linux/iopoll.h>
 #include <linux/err.h>
+#include <linux/mutex.h>
 #include <linux/pm_clock.h>
 #include <linux/pm_domain.h>
 #include <linux/of_address.h>
@@ -16,6 +17,7 @@ 
 #include <linux/clk.h>
 #include <linux/regmap.h>
 #include <linux/mfd/syscon.h>
+#include <soc/rockchip/pm_domains.h>
 #include <dt-bindings/power/px30-power.h>
 #include <dt-bindings/power/rk3036-power.h>
 #include <dt-bindings/power/rk3066-power.h>
@@ -139,6 +141,109 @@  struct rockchip_pmu {
 #define DOMAIN_RK3568(name, pwr, req, wakeup)		\
 	DOMAIN_M(name, pwr, pwr, req, req, req, wakeup)
 
+/*
+ * Dynamic Memory Controller may need to coordinate with us -- see
+ * rockchip_pmu_block().
+ *
+ * dmc_pmu_mutex protects registration-time races, so DMC driver doesn't try to
+ * block() while we're initializing the PMU.
+ */
+static DEFINE_MUTEX(dmc_pmu_mutex);
+static struct rockchip_pmu *dmc_pmu;
+
+/*
+ * Block PMU transitions and make sure they don't interfere with ARM Trusted
+ * Firmware operations. There are two conflicts, noted in the comments below.
+ *
+ * Caller must unblock PMU transitions via rockchip_pmu_unblock().
+ */
+int rockchip_pmu_block(void)
+{
+	struct rockchip_pmu *pmu;
+	struct generic_pm_domain *genpd;
+	struct rockchip_pm_domain *pd;
+	int i, ret;
+
+	mutex_lock(&dmc_pmu_mutex);
+
+	/* No PMU (yet)? Then we just block rockchip_pmu_probe(). */
+	if (!dmc_pmu)
+		return 0;
+	pmu = dmc_pmu;
+
+	/*
+	 * mutex blocks all idle transitions: we can't touch the
+	 * PMU_BUS_IDLE_REQ (our ".idle_offset") register while ARM Trusted
+	 * Firmware might be using it.
+	 */
+	mutex_lock(&pmu->mutex);
+
+	/*
+	 * Power domain clocks: Per Rockchip, we *must* keep certain clocks
+	 * enabled for the duration of power-domain transitions. Most
+	 * transitions are handled by this driver, but some cases (in
+	 * particular, DRAM DVFS / memory-controller idle) must be handled by
+	 * firmware. Firmware can handle most clock management via a special
+	 * "ungate" register (PMU_CRU_GATEDIS_CON0), but unfortunately, this
+	 * doesn't handle PLLs. We can assist this transition by doing the
+	 * clock management on behalf of firmware.
+	 */
+	for (i = 0; i < pmu->genpd_data.num_domains; i++) {
+		genpd = pmu->genpd_data.domains[i];
+		if (genpd) {
+			pd = to_rockchip_pd(genpd);
+			ret = clk_bulk_enable(pd->num_clks, pd->clks);
+			if (ret < 0) {
+				dev_err(pmu->dev,
+					"failed to enable clks for domain '%s': %d\n",
+					genpd->name, ret);
+				goto err;
+			}
+		}
+	}
+
+	return 0;
+
+err:
+	for (i = i - 1; i >= 0; i--) {
+		genpd = pmu->genpd_data.domains[i];
+		if (genpd) {
+			pd = to_rockchip_pd(genpd);
+			clk_bulk_disable(pd->num_clks, pd->clks);
+		}
+	}
+	mutex_unlock(&pmu->mutex);
+	mutex_unlock(&dmc_pmu_mutex);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(rockchip_pmu_block);
+
+/* Unblock PMU transitions. */
+void rockchip_pmu_unblock(void)
+{
+	struct rockchip_pmu *pmu;
+	struct generic_pm_domain *genpd;
+	struct rockchip_pm_domain *pd;
+	int i;
+
+	if (dmc_pmu) {
+		pmu = dmc_pmu;
+		for (i = 0; i < pmu->genpd_data.num_domains; i++) {
+			genpd = pmu->genpd_data.domains[i];
+			if (genpd) {
+				pd = to_rockchip_pd(genpd);
+				clk_bulk_disable(pd->num_clks, pd->clks);
+			}
+		}
+
+		mutex_unlock(&pmu->mutex);
+	}
+
+	mutex_unlock(&dmc_pmu_mutex);
+}
+EXPORT_SYMBOL_GPL(rockchip_pmu_unblock);
+
 static bool rockchip_pmu_domain_is_idle(struct rockchip_pm_domain *pd)
 {
 	struct rockchip_pmu *pmu = pd->pmu;
@@ -690,6 +795,12 @@  static int rockchip_pm_domain_probe(struct platform_device *pdev)
 
 	error = -ENODEV;
 
+	/*
+	 * Prevent any rockchip_pmu_block() from racing with the remainder of
+	 * setup (clocks, register initialization).
+	 */
+	mutex_lock(&dmc_pmu_mutex);
+
 	for_each_available_child_of_node(np, node) {
 		error = rockchip_pm_add_one_domain(pmu, node);
 		if (error) {
@@ -719,10 +830,17 @@  static int rockchip_pm_domain_probe(struct platform_device *pdev)
 		goto err_out;
 	}
 
+	/* We only expect one PMU. */
+	if (!WARN_ON_ONCE(dmc_pmu))
+		dmc_pmu = pmu;
+
+	mutex_unlock(&dmc_pmu_mutex);
+
 	return 0;
 
 err_out:
 	rockchip_pm_domain_cleanup(pmu);
+	mutex_unlock(&dmc_pmu_mutex);
 	return error;
 }
 
diff --git a/include/soc/rockchip/pm_domains.h b/include/soc/rockchip/pm_domains.h
new file mode 100644
index 000000000000..7dbd941fc937
--- /dev/null
+++ b/include/soc/rockchip/pm_domains.h
@@ -0,0 +1,25 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright 2022, The Chromium OS Authors. All rights reserved.
+ */
+
+#ifndef __SOC_ROCKCHIP_PM_DOMAINS_H__
+#define __SOC_ROCKCHIP_PM_DOMAINS_H__
+
+#ifdef CONFIG_ROCKCHIP_PM_DOMAINS
+
+int rockchip_pmu_block(void);
+void rockchip_pmu_unblock(void);
+
+#else /* CONFIG_ROCKCHIP_PM_DOMAINS */
+
+static inline int rockchip_pmu_block(void)
+{
+	return 0;
+}
+
+static inline void rockchip_pmu_unblock(void) { }
+
+#endif /* CONFIG_ROCKCHIP_PM_DOMAINS */
+
+#endif /* __SOC_ROCKCHIP_PM_DOMAINS_H__ */