diff mbox series

[BUG] cpufreq/armada-37xx: Parent clock issues after boot

Message ID 20190124103052.dvkg43uwieejoupo@vireshk-i7 (mailing list archive)
State New, archived
Headers show
Series [BUG] cpufreq/armada-37xx: Parent clock issues after boot | expand

Commit Message

Viresh Kumar Jan. 24, 2019, 10:30 a.m. UTC
Hi Gregory,

Ilias has an espressobin board and he reported performance regression
while playing with cpufreq governors.

I have tried to track it down in the last couple of days with Ilias
testing the kernel with my suggestions. We strongly believe that there
is something wrong with parent clock selection in the clk driver.

Simple way to reproduce the issue:

- configure kernel with performance and powersave governors, make
  performance governor default and boot the kernel.

- Run "sysbench --test=cpu run" and note down performance numbers.
- Do following to force a freq change by kernel:
  echo powersave > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
  echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor

- Run sysbench again and you will see 20% drop in performance.

The problem happens as soon as the kernel changes the frequency for
the very first time. Probably because that's when we change the
clk-parent for the very first time.

Few more things I noticed:

- During boot when the kernel adds the CPU clk to clk-framework, we
  read the registers from the non-DVFS set as DVFS wasn't enabled yet.
  So both parent and clock rate are read from there.

- But cpufreq starts working with the other set of registers which are
  available only after DVFS is enabled.

- We call clk_get_parent() followed by clk_set_parent() in
  armada37xx_cpufreq_dvfs_setup(). I am not sure what you wanted to do
  here as these two statements may not have any affect as you pass the
  return value of clk_get_parent() to clk_set_parent(). Because the
  clk framework will match the new parent with cached value of old
  one, it wouldn't change anything at all at hardware level. Over
  that, this all is done before enabling DVFS specific bits, which
  will make us play with non-DVFS registers and we don't want that,
  isn't it ?

- We checked the divider values right from the registers before moving
  to powersave and after moving back to performance. We are at
  load_level 0, which means parent clock gets passed as is. So the
  divider should be fine, only thing left is parent clock.

I am attaching two files here, one of them is the debug patch I wrote
to test this and the second one shows those debug prints during
different phase of testing.

Thanks in advance for helping out.

Comments

Gregory CLEMENT Jan. 24, 2019, 10:39 a.m. UTC | #1
Hi Viresh,
 
 On jeu., janv. 24 2019, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> Hi Gregory,
>
> Ilias has an espressobin board and he reported performance regression
> while playing with cpufreq governors.
>
> I have tried to track it down in the last couple of days with Ilias
> testing the kernel with my suggestions. We strongly believe that there
> is something wrong with parent clock selection in the clk driver.
>
> Simple way to reproduce the issue:
>
> - configure kernel with performance and powersave governors, make
>   performance governor default and boot the kernel.
>
> - Run "sysbench --test=cpu run" and note down performance numbers.
> - Do following to force a freq change by kernel:
>   echo powersave > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
>   echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
>
> - Run sysbench again and you will see 20% drop in performance.
>
> The problem happens as soon as the kernel changes the frequency for
> the very first time. Probably because that's when we change the
> clk-parent for the very first time.
>
> Few more things I noticed:
>
> - During boot when the kernel adds the CPU clk to clk-framework, we
>   read the registers from the non-DVFS set as DVFS wasn't enabled yet.
>   So both parent and clock rate are read from there.
>
> - But cpufreq starts working with the other set of registers which are
>   available only after DVFS is enabled.
>
> - We call clk_get_parent() followed by clk_set_parent() in
>   armada37xx_cpufreq_dvfs_setup(). I am not sure what you wanted to do
>   here as these two statements may not have any affect as you pass the
>   return value of clk_get_parent() to clk_set_parent(). Because the
>   clk framework will match the new parent with cached value of old
>   one, it wouldn't change anything at all at hardware level. Over
>   that, this all is done before enabling DVFS specific bits, which
>   will make us play with non-DVFS registers and we don't want that,
>   isn't it ?
>
> - We checked the divider values right from the registers before moving
>   to powersave and after moving back to performance. We are at
>   load_level 0, which means parent clock gets passed as is. So the
>   divider should be fine, only thing left is parent clock.
>
> I am attaching two files here, one of them is the debug patch I wrote
> to test this and the second one shows those debug prints during
> different phase of testing.
>
> Thanks in advance for helping out.

Thanks for your very detailed bug report and for your first
investigations. I will have a look on it very soon.

Gregory

>
> -- 
> viresh
>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
Gregory CLEMENT Jan. 25, 2019, 5:54 p.m. UTC | #2
Hi Viresh,
 
 On jeu., janv. 24 2019, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> Hi Gregory,
>
> Ilias has an espressobin board and he reported performance regression
> while playing with cpufreq governors.
>
> I have tried to track it down in the last couple of days with Ilias
> testing the kernel with my suggestions. We strongly believe that there
> is something wrong with parent clock selection in the clk driver.
>
> Simple way to reproduce the issue:
>
> - configure kernel with performance and powersave governors, make
>   performance governor default and boot the kernel.
>
> - Run "sysbench --test=cpu run" and note down performance numbers.
> - Do following to force a freq change by kernel:
>   echo powersave > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
>   echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
>
> - Run sysbench again and you will see 20% drop in performance.
>
> The problem happens as soon as the kernel changes the frequency for
> the very first time. Probably because that's when we change the
> clk-parent for the very first time.
>
> Few more things I noticed:
>
> - During boot when the kernel adds the CPU clk to clk-framework, we
>   read the registers from the non-DVFS set as DVFS wasn't enabled yet.
>   So both parent and clock rate are read from there.
>
> - But cpufreq starts working with the other set of registers which are
>   available only after DVFS is enabled.
>
> - We call clk_get_parent() followed by clk_set_parent() in
>   armada37xx_cpufreq_dvfs_setup(). I am not sure what you wanted to do
>   here as these two statements may not have any affect as you pass the
>   return value of clk_get_parent() to clk_set_parent(). Because the
>   clk framework will match the new parent with cached value of old
>   one, it wouldn't change anything at all at hardware level. Over

This is exactly where the issue is come from. My intent was to use the
same parent that was used at boot time. And the trick was that when we
set the clock parent actually it only set the DVFS case. But I didn't
realized that it didn't match at all what it is done in clock framework.

>   that, this all is done before enabling DVFS specific bits, which
>   will make us play with non-DVFS registers and we don't want that,
>   isn't it ?

Well the idea was to modify the DVFS register before we enable
DVFS. Indeed modifying the parent clock of a CPU while we use it could
be a little tricky.

I have some ideas to fix this issue: I won't use anymore
get_parent/set_parent in the cpufreq driver and I will set the DVFS
parent clock during the initialization at the clock driver level. I need
to do some tests to be sure of the implementation and once it was done I
will CC you with the patch for the fix.

Gregory

>
> - We checked the divider values right from the registers before moving
>   to powersave and after moving back to performance. We are at
>   load_level 0, which means parent clock gets passed as is. So the
>   divider should be fine, only thing left is parent clock.
>
> I am attaching two files here, one of them is the debug patch I wrote
> to test this and the second one shows those debug prints during
> different phase of testing.
>
> Thanks in advance for helping out.
>
> -- 
> viresh
>
>
>
Viresh Kumar Feb. 22, 2019, 3:32 a.m. UTC | #3
On 25-01-19, 18:54, Gregory CLEMENT wrote:
> Well the idea was to modify the DVFS register before we enable
> DVFS. Indeed modifying the parent clock of a CPU while we use it could
> be a little tricky.
> 
> I have some ideas to fix this issue: I won't use anymore
> get_parent/set_parent in the cpufreq driver and I will set the DVFS
> parent clock during the initialization at the clock driver level. I need
> to do some tests to be sure of the implementation and once it was done I
> will CC you with the patch for the fix.

Hi Gregory,

I was wondering if you had any updates on this as it has been almost a
month since you shared an update.
diff mbox series

Patch

From 9cd83f9c1713126aef291d05f973d10e9df5f7eb Mon Sep 17 00:00:00 2001
Message-Id: <9cd83f9c1713126aef291d05f973d10e9df5f7eb.1548325580.git.viresh.kumar@linaro.org>
From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Wed, 23 Jan 2019 15:18:19 +0530
Subject: [PATCH] armada-debug-prints

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/clk/mvebu/armada-37xx-periph.c | 30 ++++++++++++++++++++++----
 drivers/cpufreq/armada-37xx-cpufreq.c  |  8 +++++++
 2 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/drivers/clk/mvebu/armada-37xx-periph.c b/drivers/clk/mvebu/armada-37xx-periph.c
index 1f1cff428d78..31f2f2cca6a5 100644
--- a/drivers/clk/mvebu/armada-37xx-periph.c
+++ b/drivers/clk/mvebu/armada-37xx-periph.c
@@ -393,6 +393,7 @@  static unsigned int armada_3700_pm_dvfs_get_cpu_div(struct regmap *base)
 	armada_3700_pm_dvfs_update_regs(load_level, &reg, &offset);
 
 	regmap_read(base, reg, &div);
+	pr_info("CLKDEBUG: %s: %d: %x: %x: %d\n", __func__, __LINE__, reg, offset, load_level);
 
 	return (div >> offset) & ARMADA_37XX_NB_TBG_DIV_MASK;
 }
@@ -469,12 +470,22 @@  static unsigned long clk_pm_cpu_recalc_rate(struct clk_hw *hw,
 {
 	struct clk_pm_cpu *pm_cpu = to_clk_pm_cpu(hw);
 	unsigned int div;
+	unsigned long rate;
 
-	if (armada_3700_pm_dvfs_is_enabled(pm_cpu->nb_pm_base))
+	pr_info("CLKDEBUG: %s: %d: %lu: %p\n", __func__, __LINE__, parent_rate, hw);
+
+	if (armada_3700_pm_dvfs_is_enabled(pm_cpu->nb_pm_base)) {
+		pr_info("CLKDEBUG: %s: %d\n", __func__, __LINE__);
 		div = armada_3700_pm_dvfs_get_cpu_div(pm_cpu->nb_pm_base);
-	else
+	} else {
+		pr_info("CLKDEBUG: %s: %d\n", __func__, __LINE__);
 		div = get_div(pm_cpu->reg_div, pm_cpu->shift_div);
-	return DIV_ROUND_UP_ULL((u64)parent_rate, div);
+	}
+
+	rate = DIV_ROUND_UP_ULL((u64)parent_rate, div);
+
+	pr_info("CLKDEBUG: %s: %d: %lu: %u\n", __func__, __LINE__, rate, div);
+	return rate;
 }
 
 static long clk_pm_cpu_round_rate(struct clk_hw *hw, unsigned long rate,
@@ -533,6 +544,8 @@  static void clk_pm_cpu_set_rate_wa(unsigned long rate, struct regmap *base)
 	if (rate != 1200 * 1000 * 1000)
 		return;
 
+	pr_info("CLKDEBUG: %s: %d\n", __func__, __LINE__);
+
 	regmap_read(base, ARMADA_37XX_NB_CPU_LOAD, &cur_level);
 	cur_level &= ARMADA_37XX_NB_CPU_LOAD_MASK;
 	if (cur_level <= ARMADA_37XX_DVFS_LOAD_1)
@@ -552,10 +565,14 @@  static int clk_pm_cpu_set_rate(struct clk_hw *hw, unsigned long rate,
 	unsigned int div = parent_rate / rate;
 	unsigned int load_level;
 
+	pr_info("CLKDEBUG: %s: %d\n", __func__, __LINE__);
+
 	/* only available when DVFS is enabled */
 	if (!armada_3700_pm_dvfs_is_enabled(base))
 		return -EINVAL;
 
+	pr_info("CLKDEBUG: %s: %d\n", __func__, __LINE__);
+
 	for (load_level = 0; load_level < LOAD_LEVEL_NR; load_level++) {
 		unsigned int reg, mask, val,
 			offset = ARMADA_37XX_NB_TBG_DIV_OFF;
@@ -563,10 +580,13 @@  static int clk_pm_cpu_set_rate(struct clk_hw *hw, unsigned long rate,
 		armada_3700_pm_dvfs_update_regs(load_level, &reg, &offset);
 
 		regmap_read(base, reg, &val);
+		pr_info("CLKDEBUG: %s: %d: %x: %x: %x\n", __func__, __LINE__, reg, offset, val);
 		val >>= offset;
 		val &= ARMADA_37XX_NB_TBG_DIV_MASK;
 
 		if (val == div) {
+			pr_info("CLKDEBUG: %s: %d: %u\n", __func__, __LINE__, div);
+
 			/*
 			 * We found a load level matching the target
 			 * divider, switch to this load level and
@@ -578,6 +598,7 @@  static int clk_pm_cpu_set_rate(struct clk_hw *hw, unsigned long rate,
 			clk_pm_cpu_set_rate_wa(rate, base);
 
 			regmap_update_bits(base, reg, mask, load_level);
+			pr_info("CLKDEBUG: %s: %d: %x: %x: %d\n", __func__, __LINE__, reg, offset, load_level);
 
 			return rate;
 		}
@@ -676,7 +697,8 @@  static int armada_3700_add_composite_clk(const struct clk_periph_data *data,
 	*hw = clk_hw_register_composite(dev, data->name, data->parent_names,
 					data->num_parents, mux_hw,
 					mux_ops, rate_hw, rate_ops,
-					gate_hw, gate_ops, CLK_IGNORE_UNUSED);
+					gate_hw, gate_ops, CLK_IGNORE_UNUSED |
+					CLK_GET_RATE_NOCACHE);
 
 	return PTR_ERR_OR_ZERO(*hw);
 }
diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c
index 75491fc841a6..17031f833777 100644
--- a/drivers/cpufreq/armada-37xx-cpufreq.c
+++ b/drivers/cpufreq/armada-37xx-cpufreq.c
@@ -159,6 +159,7 @@  static void __init armada37xx_cpufreq_dvfs_setup(struct regmap *base,
 		mask <<= offset;
 
 		regmap_update_bits(base, reg, mask, val);
+		pr_info("CLKDEBUG: %s: %d: %x: %x: %x\n", __func__, __LINE__, reg, offset, val);
 	}
 
 	/*
@@ -399,6 +400,8 @@  static int __init armada37xx_cpufreq_driver_init(void)
 		return PTR_ERR(clk);
 	}
 
+	pr_info("CLKDEBUG: %s: %d\n", __func__, __LINE__);
+
 	/* Get nominal (current) CPU frequency */
 	cur_frequency = clk_get_rate(clk);
 	if (!cur_frequency) {
@@ -445,6 +448,11 @@  static int __init armada37xx_cpufreq_driver_init(void)
 	pdata.suspend = armada37xx_cpufreq_suspend;
 	pdata.resume = armada37xx_cpufreq_resume;
 
+	ret = clk_set_rate(clk, 500000000);
+	pr_info("CLKDEBUG: %s: %d: %d\n", __func__, __LINE__, ret);
+	ret = clk_set_rate(clk, 1000000000);
+	pr_info("CLKDEBUG: %s: %d: %d\n", __func__, __LINE__, ret);
+
 	pdev = platform_device_register_data(NULL, "cpufreq-dt", -1, &pdata,
 					     sizeof(pdata));
 	ret = PTR_ERR_OR_ZERO(pdev);
-- 
2.20.1.321.g9e740568ce00