diff mbox

[v2] cpufreq: powernv: Add checks to report cpu frequency throttling conditions

Message ID 1427395282-31875-1-git-send-email-shilpa.bhat@linux.vnet.ibm.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Shilpasri G Bhat March 26, 2015, 6:41 p.m. UTC
Cpu frequency can be throttled due to failures of components like OCC,
power supply and fan. It can also be throttled due to temperature and
power limit. We can detect the throttling by checking 1)if max frequency
is reduced, 2)if the core is put to safe frequency 3)if the SPR based
frequency management is disabled.

The current status of the core is read from Power Management Status
Register(PMSR) to check if any of the throttling condition is
occurred and the appropriate throttling message is reported.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
---
Changes from V1: Removed unused value of PMCR register

 drivers/cpufreq/powernv-cpufreq.c | 39 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 38 insertions(+), 1 deletion(-)

Comments

Viresh Kumar March 27, 2015, 4:35 a.m. UTC | #1
Hi Shilpa,

On 27 March 2015 at 00:11, Shilpasri G Bhat
<shilpa.bhat@linux.vnet.ibm.com> wrote:
> Cpu frequency can be throttled due to failures of components like OCC,
> power supply and fan. It can also be throttled due to temperature and
> power limit. We can detect the throttling by checking 1)if max frequency

Add these points in separate lines please, with a space after ). Its not
readable this way..

> is reduced, 2)if the core is put to safe frequency 3)if the SPR based
> frequency management is disabled.

All these three points refer to the state CPU has shifted to ? Sorry it wasn't
clear to the outsiders :), perhaps some more detail on why CPU would have
done that.

> The current status of the core is read from Power Management Status
> Register(PMSR) to check if any of the throttling condition is
> occurred and the appropriate throttling message is reported.

So, what do we want to do on throttling? Just print a warning? Is that
enough? What if CPU gets heated up to a point that it burns up ?

> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> ---
> Changes from V1: Removed unused value of PMCR register
>
>  drivers/cpufreq/powernv-cpufreq.c | 39 ++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 2dfd4fd..4837eed 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -36,7 +36,7 @@
>  #define POWERNV_MAX_PSTATES    256
>
>  static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
> -static bool rebooting;
> +static bool rebooting, throttled;
>
>  /*
>   * Note: The set of pstates consists of contiguous integers, the
> @@ -294,6 +294,40 @@ static inline unsigned int get_nominal_index(void)
>         return powernv_pstate_info.max - powernv_pstate_info.nominal;
>  }
>
> +static void powernv_cpufreq_throttle_check(unsigned int cpu)
> +{
> +       unsigned long pmsr;
> +       int pmsr_pmax, pmsr_lp;
> +
> +       pmsr = get_pmspr(SPRN_PMSR);
> +
> +       /* Check for Pmax Capping */
> +       pmsr_pmax = (s8)((pmsr >> 32) & 0xFF);

u8 ?

> +       if (pmsr_pmax != powernv_pstate_info.max) {
> +               throttled = true;
> +               pr_warn("Cpu %d Pmax is reduced to %d\n", cpu, pmsr_pmax);
> +       }
> +
> +       /* Check for Psafe by reading LocalPstate
> +        * or check if Psafe_mode_active- 34th bit is set in PMSR.
> +        */

Proper multi-line comment format is:

/*
 * ....
 */


> +       pmsr_lp = (s8)((pmsr >> 48) & 0xFF);
> +       if ((pmsr_lp < powernv_pstate_info.min) || ((pmsr >> 30) & 1)) {
> +               throttled = true;
> +               pr_warn("Cpu %d in Psafe %d PMSR[34]=%lx\n", cpu,
> +                               pmsr_lp, ((pmsr >> 30) & 1));
> +       }
> +
> +       /* Check if SPR_EM_DISABLED- 33rd bit is set in PMSR */
> +       if ((pmsr >> 31) & 1) {
> +               throttled = true;
> +               pr_warn("Frequency management disabled cpu %d PMSR[33]=%lx\n",
> +                               cpu, ((pmsr >> 31) & 1));
> +       }
> +       if (throttled)
> +               pr_warn("Cpu Frequency is throttled\n");
> +}
> +
>  /*
>   * powernv_cpufreq_target_index: Sets the frequency corresponding to
>   * the cpufreq table entry indexed by new_index on the cpus in the
> @@ -307,6 +341,9 @@ static int powernv_cpufreq_target_index(struct cpufreq_policy *policy,
>         if (unlikely(rebooting) && new_index != get_nominal_index())
>                 return 0;
>
> +       if (!throttled)
> +               powernv_cpufreq_throttle_check(smp_processor_id());

And CPU can't come out of throttling again ?

> +
>         freq_data.pstate_id = powernv_freqs[new_index].driver_data;
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shilpasri G Bhat March 27, 2015, 6:32 a.m. UTC | #2
Hi Viresh,

On 03/27/2015 10:05 AM, Viresh Kumar wrote:
> Hi Shilpa,
> 
> On 27 March 2015 at 00:11, Shilpasri G Bhat
> <shilpa.bhat@linux.vnet.ibm.com> wrote:
>> Cpu frequency can be throttled due to failures of components like OCC,
>> power supply and fan. It can also be throttled due to temperature and
>> power limit. We can detect the throttling by checking 1)if max frequency
> 
> Add these points in separate lines please, with a space after ). Its not
> readable this way..

Will do.
> 
>> is reduced, 2)if the core is put to safe frequency 3)if the SPR based
>> frequency management is disabled.
> 
> All these three points refer to the state CPU has shifted to ? Sorry it wasn't
> clear to the outsiders :), perhaps some more detail on why CPU would have
> done that.

The power and thermal safety of the system is taken care by an
On-Chip-Controller (OCC) which is real-time subsystem embedded within the POWER8
processor. OCC continuously monitors the memory and core temperature, the total
system power, state of power supply and fan.

The cpu frequency can be throttled for the following reason:
1)If a processor crosses its power and temperature limit then OCC will lower its
Pmax to reduce the frequency and voltage.
2)If OCC crashes then the system is forced to Psafe frequency.
3)If OCC fails to recover then the kernel is not allowed to do any further
frequency changes and the chip will remain in Psafe.

The user can see a drop in performance when frequency is throttled and is
unaware of throttling. So we want to report such a condition so that user can
check the OCC status to reboot the system or check for power supply or fan failures.

> 
>> The current status of the core is read from Power Management Status
>> Register(PMSR) to check if any of the throttling condition is
>> occurred and the appropriate throttling message is reported.
> 
> So, what do we want to do on throttling? Just print a warning? Is that
> enough? What if CPU gets heated up to a point that it burns up ?

On over temperature safety measures are taken by OCC one of which being
throttling frequency. As the chip frequency and voltage is already lowered, not
sure what we can do apart from reporting. Maybe on detection of throttling
kernel can take corrective measure to migrate the tasks from that cpu or it can
force the cpu to idle.
> 
>> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
>> ---
>> Changes from V1: Removed unused value of PMCR register
>>
>>  drivers/cpufreq/powernv-cpufreq.c | 39 ++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 38 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
>> index 2dfd4fd..4837eed 100644
>> --- a/drivers/cpufreq/powernv-cpufreq.c
>> +++ b/drivers/cpufreq/powernv-cpufreq.c
>> @@ -36,7 +36,7 @@
>>  #define POWERNV_MAX_PSTATES    256
>>
>>  static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
>> -static bool rebooting;
>> +static bool rebooting, throttled;
>>
>>  /*
>>   * Note: The set of pstates consists of contiguous integers, the
>> @@ -294,6 +294,40 @@ static inline unsigned int get_nominal_index(void)
>>         return powernv_pstate_info.max - powernv_pstate_info.nominal;
>>  }
>>
>> +static void powernv_cpufreq_throttle_check(unsigned int cpu)
>> +{
>> +       unsigned long pmsr;
>> +       int pmsr_pmax, pmsr_lp;
>> +
>> +       pmsr = get_pmspr(SPRN_PMSR);
>> +
>> +       /* Check for Pmax Capping */
>> +       pmsr_pmax = (s8)((pmsr >> 32) & 0xFF);
> 
> u8 ?

Pstate is negative. I want to propagate the sign.
> 
>> +       if (pmsr_pmax != powernv_pstate_info.max) {
>> +               throttled = true;
>> +               pr_warn("Cpu %d Pmax is reduced to %d\n", cpu, pmsr_pmax);
>> +       }
>> +
>> +       /* Check for Psafe by reading LocalPstate
>> +        * or check if Psafe_mode_active- 34th bit is set in PMSR.
>> +        */
> 
> Proper multi-line comment format is:
> 
> /*
>  * ....
>  */
> 
> 
Will do.
>> +       pmsr_lp = (s8)((pmsr >> 48) & 0xFF);
>> +       if ((pmsr_lp < powernv_pstate_info.min) || ((pmsr >> 30) & 1)) {
>> +               throttled = true;
>> +               pr_warn("Cpu %d in Psafe %d PMSR[34]=%lx\n", cpu,
>> +                               pmsr_lp, ((pmsr >> 30) & 1));
>> +       }
>> +
>> +       /* Check if SPR_EM_DISABLED- 33rd bit is set in PMSR */
>> +       if ((pmsr >> 31) & 1) {
>> +               throttled = true;
>> +               pr_warn("Frequency management disabled cpu %d PMSR[33]=%lx\n",
>> +                               cpu, ((pmsr >> 31) & 1));
>> +       }
>> +       if (throttled)
>> +               pr_warn("Cpu Frequency is throttled\n");
>> +}
>> +
>>  /*
>>   * powernv_cpufreq_target_index: Sets the frequency corresponding to
>>   * the cpufreq table entry indexed by new_index on the cpus in the
>> @@ -307,6 +341,9 @@ static int powernv_cpufreq_target_index(struct cpufreq_policy *policy,
>>         if (unlikely(rebooting) && new_index != get_nominal_index())
>>                 return 0;
>>
>> +       if (!throttled)
>> +               powernv_cpufreq_throttle_check(smp_processor_id());
> 
> And CPU can't come out of throttling again ?

Yes we can come out of throttling if OCC recovers. We need a separate
notification from firmware when we try to recover. I will send a different patch
where driver registers to recovery notification and on successful recovery we
can reset 'throttled' to false.

> 
>> +
>>         freq_data.pstate_id = powernv_freqs[new_index].driver_data;
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar March 27, 2015, 6:44 a.m. UTC | #3
On 27 March 2015 at 12:02, Shilpasri G Bhat
<shilpa.bhat@linux.vnet.ibm.com> wrote:
> The power and thermal safety of the system is taken care by an
> On-Chip-Controller (OCC) which is real-time subsystem embedded within the POWER8
> processor. OCC continuously monitors the memory and core temperature, the total
> system power, state of power supply and fan.
>
> The cpu frequency can be throttled for the following reason:
> 1)If a processor crosses its power and temperature limit then OCC will lower its
> Pmax to reduce the frequency and voltage.
> 2)If OCC crashes then the system is forced to Psafe frequency.
> 3)If OCC fails to recover then the kernel is not allowed to do any further
> frequency changes and the chip will remain in Psafe.
>
> The user can see a drop in performance when frequency is throttled and is
> unaware of throttling. So we want to report such a condition so that user can
> check the OCC status to reboot the system or check for power supply or fan failures.

All these details are required to be part of the commit, so that reviewers can
understand it better.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 2dfd4fd..4837eed 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -36,7 +36,7 @@ 
 #define POWERNV_MAX_PSTATES	256
 
 static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
-static bool rebooting;
+static bool rebooting, throttled;
 
 /*
  * Note: The set of pstates consists of contiguous integers, the
@@ -294,6 +294,40 @@  static inline unsigned int get_nominal_index(void)
 	return powernv_pstate_info.max - powernv_pstate_info.nominal;
 }
 
+static void powernv_cpufreq_throttle_check(unsigned int cpu)
+{
+	unsigned long pmsr;
+	int pmsr_pmax, pmsr_lp;
+
+	pmsr = get_pmspr(SPRN_PMSR);
+
+	/* Check for Pmax Capping */
+	pmsr_pmax = (s8)((pmsr >> 32) & 0xFF);
+	if (pmsr_pmax != powernv_pstate_info.max) {
+		throttled = true;
+		pr_warn("Cpu %d Pmax is reduced to %d\n", cpu, pmsr_pmax);
+	}
+
+	/* Check for Psafe by reading LocalPstate
+	 * or check if Psafe_mode_active- 34th bit is set in PMSR.
+	 */
+	pmsr_lp = (s8)((pmsr >> 48) & 0xFF);
+	if ((pmsr_lp < powernv_pstate_info.min) || ((pmsr >> 30) & 1)) {
+		throttled = true;
+		pr_warn("Cpu %d in Psafe %d PMSR[34]=%lx\n", cpu,
+				pmsr_lp, ((pmsr >> 30) & 1));
+	}
+
+	/* Check if SPR_EM_DISABLED- 33rd bit is set in PMSR */
+	if ((pmsr >> 31) & 1) {
+		throttled = true;
+		pr_warn("Frequency management disabled cpu %d PMSR[33]=%lx\n",
+				cpu, ((pmsr >> 31) & 1));
+	}
+	if (throttled)
+		pr_warn("Cpu Frequency is throttled\n");
+}
+
 /*
  * powernv_cpufreq_target_index: Sets the frequency corresponding to
  * the cpufreq table entry indexed by new_index on the cpus in the
@@ -307,6 +341,9 @@  static int powernv_cpufreq_target_index(struct cpufreq_policy *policy,
 	if (unlikely(rebooting) && new_index != get_nominal_index())
 		return 0;
 
+	if (!throttled)
+		powernv_cpufreq_throttle_check(smp_processor_id());
+
 	freq_data.pstate_id = powernv_freqs[new_index].driver_data;
 
 	/*