cpufreq / CPPC: Set platform specific transition_delay_us

Message ID	1524611559-20138-1-git-send-email-pprakash@codeaurora.org (mailing list archive)
State	Superseded, archived
Headers	show Return-Path: <linux-pm-owner@kernel.org> sender: pprakash@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id BBF1B6071A; Tue, 24 Apr 2018 23:13:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org BBF1B6071A From: Prashanth Prakash <pprakash@codeaurora.org> To: linux-pm@vger.kernel.org Cc: rjw@rjwysocki.net, viresh.kumar@linaro.org, Prashanth Prakash <pprakash@codeaurora.org> Subject: [PATCH] cpufreq / CPPC: Set platform specific transition_delay_us Date: Tue, 24 Apr 2018 17:12:39 -0600 Message-Id: <1524611559-20138-1-git-send-email-pprakash@codeaurora.org> Sender: linux-pm-owner@vger.kernel.org Precedence: bulk

Message ID

1524611559-20138-1-git-send-email-pprakash@codeaurora.org (mailing list archive)

State

Superseded, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org BBF1B6071A
From: Prashanth Prakash <pprakash@codeaurora.org>
To: linux-pm@vger.kernel.org
Cc: rjw@rjwysocki.net, viresh.kumar@linaro.org,
	Prashanth Prakash <pprakash@codeaurora.org>
Subject: [PATCH] cpufreq / CPPC: Set platform specific transition_delay_us
Date: Tue, 24 Apr 2018 17:12:39 -0600
Message-Id: <1524611559-20138-1-git-send-email-pprakash@codeaurora.org>
Sender: linux-pm-owner@vger.kernel.org
Precedence: bulk

Commit Message

Prakash, Prashanth April 24, 2018, 11:12 p.m. UTC

Add support to specify platform specific transition_delay_us instead
of using the transition delay derived from PCC.

With commit "45f39cb5071c: cpufreq: CPPC: Use transition_delay_us
depending transition_latency" we are setting transition_delay_us
directly and not applying the LATENCY_MULTIPLIER. With this on Qualcomm
Centriq we can end up with a very high rate of frequency change requests
when using schedutil governor (default rate_limit_us=10 compared to an
earlier value of 10000).

The PCC subspace describes the rate at which the platform can accept
commands on the CPPC's PCC channel. This includes read and write
command on the PCC channel that can be used for reasons other than
frequency transitions. Moreover the same PCC subspace can be used by
multiple freq domains and deriving transition_delay_us from it as we do
now can be sub-optimal.

Moreover if a platform does not use PCC for desired_perf register then
there is no way to compute the transition latency or the delay_us.

CPPC does not have a standard defined mechanism to get the transition
rate or the latency at the moment.

Given the above limitations, it is simpler to have a platform specific
transition_delay_us and rely only on PCC derived value only if a
platform specific value is not available.

Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Fixes: 45f39cb5071c ("cpufreq: CPPC: Use transition_delay_us depending
transition_latency)
---
 drivers/cpufreq/cppc_cpufreq.c | 43 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 41 insertions(+), 2 deletions(-)

Comments

Viresh Kumar April 25, 2018, 2:54 a.m. UTC | #1

On 24-04-18, 17:12, Prashanth Prakash wrote:
> Add support to specify platform specific transition_delay_us instead
> of using the transition delay derived from PCC.
> 
> With commit "45f39cb5071c: cpufreq: CPPC: Use transition_delay_us
> depending transition_latency" we are setting transition_delay_us
> directly and not applying the LATENCY_MULTIPLIER. With this on Qualcomm
> Centriq we can end up with a very high rate of frequency change requests
> when using schedutil governor (default rate_limit_us=10 compared to an
> earlier value of 10000).
> 
> The PCC subspace describes the rate at which the platform can accept
> commands on the CPPC's PCC channel. This includes read and write
> command on the PCC channel that can be used for reasons other than
> frequency transitions. Moreover the same PCC subspace can be used by
> multiple freq domains and deriving transition_delay_us from it as we do
> now can be sub-optimal.
> 
> Moreover if a platform does not use PCC for desired_perf register then
> there is no way to compute the transition latency or the delay_us.
> 
> CPPC does not have a standard defined mechanism to get the transition
> rate or the latency at the moment.
> 
> Given the above limitations, it is simpler to have a platform specific
> transition_delay_us and rely only on PCC derived value only if a
> platform specific value is not available.
> 
> Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
> Fixes: 45f39cb5071c ("cpufreq: CPPC: Use transition_delay_us depending
> transition_latency)
> ---
>  drivers/cpufreq/cppc_cpufreq.c | 43 ++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index bc5fc16..e935e43 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -126,6 +126,43 @@ static void cppc_cpufreq_stop_cpu(struct cpufreq_policy *policy)
>  				cpu->perf_caps.lowest_perf, cpu_num, ret);
>  }
>  
> +/*
> + * The PCC subspace describes the rate at which platform can accept commands
> + * on the shared PCC channel (including READs which do not count towards freq
> + * trasition requests), so ideally we need to use the PCC values as a fallback
> + * if we don't have a platform specific transition_delay_us
> + */
> +#if defined(CONFIG_ARM64)
> +#include <asm/cputype.h>
> +
> +static unsigned int cppc_cpufreq_get_transition_delay_us(void)
> +{
> +	unsigned long implementor = read_cpuid_implementor();
> +	unsigned long part_num = read_cpuid_part_number();
> +	unsigned int delay_us = 0;
> +
> +	switch (implementor) {
> +	case ARM_CPU_IMP_QCOM:
> +		switch (part_num) {
> +		case QCOM_CPU_PART_FALKOR_V1:
> +		case QCOM_CPU_PART_FALKOR:
> +			delay_us = 10000;
> +			break;
> +		}
> +		break;
> +	}
> +
> +	return delay_us;
> +}
> +
> +#else
> +
> +static unsigned int cppc_cpufreq_get_transition_delay_us(void)
> +{
> +	return 0;
> +}
> +#endif
> +
>  static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>  {
>  	struct cppc_cpudata *cpu;
> @@ -162,8 +199,10 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>  		cpu->perf_caps.highest_perf;
>  	policy->cpuinfo.max_freq = cppc_dmi_max_khz;
>  
> -	policy->transition_delay_us = cppc_get_transition_latency(cpu_num) /
> -		NSEC_PER_USEC;
> +	policy->transition_delay_us = cppc_cpufreq_get_transition_delay_us();
> +	if (!policy->transition_delay_us)
> +		policy->transition_delay_us = cppc_get_transition_latency(cpu_num) /
> +			NSEC_PER_USEC;

What about returning this value directly from
cppc_cpufreq_get_transition_delay_us() instead of 0 ?

>  	policy->shared_type = cpu->shared_type;
>  
>  	if (policy->shared_type == CPUFREQ_SHARED_TYPE_ANY) {
> -- 
> Qualcomm Datacenter Technologies on behalf of Qualcomm Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.

Prakash, Prashanth April 25, 2018, 3:39 p.m. UTC | #2

On 4/24/2018 8:54 PM, Viresh Kumar wrote:
> On 24-04-18, 17:12, Prashanth Prakash wrote:
>> Add support to specify platform specific transition_delay_us instead
>> of using the transition delay derived from PCC.
>>
>> With commit "45f39cb5071c: cpufreq: CPPC: Use transition_delay_us
>> depending transition_latency" we are setting transition_delay_us
>> directly and not applying the LATENCY_MULTIPLIER. With this on Qualcomm
>> Centriq we can end up with a very high rate of frequency change requests
>> when using schedutil governor (default rate_limit_us=10 compared to an
>> earlier value of 10000).
>>
>> The PCC subspace describes the rate at which the platform can accept
>> commands on the CPPC's PCC channel. This includes read and write
>> command on the PCC channel that can be used for reasons other than
>> frequency transitions. Moreover the same PCC subspace can be used by
>> multiple freq domains and deriving transition_delay_us from it as we do
>> now can be sub-optimal.
>>
>> Moreover if a platform does not use PCC for desired_perf register then
>> there is no way to compute the transition latency or the delay_us.
>>
>> CPPC does not have a standard defined mechanism to get the transition
>> rate or the latency at the moment.
>>
>> Given the above limitations, it is simpler to have a platform specific
>> transition_delay_us and rely only on PCC derived value only if a
>> platform specific value is not available.
>>
>> Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>> Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
>> Fixes: 45f39cb5071c ("cpufreq: CPPC: Use transition_delay_us depending
>> transition_latency)
>> ---
>>  drivers/cpufreq/cppc_cpufreq.c | 43 ++++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>> index bc5fc16..e935e43 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -126,6 +126,43 @@ static void cppc_cpufreq_stop_cpu(struct cpufreq_policy *policy)
>>  				cpu->perf_caps.lowest_perf, cpu_num, ret);
>>  }
>>  
>> +/*
>> + * The PCC subspace describes the rate at which platform can accept commands
>> + * on the shared PCC channel (including READs which do not count towards freq
>> + * trasition requests), so ideally we need to use the PCC values as a fallback
>> + * if we don't have a platform specific transition_delay_us
>> + */
>> +#if defined(CONFIG_ARM64)
>> +#include <asm/cputype.h>
>> +
>> +static unsigned int cppc_cpufreq_get_transition_delay_us(void)
>> +{
>> +	unsigned long implementor = read_cpuid_implementor();
>> +	unsigned long part_num = read_cpuid_part_number();
>> +	unsigned int delay_us = 0;
>> +
>> +	switch (implementor) {
>> +	case ARM_CPU_IMP_QCOM:
>> +		switch (part_num) {
>> +		case QCOM_CPU_PART_FALKOR_V1:
>> +		case QCOM_CPU_PART_FALKOR:
>> +			delay_us = 10000;
>> +			break;
>> +		}
>> +		break;
>> +	}
>> +
>> +	return delay_us;
>> +}
>> +
>> +#else
>> +
>> +static unsigned int cppc_cpufreq_get_transition_delay_us(void)
>> +{
>> +	return 0;
>> +}
>> +#endif
>> +
>>  static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>  {
>>  	struct cppc_cpudata *cpu;
>> @@ -162,8 +199,10 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>  		cpu->perf_caps.highest_perf;
>>  	policy->cpuinfo.max_freq = cppc_dmi_max_khz;
>>  
>> -	policy->transition_delay_us = cppc_get_transition_latency(cpu_num) /
>> -		NSEC_PER_USEC;
>> +	policy->transition_delay_us = cppc_cpufreq_get_transition_delay_us();
>> +	if (!policy->transition_delay_us)
>> +		policy->transition_delay_us = cppc_get_transition_latency(cpu_num) /
>> +			NSEC_PER_USEC;
> What about returning this value directly from
> cppc_cpufreq_get_transition_delay_us() instead of 0 ?
Thanks Viresh!
I will make the change and post v2 today.

-Prashanth

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index bc5fc16..e935e43 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -126,6 +126,43 @@  static void cppc_cpufreq_stop_cpu(struct cpufreq_policy *policy)
 				cpu->perf_caps.lowest_perf, cpu_num, ret);
 }
 
+/*
+ * The PCC subspace describes the rate at which platform can accept commands
+ * on the shared PCC channel (including READs which do not count towards freq
+ * trasition requests), so ideally we need to use the PCC values as a fallback
+ * if we don't have a platform specific transition_delay_us
+ */
+#if defined(CONFIG_ARM64)
+#include <asm/cputype.h>
+
+static unsigned int cppc_cpufreq_get_transition_delay_us(void)
+{
+	unsigned long implementor = read_cpuid_implementor();
+	unsigned long part_num = read_cpuid_part_number();
+	unsigned int delay_us = 0;
+
+	switch (implementor) {
+	case ARM_CPU_IMP_QCOM:
+		switch (part_num) {
+		case QCOM_CPU_PART_FALKOR_V1:
+		case QCOM_CPU_PART_FALKOR:
+			delay_us = 10000;
+			break;
+		}
+		break;
+	}
+
+	return delay_us;
+}
+
+#else
+
+static unsigned int cppc_cpufreq_get_transition_delay_us(void)
+{
+	return 0;
+}
+#endif
+
 static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
 {
 	struct cppc_cpudata *cpu;
@@ -162,8 +199,10 @@  static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
 		cpu->perf_caps.highest_perf;
 	policy->cpuinfo.max_freq = cppc_dmi_max_khz;
 
-	policy->transition_delay_us = cppc_get_transition_latency(cpu_num) /
-		NSEC_PER_USEC;
+	policy->transition_delay_us = cppc_cpufreq_get_transition_delay_us();
+	if (!policy->transition_delay_us)
+		policy->transition_delay_us = cppc_get_transition_latency(cpu_num) /
+			NSEC_PER_USEC;
 	policy->shared_type = cpu->shared_type;
 
 	if (policy->shared_type == CPUFREQ_SHARED_TYPE_ANY) {

cpufreq / CPPC: Set platform specific transition_delay_us

Commit Message

Comments

Patch