diff mbox

[RFCv5,40/46] sched/cpufreq_sched: compute freq_new based on capacity_orig_of()

Message ID 1436293469-25707-41-git-send-email-morten.rasmussen@arm.com (mailing list archive)
State RFC
Headers show

Commit Message

Morten Rasmussen July 7, 2015, 6:24 p.m. UTC
From: Juri Lelli <juri.lelli@arm.com>

capacity is both cpu and freq scaled with EAS. We thus need to compute
freq_new using capacity_orig_of(), so that we properly scale back the thing
on heterogeneous architectures. In fact, capacity_orig_of() returns only
the cpu scaled part of capacity (see update_cpu_capacity()). So, to scale
freq_new correctly, we multiply policy->max by capacity/capacity_orig_of()
instead of capacity/1024.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>

Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 kernel/sched/cpufreq_sched.c | 2 +-
 kernel/sched/fair.c          | 2 +-
 kernel/sched/sched.h         | 2 ++
 3 files changed, 4 insertions(+), 2 deletions(-)

Comments

Michael Turquette July 8, 2015, 3:22 p.m. UTC | #1
Quoting Morten Rasmussen (2015-07-07 11:24:23)
> From: Juri Lelli <juri.lelli@arm.com>
> 
> capacity is both cpu and freq scaled with EAS. We thus need to compute
> freq_new using capacity_orig_of(), so that we properly scale back the thing
> on heterogeneous architectures. In fact, capacity_orig_of() returns only
> the cpu scaled part of capacity (see update_cpu_capacity()). So, to scale
> freq_new correctly, we multiply policy->max by capacity/capacity_orig_of()
> instead of capacity/1024.
> 
> cc: Ingo Molnar <mingo@redhat.com>
> cc: Peter Zijlstra <peterz@infradead.org>
> 
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>

Looks good to me. Please feel free to add my Reviewed-by or Acked-by as
appropriate.

Regards,
Mike

> ---
>  kernel/sched/cpufreq_sched.c | 2 +-
>  kernel/sched/fair.c          | 2 +-
>  kernel/sched/sched.h         | 2 ++
>  3 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/cpufreq_sched.c b/kernel/sched/cpufreq_sched.c
> index 2968f3a..7071528 100644
> --- a/kernel/sched/cpufreq_sched.c
> +++ b/kernel/sched/cpufreq_sched.c
> @@ -184,7 +184,7 @@ void cpufreq_sched_set_cap(int cpu, unsigned long capacity)
>                 goto out;
>  
>         /* Convert the new maximum capacity request into a cpu frequency */
> -       freq_new = capacity * policy->max >> SCHED_CAPACITY_SHIFT;
> +       freq_new = (capacity * policy->max) / capacity_orig_of(cpu);
>  
>         /* No change in frequency? Bail and return current capacity. */
>         if (freq_new == policy->cur)
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d395bc9..f74e9d2 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4625,7 +4625,7 @@ static unsigned long capacity_of(int cpu)
>         return cpu_rq(cpu)->cpu_capacity;
>  }
>  
> -static unsigned long capacity_orig_of(int cpu)
> +unsigned long capacity_orig_of(int cpu)
>  {
>         return cpu_rq(cpu)->cpu_capacity_orig;
>  }
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index b5e27d9..1327dc7 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1476,6 +1476,8 @@ unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
>  }
>  #endif
>  
> +unsigned long capacity_orig_of(int cpu);
> +
>  extern struct static_key __sched_energy_freq;
>  static inline bool sched_energy_freq(void)
>  {
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Juri Lelli July 9, 2015, 4:21 p.m. UTC | #2
Hi Mike,

On 08/07/15 16:22, Michael Turquette wrote:
> Quoting Morten Rasmussen (2015-07-07 11:24:23)
>> From: Juri Lelli <juri.lelli@arm.com>
>>
>> capacity is both cpu and freq scaled with EAS. We thus need to compute
>> freq_new using capacity_orig_of(), so that we properly scale back the thing
>> on heterogeneous architectures. In fact, capacity_orig_of() returns only
>> the cpu scaled part of capacity (see update_cpu_capacity()). So, to scale
>> freq_new correctly, we multiply policy->max by capacity/capacity_orig_of()
>> instead of capacity/1024.
>>
>> cc: Ingo Molnar <mingo@redhat.com>
>> cc: Peter Zijlstra <peterz@infradead.org>
>>
>> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> 
> Looks good to me. Please feel free to add my Reviewed-by or Acked-by as
> appropriate.
> 

Thanks for reviewing it!

Best,

- Juri

> Regards,
> Mike
> 
>> ---
>>  kernel/sched/cpufreq_sched.c | 2 +-
>>  kernel/sched/fair.c          | 2 +-
>>  kernel/sched/sched.h         | 2 ++
>>  3 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/cpufreq_sched.c b/kernel/sched/cpufreq_sched.c
>> index 2968f3a..7071528 100644
>> --- a/kernel/sched/cpufreq_sched.c
>> +++ b/kernel/sched/cpufreq_sched.c
>> @@ -184,7 +184,7 @@ void cpufreq_sched_set_cap(int cpu, unsigned long capacity)
>>                 goto out;
>>  
>>         /* Convert the new maximum capacity request into a cpu frequency */
>> -       freq_new = capacity * policy->max >> SCHED_CAPACITY_SHIFT;
>> +       freq_new = (capacity * policy->max) / capacity_orig_of(cpu);
>>  
>>         /* No change in frequency? Bail and return current capacity. */
>>         if (freq_new == policy->cur)
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index d395bc9..f74e9d2 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -4625,7 +4625,7 @@ static unsigned long capacity_of(int cpu)
>>         return cpu_rq(cpu)->cpu_capacity;
>>  }
>>  
>> -static unsigned long capacity_orig_of(int cpu)
>> +unsigned long capacity_orig_of(int cpu)
>>  {
>>         return cpu_rq(cpu)->cpu_capacity_orig;
>>  }
>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>> index b5e27d9..1327dc7 100644
>> --- a/kernel/sched/sched.h
>> +++ b/kernel/sched/sched.h
>> @@ -1476,6 +1476,8 @@ unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
>>  }
>>  #endif
>>  
>> +unsigned long capacity_orig_of(int cpu);
>> +
>>  extern struct static_key __sched_energy_freq;
>>  static inline bool sched_energy_freq(void)
>>  {
>> -- 
>> 1.9.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra Aug. 15, 2015, 12:46 p.m. UTC | #3
On Tue, Jul 07, 2015 at 07:24:23PM +0100, Morten Rasmussen wrote:
> diff --git a/kernel/sched/cpufreq_sched.c b/kernel/sched/cpufreq_sched.c
> index 2968f3a..7071528 100644
> --- a/kernel/sched/cpufreq_sched.c
> +++ b/kernel/sched/cpufreq_sched.c
> @@ -184,7 +184,7 @@ void cpufreq_sched_set_cap(int cpu, unsigned long capacity)
>  		goto out;
>  
>  	/* Convert the new maximum capacity request into a cpu frequency */
> -	freq_new = capacity * policy->max >> SCHED_CAPACITY_SHIFT;
> +	freq_new = (capacity * policy->max) / capacity_orig_of(cpu);
>  
>  	/* No change in frequency? Bail and return current capacity. */
>  	if (freq_new == policy->cur)

Can't we avoid exporting that lot by simply passing in the right values
to begin with?
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Turquette Aug. 16, 2015, 4:03 a.m. UTC | #4
Quoting Peter Zijlstra (2015-08-15 05:46:38)
> On Tue, Jul 07, 2015 at 07:24:23PM +0100, Morten Rasmussen wrote:
> > diff --git a/kernel/sched/cpufreq_sched.c b/kernel/sched/cpufreq_sched.c
> > index 2968f3a..7071528 100644
> > --- a/kernel/sched/cpufreq_sched.c
> > +++ b/kernel/sched/cpufreq_sched.c
> > @@ -184,7 +184,7 @@ void cpufreq_sched_set_cap(int cpu, unsigned long capacity)
> >               goto out;
> >  
> >       /* Convert the new maximum capacity request into a cpu frequency */
> > -     freq_new = capacity * policy->max >> SCHED_CAPACITY_SHIFT;
> > +     freq_new = (capacity * policy->max) / capacity_orig_of(cpu);
> >  
> >       /* No change in frequency? Bail and return current capacity. */
> >       if (freq_new == policy->cur)
> 
> Can't we avoid exporting that lot by simply passing in the right values
> to begin with?

By "right value" do you mean, "pass the frequency from cfs"?

If that is what you mean, then the answer is "yes". But it also means
that cfs will need access to either:

1) the cpu frequncy-domain topology described in struct cpufreq.cpus
OR
2) duplicate that frequency-domain knowledge, perhaps in sched_domain

If that isn't what you mean by "right value" then let me know.

Regards,
Mike

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra Aug. 16, 2015, 8:24 p.m. UTC | #5
On Sat, Aug 15, 2015 at 09:03:33PM -0700, Michael Turquette wrote:
> Quoting Peter Zijlstra (2015-08-15 05:46:38)
> > On Tue, Jul 07, 2015 at 07:24:23PM +0100, Morten Rasmussen wrote:
> > > diff --git a/kernel/sched/cpufreq_sched.c b/kernel/sched/cpufreq_sched.c
> > > index 2968f3a..7071528 100644
> > > --- a/kernel/sched/cpufreq_sched.c
> > > +++ b/kernel/sched/cpufreq_sched.c
> > > @@ -184,7 +184,7 @@ void cpufreq_sched_set_cap(int cpu, unsigned long capacity)
> > >               goto out;
> > >  
> > >       /* Convert the new maximum capacity request into a cpu frequency */
> > > -     freq_new = capacity * policy->max >> SCHED_CAPACITY_SHIFT;
> > > +     freq_new = (capacity * policy->max) / capacity_orig_of(cpu);
> > >  
> > >       /* No change in frequency? Bail and return current capacity. */
> > >       if (freq_new == policy->cur)
> > 
> > Can't we avoid exporting that lot by simply passing in the right values
> > to begin with?
> 
> By "right value" do you mean, "pass the frequency from cfs"?

Nah, just maybe: (capacity << SCHED_CAPACITY_SHIFT) / capacity_orig_of()
such that you don't have to export that knowledge to this thing.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Juri Lelli Aug. 17, 2015, 12:19 p.m. UTC | #6
On 16/08/15 21:24, Peter Zijlstra wrote:
> On Sat, Aug 15, 2015 at 09:03:33PM -0700, Michael Turquette wrote:
>> Quoting Peter Zijlstra (2015-08-15 05:46:38)
>>> On Tue, Jul 07, 2015 at 07:24:23PM +0100, Morten Rasmussen wrote:
>>>> diff --git a/kernel/sched/cpufreq_sched.c b/kernel/sched/cpufreq_sched.c
>>>> index 2968f3a..7071528 100644
>>>> --- a/kernel/sched/cpufreq_sched.c
>>>> +++ b/kernel/sched/cpufreq_sched.c
>>>> @@ -184,7 +184,7 @@ void cpufreq_sched_set_cap(int cpu, unsigned long capacity)
>>>>               goto out;
>>>>  
>>>>       /* Convert the new maximum capacity request into a cpu frequency */
>>>> -     freq_new = capacity * policy->max >> SCHED_CAPACITY_SHIFT;
>>>> +     freq_new = (capacity * policy->max) / capacity_orig_of(cpu);
>>>>  
>>>>       /* No change in frequency? Bail and return current capacity. */
>>>>       if (freq_new == policy->cur)
>>>
>>> Can't we avoid exporting that lot by simply passing in the right values
>>> to begin with?
>>
>> By "right value" do you mean, "pass the frequency from cfs"?
> 
> Nah, just maybe: (capacity << SCHED_CAPACITY_SHIFT) / capacity_orig_of()
> such that you don't have to export that knowledge to this thing.
> 

Oh, right. I guess we can just go with something like:

 req_cap = get_cpu_usage(cpu) * capacity_margin / capacity_orig_of(cpu);

on fair.c side and switch back to

 freq_new = capacity * policy->max >> SCHED_CAPACITY_SHIFT;

on cpufreq_sched.c side. That saves us exporting capacity_orig_of().

Thanks,

- Juri

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Muckle Oct. 13, 2015, 7:47 p.m. UTC | #7
On 08/17/2015 05:19 AM, Juri Lelli wrote:
>> Nah, just maybe: (capacity << SCHED_CAPACITY_SHIFT) / capacity_orig_of()
>> > such that you don't have to export that knowledge to this thing.
>> > 
> Oh, right. I guess we can just go with something like:
> 
>  req_cap = get_cpu_usage(cpu) * capacity_margin / capacity_orig_of(cpu);
> 
> on fair.c side and switch back to
> 
>  freq_new = capacity * policy->max >> SCHED_CAPACITY_SHIFT;
> 
> on cpufreq_sched.c side. That saves us exporting capacity_orig_of().

Another good reason to make this change - I believe the current code
will not support a cpufreq policy covering CPUs of different types (such
as a b.L system in a single frequency domain).

The max requested capacity is calculated using the absolute/global
capacity values requested from each CPU. A small CPU needing a higher
OPP may get ignored because of a request from a big CPU for a lower OPP,
because the big CPU @ lower OPP shows up as the max capacity request.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/kernel/sched/cpufreq_sched.c b/kernel/sched/cpufreq_sched.c
index 2968f3a..7071528 100644
--- a/kernel/sched/cpufreq_sched.c
+++ b/kernel/sched/cpufreq_sched.c
@@ -184,7 +184,7 @@  void cpufreq_sched_set_cap(int cpu, unsigned long capacity)
 		goto out;
 
 	/* Convert the new maximum capacity request into a cpu frequency */
-	freq_new = capacity * policy->max >> SCHED_CAPACITY_SHIFT;
+	freq_new = (capacity * policy->max) / capacity_orig_of(cpu);
 
 	/* No change in frequency? Bail and return current capacity. */
 	if (freq_new == policy->cur)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d395bc9..f74e9d2 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4625,7 +4625,7 @@  static unsigned long capacity_of(int cpu)
 	return cpu_rq(cpu)->cpu_capacity;
 }
 
-static unsigned long capacity_orig_of(int cpu)
+unsigned long capacity_orig_of(int cpu)
 {
 	return cpu_rq(cpu)->cpu_capacity_orig;
 }
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b5e27d9..1327dc7 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1476,6 +1476,8 @@  unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 }
 #endif
 
+unsigned long capacity_orig_of(int cpu);
+
 extern struct static_key __sched_energy_freq;
 static inline bool sched_energy_freq(void)
 {