Message ID | 1499189651-18797-2-git-send-email-patrick.bellasi@arm.com (mailing list archive) |
---|---|
State | Deferred |
Headers | show |
On 04-07-17, 18:34, Patrick Bellasi wrote: > In system where multiple CPUs shares the same frequency domain a small > workload on a CPU can still be subject to frequency spikes, generated by > the activation of the sugov's kthread. > > Since the sugov kthread is a special RT task, which goal is just that to > activate a frequency transition, it does not make sense for it to bias > the schedutil's frequency selection policy. > > This patch exploits the information related to the current task to silently > ignore cpufreq_update_this_cpu() calls, coming from the RT scheduler, while > the sugov kthread is running. > > Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Cc: Viresh Kumar <viresh.kumar@linaro.org> > Cc: linux-kernel@vger.kernel.org > Cc: linux-pm@vger.kernel.org > > --- > Changes from v1: > - move check before policy spinlock (JuriL) > --- > kernel/sched/cpufreq_schedutil.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > index c982dd0..eaba6d6 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -218,6 +218,10 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, > unsigned int next_f; > bool busy; > > + /* Skip updates generated by sugov kthreads */ > + if (unlikely(current == sg_policy->thread)) > + return; > + > sugov_set_iowait_boost(sg_cpu, time, flags); > sg_cpu->last_update = time; > > @@ -290,6 +294,10 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time, > unsigned long util, max; > unsigned int next_f; > > + /* Skip updates generated by sugov kthreads */ > + if (unlikely(current == sg_policy->thread)) > + return; > + > sugov_get_util(&util, &max); Yes we discussed this last time as well (I looked again at those discussions and am still confused a bit), but wanted to clarify one more time. After the 2nd patch of this series is applied, why will we still have this problem? As we concluded it last time, the problem wouldn't happen until the time the sugov RT thread is running (Hint: work_in_progress). And once the sugov RT thread is gone, one of the other scheduling classes will take over and should update the flag pretty quickly. Are we worried about the time between the sugov RT thread finishes and when the CFS or IDLE sched class call the util handler again? If yes, then we will still have that problem for any normal RT/DL task. Isn't it ?
On 05-Jul 10:30, Viresh Kumar wrote: > On 04-07-17, 18:34, Patrick Bellasi wrote: > > In system where multiple CPUs shares the same frequency domain a small > > workload on a CPU can still be subject to frequency spikes, generated by > > the activation of the sugov's kthread. > > > > Since the sugov kthread is a special RT task, which goal is just that to > > activate a frequency transition, it does not make sense for it to bias > > the schedutil's frequency selection policy. > > > > This patch exploits the information related to the current task to silently > > ignore cpufreq_update_this_cpu() calls, coming from the RT scheduler, while > > the sugov kthread is running. > > > > Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> > > Cc: Ingo Molnar <mingo@redhat.com> > > Cc: Peter Zijlstra <peterz@infradead.org> > > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Cc: Viresh Kumar <viresh.kumar@linaro.org> > > Cc: linux-kernel@vger.kernel.org > > Cc: linux-pm@vger.kernel.org > > > > --- > > Changes from v1: > > - move check before policy spinlock (JuriL) > > --- > > kernel/sched/cpufreq_schedutil.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > > index c982dd0..eaba6d6 100644 > > --- a/kernel/sched/cpufreq_schedutil.c > > +++ b/kernel/sched/cpufreq_schedutil.c > > @@ -218,6 +218,10 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, > > unsigned int next_f; > > bool busy; > > > > + /* Skip updates generated by sugov kthreads */ > > + if (unlikely(current == sg_policy->thread)) > > + return; > > + > > sugov_set_iowait_boost(sg_cpu, time, flags); > > sg_cpu->last_update = time; > > > > @@ -290,6 +294,10 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time, > > unsigned long util, max; > > unsigned int next_f; > > > > + /* Skip updates generated by sugov kthreads */ > > + if (unlikely(current == sg_policy->thread)) > > + return; > > + > > sugov_get_util(&util, &max); > > Yes we discussed this last time as well (I looked again at those discussions and > am still confused a bit), but wanted to clarify one more time. > > After the 2nd patch of this series is applied, why will we still have this > problem? As we concluded it last time, the problem wouldn't happen until the > time the sugov RT thread is running (Hint: work_in_progress). And once the sugov > RT thread is gone, one of the other scheduling classes will take over and should > update the flag pretty quickly. > > Are we worried about the time between the sugov RT thread finishes and when the > CFS or IDLE sched class call the util handler again? If yes, then we will still > have that problem for any normal RT/DL task. Isn't it ? Yes, we are worried about that time, without this we can generate spikes to the max OPP even when only relatively small FAIR tasks are running. The same problem is not there for the other "normal RT/DL" tasks, just because for those tasks this is the expected behavior: we wanna go to max. To the contrary the sugov kthread, although being a RT task, is just functional to the "machinery" to work, it's an actuator. Thus, IMO it makes no sense from a design standpoint for it to interfere whatsoever with what the "machinery" is doing. Finally, the second patch of this series fixes a kind-of symmetrical issue: while this one avoid going to max OPP, the next one avoid to stay at max OPP once not more needed. Cheers Patrick
On 05-07-17, 12:38, Patrick Bellasi wrote: > On 05-Jul 10:30, Viresh Kumar wrote: > > Yes we discussed this last time as well (I looked again at those discussions and > > am still confused a bit), but wanted to clarify one more time. > > > > After the 2nd patch of this series is applied, why will we still have this > > problem? As we concluded it last time, the problem wouldn't happen until the > > time the sugov RT thread is running (Hint: work_in_progress). And once the sugov > > RT thread is gone, one of the other scheduling classes will take over and should > > update the flag pretty quickly. > > > > Are we worried about the time between the sugov RT thread finishes and when the > > CFS or IDLE sched class call the util handler again? If yes, then we will still > > have that problem for any normal RT/DL task. Isn't it ? > > Yes, we are worried about that time, But isn't that a very very small amount of time? i.e. As soon as the RT thread is finished, we will select the next task from CFS or go to IDLE class (of course if there is nothing left in DL/RT). And this should happen very quickly. Are we sure we really see problems in that short time? Sure it can happen, but it looks to be an extreme corner case and just wanted to check if it really happened for you after the 2nd patch. > without this we can generate > spikes to the max OPP even when only relatively small FAIR tasks are > running. > > The same problem is not there for the other "normal RT/DL" tasks, just > because for those tasks this is the expected behavior: we wanna go to > max. By same problem I meant that after the last RT task is finished and before the pick_next_task of the IDLE_CLASS (or CFS) is called, we can still get a callback into schedutil and that may raise the frequency to MAX. Its a similar kind of problem, but yes we never wanted the freq to go to max for sugov thread. > To the contrary the sugov kthread, although being a RT task, is just > functional to the "machinery" to work, it's an actuator. Thus, IMO it > makes no sense from a design standpoint for it to interfere whatsoever > with what the "machinery" is doing. I think everyone agrees on this. I was just exploring if that can be achieved without any special code like what this patch proposes. I was wondering about what will happen for a case where we have two RT tasks (one of them is sugov thread) and when we land into schedutil the current task is sugov. With this patch we will not set the flag, but actually we have another task which is RT.
On Wednesday, July 05, 2017 12:38:34 PM Patrick Bellasi wrote: > On 05-Jul 10:30, Viresh Kumar wrote: > > On 04-07-17, 18:34, Patrick Bellasi wrote: > > > In system where multiple CPUs shares the same frequency domain a small > > > workload on a CPU can still be subject to frequency spikes, generated by > > > the activation of the sugov's kthread. > > > > > > Since the sugov kthread is a special RT task, which goal is just that to > > > activate a frequency transition, it does not make sense for it to bias > > > the schedutil's frequency selection policy. > > > > > > This patch exploits the information related to the current task to silently > > > ignore cpufreq_update_this_cpu() calls, coming from the RT scheduler, while > > > the sugov kthread is running. > > > > > > Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> > > > Cc: Ingo Molnar <mingo@redhat.com> > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > Cc: Viresh Kumar <viresh.kumar@linaro.org> > > > Cc: linux-kernel@vger.kernel.org > > > Cc: linux-pm@vger.kernel.org > > > > > > --- > > > Changes from v1: > > > - move check before policy spinlock (JuriL) > > > --- > > > kernel/sched/cpufreq_schedutil.c | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > > > index c982dd0..eaba6d6 100644 > > > --- a/kernel/sched/cpufreq_schedutil.c > > > +++ b/kernel/sched/cpufreq_schedutil.c > > > @@ -218,6 +218,10 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, > > > unsigned int next_f; > > > bool busy; > > > > > > + /* Skip updates generated by sugov kthreads */ > > > + if (unlikely(current == sg_policy->thread)) > > > + return; > > > + > > > sugov_set_iowait_boost(sg_cpu, time, flags); > > > sg_cpu->last_update = time; > > > > > > @@ -290,6 +294,10 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time, > > > unsigned long util, max; > > > unsigned int next_f; > > > > > > + /* Skip updates generated by sugov kthreads */ > > > + if (unlikely(current == sg_policy->thread)) > > > + return; > > > + > > > sugov_get_util(&util, &max); > > > > Yes we discussed this last time as well (I looked again at those discussions and > > am still confused a bit), but wanted to clarify one more time. > > > > After the 2nd patch of this series is applied, why will we still have this > > problem? As we concluded it last time, the problem wouldn't happen until the > > time the sugov RT thread is running (Hint: work_in_progress). And once the sugov > > RT thread is gone, one of the other scheduling classes will take over and should > > update the flag pretty quickly. > > > > Are we worried about the time between the sugov RT thread finishes and when the > > CFS or IDLE sched class call the util handler again? If yes, then we will still > > have that problem for any normal RT/DL task. Isn't it ? > > Yes, we are worried about that time, without this we can generate > spikes to the max OPP even when only relatively small FAIR tasks are > running. > > The same problem is not there for the other "normal RT/DL" tasks, just > because for those tasks this is the expected behavior: we wanna go to > max. > > To the contrary the sugov kthread, although being a RT task, is just > functional to the "machinery" to work, it's an actuator. Thus, IMO it > makes no sense from a design standpoint for it to interfere whatsoever > with what the "machinery" is doing. How is this related to the Juri's series? Thanks, Rafael
On 07/04/2017 10:34 AM, Patrick Bellasi wrote: > In system where multiple CPUs shares the same frequency domain a small > workload on a CPU can still be subject to frequency spikes, generated by > the activation of the sugov's kthread. > > Since the sugov kthread is a special RT task, which goal is just that to > activate a frequency transition, it does not make sense for it to bias > the schedutil's frequency selection policy. > > This patch exploits the information related to the current task to silently > ignore cpufreq_update_this_cpu() calls, coming from the RT scheduler, while > the sugov kthread is running. > > Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Cc: Viresh Kumar <viresh.kumar@linaro.org> > Cc: linux-kernel@vger.kernel.org > Cc: linux-pm@vger.kernel.org > > --- > Changes from v1: > - move check before policy spinlock (JuriL) > --- > kernel/sched/cpufreq_schedutil.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > index c982dd0..eaba6d6 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -218,6 +218,10 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, > unsigned int next_f; > bool busy; > > + /* Skip updates generated by sugov kthreads */ > + if (unlikely(current == sg_policy->thread)) > + return; > + > sugov_set_iowait_boost(sg_cpu, time, flags); > sg_cpu->last_update = time; > > @@ -290,6 +294,10 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time, > unsigned long util, max; > unsigned int next_f; > > + /* Skip updates generated by sugov kthreads */ > + if (unlikely(current == sg_policy->thread)) > + return; > + This seems super race-y. Especially when combined with rate_limit_us. Deciding to not update the frequency for a policy just because the call back happened in the context of the kthread is not right. Especially when it's combined with the remote CPU call backs patches Viresh is putting out (which I think is a well intended patch series). -Saravana
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index c982dd0..eaba6d6 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -218,6 +218,10 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, unsigned int next_f; bool busy; + /* Skip updates generated by sugov kthreads */ + if (unlikely(current == sg_policy->thread)) + return; + sugov_set_iowait_boost(sg_cpu, time, flags); sg_cpu->last_update = time; @@ -290,6 +294,10 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time, unsigned long util, max; unsigned int next_f; + /* Skip updates generated by sugov kthreads */ + if (unlikely(current == sg_policy->thread)) + return; + sugov_get_util(&util, &max); raw_spin_lock(&sg_policy->update_lock);
In system where multiple CPUs shares the same frequency domain a small workload on a CPU can still be subject to frequency spikes, generated by the activation of the sugov's kthread. Since the sugov kthread is a special RT task, which goal is just that to activate a frequency transition, it does not make sense for it to bias the schedutil's frequency selection policy. This patch exploits the information related to the current task to silently ignore cpufreq_update_this_cpu() calls, coming from the RT scheduler, while the sugov kthread is running. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: linux-kernel@vger.kernel.org Cc: linux-pm@vger.kernel.org --- Changes from v1: - move check before policy spinlock (JuriL) --- kernel/sched/cpufreq_schedutil.c | 8 ++++++++ 1 file changed, 8 insertions(+)