Message ID | 1302200311-24263-1-git-send-email-khilman@ti.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 4/7/2011 11:18 AM, Kevin Hilman wrote: > From: Nicole Chalhoub<n-chalhoub@ti.com> > > While there is CPU load, continue the periodic tick in order to give > CPUidle another opportunity to pick a deeper C-state instead of > spending potentially long i so I don't really like this patch. It's actually a pretty bad hack (I'm sure it'll work somewhat) [and I mean that in the most positive sense of the word ;-) ] what we really need instead, and this is inside cpuidle, is the option to set a timer when we enter the non-deepest C state, so that if that timer fires we then reevaluate. The duration of that timer will be dependent on the C state (so should come from the C state structure of the state we pick). For the most shallow one this will be a relatively short time, but for the deepest-but-one this might be a lot longer time. your patch abuses a completely different, unrelated timer for this, with a pretty much unspecified frequency, that also has other side effects that we probably don't want. it shouldn't be hard to do the right thing instead and make it a separate timer with a per C state timeout. (and I would say a default timeout of 10x the break even time that we already have in the structure) -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Arjan, Arjan van de Ven <arjan@linux.intel.com> writes: > On 4/7/2011 11:18 AM, Kevin Hilman wrote: >> From: Nicole Chalhoub<n-chalhoub@ti.com> >> >> While there is CPU load, continue the periodic tick in order to give >> CPUidle another opportunity to pick a deeper C-state instead of >> spending potentially long i > > > so I don't really like this patch. It's actually a pretty bad hack > (I'm sure it'll work somewhat) > [and I mean that in the most positive sense of the word ;-) ] I'll take it as a complement then. :) I agree though, it did feel somewhat like we were attempting to fix the problem in the wrong place. > what we really need instead, and this is inside cpuidle, is the option > to set a timer when we enter the non-deepest C state, > so that if that timer fires we then reevaluate. > The duration of that timer will be dependent on the C state (so should > come from the C state structure of the state we pick). OK, this sounds like a good idea. Will experiment. Of course, setting new timers can affect the governors decision. To avoid that, I guess this timer will need to be one-shot, and only set after the CPUidle governor has made a decision, otherwise that timer itself will affect tick_nohz_get_sleep_length() which the governor uses to pick a C-state. > For the most shallow one this will be a relatively short time, but for > the deepest-but-one this might be a lot longer time. > > your patch abuses a completely different, unrelated timer for this, > with a pretty much unspecified frequency, that also has other side > effects that we probably don't want. What side effects come to mind? The only side effects that I could think of were (potentially) unwanted wakeups from C1. However, since C1 is presumably cheap to enter (and exit), it seemed like a worthwhile cost since you're almost certain to pick a deeper C state after wakeup. That being said, your idea of per C-state timer is much better than relying on the scheduler tick. On most ARM systems, HZ is still pretty low (around 100), the time between ticks is relatively long, but on a HZ=1000 setup, I could see the extra wakeups having a penalty of their own. > it shouldn't be hard to do the right thing instead and make it a > separate timer with a per C state timeout. Agreed. Will give it a try. > (and I would say a default timeout of 10x the break even time that we > already have in the structure) OK. Thanks for the review and suggestions, Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index d5097c4..418066c 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -324,7 +324,7 @@ void tick_nohz_stop_sched_tick(int inidle) } while (read_seqretry(&xtime_lock, seq)); if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) || - arch_needs_cpu(cpu)) { + arch_needs_cpu(cpu) || this_cpu_load()) { next_jiffies = last_jiffies + 1; delta_jiffies = 1; } else {