Message ID | 1313766244-22313-3-git-send-email-tom.leiming@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, 2011/8/20 Thomas Renninger <trenn@suse.de>: > On Friday, August 19, 2011 05:04:04 PM tom.leiming@gmail.com wrote: >> From: Ming Lei <tom.leiming@gmail.com> >> >> This patch removes the 'cpu_id' parameter of the cpu_idle >> trace point, based on the ideas below: >> >> - the cpu_id which is passed to trace point is always the current >> cpu > Are you sure this will always be true? It is sure at least now, the only place to pass 'dev->cpu' is inside cpuidle_idle_call, but the cpuidle_device is got from __this_cpu_read(cpuidle_devices), so it should be true for such case. > >> - the current cpu info has been included into the trace result >> already >> - smp_processor_id() can't be used safely in preemptible context. > > The cpuid has been added to idle events on purpose for example to be > able to pass C-state dependencies. > 2 cores may only enter a deeper sleep state if the 2nd core enters it > as well. > There may be architectures where you could trigger a sleep state on > a foreign cpu. > > This may not be used now, but if the kernel does want to use it, you do > not want to adjust a dozen userspace apps. > > Not sure how to quickly solve the: > "smp_processor_id() can't be used safely in preemptible context." > problem, though. > A convention could be declared that if -1 is passed, it's the same cpu > entering the sleep state as the running one. This will probably > break userspace apps as well... > > Best would be to clean up x86 and let idle routines always be entered > from cpuidle subsystem which knows the cpu id already and eliminate > all idle functions in arch/x86/kernel/process.c. > > Thomas > > PS: I do not care about patch 1 and 2 as these events are ARM specific > afaik. But I vote against this one. > thanks,
On Saturday, August 20, 2011 04:40:09 AM Ming Lei wrote: > Hi, > > 2011/8/20 Thomas Renninger <trenn@suse.de>: > > On Friday, August 19, 2011 05:04:04 PM tom.leiming@gmail.com wrote: > >> From: Ming Lei <tom.leiming@gmail.com> > >> > >> This patch removes the 'cpu_id' parameter of the cpu_idle > >> trace point, based on the ideas below: > >> > >> - the cpu_id which is passed to trace point is always the current > >> cpu > > Are you sure this will always be true? > > It is sure at least now, the only place to pass 'dev->cpu' is inside > cpuidle_idle_call, It was known that cpu_id is always the current cpu with current implementation when this got introduced. But the perf events API must not change back and forth for userspace compatibility. Therefore the cpu_id was added in case that future implementations want to pass info where the current cpu is not the cpu which is sent to the sleep state. > smp_processor_id() can't be used safely in preemptible context. I expect the only side effect that could happen is that if smp_process_id is interrupted you get the wrong core id on a cpu idle trace event. This only happens if cpuidle is not used and even then should happen very rarely, nothing to worry for a debug tool like that. And it should get fixed if these idle functions get fully integrated into cpuidle at some point of time. Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Ming Lei, Thomas, Sorry if it is a bit late to jump in. On Mon, Aug 22, 2011 at 10:27 AM, Thomas Renninger <trenn@suse.de> wrote: > On Saturday, August 20, 2011 04:40:09 AM Ming Lei wrote: >> Hi, >> >> 2011/8/20 Thomas Renninger <trenn@suse.de>: >> > On Friday, August 19, 2011 05:04:04 PM tom.leiming@gmail.com wrote: >> >> From: Ming Lei <tom.leiming@gmail.com> >> >> >> >> This patch removes the 'cpu_id' parameter of the cpu_idle >> >> trace point, based on the ideas below: >> >> >> >> - the cpu_id which is passed to trace point is always the current >> >> cpu >> > Are you sure this will always be true? >> >> It is sure at least now, the only place to pass 'dev->cpu' is inside >> cpuidle_idle_call, > It was known that cpu_id is always the current cpu with current > implementation when this got introduced. > But the perf events API must not change back and forth for userspace > compatibility. Therefore the cpu_id was added in case > that future implementations want to pass info where the current cpu > is not the cpu which is sent to the sleep state. Agree. Let's keep the cpu_id field. > >> smp_processor_id() can't be used safely in preemptible context. > I expect the only side effect that could happen is that if smp_process_id > is interrupted you get the wrong core id on a cpu idle trace event. > This only happens if cpuidle is not used and even then should happen > very rarely, nothing to worry for a debug tool like that. > And it should get fixed if these idle functions get fully integrated into > cpuidle at some point of time. > > Thomas > Regards, Jean -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Fri, Sep 2, 2011 at 3:26 PM, Jean Pihet <jean.pihet@newoldbits.com> wrote: >> It was known that cpu_id is always the current cpu with current >> implementation when this got introduced. >> But the perf events API must not change back and forth for userspace >> compatibility. Therefore the cpu_id was added in case >> that future implementations want to pass info where the current cpu >> is not the cpu which is sent to the sleep state. > Agree. Let's keep the cpu_id field. OK, let's keep it. How about removing it in clock_enalbe/clock_disable/power_domain_target as did in [1/3] and [2/3]? I don't see any usefulness of 'cpu_id' in the tree trace points. thanks,
diff --git a/Documentation/trace/events-power.txt b/Documentation/trace/events-power.txt index e9d5fe3..9f2f96c 100644 --- a/Documentation/trace/events-power.txt +++ b/Documentation/trace/events-power.txt @@ -23,7 +23,7 @@ Cf. include/trace/events/power.h for the events definitions. A 'cpu' event class gathers the CPU-related events: cpuidle and cpufreq. -cpu_idle "state=%lu cpu_id=%lu" +cpu_idle "state=%lu" cpu_frequency "state=%lu cpu_id=%lu" A suspend event is used to indicate the system going in and out of the @@ -33,8 +33,8 @@ machine_suspend "state=%lu" Note: the value of '-1' or '4294967295' for state means an exit from the current state, -i.e. trace_cpu_idle(4, smp_processor_id()) means that the system -enters the idle state 4, while trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()) +i.e. trace_cpu_idle(4) means that the system +enters the idle state 4, while trace_cpu_idle(PWR_EVENT_EXIT) means that the system exits the previous idle state. The event which has 'state=4294967295' in the trace is very important to the user diff --git a/arch/arm/mach-omap2/pm34xx.c b/arch/arm/mach-omap2/pm34xx.c index 7255d9b..cde9a11 100644 --- a/arch/arm/mach-omap2/pm34xx.c +++ b/arch/arm/mach-omap2/pm34xx.c @@ -502,12 +502,12 @@ static void omap3_pm_idle(void) goto out; trace_power_start(POWER_CSTATE, 1, smp_processor_id()); - trace_cpu_idle(1, smp_processor_id()); + trace_cpu_idle(1); omap_sram_idle(); trace_power_end(smp_processor_id()); - trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); + trace_cpu_idle(PWR_EVENT_EXIT); out: local_fiq_enable(); diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index e7e3b01..fb9e92b 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -378,7 +378,7 @@ void default_idle(void) { if (hlt_use_halt()) { trace_power_start(POWER_CSTATE, 1, smp_processor_id()); - trace_cpu_idle(1, smp_processor_id()); + trace_cpu_idle(1); current_thread_info()->status &= ~TS_POLLING; /* * TS_POLLING-cleared state must be visible before we @@ -392,7 +392,7 @@ void default_idle(void) local_irq_enable(); current_thread_info()->status |= TS_POLLING; trace_power_end(smp_processor_id()); - trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); + trace_cpu_idle(PWR_EVENT_EXIT); } else { local_irq_enable(); /* loop is done by the caller */ @@ -443,7 +443,7 @@ static void mwait_idle(void) { if (!need_resched()) { trace_power_start(POWER_CSTATE, 1, smp_processor_id()); - trace_cpu_idle(1, smp_processor_id()); + trace_cpu_idle(1); if (this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) clflush((void *)¤t_thread_info()->flags); @@ -454,7 +454,7 @@ static void mwait_idle(void) else local_irq_enable(); trace_power_end(smp_processor_id()); - trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); + trace_cpu_idle(PWR_EVENT_EXIT); } else local_irq_enable(); } @@ -467,12 +467,12 @@ static void mwait_idle(void) static void poll_idle(void) { trace_power_start(POWER_CSTATE, 0, smp_processor_id()); - trace_cpu_idle(0, smp_processor_id()); + trace_cpu_idle(0); local_irq_enable(); while (!need_resched()) cpu_relax(); trace_power_end(smp_processor_id()); - trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); + trace_cpu_idle(PWR_EVENT_EXIT); } /* diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index d4c5423..7980732 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -106,12 +106,12 @@ int cpuidle_idle_call(void) dev->last_state = target_state; trace_power_start(POWER_CSTATE, next_state, dev->cpu); - trace_cpu_idle(next_state, dev->cpu); + trace_cpu_idle(next_state); dev->last_residency = target_state->enter(dev, target_state); trace_power_end(dev->cpu); - trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu); + trace_cpu_idle(PWR_EVENT_EXIT); if (dev->last_state) target_state = dev->last_state; diff --git a/include/trace/events/power.h b/include/trace/events/power.h index 3878edc..8a579bd 100644 --- a/include/trace/events/power.h +++ b/include/trace/events/power.h @@ -27,11 +27,20 @@ DECLARE_EVENT_CLASS(cpu, (unsigned long)__entry->cpu_id) ); -DEFINE_EVENT(cpu, cpu_idle, +TRACE_EVENT(cpu_idle, + TP_PROTO(unsigned int state), - TP_PROTO(unsigned int state, unsigned int cpu_id), + TP_ARGS(state), + + TP_STRUCT__entry( + __field( u32, state ) + ), + + TP_fast_assign( + __entry->state = state; + ), - TP_ARGS(state, cpu_id) + TP_printk("state=%lu", (unsigned long)__entry->state) ); /* This file can get included multiple times, TRACE_HEADER_MULTI_READ at top */