Message ID | 20230306123418.720679-4-dedekind1@gmail.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | Sapphire Rapids C0.x idle states support | expand |
On Mon, Mar 06, 2023 at 02:34:18PM +0200, Artem Bityutskiy wrote: > @@ -225,6 +229,27 @@ static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, > return 0; > } > > +/** > + * intel_idle_umwait_irq - Request C0.x using the 'umwait' instruction. > + * @dev: cpuidle device of the target CPU. > + * @drv: cpuidle driver (assumed to point to intel_idle_driver). > + * @index: Target idle state index. > + * > + * Request C0.1 or C0.2 using 'umwait' instruction with interrupts enabled. > + */ > +static __cpuidle int intel_idle_umwait_irq(struct cpuidle_device *dev, > + struct cpuidle_driver *drv, > + int index) > +{ > + u32 state = flg2MWAIT(drv->states[index].flags); > + > + raw_local_irq_enable(); > + umwait_idle(rdtsc() + umwait_limit, state); > + local_irq_disable(); > + > + return index; > +} Bad copy paste job there... vmlinux.o: warning: objtool: intel_idle_umwait_irq+0x8c: call to trace_hardirqs_off() leaves .noinstr.text section
On Mon, Mar 6, 2023 at 4:32 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Mon, Mar 06, 2023 at 02:34:18PM +0200, Artem Bityutskiy wrote: > > @@ -225,6 +229,27 @@ static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, > > return 0; > > } > > > > +/** > > + * intel_idle_umwait_irq - Request C0.x using the 'umwait' instruction. > > + * @dev: cpuidle device of the target CPU. > > + * @drv: cpuidle driver (assumed to point to intel_idle_driver). > > + * @index: Target idle state index. > > + * > > + * Request C0.1 or C0.2 using 'umwait' instruction with interrupts enabled. > > + */ > > +static __cpuidle int intel_idle_umwait_irq(struct cpuidle_device *dev, > > + struct cpuidle_driver *drv, > > + int index) > > +{ > > + u32 state = flg2MWAIT(drv->states[index].flags); > > + > > + raw_local_irq_enable(); > > + umwait_idle(rdtsc() + umwait_limit, state); > > + local_irq_disable(); > > + > > + return index; > > +} > > Bad copy paste job there... > > vmlinux.o: warning: objtool: intel_idle_umwait_irq+0x8c: call to trace_hardirqs_off() leaves .noinstr.text section Well, it would be kind of nice to say that this is related to commit 6d9c7f51b1d9 ("cpuidle, intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE *again*") that is present in 6.3-rc1.
On Tue, Mar 07, 2023 at 01:39:09PM +0100, Rafael J. Wysocki wrote: > On Mon, Mar 6, 2023 at 4:32 PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Mon, Mar 06, 2023 at 02:34:18PM +0200, Artem Bityutskiy wrote: > > > @@ -225,6 +229,27 @@ static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, > > > return 0; > > > } > > > > > > +/** > > > + * intel_idle_umwait_irq - Request C0.x using the 'umwait' instruction. > > > + * @dev: cpuidle device of the target CPU. > > > + * @drv: cpuidle driver (assumed to point to intel_idle_driver). > > > + * @index: Target idle state index. > > > + * > > > + * Request C0.1 or C0.2 using 'umwait' instruction with interrupts enabled. > > > + */ > > > +static __cpuidle int intel_idle_umwait_irq(struct cpuidle_device *dev, > > > + struct cpuidle_driver *drv, > > > + int index) > > > +{ > > > + u32 state = flg2MWAIT(drv->states[index].flags); > > > + > > > + raw_local_irq_enable(); > > > + umwait_idle(rdtsc() + umwait_limit, state); > > > + local_irq_disable(); > > > + > > > + return index; > > > +} > > > > Bad copy paste job there... > > > > vmlinux.o: warning: objtool: intel_idle_umwait_irq+0x8c: call to trace_hardirqs_off() leaves .noinstr.text section > > Well, it would be kind of nice to say that this is related to commit > 6d9c7f51b1d9 ("cpuidle, intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE > *again*") that is present in 6.3-rc1. Right, but he said the patches were based on -next, which would've had that commit for a fair while too.
On Wed, 2023-03-08 at 13:32 +0100, Peter Zijlstra wrote: > > Well, it would be kind of nice to say that this is related to commit > > 6d9c7f51b1d9 ("cpuidle, intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE > > *again*") that is present in 6.3-rc1. > > Right, but he said the patches were based on -next, which would've had > that commit for a fair while too. I can see what is the problem from the above mentioned commit. But I struggle to reproduce it. I tried 'make W=1', and 'tools/objtool/objtool -n vmlinux.o'. I have 'CONFIG_HAVE_NOINSTR_VALIDATION' too. I also tries this: 1. Build kernel with this patch-set 2. tools/objtool/objtool -n vmlinux.o > ~/tmp/before.txt 2>&1 3. Make clean; Build kernel with a fix (local_irq_disable() -> raw_local_irq_disable()) 4. tools/objtool/objtool -n vmlinux.o > ~/tmp/after.txt 2>&1 5. diff -u ~/tmp/before.txt ~/tmp/after.txt And no difference. Could you please help by giving a hint how to verify?
On Thu, Mar 09, 2023 at 10:01:57AM +0200, Artem Bityutskiy wrote:
> Could you please help by giving a hint how to verify?
I think you need both:
DEBUG_ENTRY and TRACE_IRQFLAGS
I can reproduce with:
make O=defconfig-build defconfig
cd defconfig-build
../scripts/config --enable INTEL_IDLE
../scripts/config --enable DEBUG_ENTRY
../scripts/config --enable IRQSOFF_TRACER
cd ..
make O=defconfig-build -j32 vmlinux
But typically it is useful to have a .config with lots of tracing and
debug (including lockdep and debug_entry) enabled for this stuff.
On Tue, 2023-03-14 at 13:24 +0100, Peter Zijlstra wrote: > ../scripts/config --enable INTEL_IDLE > ../scripts/config --enable DEBUG_ENTRY > ../scripts/config --enable IRQSOFF_TRACER Thank you! I've checked that v2 of these patches addresses this issue. I've added these to my defconfig too. Artem.
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 938c17f25d94..f7705a64d0e6 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -51,11 +51,13 @@ #include <linux/notifier.h> #include <linux/cpu.h> #include <linux/moduleparam.h> +#include <linux/units.h> #include <asm/cpu_device_id.h> #include <asm/intel-family.h> #include <asm/nospec-branch.h> #include <asm/mwait.h> #include <asm/msr.h> +#include <asm/tsc.h> #include <asm/fpu/api.h> #define INTEL_IDLE_VERSION "0.5.1" @@ -73,6 +75,8 @@ static struct cpuidle_device __percpu *intel_idle_cpuidle_devices; static unsigned long auto_demotion_disable_flags; +static u64 umwait_limit; + static enum { C1E_PROMOTION_PRESERVE, C1E_PROMOTION_ENABLE, @@ -225,6 +229,27 @@ static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, return 0; } +/** + * intel_idle_umwait_irq - Request C0.x using the 'umwait' instruction. + * @dev: cpuidle device of the target CPU. + * @drv: cpuidle driver (assumed to point to intel_idle_driver). + * @index: Target idle state index. + * + * Request C0.1 or C0.2 using 'umwait' instruction with interrupts enabled. + */ +static __cpuidle int intel_idle_umwait_irq(struct cpuidle_device *dev, + struct cpuidle_driver *drv, + int index) +{ + u32 state = flg2MWAIT(drv->states[index].flags); + + raw_local_irq_enable(); + umwait_idle(rdtsc() + umwait_limit, state); + local_irq_disable(); + + return index; +} + /* * States are indexed by the cstate number, * which is also the index into the MWAIT hint array. @@ -968,6 +993,13 @@ static struct cpuidle_state adl_n_cstates[] __initdata = { }; static struct cpuidle_state spr_cstates[] __initdata = { + { + .name = "C0.2", + .desc = "UMWAIT C0.2", + .flags = MWAIT2flg(TPAUSE_C02_STATE) | CPUIDLE_FLAG_IRQ_ENABLE, + .exit_latency_ns = 100, + .target_residency_ns = 100, + .enter = &intel_idle_umwait_irq, }, { .name = "C1", .desc = "MWAIT 0x00", @@ -1894,7 +1926,8 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) /* Structure copy. */ drv->states[drv->state_count] = cpuidle_state_table[cstate]; - if ((cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IRQ_ENABLE) || force_irq_on) { + if (cpuidle_state_table[cstate].enter == intel_idle && + ((cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IRQ_ENABLE) || force_irq_on)) { printk("intel_idle: forced intel_idle_irq for state %d\n", cstate); drv->states[drv->state_count].enter = intel_idle_irq; } @@ -1926,6 +1959,29 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) } } +/** + * umwait_limit_init - initialize time limit value for 'umwait'. + * @drv: cpuidle driver structure to initialize. + * + * C0.1 and C0.2 (later C0.x) idle states are requested via the 'umwait' + * instruction. The 'umwait' instruction requires the "deadline" - the TSC + * counter value to break out of C0.x (unless it broke out because of an + * interrupt or some other event). + * + * The deadline is specified as an absolute TSC value, and it is calculated as + * current TSC value + 'umwait_limit'. This function initializes the + * 'umwait_limit' variable to count of cycles per tick. The motivation is: + * * the tick is not disabled for shallow states like C0.x so, so idle will + * not last longer than a tick anyway + * * limit idle time to give cpuidle a chance to re-evaluate its C-state + * selection decision and possibly select a deeper C-state. + */ +static void __init umwait_limit_init(void) +{ + umwait_limit = (u64)TICK_NSEC * tsc_khz; + do_div(umwait_limit, MICRO); +} + /** * intel_idle_cpuidle_driver_init - Create the list of available idle states. * @drv: cpuidle driver structure to initialize. @@ -1933,6 +1989,7 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) static void __init intel_idle_cpuidle_driver_init(struct cpuidle_driver *drv) { cpuidle_poll_state_init(drv); + umwait_limit_init(); if (disabled_states_mask & BIT(0)) drv->states[0].flags |= CPUIDLE_FLAG_OFF;