Message ID | 2db7bc79-4fe4-abf6-9e5a-83055af9a78a@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4] x86/mwait-idle: re-order state entry/exit code a little | expand |
On Fri, Feb 18, 2022 at 09:35:10AM +0100, Jan Beulich wrote: > The initial observation is that unlike the original ACPI idle driver we > have a 2nd cpu_is_haltable() in here. By making the actual state entry > conditional, the emitted trace records as well as the subsequent stats > update are at least misleading in case the state wasn't actually entered. > Hence they would want moving inside the conditional. At which point the > cpuidle_get_tick() invocations could (and hence should) move as well. > cstate_restore_tsc() also isn't needed if we didn't actually enter the > state. > > This leaves only the errata_c6_workaround() and lapic_timer_off() > invocations outside the conditional. As a result it looks easier to > drop the conditional (and come back in sync with the other driver again) > than to move almost everything into the conditional. > > While there also move the TRACE_6D() out of the IRQ-disabled region. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> > --- > Moving the TRACE_6D() may be a little controversial, as this could lead > to a sequence of trace records not actually representing the sequence of > events, in case further records get emitted by interrupt handlers. But > with us now conditionally enabling interrupts around MWAIT, that issue > exists already anyway. I think that's OK. We have to priority interrupt latency over trace readability. > Unlike said in the earlier outline of the alternative approach, > errata_c6_workaround() cannot be moved: cpu_has_pending_apic_eoi() needs > to be called when IRQs are already off. > --- > v4: Different approach (and title), as was previously outlined as an > alternative. > v3: Also move cstate_restore_tsc() invocation and split ones to > update_idle_stats(). > v2: New. > > --- a/xen/arch/x86/cpu/mwait-idle.c > +++ b/xen/arch/x86/cpu/mwait-idle.c > @@ -847,26 +847,25 @@ static void mwait_idle(void) > > update_last_cx_stat(power, cx, before); > > - if (cpu_is_haltable(cpu)) { > - if (cx->irq_enable_early) > - local_irq_enable(); > + if (cx->irq_enable_early) > + local_irq_enable(); Now that I look at this again, we need to be careful with the enabling interrupts and the interaction with errata_c6_workaround. Enabling interrupts here could change the result of the check for pending EOIs, and thus enter mwait with a condition that could trigger the erratas. Hopefully CPUIDLE_FLAG_IRQ_ENABLE is only set for C1. It might be prudent to only allow setting CPUIDLE_FLAG_IRQ_ENABLE for states <= 2. Thanks, Roger.
On 22.02.2022 10:08, Roger Pau Monné wrote: > On Fri, Feb 18, 2022 at 09:35:10AM +0100, Jan Beulich wrote: >> The initial observation is that unlike the original ACPI idle driver we >> have a 2nd cpu_is_haltable() in here. By making the actual state entry >> conditional, the emitted trace records as well as the subsequent stats >> update are at least misleading in case the state wasn't actually entered. >> Hence they would want moving inside the conditional. At which point the >> cpuidle_get_tick() invocations could (and hence should) move as well. >> cstate_restore_tsc() also isn't needed if we didn't actually enter the >> state. >> >> This leaves only the errata_c6_workaround() and lapic_timer_off() >> invocations outside the conditional. As a result it looks easier to >> drop the conditional (and come back in sync with the other driver again) >> than to move almost everything into the conditional. >> >> While there also move the TRACE_6D() out of the IRQ-disabled region. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks. >> --- a/xen/arch/x86/cpu/mwait-idle.c >> +++ b/xen/arch/x86/cpu/mwait-idle.c >> @@ -847,26 +847,25 @@ static void mwait_idle(void) >> >> update_last_cx_stat(power, cx, before); >> >> - if (cpu_is_haltable(cpu)) { >> - if (cx->irq_enable_early) >> - local_irq_enable(); >> + if (cx->irq_enable_early) >> + local_irq_enable(); > > Now that I look at this again, we need to be careful with the enabling > interrupts and the interaction with errata_c6_workaround. Enabling > interrupts here could change the result of the check for pending EOIs, > and thus enter mwait with a condition that could trigger the erratas. > Hopefully CPUIDLE_FLAG_IRQ_ENABLE is only set for C1. > > It might be prudent to only allow setting CPUIDLE_FLAG_IRQ_ENABLE for > states <= 2. Well, the justification for enabling was the low exit time. I don't expect states > 2 to satisfy this criteria, so I think we're okay without further precautions added right away. Jan
--- a/xen/arch/x86/cpu/mwait-idle.c +++ b/xen/arch/x86/cpu/mwait-idle.c @@ -847,26 +847,25 @@ static void mwait_idle(void) update_last_cx_stat(power, cx, before); - if (cpu_is_haltable(cpu)) { - if (cx->irq_enable_early) - local_irq_enable(); + if (cx->irq_enable_early) + local_irq_enable(); - mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK); + mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK); - local_irq_disable(); - } + local_irq_disable(); after = alternative_call(cpuidle_get_tick); cstate_restore_tsc(); trace_exit_reason(irq_traced); - TRACE_6D(TRC_PM_IDLE_EXIT, cx->type, after, - irq_traced[0], irq_traced[1], irq_traced[2], irq_traced[3]); /* Now back in C0. */ update_idle_stats(power, cx, before, after); local_irq_enable(); + TRACE_6D(TRC_PM_IDLE_EXIT, cx->type, after, + irq_traced[0], irq_traced[1], irq_traced[2], irq_traced[3]); + if (!(lapic_timer_reliable_states & (1 << cx->type))) lapic_timer_on();
The initial observation is that unlike the original ACPI idle driver we have a 2nd cpu_is_haltable() in here. By making the actual state entry conditional, the emitted trace records as well as the subsequent stats update are at least misleading in case the state wasn't actually entered. Hence they would want moving inside the conditional. At which point the cpuidle_get_tick() invocations could (and hence should) move as well. cstate_restore_tsc() also isn't needed if we didn't actually enter the state. This leaves only the errata_c6_workaround() and lapic_timer_off() invocations outside the conditional. As a result it looks easier to drop the conditional (and come back in sync with the other driver again) than to move almost everything into the conditional. While there also move the TRACE_6D() out of the IRQ-disabled region. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- Moving the TRACE_6D() may be a little controversial, as this could lead to a sequence of trace records not actually representing the sequence of events, in case further records get emitted by interrupt handlers. But with us now conditionally enabling interrupts around MWAIT, that issue exists already anyway. Unlike said in the earlier outline of the alternative approach, errata_c6_workaround() cannot be moved: cpu_has_pending_apic_eoi() needs to be called when IRQs are already off. --- v4: Different approach (and title), as was previously outlined as an alternative. v3: Also move cstate_restore_tsc() invocation and split ones to update_idle_stats(). v2: New.