[v3,0/3] SRF: Fix offline CPU preventing pc6 entry

Message ID	20241108122909.763663-1-patryk.wlazlyn@linux.intel.com (mailing list archive)
Headers	show Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1567C1E631B; Fri, 8 Nov 2024 12:29:26 +0000 (UTC) From: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael.j.wysocki@intel.com, len.brown@intel.com, artem.bityutskiy@linux.intel.com, dave.hansen@linux.intel.com, patryk.wlazlyn@linux.intel.com Subject: [PATCH v3 0/3] SRF: Fix offline CPU preventing pc6 entry Date: Fri, 8 Nov 2024 13:29:06 +0100 Message-ID: <20241108122909.763663-1-patryk.wlazlyn@linux.intel.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	SRF: Fix offline CPU preventing pc6 entry \| expand [v3,0/3] SRF: Fix offline CPU preventing pc6 entry [v3,1/3] x86/smp: Allow calling mwait_play_dead with arbitrary hint [v3,2/3] x86/smp native_play_dead: Prefer cpuidle_play_dead() over mwait_play_dead() [v3,3/3] intel_idle: Provide enter_dead() handler for SRF

Message ID

20241108122909.763663-1-patryk.wlazlyn@linux.intel.com (mailing list archive)

Headers

From: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
To: x86@kernel.org
Cc: linux-kernel@vger.kernel.org,
	linux-pm@vger.kernel.org,
	rafael.j.wysocki@intel.com,
	len.brown@intel.com,
	artem.bityutskiy@linux.intel.com,
	dave.hansen@linux.intel.com,
	patryk.wlazlyn@linux.intel.com
Subject: [PATCH v3 0/3] SRF: Fix offline CPU preventing pc6 entry
Date: Fri,  8 Nov 2024 13:29:06 +0100
Message-ID: <20241108122909.763663-1-patryk.wlazlyn@linux.intel.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

SRF: Fix offline CPU preventing pc6 entry | expand

Message

Patryk Wlazlyn Nov. 8, 2024, 12:29 p.m. UTC

Applied suggestions from Dave and Rafael.

Because we now call a smp function, we have to have ifdefs for
CONFIG_SMP or rely on mwait_play_dead_with_hint to return immediatelly
with an error on non-smp builds. I decided to do the later, but maybe we
should return -ENODEV (or other error constant) instead of 1. I am open
for suggestions.

Removing the existing "kexec hack" by bringing all offlined CPUs back
online before proceeding with the kexec would make it even simpler, but
I am not sure we can do that. It looks kind of obvious to me, but for
some reason the hack exist.

Changes since v2:
  The whole approach changed, but there main things are below.

  * Split mwait_play_dead (old code) into two parts:
    1. Computing mwait hint based on cpuid leaf 0x5
    2. Entering while(1) mwait loop with "kexec hack" handling

  * Prefer cpuidle_play_dead over mwait_play_dead by reordering calls in
    native_play_dead.

  * Add implementation for enter_dead() handler in intel_idle that calls
    executes old mwait_play_dead code, but with the mwait hint from the
    cpuidle state table.


Patryk Wlazlyn (3):
  x86/smp: Allow calling mwait_play_dead with arbitrary hint
  x86/smp native_play_dead: Prefer cpuidle_play_dead() over
    mwait_play_dead()
  intel_idle: Provide enter_dead() handler for SRF

 arch/x86/include/asm/smp.h |  6 ++++++
 arch/x86/kernel/smpboot.c  | 29 ++++++++++++++++++++---------
 drivers/idle/intel_idle.c  | 16 ++++++++++++++++
 3 files changed, 42 insertions(+), 9 deletions(-)

Comments

Dave Hansen Nov. 8, 2024, 4:22 p.m. UTC | #1

On 11/8/24 04:29, Patryk Wlazlyn wrote:
> Applied suggestions from Dave and Rafael.

The basic approach here is looking pretty sound, so thanks for that
Patryk.  I mostly have mechanical nits left.

Peter Zijlstra Nov. 12, 2024, 11:45 a.m. UTC | #2

On Fri, Nov 08, 2024 at 01:29:06PM +0100, Patryk Wlazlyn wrote:

> Removing the existing "kexec hack" by bringing all offlined CPUs back
> online before proceeding with the kexec would make it even simpler, but
> I am not sure we can do that. It looks kind of obvious to me, but for
> some reason the hack exist.

There's a comment there that explains why this is done. If you don't
understand this, then please don't touch this code.

Patryk Wlazlyn Nov. 12, 2024, 3:43 p.m. UTC | #3

> There's a comment there that explains why this is done. If you don't
> understand this, then please don't touch this code.

/*
 * Kexec is about to happen. Don't go back into mwait() as
 * the kexec kernel might overwrite text and data including
 * page tables and stack. So mwait() would resume when the
 * monitor cache line is written to and then the CPU goes
 * south due to overwritten text, page tables and stack.
 *
 * Note: This does _NOT_ protect against a stray MCE, NMI,
 * SMI. They will resume execution at the instruction
 * following the HLT instruction and run into the problem
 * which this is trying to prevent.
 */

If you are referring to this comment above, I do understand the need to
enter hlt loop before the kexec happens. I thought that I could bring
all of the offlined CPUs back online, effectively getting them out of
the mwait loop.

Thomas Gleixner Nov. 13, 2024, 1:19 a.m. UTC | #4

On Tue, Nov 12 2024 at 16:43, Patryk Wlazlyn wrote:
>> There's a comment there that explains why this is done. If you don't
>> understand this, then please don't touch this code.
>
> /*
>  * Kexec is about to happen. Don't go back into mwait() as
>  * the kexec kernel might overwrite text and data including
>  * page tables and stack. So mwait() would resume when the
>  * monitor cache line is written to and then the CPU goes
>  * south due to overwritten text, page tables and stack.
>  *
>  * Note: This does _NOT_ protect against a stray MCE, NMI,
>  * SMI. They will resume execution at the instruction
>  * following the HLT instruction and run into the problem
>  * which this is trying to prevent.
>  */
>
> If you are referring to this comment above, I do understand the need to
> enter hlt loop before the kexec happens. I thought that I could bring
> all of the offlined CPUs back online, effectively getting them out of
> the mwait loop.

That's not really working:

  1) Regular kexec offlines them again.

  2) Kexec in panic can't do any of that.

Thanks

        tglx

Patryk Wlazlyn Nov. 14, 2024, 5:13 p.m. UTC | #5

> That's not really working:
>
>   1) Regular kexec offlines them again.

But then, they execute hlt loop in the reboot vector right?
I think that's fine. We just don't want to be woken up from the mwait
when the RIP points to garbage.

>   2) Kexec in panic can't do any of that.

Does it make sense to change the kexec, so that every CPU is forced
into the reboot vector, which I believe happens for online CPUs?
We don't have to online them all the way. Maybe minimal bringup just
to be able to send them the IPI?

As a side note, It's not that important for this patchset. I think I
can make it work without simplifying the existing hack, but crossed my
mind at some point that maybe we could do that.