diff mbox

[213/474] x86, hotplug: Use mwait to offline a processor, fix the legacy case

Message ID 20110317212903.394948496@clark.kroah.org (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Greg Kroah-Hartman March 17, 2011, 9:24 p.m. UTC
None

Comments

Jonathan Nieder Aug. 28, 2011, 6:40 p.m. UTC | #1
Hi,

On 2011-03-17, Greg KH wrote:

> 2.6.33-longterm review patch.  If anyone has any objections, please let us know.
[...]
> From: H. Peter Anvin <hpa@linux.intel.com>
>
> upstream ea53069231f9317062910d6e772cca4ce93de8c8
> x86, hotplug: Use mwait to offline a processor, fix the legacy case
>
> Here included also some small follow-on patches to the same code:
[...]
> https://bugzilla.kernel.org/show_bug.cgi?id=5471
>
> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
> Signed-off-by: Len Brown <len.brown@intel.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Sorry for the sloow response.  Unfortunately this is reported to have
broken hibernation on two machines: an EeePC 1002HA and an Ideapad S10-3.
Frédéric writes:

| I've observed that when resuming from hibernation, sometimes my computer was
| returning to Grub2's menu (without any error message), and I had to do a
| "normal" boot on my Debian Squeeze system (with file systems corrections),
| loosing my hibernation's state.
|   The fail isn't automatic, but seems to happen more frequently after a while
| my computer was disconnected from AC and battery.

Noticed on Debian squeeze (which is based on v2.6.32.y), confirmed
with unpatched v2.6.32.45 and v2.6.33.18.  Backing out the patch
mentioned above avoids trouble.  A newer kernel (based on v2.6.39.y)
does _not_ exhibit the same problem, so it looks like the problem was
introduced in backporting.  http://bugs.debian.org/622259 has
details.

Ideas?

Thanks,
Jonathan
Greg KH Aug. 30, 2011, 10:34 p.m. UTC | #2
On Sun, Aug 28, 2011 at 01:40:04PM -0500, Jonathan Nieder wrote:
> Hi,
> 
> On 2011-03-17, Greg KH wrote:
> 
> > 2.6.33-longterm review patch.  If anyone has any objections, please let us know.
> [...]
> > From: H. Peter Anvin <hpa@linux.intel.com>
> >
> > upstream ea53069231f9317062910d6e772cca4ce93de8c8
> > x86, hotplug: Use mwait to offline a processor, fix the legacy case
> >
> > Here included also some small follow-on patches to the same code:
> [...]
> > https://bugzilla.kernel.org/show_bug.cgi?id=5471
> >
> > Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
> > Signed-off-by: Len Brown <len.brown@intel.com>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
> 
> Sorry for the sloow response.  Unfortunately this is reported to have
> broken hibernation on two machines: an EeePC 1002HA and an Ideapad S10-3.
> Frédéric writes:
> 
> | I've observed that when resuming from hibernation, sometimes my computer was
> | returning to Grub2's menu (without any error message), and I had to do a
> | "normal" boot on my Debian Squeeze system (with file systems corrections),
> | loosing my hibernation's state.
> |   The fail isn't automatic, but seems to happen more frequently after a while
> | my computer was disconnected from AC and battery.
> 
> Noticed on Debian squeeze (which is based on v2.6.32.y), confirmed
> with unpatched v2.6.32.45 and v2.6.33.18.  Backing out the patch
> mentioned above avoids trouble.  A newer kernel (based on v2.6.39.y)
> does _not_ exhibit the same problem, so it looks like the problem was
> introduced in backporting.  http://bugs.debian.org/622259 has
> details.

Thanks for letting me know, I'll go revert that patch for the next .32
and .33 longterm releases and let everyone else work out exactly what
the problem is, if they want to.

greg k-h
diff mbox

Patch

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -767,29 +767,6 @@  extern unsigned long		boot_option_idle_o
 extern unsigned long		idle_halt;
 extern unsigned long		idle_nomwait;
 
-/*
- * on systems with caches, caches must be flashed as the absolute
- * last instruction before going into a suspended halt.  Otherwise,
- * dirty data can linger in the cache and become stale on resume,
- * leading to strange errors.
- *
- * perform a variety of operations to guarantee that the compiler
- * will not reorder instructions.  wbinvd itself is serializing
- * so the processor will not reorder.
- *
- * Systems without cache can just go into halt.
- */
-static inline void wbinvd_halt(void)
-{
-	mb();
-	/* check for clflush to determine if wbinvd is legal */
-	if (cpu_has_clflush)
-		asm volatile("cli; wbinvd; 1: hlt; jmp 1b" : : : "memory");
-	else
-		while (1)
-			halt();
-}
-
 extern void enable_sep_cpu(void);
 extern int sysenter_setup(void);
 
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1329,11 +1329,94 @@  void play_dead_common(void)
 	local_irq_disable();
 }
 
+#define MWAIT_SUBSTATE_MASK		0xf
+#define MWAIT_SUBSTATE_SIZE		4
+
+#define CPUID_MWAIT_LEAF		5
+#define CPUID5_ECX_EXTENSIONS_SUPPORTED 0x1
+
+/*
+ * We need to flush the caches before going to sleep, lest we have
+ * dirty data in our caches when we come back up.
+ */
+static inline void mwait_play_dead(void)
+{
+	unsigned int eax, ebx, ecx, edx;
+	unsigned int highest_cstate = 0;
+	unsigned int highest_subcstate = 0;
+	int i;
+	void *mwait_ptr;
+
+	if (!cpu_has(&current_cpu_data, X86_FEATURE_MWAIT))
+		return;
+	if (!cpu_has(&current_cpu_data, X86_FEATURE_CLFLSH))
+		return;
+	if (current_cpu_data.cpuid_level < CPUID_MWAIT_LEAF)
+		return;
+
+	eax = CPUID_MWAIT_LEAF;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+
+	/*
+	 * eax will be 0 if EDX enumeration is not valid.
+	 * Initialized below to cstate, sub_cstate value when EDX is valid.
+	 */
+	if (!(ecx & CPUID5_ECX_EXTENSIONS_SUPPORTED)) {
+		eax = 0;
+	} else {
+		edx >>= MWAIT_SUBSTATE_SIZE;
+		for (i = 0; i < 7 && edx; i++, edx >>= MWAIT_SUBSTATE_SIZE) {
+			if (edx & MWAIT_SUBSTATE_MASK) {
+				highest_cstate = i;
+				highest_subcstate = edx & MWAIT_SUBSTATE_MASK;
+			}
+		}
+		eax = (highest_cstate << MWAIT_SUBSTATE_SIZE) |
+			(highest_subcstate - 1);
+	}
+
+	/*
+	 * This should be a memory location in a cache line which is
+	 * unlikely to be touched by other processors.  The actual
+	 * content is immaterial as it is not actually modified in any way.
+	 */
+	mwait_ptr = &current_thread_info()->flags;
+
+	wbinvd();
+
+	while (1) {
+		/*
+		 * The CLFLUSH is a workaround for erratum AAI65 for
+		 * the Xeon 7400 series.  It's not clear it is actually
+		 * needed, but it should be harmless in either case.
+		 * The WBINVD is insufficient due to the spurious-wakeup
+		 * case where we return around the loop.
+		 */
+		clflush(mwait_ptr);
+		__monitor(mwait_ptr, 0, 0);
+		mb();
+		__mwait(eax, 0);
+	}
+}
+
+static inline void hlt_play_dead(void)
+{
+	if (current_cpu_data.x86 >= 4)
+		wbinvd();
+
+	while (1) {
+		native_halt();
+	}
+}
+
 void native_play_dead(void)
 {
 	play_dead_common();
 	tboot_shutdown(TB_SHUTDOWN_WFS);
-	wbinvd_halt();
+
+	mwait_play_dead();	/* Only returns on failure */
+	hlt_play_dead();
 }
 
 #else /* ... !CONFIG_HOTPLUG_CPU */