diff mbox

intel_idle: Don't use on Lenovo Ideapad S10-3t

Message ID 1446499360-31984-1-git-send-email-ville.syrjala@linux.intel.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Ville Syrjälä Nov. 2, 2015, 9:22 p.m. UTC
From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle.
The two workaround that seem to help are "intel_idle.max_cstate=0"
or "nohz=off highres=off".

At a first glance quirk_tigerpoint_bm_sts() seemed promising, but
even when moved to early_resume it didn't do anything.

I have no idea what's wrong here, so let's just disable intel_idle
for these machines using a DMI match.

Cc: Len Brown <len.brown@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
If anyone has any better ideas, I can try out some patches.

 drivers/idle/intel_idle.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

Comments

Len Brown Nov. 3, 2015, 3:04 a.m. UTC | #1
PiBMZW5vdm8gSWRlYXBhZCBTMTAtM3QgaGFuZ3MgY29taW5nIG91dCBvZiBTMyB3aXRoIGludGVs
X2lkbGUuDQo+IFRoZSB0d28gd29ya2Fyb3VuZCB0aGF0IHNlZW0gdG8gaGVscCBhcmUgImludGVs
X2lkbGUubWF4X2NzdGF0ZT0wIg0KPiBvciAibm9oej1vZmYgaGlnaHJlcz1vZmYiLg0KPiANCj4g
QXQgYSBmaXJzdCBnbGFuY2UgcXVpcmtfdGlnZXJwb2ludF9ibV9zdHMoKSBzZWVtZWQgcHJvbWlz
aW5nLCBidXQNCj4gZXZlbiB3aGVuIG1vdmVkIHRvIGVhcmx5X3Jlc3VtZSBpdCBkaWRuJ3QgZG8g
YW55dGhpbmcuDQo+IA0KPiBJIGhhdmUgbm8gaWRlYSB3aGF0J3Mgd3JvbmcgaGVyZSwgc28gbGV0
J3MganVzdCBkaXNhYmxlIGludGVsX2lkbGUNCj4gZm9yIHRoZXNlIG1hY2hpbmVzIHVzaW5nIGEg
RE1JIG1hdGNoLg0KDQpWaWxsZSwgDQoNCkl0IGlzIGdyZWF0IHRoYXQgc2V2ZXJhbCB3b3JrYXJv
dW5kcyBoYXZlIGJlZW4gZGlzY292ZXJlZC4NCg0KQnV0IGl0IHdvdWxkIGJlIGJldHRlciB0byBn
ZXQgYSBnb29kIGlkZWEgb2YgdGhlIHJvb3QtY2F1c2UNCmJlZm9yZSBwZXJtYW5lbnRseSBpZ25v
cmluZyB0aGUgcHJvYmxlbSB2aWEgYSBuZXcNCmJsYWNrLWxpc3QgaW4gdGhlIHVwc3RyZWFtIGtl
cm5lbC4NCg0KSXMgaXQgcG9zc2libGUgZm9yIHlvdSB0byBmaWxlIGEgYnVnIGF0IGJ1Z3ppbGxh
Lmtlcm5lbC5vcmcNCmFnYWluc3QgUHJvZHVjdDogcG93ZXItbWFuYWdlbWVudDsgY29tcG9uZW50
OiBpbnRlbF9pZGxlPw0KDQpJbiBpdCwgcGxlYXNlIHB1dCB0aGUgZm9sbG93aW5nIGluZm9ybWF0
aW9uLg0KDQpJZiB0aGlzIGlzIGEgcmVncmVzc2lvbiwgdGhlIG9sZGVzdCBrZXJuZWwgdGhhdCBi
cm9rZS4NCg0KV2hlbiBib290ZWQgd2l0aCBpbnRlbF9pZGxlLCBhbmQgdGhlbiB3aXRob3V0Og0K
DQpkbWVzZyB8IGdyZXAgaWRsZQ0KZ3JlcCAuIC9zeXMvZGV2aWNlcy9zeXN0ZW0vY3B1L2NwdTAv
Y3B1aWRsZS8qLyoNCiMgdHVyYm9zdGF0IC0tZGVidWcgc2xlZXAgMTAgMj4gdHVyYm9zdGF0Lm91
dA0KDQojIGNkIC9zeXMvZGV2aWNlcy9zeXN0ZW0vY2xvY2tzb3VyY2UvY2xvY2tzb3VyY2UwDQpn
cmVwIC4gYXZhaWxhYmxlX2Nsb2Nrc291cmNlIGN1cnJlbnRfY2xvY2tzb3VyY2UNCg0KDQpPdGhl
ciBib290IG9wdGlvbnMgdG8gdGVzdDoNCg0KbWF4Y3B1cz0xDQpub2hwZXQNCg0KaW50ZWxfaWRs
ZS5tYXhfY3N0YXRlPTMNCmFuZCBpZiB0aGF0IGZhaWxzDQppbnRlbF9pZGxlLm1heF9jc3RhdGU9
Mg0KYW5kIGlmIHRoYXQgZmFpbHMNCmludGVsX2lkbGUubWF4X2NzdGF0ZT0xDQoNCnRoYW5rcywN
Ci1MZW4NCg0K
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ville Syrjälä Nov. 4, 2015, 4:21 p.m. UTC | #2
On Tue, Nov 03, 2015 at 03:04:21AM +0000, Brown, Len wrote:
> > Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle.
> > The two workaround that seem to help are "intel_idle.max_cstate=0"
> > or "nohz=off highres=off".
> > 
> > At a first glance quirk_tigerpoint_bm_sts() seemed promising, but
> > even when moved to early_resume it didn't do anything.
> > 
> > I have no idea what's wrong here, so let's just disable intel_idle
> > for these machines using a DMI match.
> 
> Ville, 
> 
> It is great that several workarounds have been discovered.
> 
> But it would be better to get a good idea of the root-cause
> before permanently ignoring the problem via a new
> black-list in the upstream kernel.
> 
> Is it possible for you to file a bug at bugzilla.kernel.org
> against Product: power-management; component: intel_idle?

https://bugzilla.kernel.org/show_bug.cgi?id=107151

> 
> In it, please put the following information.
> 
> If this is a regression, the oldest kernel that broke.

I'll repeat the details here just in case people are too lazy to look at
the bug:

It seems intel_idle has always been flaky with this hardware. It used to
work a little better in the past, but not perfectly.

I tried to bisect the total breakage starting from v3.11, and found the
following commit to be at fault:
commit a8d46b9e4e487301affe84fa53de40b890898604
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Tue Sep 30 02:29:01 2014 +0200

    ACPI / sleep: Rework the handling of ACPI GPE wakeup from
    suspend-to-idle

Before that I observed three different behaviours for S3 resume:
a) sometimes it resumed OK
b) sometimes it resumed part way, but kbd/network etc. were still
   dead, but then pressing the power button made it finish the resume
   somehow
c) same as the previous, except the power button press also made it
   all the way to userspace and initiated a normal shutdown
The same kernel could exhibit both a) and b), or both a) and c).

After a8d46b9e4e48 it never resumes, no matter if I press the power
button or not.

> When booted with intel_idle, and then without:
> 
> dmesg | grep idle
> grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
> # turbostat --debug sleep 10 2> turbostat.out
> 
> # cd /sys/devices/system/clocksource/clocksource0
> grep . available_clocksource current_clocksource
> 
> 
> Other boot options to test:
> 
> maxcpus=1
> nohpet
> 
> intel_idle.max_cstate=3
> and if that fails
> intel_idle.max_cstate=2
> and if that fails
> intel_idle.max_cstate=1

None of these help.
Henrique de Moraes Holschuh Nov. 8, 2015, 7:20 p.m. UTC | #3
On Wed, 04 Nov 2015, Ville Syrjälä wrote:
> On Tue, Nov 03, 2015 at 03:04:21AM +0000, Brown, Len wrote:
> > > Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle.
> > > The two workaround that seem to help are "intel_idle.max_cstate=0"
> > > or "nohz=off highres=off".
> > > 
> > > At a first glance quirk_tigerpoint_bm_sts() seemed promising, but
> > > even when moved to early_resume it didn't do anything.
> > > 
> > > I have no idea what's wrong here, so let's just disable intel_idle
> > > for these machines using a DMI match.
> > 
> > Ville, 
> > 
> > It is great that several workarounds have been discovered.
> > 
> > But it would be better to get a good idea of the root-cause
> > before permanently ignoring the problem via a new
> > black-list in the upstream kernel.
> > 
> > Is it possible for you to file a bug at bugzilla.kernel.org
> > against Product: power-management; component: intel_idle?
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=107151

This is a shot in the dark, but...

The Atom N450 public errata sheet is nightmare fuel, and Ville's Ideapad is
running that hellish processor on outdated microcode.

http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/atom-processor-n400-specification-update.pdf

Ville, you might want to insure you're at the latest BIOS for the S10-3t:
http://support.lenovo.com/us/en/products/laptops-and-netbooks/ideapad-s-series-netbooks/ideapad-s10-3t/downloads/DS010786

(don't bother with changelogs to make any sort of update-or-not decisions:
Lenovo changelogs are known to often be incomplete).

I also strongly recommend that you should install your distro's microcode
update package: your microcode should have been revision 0x107 or higher, it
has been in the public microcode distribution since Q2/Q3-2010...

Now, as I said, this is all a shot in the dark, and it might not help at all
with the hangs.  But it looks like something worth trying...
Ville Syrjälä Nov. 9, 2015, 10:28 a.m. UTC | #4
On Sun, Nov 08, 2015 at 05:20:16PM -0200, Henrique de Moraes Holschuh wrote:
> On Wed, 04 Nov 2015, Ville Syrjälä wrote:
> > On Tue, Nov 03, 2015 at 03:04:21AM +0000, Brown, Len wrote:
> > > > Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle.
> > > > The two workaround that seem to help are "intel_idle.max_cstate=0"
> > > > or "nohz=off highres=off".
> > > > 
> > > > At a first glance quirk_tigerpoint_bm_sts() seemed promising, but
> > > > even when moved to early_resume it didn't do anything.
> > > > 
> > > > I have no idea what's wrong here, so let's just disable intel_idle
> > > > for these machines using a DMI match.
> > > 
> > > Ville, 
> > > 
> > > It is great that several workarounds have been discovered.
> > > 
> > > But it would be better to get a good idea of the root-cause
> > > before permanently ignoring the problem via a new
> > > black-list in the upstream kernel.
> > > 
> > > Is it possible for you to file a bug at bugzilla.kernel.org
> > > against Product: power-management; component: intel_idle?
> > 
> > https://bugzilla.kernel.org/show_bug.cgi?id=107151
> 
> This is a shot in the dark, but...
> 
> The Atom N450 public errata sheet is nightmare fuel, and Ville's Ideapad is
> running that hellish processor on outdated microcode.
> 
> http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/atom-processor-n400-specification-update.pdf
> 
> Ville, you might want to insure you're at the latest BIOS for the S10-3t:
> http://support.lenovo.com/us/en/products/laptops-and-netbooks/ideapad-s-series-netbooks/ideapad-s10-3t/downloads/DS010786

Seems to require Windows, so no can do without major pain.

Considering acpi_idle works just fine, I doubt a new BIOS version would
have anything to fix this, except maybe by accident.

> (don't bother with changelogs to make any sort of update-or-not decisions:
> Lenovo changelogs are known to often be incomplete).
> 
> I also strongly recommend that you should install your distro's microcode
> update package: your microcode should have been revision 0x107 or higher, it
> has been in the public microcode distribution since Q2/Q3-2010...

[    0.000000] microcode: CPU0 microcode updated early to revision 0x107, date = 2009-08-25
[    0.990876] microcode: CPU0 sig=0x106ca, pf=0x4, revision=0x107
[    0.990972] microcode: CPU1 sig=0x106ca, pf=0x4, revision=0x107

No help unfortunately.

> 
> Now, as I said, this is all a shot in the dark, and it might not help at all
> with the hangs.  But it looks like something worth trying...
> 
> -- 
>   "One disk to rule them all, One disk to find them. One disk to bring
>   them all and in the darkness grind them. In the Land of Redmond
>   where the shadows lie." -- The Silicon Valley Tarot
>   Henrique Holschuh
diff mbox

Patch

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index cd4510a..c4a6888 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -61,6 +61,7 @@ 
 #include <linux/notifier.h>
 #include <linux/cpu.h>
 #include <linux/module.h>
+#include <linux/dmi.h>
 #include <asm/cpu_device_id.h>
 #include <asm/mwait.h>
 #include <asm/msr.h>
@@ -925,6 +926,25 @@  static const struct x86_cpu_id intel_idle_ids[] __initconst = {
 };
 MODULE_DEVICE_TABLE(x86cpu, intel_idle_ids);
 
+static int intel_idle_disable_callback(const struct dmi_system_id *id)
+{
+	pr_debug(PREFIX "problematic system (%s), disabling\n", id->ident);
+	return 1;
+}
+
+static const struct dmi_system_id intel_idle_disable_dmi[] = {
+	{
+		/* Lenovo Ideapad S10-3t, hangs coming out of S3 */
+		.callback = intel_idle_disable_callback,
+		.ident = "Lenovo Ideapad S10-3t",
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+			DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo Ideapad S10-3t"),
+		},
+	},
+	{}
+};
+
 /*
  * intel_idle_probe()
  */
@@ -957,6 +977,9 @@  static int __init intel_idle_probe(void)
 	    !mwait_substates)
 			return -ENODEV;
 
+	if (dmi_check_system(intel_idle_disable_dmi))
+		return -ENODEV;
+
 	pr_debug(PREFIX "MWAIT substates: 0x%x\n", mwait_substates);
 
 	icpu = (const struct idle_cpu *)id->driver_data;