diff mbox

[DEBUG] x86, pat/mtrr: MTRR/PAT init earlier for each APs

Message ID 1482229288-30913-1-git-send-email-yu.c.chen@intel.com (mailing list archive)
State Deferred
Headers show

Commit Message

Chen Yu Dec. 20, 2016, 10:21 a.m. UTC
This is a debug patch to descibe/workaround the issue
we encountered recently.

Problem and the cause:
Currently we are suffering from *extremely* slow CPU online
speed during system resuming from S3. Say, the MacBookPro 2015
has 4 CPUs, and it took more than 1 second each for both CPU1
and CPU3 to be brought back to idle thread again. Further ftrace
result showed that, *each* instruction the CPU1 and CPU3 execute
will take much longer time than it will take during normal cpu
online via sysfs(without S3 involved). And more interesting
thing was found that after resumed back, every instruction CPU1 and
CPU3 execute is back to its normal speed(unixbench has the same score
before/after S3). So it smells like there is something wrong with
the cache/tlb settings only during resuming back from S3.
Finally we have found this might be related to BIOS who has
scribbled the mtrr/pat before it resumed back to the OS, and every
instruction seems to be run in an uncached behavior, fortunately
later after all the APs have been brought up again, mtrr_aps_init()
will be invoked to synchronize the mtrr on these APs to the value
once saved by CPU0 before suspended, thus everything is back
to normal after resumed.

Workaround:
So it turns out to be that if we can synchronize the APs with boot CPU
ASAP, rather than waiting till all CPUS online, it might reduce the
impact of the bogus BIOS who scribbled the mtrr/pat. So here is the
hack patch to let the users to synchronize mtrr on APs earlier.
With the following debug patch applied, the resume time for CPU1 and
CPU3 have dropped a lot.

(Notice, the following result were tested with ftrace function_graph enabled
during suspend/resume, by this tool:
https://01.org/suspendresume


Before patch applied:
[  619.810899] Enabling non-boot CPUs ...
[  619.825528] x86: Booting SMP configuration:
[  619.825537] smpboot: Booting Node 0 Processor 1 APIC 0x2
-------skip--------
[  621.723809] CPU1 is up
[  621.762843] smpboot: Booting Node 0 Processor 2 APIC 0x1
-------skip--------
[  621.766679] CPU2 is up
[  621.840228] smpboot: Booting Node 0 Processor 3 APIC 0x3
-------skip--------
[  626.690900] CPU3 is up

So it took CPU1 621.723809 - 619.825537 = 1898.272 ms, and
CPU3 626.690900 - 621.840228 = 4850.672 ms !


After patch applied:
[  106.931790] smpboot: Booting Node 0 Processor 1 APIC 0x2
-------skip--------
[  106.948360] CPU1 is up
[  106.963975] smpboot: Booting Node 0 Processor 2 APIC 0x1
-------skip--------
[  106.968087] CPU2 is up
[  106.986534] smpboot: Booting Node 0 Processor 3 APIC 0x3
-------skip--------
[  106.990702] CPU3 is up

It took CPU1 106.948360 - 106.931790 = 16.57 ms, and
CPU3 106.990702 - 106.986534 = 4.16 ms

Question:
So it turns out to be a BIOS issue, but Linux should also deal with
this bogus BIOS, right? I studied the commit we delay the synchronization
until all the APs are brought up, and according to:

Commit d0af9eed5aa9 ("x86, pat/mtrr: Rendezvous all the cpus
for MTRR/PAT init")

It seems that there would be problem if we do not sync APs at the same
time(some CPUs run with cache disabled will hang the system, because its
sibling is trying to adjust the mtrr which might disable its cache) on
some special platforms? But I have a question that, even in our current
solution which defers the synchronization, the scenario mentioned above
can not be avoided because at the time CPU3 is trying to restore mtrr,
its sibling CPU1 might also be doing some kworker or ticking tasks,
the CPU1 might also run with cache disabled?
I'm not sure if I understand the code correctly, and it would be
appreciated if people could give any comments/suggestions on how to deal
with this situation found on MacProBook, and if you need me to do any test
please feel free to let me know.

Thanks,
Yu

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: "Brandt, Todd E" <todd.e.brandt@intel.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 arch/x86/kernel/cpu/mtrr/main.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Lukas Wunner Dec. 20, 2016, 10:56 a.m. UTC | #1
On Tue, Dec 20, 2016 at 06:21:28PM +0800, Chen Yu wrote:
> This is a debug patch to descibe/workaround the issue
> we encountered recently.
> 
> Problem and the cause:
> Currently we are suffering from *extremely* slow CPU online
> speed during system resuming from S3. Say, the MacBookPro 2015
> has 4 CPUs, and it took more than 1 second each for both CPU1
> and CPU3 to be brought back to idle thread again. Further ftrace
> result showed that, *each* instruction the CPU1 and CPU3 execute
> will take much longer time than it will take during normal cpu
> online via sysfs(without S3 involved). And more interesting
> thing was found that after resumed back, every instruction CPU1 and
> CPU3 execute is back to its normal speed(unixbench has the same score
> before/after S3). So it smells like there is something wrong with
> the cache/tlb settings only during resuming back from S3.
> Finally we have found this might be related to BIOS who has
> scribbled the mtrr/pat before it resumed back to the OS, and every
> instruction seems to be run in an uncached behavior, fortunately
> later after all the APs have been brought up again, mtrr_aps_init()
> will be invoked to synchronize the mtrr on these APs to the value
> once saved by CPU0 before suspended, thus everything is back
> to normal after resumed.

I'm seeing the same issue on a MacBookPro9,1 (Ivy Bridge, 2012,
BIOS MBP91.88Z.00D3.B0C.1509111653 09/11/2015).

I was already wondering what's going on, so thanks a lot for looking
into this.

For me this is a regression because with 4.7.0-rc7 resume was
sufficiently quick.  The issue only started to appear after
I rebased my working branch on 4.8.0-rc8.  I can confirm that
your patch fixes the issue, so FWIW:

Tested-by: Lukas Wunner <lukas@wunner.de>


This is with 4.7.0-rc7, resume was consistently at or below 1 sec:

[  107.080920] ACPI: Low-level resume complete
[  107.080965] ACPI : EC: EC started
[  107.080966] PM: Restoring platform NVS memory
[  107.081341] Enabling non-boot CPUs ...
[  107.101059] x86: Booting SMP configuration:
[  107.101060] smpboot: Booting Node 0 Processor 1 APIC 0x2
[  107.108619]  cache: parent cpu1 should not be sleeping
[  107.135789] CPU1 is up
[  107.199203] smpboot: Booting Node 0 Processor 2 APIC 0x4
[  107.209808]  cache: parent cpu2 should not be sleeping
[  107.240896] CPU2 is up
[  107.314086] smpboot: Booting Node 0 Processor 3 APIC 0x6
[  107.326220]  cache: parent cpu3 should not be sleeping
[  107.357507] CPU3 is up
[  107.449002] smpboot: Booting Node 0 Processor 4 APIC 0x1
[  107.451740]  cache: parent cpu4 should not be sleeping
[  107.452106] CPU4 is up
[  107.549902] smpboot: Booting Node 0 Processor 5 APIC 0x3
[  107.566487]  cache: parent cpu5 should not be sleeping
[  107.624016] CPU5 is up
[  107.787000] smpboot: Booting Node 0 Processor 6 APIC 0x5
[  107.804491]  cache: parent cpu6 should not be sleeping
[  107.869075] CPU6 is up
[  108.057328] smpboot: Booting Node 0 Processor 7 APIC 0x7
[  108.075840]  cache: parent cpu7 should not be sleeping
[  108.149616] CPU7 is up
[  108.155910] ACPI: Waking up from system sleep state S3


This is with 4.8.0-rc8 without your patch, around 4 sec:

[  608.277228] ACPI: Low-level resume complete
[  608.277273] ACPI : EC: EC started
[  608.277274] PM: Restoring platform NVS memory
[  608.277612] Enabling non-boot CPUs ...
[  608.301165] x86: Booting SMP configuration:
[  608.301165] smpboot: Booting Node 0 Processor 1 APIC 0x2
[  608.350399]  cache: parent cpu1 should not be sleeping
[  608.531440] CPU1 is up
[  608.761943] smpboot: Booting Node 0 Processor 2 APIC 0x4
[  608.806085]  cache: parent cpu2 should not be sleeping
[  608.972760] CPU2 is up
[  609.276436] smpboot: Booting Node 0 Processor 3 APIC 0x6
[  609.323774]  cache: parent cpu3 should not be sleeping
[  609.472503] CPU3 is up
[  609.824707] smpboot: Booting Node 0 Processor 4 APIC 0x1
[  609.828097]  cache: parent cpu4 should not be sleeping
[  609.828776] CPU4 is up
[  610.223620] smpboot: Booting Node 0 Processor 5 APIC 0x3
[  610.292411]  cache: parent cpu5 should not be sleeping
[  610.545823] CPU5 is up
[  610.981127] smpboot: Booting Node 0 Processor 6 APIC 0x5
[  611.049177]  cache: parent cpu6 should not be sleeping
[  611.334813] CPU6 is up
[  612.049163] smpboot: Booting Node 0 Processor 7 APIC 0x7
[  612.123333]  cache: parent cpu7 should not be sleeping
[  612.427630] CPU7 is up
[  612.436692] ACPI: Waking up from system sleep state S3


With your patch I'm back to 1 sec:

[   98.177541] ACPI: Low-level resume complete
[   98.177585] ACPI : EC: EC started
[   98.177585] PM: Restoring platform NVS memory
[   98.177951] Enabling non-boot CPUs ...
[   98.201608] x86: Booting SMP configuration:
[   98.201609] smpboot: Booting Node 0 Processor 1 APIC 0x2
[   98.210412]  cache: parent cpu1 should not be sleeping
[   98.238198] CPU1 is up
[   98.307550] smpboot: Booting Node 0 Processor 2 APIC 0x4
[   98.319788]  cache: parent cpu2 should not be sleeping
[   98.353269] CPU2 is up
[   98.428475] smpboot: Booting Node 0 Processor 3 APIC 0x6
[   98.439979]  cache: parent cpu3 should not be sleeping
[   98.478572] CPU3 is up
[   98.585208] smpboot: Booting Node 0 Processor 4 APIC 0x1
[   98.588040]  cache: parent cpu4 should not be sleeping
[   98.588412] CPU4 is up
[   98.702925] smpboot: Booting Node 0 Processor 5 APIC 0x3
[   98.719564]  cache: parent cpu5 should not be sleeping
[   98.786342] CPU5 is up
[   98.959777] smpboot: Booting Node 0 Processor 6 APIC 0x5
[   98.977120]  cache: parent cpu6 should not be sleeping
[   99.048355] CPU6 is up
[   99.241252] smpboot: Booting Node 0 Processor 7 APIC 0x7
[   99.261368]  cache: parent cpu7 should not be sleeping
[   99.339541] CPU7 is up
[   99.345700] ACPI: Waking up from system sleep state S3

Thanks,

Lukas

> 
> Workaround:
> So it turns out to be that if we can synchronize the APs with boot CPU
> ASAP, rather than waiting till all CPUS online, it might reduce the
> impact of the bogus BIOS who scribbled the mtrr/pat. So here is the
> hack patch to let the users to synchronize mtrr on APs earlier.
> With the following debug patch applied, the resume time for CPU1 and
> CPU3 have dropped a lot.
> 
> (Notice, the following result were tested with ftrace function_graph enabled
> during suspend/resume, by this tool:
> https://01.org/suspendresume
> 
> 
> Before patch applied:
> [  619.810899] Enabling non-boot CPUs ...
> [  619.825528] x86: Booting SMP configuration:
> [  619.825537] smpboot: Booting Node 0 Processor 1 APIC 0x2
> -------skip--------
> [  621.723809] CPU1 is up
> [  621.762843] smpboot: Booting Node 0 Processor 2 APIC 0x1
> -------skip--------
> [  621.766679] CPU2 is up
> [  621.840228] smpboot: Booting Node 0 Processor 3 APIC 0x3
> -------skip--------
> [  626.690900] CPU3 is up
> 
> So it took CPU1 621.723809 - 619.825537 = 1898.272 ms, and
> CPU3 626.690900 - 621.840228 = 4850.672 ms !
> 
> 
> After patch applied:
> [  106.931790] smpboot: Booting Node 0 Processor 1 APIC 0x2
> -------skip--------
> [  106.948360] CPU1 is up
> [  106.963975] smpboot: Booting Node 0 Processor 2 APIC 0x1
> -------skip--------
> [  106.968087] CPU2 is up
> [  106.986534] smpboot: Booting Node 0 Processor 3 APIC 0x3
> -------skip--------
> [  106.990702] CPU3 is up
> 
> It took CPU1 106.948360 - 106.931790 = 16.57 ms, and
> CPU3 106.990702 - 106.986534 = 4.16 ms
> 
> Question:
> So it turns out to be a BIOS issue, but Linux should also deal with
> this bogus BIOS, right? I studied the commit we delay the synchronization
> until all the APs are brought up, and according to:
> 
> Commit d0af9eed5aa9 ("x86, pat/mtrr: Rendezvous all the cpus
> for MTRR/PAT init")
> 
> It seems that there would be problem if we do not sync APs at the same
> time(some CPUs run with cache disabled will hang the system, because its
> sibling is trying to adjust the mtrr which might disable its cache) on
> some special platforms? But I have a question that, even in our current
> solution which defers the synchronization, the scenario mentioned above
> can not be avoided because at the time CPU3 is trying to restore mtrr,
> its sibling CPU1 might also be doing some kworker or ticking tasks,
> the CPU1 might also run with cache disabled?
> I'm not sure if I understand the code correctly, and it would be
> appreciated if people could give any comments/suggestions on how to deal
> with this situation found on MacProBook, and if you need me to do any test
> please feel free to let me know.
> 
> Thanks,
> Yu
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Len Brown <len.brown@intel.com>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Suresh Siddha <sbsiddha@gmail.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Lukas Wunner <lukas@wunner.de>
> Cc: "Brandt, Todd E" <todd.e.brandt@intel.com>
> Cc: Rui Zhang <rui.zhang@intel.com>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> ---
>  arch/x86/kernel/cpu/mtrr/main.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
> index 24e87e7..eddaa89 100644
> --- a/arch/x86/kernel/cpu/mtrr/main.c
> +++ b/arch/x86/kernel/cpu/mtrr/main.c
> @@ -813,15 +813,28 @@ void mtrr_save_state(void)
>  	put_online_cpus();
>  }
>  
> +static bool __read_mostly no_aps_delay;
> +
> +static int __init no_aps_setup(char *str)
> +{
> +	no_aps_delay = true;
> +	pr_info("hack: do not delay aps mtrr/pat initialization.\n");
> +
> +	return 0;
> +}
> +
>  void set_mtrr_aps_delayed_init(void)
>  {
>  	if (!mtrr_enabled())
>  		return;
>  	if (!use_intel())
>  		return;
> +	if (no_aps_delay)
> +		return;
>  
>  	mtrr_aps_delayed_init = true;
>  }
> +early_param("no_aps_delay", no_aps_setup);
>  
>  /*
>   * Delayed MTRR initialization for all AP's
> -- 
> 2.7.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu Dec. 21, 2016, 2:32 a.m. UTC | #2
On Tue, Dec 20, 2016 at 11:56:51AM +0100, Lukas Wunner wrote:
Hi Lukas,
> On Tue, Dec 20, 2016 at 06:21:28PM +0800, Chen Yu wrote:
> > This is a debug patch to descibe/workaround the issue
> > we encountered recently.
> > 
> > Problem and the cause:
> > Currently we are suffering from *extremely* slow CPU online
> > speed during system resuming from S3. Say, the MacBookPro 2015
> > has 4 CPUs, and it took more than 1 second each for both CPU1
> > and CPU3 to be brought back to idle thread again. Further ftrace
> > result showed that, *each* instruction the CPU1 and CPU3 execute
> > will take much longer time than it will take during normal cpu
> > online via sysfs(without S3 involved). And more interesting
> > thing was found that after resumed back, every instruction CPU1 and
> > CPU3 execute is back to its normal speed(unixbench has the same score
> > before/after S3). So it smells like there is something wrong with
> > the cache/tlb settings only during resuming back from S3.
> > Finally we have found this might be related to BIOS who has
> > scribbled the mtrr/pat before it resumed back to the OS, and every
> > instruction seems to be run in an uncached behavior, fortunately
> > later after all the APs have been brought up again, mtrr_aps_init()
> > will be invoked to synchronize the mtrr on these APs to the value
> > once saved by CPU0 before suspended, thus everything is back
> > to normal after resumed.
> 
> I'm seeing the same issue on a MacBookPro9,1 (Ivy Bridge, 2012,
> BIOS MBP91.88Z.00D3.B0C.1509111653 09/11/2015).
> 
> I was already wondering what's going on, so thanks a lot for looking
> into this.
> 
> For me this is a regression because with 4.7.0-rc7 resume was
> sufficiently quick.  The issue only started to appear after
> I rebased my working branch on 4.8.0-rc8.  I can confirm that
> your patch fixes the issue, so FWIW:
> 
> Tested-by: Lukas Wunner <lukas@wunner.de>
> 
Thanks for confirming this, and you have also provided an important
information that, this does not appear in 4.7.0-rc7, so I'll rebase to
this versin to check if things are better. Could you also send me your
kernel config when using 4.7.0-rc7? 

> 
> This is with 4.7.0-rc7, resume was consistently at or below 1 sec:
> 
> [  107.080920] ACPI: Low-level resume complete
> [  107.080965] ACPI : EC: EC started
> [  107.080966] PM: Restoring platform NVS memory
> [  107.081341] Enabling non-boot CPUs ...
> [  107.101059] x86: Booting SMP configuration:
> [  107.101060] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [  107.108619]  cache: parent cpu1 should not be sleeping
> [  107.135789] CPU1 is up
> [  107.199203] smpboot: Booting Node 0 Processor 2 APIC 0x4
> [  107.209808]  cache: parent cpu2 should not be sleeping
> [  107.240896] CPU2 is up
> [  107.314086] smpboot: Booting Node 0 Processor 3 APIC 0x6
> [  107.326220]  cache: parent cpu3 should not be sleeping
> [  107.357507] CPU3 is up
> [  107.449002] smpboot: Booting Node 0 Processor 4 APIC 0x1
> [  107.451740]  cache: parent cpu4 should not be sleeping
> [  107.452106] CPU4 is up
> [  107.549902] smpboot: Booting Node 0 Processor 5 APIC 0x3
> [  107.566487]  cache: parent cpu5 should not be sleeping
> [  107.624016] CPU5 is up
> [  107.787000] smpboot: Booting Node 0 Processor 6 APIC 0x5
> [  107.804491]  cache: parent cpu6 should not be sleeping
> [  107.869075] CPU6 is up
> [  108.057328] smpboot: Booting Node 0 Processor 7 APIC 0x7
> [  108.075840]  cache: parent cpu7 should not be sleeping
> [  108.149616] CPU7 is up
> [  108.155910] ACPI: Waking up from system sleep state S3
> 
> 
> This is with 4.8.0-rc8 without your patch, around 4 sec:
> 
> [  608.277228] ACPI: Low-level resume complete
> [  608.277273] ACPI : EC: EC started
> [  608.277274] PM: Restoring platform NVS memory
> [  608.277612] Enabling non-boot CPUs ...
> [  608.301165] x86: Booting SMP configuration:
> [  608.301165] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [  608.350399]  cache: parent cpu1 should not be sleeping
> [  608.531440] CPU1 is up
> [  608.761943] smpboot: Booting Node 0 Processor 2 APIC 0x4
> [  608.806085]  cache: parent cpu2 should not be sleeping
> [  608.972760] CPU2 is up
> [  609.276436] smpboot: Booting Node 0 Processor 3 APIC 0x6
> [  609.323774]  cache: parent cpu3 should not be sleeping
> [  609.472503] CPU3 is up
> [  609.824707] smpboot: Booting Node 0 Processor 4 APIC 0x1
> [  609.828097]  cache: parent cpu4 should not be sleeping
> [  609.828776] CPU4 is up
> [  610.223620] smpboot: Booting Node 0 Processor 5 APIC 0x3
> [  610.292411]  cache: parent cpu5 should not be sleeping
> [  610.545823] CPU5 is up
> [  610.981127] smpboot: Booting Node 0 Processor 6 APIC 0x5
> [  611.049177]  cache: parent cpu6 should not be sleeping
> [  611.334813] CPU6 is up
> [  612.049163] smpboot: Booting Node 0 Processor 7 APIC 0x7
> [  612.123333]  cache: parent cpu7 should not be sleeping
> [  612.427630] CPU7 is up
> [  612.436692] ACPI: Waking up from system sleep state S3
> 
> 
> With your patch I'm back to 1 sec:
> 
> [   98.177541] ACPI: Low-level resume complete
> [   98.177585] ACPI : EC: EC started
> [   98.177585] PM: Restoring platform NVS memory
> [   98.177951] Enabling non-boot CPUs ...
> [   98.201608] x86: Booting SMP configuration:
> [   98.201609] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [   98.210412]  cache: parent cpu1 should not be sleeping
> [   98.238198] CPU1 is up
> [   98.307550] smpboot: Booting Node 0 Processor 2 APIC 0x4
> [   98.319788]  cache: parent cpu2 should not be sleeping
> [   98.353269] CPU2 is up
> [   98.428475] smpboot: Booting Node 0 Processor 3 APIC 0x6
> [   98.439979]  cache: parent cpu3 should not be sleeping
> [   98.478572] CPU3 is up
> [   98.585208] smpboot: Booting Node 0 Processor 4 APIC 0x1
> [   98.588040]  cache: parent cpu4 should not be sleeping
> [   98.588412] CPU4 is up
> [   98.702925] smpboot: Booting Node 0 Processor 5 APIC 0x3
> [   98.719564]  cache: parent cpu5 should not be sleeping
> [   98.786342] CPU5 is up
> [   98.959777] smpboot: Booting Node 0 Processor 6 APIC 0x5
> [   98.977120]  cache: parent cpu6 should not be sleeping
> [   99.048355] CPU6 is up
> [   99.241252] smpboot: Booting Node 0 Processor 7 APIC 0x7
> [   99.261368]  cache: parent cpu7 should not be sleeping
> [   99.339541] CPU7 is up
> [   99.345700] ACPI: Waking up from system sleep state S3
> 
> Thanks,
> 
> Lukas
>
Thanks,
Yu
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu Dec. 21, 2016, 6:51 a.m. UTC | #3
On Tue, Dec 20, 2016 at 11:56:51AM +0100, Lukas Wunner wrote:
> On Tue, Dec 20, 2016 at 06:21:28PM +0800, Chen Yu wrote:
> > This is a debug patch to descibe/workaround the issue
> > we encountered recently.
> > 
> > Problem and the cause:
> > Currently we are suffering from *extremely* slow CPU online
> > speed during system resuming from S3. Say, the MacBookPro 2015
> > has 4 CPUs, and it took more than 1 second each for both CPU1
> > and CPU3 to be brought back to idle thread again. Further ftrace
> > result showed that, *each* instruction the CPU1 and CPU3 execute
> > will take much longer time than it will take during normal cpu
> > online via sysfs(without S3 involved). And more interesting
> > thing was found that after resumed back, every instruction CPU1 and
> > CPU3 execute is back to its normal speed(unixbench has the same score
> > before/after S3). So it smells like there is something wrong with
> > the cache/tlb settings only during resuming back from S3.
> > Finally we have found this might be related to BIOS who has
> > scribbled the mtrr/pat before it resumed back to the OS, and every
> > instruction seems to be run in an uncached behavior, fortunately
> > later after all the APs have been brought up again, mtrr_aps_init()
> > will be invoked to synchronize the mtrr on these APs to the value
> > once saved by CPU0 before suspended, thus everything is back
> > to normal after resumed.
> 
> I'm seeing the same issue on a MacBookPro9,1 (Ivy Bridge, 2012,
> BIOS MBP91.88Z.00D3.B0C.1509111653 09/11/2015).
> 
> I was already wondering what's going on, so thanks a lot for looking
> into this.
> 
> For me this is a regression because with 4.7.0-rc7 resume was
> sufficiently quick.  The issue only started to appear after
> I rebased my working branch on 4.8.0-rc8.  I can confirm that
> your patch fixes the issue, so FWIW:
> 
> Tested-by: Lukas Wunner <lukas@wunner.de>
>
I've tested on 4.7.0-rc7, and the problem is still there,
it tooks nearly 2 second to bring up these cpus, after the
patch applied on top of 4.7.0-rc7, the time has dropped to 30ms.
Maybe because the mtrr after restored are random, thus we see different
result. Let's wait if other experts would give any clue on this :)

Original 4.7.0-rc7
[ 1895.937202] Enabling non-boot CPUs ...
[ 1895.960590] x86: Booting SMP configuration:
[ 1895.960599] smpboot: Booting Node 0 Processor 1 APIC 0x2
[ 1896.042237]  cache: parent cpu1 should not be sleeping
[ 1896.278915] CPU1 is up
[ 1896.326531] hrtimer: interrupt took 21233972 ns
[ 1896.497384] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 2.551 msecs
[ 1896.499428] perf: interrupt took too long (15541 > 2500), lowering kernel.perf_event_max_sample_rate to 12750
[ 1898.533643] smpboot: Booting Node 0 Processor 2 APIC 0x1
[ 1898.539348]  cache: parent cpu2 should not be sleeping
[ 1898.583167] CPU2 is up
[ 1900.938878] smpboot: Booting Node 0 Processor 3 APIC 0x3
[ 1901.002698]  cache: parent cpu3 should not be sleeping
[ 1901.584319] perf: interrupt took too long (20241 > 19426), lowering kernel.perf_event_max_sample_rate to 9750
[ 1901.605220] CPU3 is up


After patch applied on 4.7.0-rc7:
[   93.580823] Enabling non-boot CPUs ...
[   93.613049] x86: Booting SMP configuration:
[   93.613057] smpboot: Booting Node 0 Processor 1 APIC 0x2
[   93.620355]  cache: parent cpu1 should not be sleeping
[   93.620773] CPU1 is up
[   93.642325] smpboot: Booting Node 0 Processor 2 APIC 0x1
[   93.646009]  cache: parent cpu2 should not be sleeping
[   93.646293] CPU2 is up
[   93.675439] smpboot: Booting Node 0 Processor 3 APIC 0x3
[   93.679256]  cache: parent cpu3 should not be sleeping
[   93.679573] CPU3 is up
[   93.679582] ACPI: Waking up from system sleep state S3

> 
> This is with 4.7.0-rc7, resume was consistently at or below 1 sec:
> 
> [  107.080920] ACPI: Low-level resume complete
> [  107.080965] ACPI : EC: EC started
> [  107.080966] PM: Restoring platform NVS memory
> [  107.081341] Enabling non-boot CPUs ...
> [  107.101059] x86: Booting SMP configuration:
> [  107.101060] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [  107.108619]  cache: parent cpu1 should not be sleeping
> [  107.135789] CPU1 is up
> [  107.199203] smpboot: Booting Node 0 Processor 2 APIC 0x4
> [  107.209808]  cache: parent cpu2 should not be sleeping
> [  107.240896] CPU2 is up
> [  107.314086] smpboot: Booting Node 0 Processor 3 APIC 0x6
> [  107.326220]  cache: parent cpu3 should not be sleeping
> [  107.357507] CPU3 is up
> [  107.449002] smpboot: Booting Node 0 Processor 4 APIC 0x1
> [  107.451740]  cache: parent cpu4 should not be sleeping
> [  107.452106] CPU4 is up
> [  107.549902] smpboot: Booting Node 0 Processor 5 APIC 0x3
> [  107.566487]  cache: parent cpu5 should not be sleeping
> [  107.624016] CPU5 is up
> [  107.787000] smpboot: Booting Node 0 Processor 6 APIC 0x5
> [  107.804491]  cache: parent cpu6 should not be sleeping
> [  107.869075] CPU6 is up
> [  108.057328] smpboot: Booting Node 0 Processor 7 APIC 0x7
> [  108.075840]  cache: parent cpu7 should not be sleeping
> [  108.149616] CPU7 is up
> [  108.155910] ACPI: Waking up from system sleep state S3
> 
> 
> This is with 4.8.0-rc8 without your patch, around 4 sec:
> 
> [  608.277228] ACPI: Low-level resume complete
> [  608.277273] ACPI : EC: EC started
> [  608.277274] PM: Restoring platform NVS memory
> [  608.277612] Enabling non-boot CPUs ...
> [  608.301165] x86: Booting SMP configuration:
> [  608.301165] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [  608.350399]  cache: parent cpu1 should not be sleeping
> [  608.531440] CPU1 is up
> [  608.761943] smpboot: Booting Node 0 Processor 2 APIC 0x4
> [  608.806085]  cache: parent cpu2 should not be sleeping
> [  608.972760] CPU2 is up
> [  609.276436] smpboot: Booting Node 0 Processor 3 APIC 0x6
> [  609.323774]  cache: parent cpu3 should not be sleeping
> [  609.472503] CPU3 is up
> [  609.824707] smpboot: Booting Node 0 Processor 4 APIC 0x1
> [  609.828097]  cache: parent cpu4 should not be sleeping
> [  609.828776] CPU4 is up
> [  610.223620] smpboot: Booting Node 0 Processor 5 APIC 0x3
> [  610.292411]  cache: parent cpu5 should not be sleeping
> [  610.545823] CPU5 is up
> [  610.981127] smpboot: Booting Node 0 Processor 6 APIC 0x5
> [  611.049177]  cache: parent cpu6 should not be sleeping
> [  611.334813] CPU6 is up
> [  612.049163] smpboot: Booting Node 0 Processor 7 APIC 0x7
> [  612.123333]  cache: parent cpu7 should not be sleeping
> [  612.427630] CPU7 is up
> [  612.436692] ACPI: Waking up from system sleep state S3
> 
> 
> With your patch I'm back to 1 sec:
> 
> [   98.177541] ACPI: Low-level resume complete
> [   98.177585] ACPI : EC: EC started
> [   98.177585] PM: Restoring platform NVS memory
> [   98.177951] Enabling non-boot CPUs ...
> [   98.201608] x86: Booting SMP configuration:
> [   98.201609] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [   98.210412]  cache: parent cpu1 should not be sleeping
> [   98.238198] CPU1 is up
> [   98.307550] smpboot: Booting Node 0 Processor 2 APIC 0x4
> [   98.319788]  cache: parent cpu2 should not be sleeping
> [   98.353269] CPU2 is up
> [   98.428475] smpboot: Booting Node 0 Processor 3 APIC 0x6
> [   98.439979]  cache: parent cpu3 should not be sleeping
> [   98.478572] CPU3 is up
> [   98.585208] smpboot: Booting Node 0 Processor 4 APIC 0x1
> [   98.588040]  cache: parent cpu4 should not be sleeping
> [   98.588412] CPU4 is up
> [   98.702925] smpboot: Booting Node 0 Processor 5 APIC 0x3
> [   98.719564]  cache: parent cpu5 should not be sleeping
> [   98.786342] CPU5 is up
> [   98.959777] smpboot: Booting Node 0 Processor 6 APIC 0x5
> [   98.977120]  cache: parent cpu6 should not be sleeping
> [   99.048355] CPU6 is up
> [   99.241252] smpboot: Booting Node 0 Processor 7 APIC 0x7
> [   99.261368]  cache: parent cpu7 should not be sleeping
> [   99.339541] CPU7 is up
> [   99.345700] ACPI: Waking up from system sleep state S3
> 
> Thanks,
> 
> Lukas
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Wunner Dec. 21, 2016, 9:54 a.m. UTC | #4
On Wed, Dec 21, 2016 at 02:51:02PM +0800, Chen Yu wrote:
> On Tue, Dec 20, 2016 at 11:56:51AM +0100, Lukas Wunner wrote:
> > On Tue, Dec 20, 2016 at 06:21:28PM +0800, Chen Yu wrote:
> > > This is a debug patch to descibe/workaround the issue
> > > we encountered recently.
> > > 
> > > Problem and the cause:
> > > Currently we are suffering from *extremely* slow CPU online
> > > speed during system resuming from S3. Say, the MacBookPro 2015
> > > has 4 CPUs, and it took more than 1 second each for both CPU1
> > > and CPU3 to be brought back to idle thread again. Further ftrace
> > > result showed that, *each* instruction the CPU1 and CPU3 execute
> > > will take much longer time than it will take during normal cpu
> > > online via sysfs(without S3 involved). And more interesting
> > > thing was found that after resumed back, every instruction CPU1 and
> > > CPU3 execute is back to its normal speed(unixbench has the same score
> > > before/after S3). So it smells like there is something wrong with
> > > the cache/tlb settings only during resuming back from S3.
> > > Finally we have found this might be related to BIOS who has
> > > scribbled the mtrr/pat before it resumed back to the OS, and every
> > > instruction seems to be run in an uncached behavior, fortunately
> > > later after all the APs have been brought up again, mtrr_aps_init()
> > > will be invoked to synchronize the mtrr on these APs to the value
> > > once saved by CPU0 before suspended, thus everything is back
> > > to normal after resumed.
> > 
> > I'm seeing the same issue on a MacBookPro9,1 (Ivy Bridge, 2012,
> > BIOS MBP91.88Z.00D3.B0C.1509111653 09/11/2015).
> > 
> > I was already wondering what's going on, so thanks a lot for looking
> > into this.
> > 
> > For me this is a regression because with 4.7.0-rc7 resume was
> > sufficiently quick.  The issue only started to appear after
> > I rebased my working branch on 4.8.0-rc8.  I can confirm that
> > your patch fixes the issue, so FWIW:
> > 
> > Tested-by: Lukas Wunner <lukas@wunner.de>
>
> I've tested on 4.7.0-rc7, and the problem is still there,
> it tooks nearly 2 second to bring up these cpus, after the
> patch applied on top of 4.7.0-rc7, the time has dropped to 30ms.
> Maybe because the mtrr after restored are random, thus we see different
> result. Let's wait if other experts would give any clue on this :)

Feeding /var/log/kern.log into this quick perl hack results in the numbers
included below.  The 4 sec resume began with 4.8-rc8 for me, but it's
not 100% reproducible, sometimes it just takes 1 sec even with 4.8-rc8,
likely as you say because of random content in the relevant registers:

perl -n -e '
if (/Linux version ([^ ]+)/) {
	$ver = "$1";
} elsif (/\[ *([0-9.]+)\] ACPI: Low-level/) {
	$start = $1;
} elsif (/\[ *([0-9.]+)\] ACPI: Waking up/) {
	$end = $1;
	print $ver . "\t" . ($end - $start) . "\n";
}'

Thanks,

Lukas

-- snip --

4.5.0-rc5-amd64	0.791821999999996
4.5.0-rc5-amd64	0.784248000000048
4.5.0-rc5-amd64	0.755302000000029
4.5.0-rc5-amd64	0.872710000000041
4.5.0-rc5-amd64	0.811196999999993
4.5.0-rc5-amd64	0.619765999999998
4.5.0-rc5-amd64	0.847110000000001
4.5.0-rc5-amd64	0.583031000000005
4.5.0-rc5-amd64	0.619109999999999
4.5.0-rc5-amd64	0.833584999999999
4.5.0-rc5-amd64	0.899036000000024
4.5.0-rc5-amd64	0.912164000000004
4.5.0-rc5-amd64	0.886482999999998
4.5.0-rc5-amd64	0.724455999999996
4.5.0-rc5-amd64	0.900266000000002
4.5.0-rc5-amd64	0.848599000000007
4.5.0-rc5-amd64	0.805461000000008
4.5.0-rc5-amd64	0.711756999999992
4.5.0-rc5-amd64	0.546851000000004
4.5.0-rc5-amd64	0.779108000000008
4.5.0-rc5-amd64	0.904349999999795
4.5.0-rc5-amd64	0.8104629999998
4.5.0-rc5-amd64	0.810886000000004
4.5.0-rc5-amd64	0.806125999999999
4.5.0-rc5-amd64	0.83456000000001
4.5.0-rc5-amd64	0.809259999999995
4.5.0-rc5-amd64	0.657921000000002
4.5.0-rc5-amd64	0.710526999999956
4.5.0-rc5-amd64	0.827078000000029
4.5.0-rc5-amd64	0.82047
4.5.0-rc5-amd64	0.545153999999911
4.5.0-rc5-amd64	0.617734000000041
4.5.0-rc5-amd64	0.884256000000001
4.5.0-rc5-amd64	0.629543999999981
4.5.0-rc5-amd64	0.902484999999999
4.5.0-rc5-amd64	0.80895799999999
4.5.0-rc5-amd64	0.651888999999983
4.5.0-rc5-amd64	0.681414999999902
4.5.0-rc5-amd64	0.908006
4.5.0-rc5-amd64	0.810744
4.5.0-rc5-amd64	0.673935
4.5.0-rc5-amd64	0.921460999999994
4.5.0-rc5-amd64	0.875963999999996
4.5.0-rc5-amd64	0.904393999999996
4.5.0-rc5-amd64	0.771819999999998
4.5.0-rc5-amd64	0.829245999999998
4.5.0-rc5-amd64	0.884001000000126
4.5.0-rc5-amd64	0.925242999999995
4.5.0-rc5-amd64	0.807276999999999
4.5.0-rc5-amd64	0.77618600000001
4.5.0-rc5-amd64	0.891590000000008
4.5.0-rc5-amd64	0.832144999999997
4.5.0-rc5-amd64	0.664855000000017
4.5.0-rc5-amd64	0.743611000000016
4.5.0-rc5-amd64	0.572099999999978
4.5.0-rc5-amd64	0.968667000000039
4.5.0-rc5-amd64	0.835151999999994
4.5.0-rc5-amd64	0.835349000000008
4.5.0-rc5-amd64	0.698092000000003
4.5.0-rc5-amd64	0.743515000000002
4.5.0-rc5-amd64	0.916257999999999
4.5.0-rc5-amd64	0.801462000000001
4.5.0-rc5-amd64	0.847062999999999
4.5.0-rc5-amd64	0.837590999999989
4.5.0-rc5-amd64	0.878129999999999
4.5.0-rc5-amd64	0.911189000000007
4.5.0-rc5-amd64	0.841833000000008
4.5.0-rc5-amd64	0.69529
4.5.0-rc5-amd64	0.568078
4.5.0-rc5-amd64	0.813648999999998
4.5.0-rc5-amd64	0.801636999999999
4.5.0-rc5-amd64	0.790887999999995
4.5.0-rc5-amd64	0.828260999999998
4.5.0-rc5-amd64	0.950163000000003
4.5.0-rc5-amd64	0.916289999999996
4.5.0-rc5-amd64	0.644873000000004
4.5.0-rc5-amd64	0.61748499999976
4.5.0-rc5-amd64	0.846859000000009
4.5.0-rc5-amd64	0.902849000000003
4.5.0-rc5-amd64	0.903951000000006
4.5.0-rc5-amd64	0.929685000000006
4.5.0-rc5-amd64	0.595804000000044
4.5.0-rc5-amd64	0.807150999999976
4.5.0-rc5-amd64	0.918833000000006
4.5.0-rc5-amd64	0.621095000000004
4.5.0-rc5-amd64	0.797015000000002
4.5.0-rc5-amd64	0.798920000000066
4.5.0-rc5-amd64	0.844256999999971
4.5.0-rc5-amd64	0.545507999999927
4.5.0-rc5-amd64	0.558537999999999
4.5.0-rc5-amd64	0.523992000000135
4.5.0-rc5-amd64	0.862690000000001
4.5.0-rc5-amd64	0.854602
4.5.0-rc5-amd64	0.535631999999964
4.5.0-rc5-amd64	0.758905000000027
4.5.0-rc5-amd64	0.763448999999923
4.5.0-rc5-amd64	0.857717000000001
4.5.0-rc5-amd64	0.934520999999989
4.5.0-rc5-amd64	0.883544000000001
4.5.0-rc5-amd64	0.786261000000081
4.5.0-rc5-amd64	0.620818999999983
4.5.0-rc5-amd64	0.69337500000006
4.5.0-rc5-amd64	0.847121000000016
4.5.0-rc5-amd64	0.811431000000084
4.5.0-rc5-amd64	0.958880999999991
4.5.0-rc5-amd64	0.771730999999932
4.5.0-rc5-amd64	0.916927999999999
4.5.0-rc5-amd64	0.640732
4.5.0-rc5-amd64	0.770714999999996
4.5.0-rc5-amd64	0.822458999999981
4.5.0-rc5-amd64	0.922315999999995
4.5.0-rc5-amd64	0.849181999999985
4.5.0-rc5-amd64	0.898053999999945
4.5.0-rc5-amd64	0.889291000000014
4.5.0-rc5-amd64	0.816744999999969
4.5.0-rc5-amd64	0.603718000000001
4.5.0-rc5-amd64	0.809477000000001
4.5.0-rc5-amd64	0.647982999999996
4.5.0-rc5-amd64	0.569856999999999
4.5.0-rc5-amd64	0.836538000000004
4.5.0-rc5-amd64	0.503213000000017
4.5.0-rc5-amd64	0.921850999999997
4.5.0-rc5-amd64	0.819006000000002
4.5.0-rc5-amd64	0.690811000000053
4.5.0-rc5-amd64	0.844243000000006
4.5.0-rc5-amd64	0.908578999999918
4.5.0-rc5-amd64	0.617271000000073
4.5.0-rc5-amd64	0.916234000000031
4.5.0-rc5-amd64	0.808912999999961
4.5.0-rc5-amd64	0.774151999999958
4.5.0-rc5-amd64	0.905732000000057
4.5.0-rc5-amd64	0.806829999999991
4.5.0-rc5-amd64	0.847582999999986
4.5.0-rc5-amd64	0.898519999999962
4.5.0-rc5-amd64	0.888661000000013
4.5.0-rc5-amd64	1.05587100000002
4.5.0-rc5-amd64	0.831493000000023
4.5.0-rc5-amd64	0.783102999999983
4.5.0-rc5-amd64	0.859454000001278
4.5.0-rc5-amd64	0.903360000000248
4.5.0-rc5-amd64	0.590835999999399
4.5.0-rc5-amd64	0.810430999998061
4.5.0-rc5-amd64	0.868598000000929
4.6.0-rc6-amd64	0.858732000000032
4.6.0-rc6-amd64	0.768321000000014
4.6.0-rc6-amd64	0.999735999999984
4.6.0-rc6-amd64	0.935766000000058
4.6.0-rc6-amd64	1.074747
4.6.0-rc6-amd64	0.871172000000001
4.6.0-rc6-amd64	0.735118
4.6.0-rc6-amd64	0.688977000000023
4.6.0-rc6-amd64	0.98784999999998
4.6.0-rc6-amd64	0.902268000000007
4.6.0-rc6-amd64	0.687322000000002
4.6.0-rc6-amd64	0.966214999999977
4.6.0-rc6-amd64	0.945695999999998
4.6.0-rc6-amd64	0.984290999999985
4.6.0-rc6-amd64	1.00772600000002
4.6.0-rc6-amd64	0.832724999999982
4.6.0-rc6-amd64	1.007998
4.6.0-rc6-amd64	0.928820999999999
4.6.0-rc6-amd64	0.724267999999995
4.6.0-rc6-amd64	0.779960000000003
4.6.0-rc6-amd64	0.928005999999996
4.6.0-rc6-amd64	0.785260999999998
4.6.0-rc6-amd64	0.938457999999997
4.6.0-rc6-amd64	0.718059000000011
4.6.0-rc6-amd64	0.934480999999998
4.6.0-rc6-amd64	0.895643999999997
4.6.0-rc6-amd64	0.776068000000002
4.6.0-rc6-amd64	0.948469999999986
4.6.0-rc6-amd64	0.712186999999972
4.6.0-rc6-amd64	0.749055999999996
4.6.0-rc6-amd64	1.024166
4.6.0-rc6-amd64	0.742650000000026
4.6.0-rc6-amd64	1.108676
4.6.0-rc6-amd64	1.17285699999996
4.6.0-rc6-amd64	0.78110300000003
4.6.0-rc6-amd64	0.811090000000036
4.6.0-rc6-amd64	0.997298999999998
4.6.0-rc6-amd64	0.788564000000065
4.6.0-rc6-amd64	0.864753999999948
4.6.0-rc6-amd64	0.539875999999936
4.6.0-rc6-amd64	0.859683000000018
4.6.0-rc6-amd64	0.84412199999997
4.6.0-rc6-amd64	0.911157000000003
4.6.0-rc6-amd64	1.08518399999991
4.6.0-rc6-amd64	0.975603999999976
4.6.0-rc6-amd64	1.01276200000001
4.6.0-rc6-amd64	0.957143000000002
4.6.0-rc6-amd64	0.838584999999995
4.6.0-rc6-amd64	1.027795
4.6.0-rc6-amd64	0.837847999999994
4.6.0-rc6-amd64	1.00810899999988
4.6.0-rc6-amd64	1.03576699999999
4.6.0-rc6-amd64	0.825954000000024
4.6.0-rc6-amd64	0.915140000000008
4.6.0-rc6-amd64	0.875745999999999
4.6.0-rc6-amd64	1.068602
4.6.0-rc6-amd64	0.880808999999999
4.6.0-rc6-amd64	0.848484999999997
4.6.0-rc6-amd64	0.98555300000001
4.6.0-rc6-amd64	1.098177
4.6.0-rc6-amd64	0.969982999999999
4.6.0-rc6-amd64	1.06376
4.6.0-rc6-amd64	1.03072
4.6.0-rc6-amd64	0.957211999999998
4.6.0-rc6-amd64	0.725653000000001
4.6.0-rc6-amd64	0.739570000000001
4.6.0-rc6-amd64	0.861633000000005
4.6.0-rc6-amd64	0.877488
4.6.0-rc6-amd64	0.962841999999995
4.7.0-rc7-amd64	0.854735000000005
4.7.0-rc7-amd64	0.827500000000001
4.7.0-rc7-amd64	1.057783
4.7.0-rc7-amd64	0.884496999999982
4.7.0-rc7-amd64	0.523160000000018
4.7.0-rc7-amd64	1.05743199999995
4.7.0-rc7-amd64	1.04921400000001
4.7.0-rc7-amd64	1.080662
4.7.0-rc7-amd64	1.043806
4.7.0-rc7-amd64	1.01401000000001
4.7.0-rc7-amd64	0.513013999999998
4.7.0-rc7-amd64	1.042517
4.7.0-rc7-amd64	0.389202999999981
4.7.0-rc7-amd64	0.401685999999927
4.7.0-rc7-amd64	0.67577899999992
4.7.0-rc7-amd64	1.08775700000001
4.7.0-rc7-amd64	1.079895
4.7.0-rc7-amd64	0.929226999999997
4.7.0-rc7-amd64	1.048232
4.7.0-rc7-amd64	1.04880600000001
4.7.0-rc7-amd64	1.02780900000005
4.7.0-rc7-amd64	1.07356300000001
4.7.0-rc7-amd64	0.748491999999999
4.7.0-rc7-amd64	1.06909499999999
4.7.0-rc7-amd64	0.53069
4.7.0-rc7-amd64	0.645086000000049
4.7.0-rc7-amd64	0.772515999999996
4.7.0-rc7-amd64	1.0279589999999
4.7.0-rc7-amd64	1.07909899999993
4.7.0-rc7-amd64	0.989211999999995
4.7.0-rc7-amd64	0.655788000000001
4.7.0-rc7-amd64	0.680987999999957
4.7.0-rc7-amd64	0.530302000000006
4.7.0-rc7-amd64	1.06296600000002
4.7.0-rc7-amd64	1.06546299999991
4.7.0-rc7-amd64	1.09333399999991
4.7.0-rc7-amd64	1.016413
4.7.0-rc7-amd64	0.892668000000015
4.7.0-rc7-amd64	0.411213000000032
4.7.0-rc7-amd64	0.744064000000002
4.7.0-rc7-amd64	0.924797000000012
4.7.0-rc7-amd64	0.953418000000056
4.7.0-rc7-amd64	1.10205900000005
4.7.0-rc7-amd64	0.933835999999999
4.7.0-rc7-amd64	0.937736000000001
4.7.0-rc7-amd64	1.09499199999999
4.7.0-rc7-amd64	0.961829999999999
4.7.0-rc7-amd64	0.515383
4.7.0-rc7-amd64	0.524988000000008
4.7.0-rc7-amd64	1.090538
4.7.0-rc7-amd64	0.527456000000001
4.7.0-rc7-amd64	1.088252
4.7.0-rc7-amd64	0.970280000000002
4.7.0-rc7-amd64	1.50766499999997
4.7.0-rc7-amd64	0.666339999999991
4.7.0-rc7-amd64	1.090823
4.7.0-rc7-amd64	0.53107
4.7.0-rc7-amd64	1.07499
4.7.0-rc7-amd64	0.390456
4.7.0-rc7-amd64	0.991683999999992
4.7.0-rc7-amd64	1.10761199999999
4.7.0-rc7-amd64	0.691510999999998
4.7.0-rc7-amd64	0.543309999999991
4.7.0-rc7-amd64	1.033035
4.7.0-rc7-amd64	0.550754999999995
4.7.0-rc7-amd64	0.931038999999998
4.7.0-rc7-amd64	0.419938999999999
4.7.0-rc7-amd64	0.957441000000017
4.7.0-rc7-amd64	0.953374999999994
4.7.0-rc7-amd64	1.02400699999998
4.7.0-rc7-amd64	0.551457999999997
4.7.0-rc7-amd64	1.09606400000001
4.7.0-rc7-amd64	1.112664
4.7.0-rc7-amd64	0.545329000000038
4.8.0-rc8-amd64	4.15946399999996
4.8.0-rc8-amd64	1.95669999999996
4.8.0-rc8-amd64	4.32400200000006
4.8.0-rc8-amd64	2.02430000000004
4.8.0-rc8-amd64	1.90040800000008
4.8.0-rc8-amd64	4.37152000000003
4.8.0-rc8-amd64	4.383017
4.8.0-rc8-amd64	4.199254
4.8.0-rc8-amd64	4.44121400000017
4.8.0-rc8-amd64	4.14728700000001
4.8.0-rc8-amd64	4.33939000000009
4.8.0-rc8-amd64	4.203777
4.8.0-rc8-amd64	0.587456000000003
4.8.0-rc8-amd64	4.39352499999998
4.8.0-rc8-amd64	1.969683
4.8.0-rc8-amd64	4.31196299999999
4.8.0-rc8-amd64	4.23275299999997
4.8.0-rc8-amd64	1.95122600000001
4.8.0-rc8-amd64	4.515658
4.8.0-rc8-amd64	3.855763
4.8.0-rc8-amd64	4.33339400000114
4.8.0-rc8-amd64	4.10148699999991
4.8.0-rc8-amd64	1.11672299999998
4.8.0-rc8-amd64	1.10457399999996
4.8.0-rc8-amd64	2.56593599999997
4.8.0-rc8-amd64	1.99261300000035
4.8.0-rc8-amd64	1.96518300000025
4.8.0-rc8-amd64	4.36082399999941
4.8.0-rc8-amd64	4.36475400000018
4.8.0-rc8-amd64	1.94891699999971
4.8.0-rc8-amd64	4.06829799999923
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu Jan. 16, 2017, 2:56 a.m. UTC | #5
Hi experts,
do you have any clue on this? thanks.

On Tue, Dec 20, 2016 at 06:21:28PM +0800, Chen Yu wrote:
> This is a debug patch to descibe/workaround the issue
> we encountered recently.
> 
> Problem and the cause:
> Currently we are suffering from *extremely* slow CPU online
> speed during system resuming from S3. Say, the MacBookPro 2015
> has 4 CPUs, and it took more than 1 second each for both CPU1
> and CPU3 to be brought back to idle thread again. Further ftrace
> result showed that, *each* instruction the CPU1 and CPU3 execute
> will take much longer time than it will take during normal cpu
> online via sysfs(without S3 involved). And more interesting
> thing was found that after resumed back, every instruction CPU1 and
> CPU3 execute is back to its normal speed(unixbench has the same score
> before/after S3). So it smells like there is something wrong with
> the cache/tlb settings only during resuming back from S3.
> Finally we have found this might be related to BIOS who has
> scribbled the mtrr/pat before it resumed back to the OS, and every
> instruction seems to be run in an uncached behavior, fortunately
> later after all the APs have been brought up again, mtrr_aps_init()
> will be invoked to synchronize the mtrr on these APs to the value
> once saved by CPU0 before suspended, thus everything is back
> to normal after resumed.
> 
> Workaround:
> So it turns out to be that if we can synchronize the APs with boot CPU
> ASAP, rather than waiting till all CPUS online, it might reduce the
> impact of the bogus BIOS who scribbled the mtrr/pat. So here is the
> hack patch to let the users to synchronize mtrr on APs earlier.
> With the following debug patch applied, the resume time for CPU1 and
> CPU3 have dropped a lot.
> 
> (Notice, the following result were tested with ftrace function_graph enabled
> during suspend/resume, by this tool:
> https://01.org/suspendresume
> 
> 
> Before patch applied:
> [  619.810899] Enabling non-boot CPUs ...
> [  619.825528] x86: Booting SMP configuration:
> [  619.825537] smpboot: Booting Node 0 Processor 1 APIC 0x2
> -------skip--------
> [  621.723809] CPU1 is up
> [  621.762843] smpboot: Booting Node 0 Processor 2 APIC 0x1
> -------skip--------
> [  621.766679] CPU2 is up
> [  621.840228] smpboot: Booting Node 0 Processor 3 APIC 0x3
> -------skip--------
> [  626.690900] CPU3 is up
> 
> So it took CPU1 621.723809 - 619.825537 = 1898.272 ms, and
> CPU3 626.690900 - 621.840228 = 4850.672 ms !
> 
> 
> After patch applied:
> [  106.931790] smpboot: Booting Node 0 Processor 1 APIC 0x2
> -------skip--------
> [  106.948360] CPU1 is up
> [  106.963975] smpboot: Booting Node 0 Processor 2 APIC 0x1
> -------skip--------
> [  106.968087] CPU2 is up
> [  106.986534] smpboot: Booting Node 0 Processor 3 APIC 0x3
> -------skip--------
> [  106.990702] CPU3 is up
> 
> It took CPU1 106.948360 - 106.931790 = 16.57 ms, and
> CPU3 106.990702 - 106.986534 = 4.16 ms
> 
> Question:
> So it turns out to be a BIOS issue, but Linux should also deal with
> this bogus BIOS, right? I studied the commit we delay the synchronization
> until all the APs are brought up, and according to:
> 
> Commit d0af9eed5aa9 ("x86, pat/mtrr: Rendezvous all the cpus
> for MTRR/PAT init")
> 
> It seems that there would be problem if we do not sync APs at the same
> time(some CPUs run with cache disabled will hang the system, because its
> sibling is trying to adjust the mtrr which might disable its cache) on
> some special platforms? But I have a question that, even in our current
> solution which defers the synchronization, the scenario mentioned above
> can not be avoided because at the time CPU3 is trying to restore mtrr,
> its sibling CPU1 might also be doing some kworker or ticking tasks,
> the CPU1 might also run with cache disabled?
> I'm not sure if I understand the code correctly, and it would be
> appreciated if people could give any comments/suggestions on how to deal
> with this situation found on MacProBook, and if you need me to do any test
> please feel free to let me know.
> 
> Thanks,
> Yu
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Len Brown <len.brown@intel.com>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Suresh Siddha <sbsiddha@gmail.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Lukas Wunner <lukas@wunner.de>
> Cc: "Brandt, Todd E" <todd.e.brandt@intel.com>
> Cc: Rui Zhang <rui.zhang@intel.com>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> ---
>  arch/x86/kernel/cpu/mtrr/main.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
> index 24e87e7..eddaa89 100644
> --- a/arch/x86/kernel/cpu/mtrr/main.c
> +++ b/arch/x86/kernel/cpu/mtrr/main.c
> @@ -813,15 +813,28 @@ void mtrr_save_state(void)
>  	put_online_cpus();
>  }
>  
> +static bool __read_mostly no_aps_delay;
> +
> +static int __init no_aps_setup(char *str)
> +{
> +	no_aps_delay = true;
> +	pr_info("hack: do not delay aps mtrr/pat initialization.\n");
> +
> +	return 0;
> +}
> +
>  void set_mtrr_aps_delayed_init(void)
>  {
>  	if (!mtrr_enabled())
>  		return;
>  	if (!use_intel())
>  		return;
> +	if (no_aps_delay)
> +		return;
>  
>  	mtrr_aps_delayed_init = true;
>  }
> +early_param("no_aps_delay", no_aps_setup);
>  
>  /*
>   * Delayed MTRR initialization for all AP's
> -- 
> 2.7.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 24e87e7..eddaa89 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -813,15 +813,28 @@  void mtrr_save_state(void)
 	put_online_cpus();
 }
 
+static bool __read_mostly no_aps_delay;
+
+static int __init no_aps_setup(char *str)
+{
+	no_aps_delay = true;
+	pr_info("hack: do not delay aps mtrr/pat initialization.\n");
+
+	return 0;
+}
+
 void set_mtrr_aps_delayed_init(void)
 {
 	if (!mtrr_enabled())
 		return;
 	if (!use_intel())
 		return;
+	if (no_aps_delay)
+		return;
 
 	mtrr_aps_delayed_init = true;
 }
+early_param("no_aps_delay", no_aps_setup);
 
 /*
  * Delayed MTRR initialization for all AP's