Message ID | alpine.LNX.2.00.1410221125280.22681@pobox.suse.cz (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Wed, 22 Oct 2014, Jiri Kosina wrote: > Still, the lockdep stacktrace is bogus and didn't really help > understanding this. Any idea why it's wrong? > > > ====================================================== > > [ INFO: possible circular locking dependency detected ] > > 3.18.0-rc1-00069-gc2661b8 #1 Not tainted > > ------------------------------------------------------- > > do_s2disk/2367 is trying to acquire lock: > > (cpuidle_lock){+.+.+.}, at: [<ffffffff814916c2>] cpuidle_pause_and_lock+0x12/0x20 > > > > but task is already holding lock: > > (cpu_hotplug.lock#2){+.+.+.}, at: [<ffffffff810522ea>] cpu_hotplug_begin+0x4a/0x80 > > > > which lock already depends on the new lock. > > > > the existing dependency chain (in reverse order) is: > > > > -> #1 (cpu_hotplug.lock#2){+.+.+.}: > > [<ffffffff81099fac>] lock_acquire+0xac/0x130 > > [<ffffffff815b9f2c>] mutex_lock_nested+0x5c/0x3b0 > > [<ffffffff81491892>] cpuidle_pause+0x12/0x30 And the report is clearly bogus here. See: $ addr2line -a ffffffff81491892 -e vmlinux-3.18.0-rc1-00068-gc2661b8 0xffffffff81491892 /dev/shm/jikos/BUILD/kernel-3.18.0_rc1_00068_gc2661b8/drivers/cpuidle/cpuidle.c:262 where the corresponding lines are 259 void cpuidle_pause(void) 260 { 261 mutex_lock(&cpuidle_lock); 262 cpuidle_uninstall_idle_handler(); 263 mutex_unlock(&cpuidle_lock); 264 } i.e. the RIP saved on stack clearly indicates that we are currently inside cpuidle_uninstall_idle_handler(). So it's clearly the case that part of the stack (which would probably point to synchronize_rcu_expedited()) is missing for some reason.
On 10/22/2014 11:53 AM, Jiri Kosina wrote: > On Tue, 21 Oct 2014, Jiri Kosina wrote: > >> Hi, >> >> I am seeing the lockdep report below when resuming from suspend-to-disk >> with current Linus' tree (c2661b80609). >> >> The reason for CCing Ingo and Peter is that I can't make any sense of one >> of the stacktraces lockdep is providing. >> >> Please have a look at the very first stacktrace in the dump, where lockdep >> is trying to explain where cpu_hotplug.lock#2 has been acquired. It seems >> to imply that cpuidle_pause() is taking cpu_hotplug.lock, but that's not >> the case at all. >> >> What am I missing? > > Okay, reverting 442bf3aaf55a ("sched: Let the scheduler see CPU idle > states") and followup 83a0a96a5f26 ("sched/fair: Leverage the idle state > info when choosing the "idlest" cpu") which depends on it makes the splat > go away. Are you able to reproduce it by offlining the cpu and onlining it again ?
On Wed, 22 Oct 2014, Daniel Lezcano wrote: > > > I am seeing the lockdep report below when resuming from suspend-to-disk > > > with current Linus' tree (c2661b80609). > > > > > > The reason for CCing Ingo and Peter is that I can't make any sense of one > > > of the stacktraces lockdep is providing. > > > > > > Please have a look at the very first stacktrace in the dump, where lockdep > > > is trying to explain where cpu_hotplug.lock#2 has been acquired. It seems > > > to imply that cpuidle_pause() is taking cpu_hotplug.lock, but that's not > > > the case at all. > > > > > > What am I missing? > > > > Okay, reverting 442bf3aaf55a ("sched: Let the scheduler see CPU idle > > states") and followup 83a0a96a5f26 ("sched/fair: Leverage the idle state > > info when choosing the "idlest" cpu") which depends on it makes the splat > > go away. > > Are you able to reproduce it by offlining the cpu and onlining it again ? No, that doesn't trigger it (please note that all the relevant stacktraces from lockdep are going through hibernation).
On Wed, Oct 22, 2014 at 11:53:49AM +0200, Jiri Kosina wrote: > On Tue, 21 Oct 2014, Jiri Kosina wrote: > > > Hi, > > > > I am seeing the lockdep report below when resuming from suspend-to-disk > > with current Linus' tree (c2661b80609). > > > > The reason for CCing Ingo and Peter is that I can't make any sense of one > > of the stacktraces lockdep is providing. > > > > Please have a look at the very first stacktrace in the dump, where lockdep > > is trying to explain where cpu_hotplug.lock#2 has been acquired. It seems > > to imply that cpuidle_pause() is taking cpu_hotplug.lock, but that's not > > the case at all. > > > > What am I missing? > > Okay, reverting 442bf3aaf55a ("sched: Let the scheduler see CPU idle > states") and followup 83a0a96a5f26 ("sched/fair: Leverage the idle state > info when choosing the "idlest" cpu") which depends on it makes the splat > go away. > > Just for the sake of testing the hypothesis, I did just the minimal change > below on top of current Linus' tree, and it also makes the splat go away > (of course it's totally incorrect thing to do by itself alone): > > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > index 125150d..d31e04c 100644 > --- a/drivers/cpuidle/cpuidle.c > +++ b/drivers/cpuidle/cpuidle.c > @@ -225,12 +225,6 @@ void cpuidle_uninstall_idle_handler(void) > initialized = 0; > wake_up_all_idle_cpus(); > } > - > - /* > - * Make sure external observers (such as the scheduler) > - * are done looking at pointed idle states. > - */ > - synchronize_rcu(); > } > > /** > > So indeed 442bf3aaf55a is guilty. > > Paul was stating yesterday that it can't be the try_get_online_cpus() in > synchronize_sched_expedited(), as it's doing only trylock. There are > however more places where synchronize_sched_expedited() is acquiring > cpu_hotplug.lock unconditionally by calling put_online_cpus(), so the race > seems real. Gah! So I only half-eliminated the deadlock between synchronize_sched_expedited() and CPU hotplug. Back to the drawing board... Thanx, Paul > Adding people involved in 442bf3aaf55a to CC. > > Still, the lockdep stacktrace is bogus and didn't really help > understanding this. Any idea why it's wrong? > > > ====================================================== > > [ INFO: possible circular locking dependency detected ] > > 3.18.0-rc1-00069-gc2661b8 #1 Not tainted > > ------------------------------------------------------- > > do_s2disk/2367 is trying to acquire lock: > > (cpuidle_lock){+.+.+.}, at: [<ffffffff814916c2>] cpuidle_pause_and_lock+0x12/0x20 > > > > but task is already holding lock: > > (cpu_hotplug.lock#2){+.+.+.}, at: [<ffffffff810522ea>] cpu_hotplug_begin+0x4a/0x80 > > > > which lock already depends on the new lock. > > > > the existing dependency chain (in reverse order) is: > > > > -> #1 (cpu_hotplug.lock#2){+.+.+.}: > > [<ffffffff81099fac>] lock_acquire+0xac/0x130 > > [<ffffffff815b9f2c>] mutex_lock_nested+0x5c/0x3b0 > > [<ffffffff81491892>] cpuidle_pause+0x12/0x30 > > [<ffffffff81402314>] dpm_suspend_noirq+0x44/0x340 > > [<ffffffff81402958>] dpm_suspend_end+0x38/0x80 > > [<ffffffff810a07bd>] hibernation_snapshot+0xcd/0x370 > > [<ffffffff810a1248>] hibernate+0x168/0x210 > > [<ffffffff8109e9b4>] state_store+0xe4/0xf0 > > [<ffffffff813003ef>] kobj_attr_store+0xf/0x20 > > [<ffffffff8121e9a3>] sysfs_kf_write+0x43/0x60 > > [<ffffffff8121e287>] kernfs_fop_write+0xe7/0x170 > > [<ffffffff811a7342>] vfs_write+0xb2/0x1f0 > > [<ffffffff811a7da4>] SyS_write+0x44/0xb0 > > [<ffffffff815be856>] system_call_fastpath+0x16/0x1b > > > > -> #0 (cpuidle_lock){+.+.+.}: > > [<ffffffff81099433>] __lock_acquire+0x1a03/0x1e30 > > [<ffffffff81099fac>] lock_acquire+0xac/0x130 > > [<ffffffff815b9f2c>] mutex_lock_nested+0x5c/0x3b0 > > [<ffffffff814916c2>] cpuidle_pause_and_lock+0x12/0x20 > > [<ffffffffc02e184c>] acpi_processor_hotplug+0x45/0x8a [processor] > > [<ffffffffc02df25a>] acpi_cpu_soft_notify+0xad/0xe3 [processor] > > [<ffffffff81071393>] notifier_call_chain+0x53/0xa0 > > [<ffffffff810713e9>] __raw_notifier_call_chain+0x9/0x10 > > [<ffffffff810521ce>] cpu_notify+0x1e/0x40 > > [<ffffffff810524a8>] _cpu_up+0x148/0x160 > > [<ffffffff815a7b99>] enable_nonboot_cpus+0xc9/0x1d0 > > [<ffffffff810a0955>] hibernation_snapshot+0x265/0x370 > > [<ffffffff810a1248>] hibernate+0x168/0x210 > > [<ffffffff8109e9b4>] state_store+0xe4/0xf0 > > [<ffffffff813003ef>] kobj_attr_store+0xf/0x20 > > [<ffffffff8121e9a3>] sysfs_kf_write+0x43/0x60 > > [<ffffffff8121e287>] kernfs_fop_write+0xe7/0x170 > > [<ffffffff811a7342>] vfs_write+0xb2/0x1f0 > > [<ffffffff811a7da4>] SyS_write+0x44/0xb0 > > [<ffffffff815be856>] system_call_fastpath+0x16/0x1b > > > > other info that might help us debug this: > > > > Possible unsafe locking scenario: > > > > CPU0 CPU1 > > ---- ---- > > lock(cpu_hotplug.lock#2); > > lock(cpuidle_lock); > > lock(cpu_hotplug.lock#2); > > lock(cpuidle_lock); > > > > *** DEADLOCK *** > > > > 8 locks held by do_s2disk/2367: > > #0: (sb_writers#6){.+.+.+}, at: [<ffffffff811a7443>] vfs_write+0x1b3/0x1f0 > > #1: (&of->mutex){+.+.+.}, at: [<ffffffff8121e25b>] kernfs_fop_write+0xbb/0x170 > > #2: (s_active#188){.+.+.+}, at: [<ffffffff8121e263>] kernfs_fop_write+0xc3/0x170 > > #3: (pm_mutex){+.+.+.}, at: [<ffffffff810a112e>] hibernate+0x4e/0x210 > > #4: (device_hotplug_lock){+.+.+.}, at: [<ffffffff813f1b52>] lock_device_hotplug+0x12/0x20 > > #5: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a7aef>] enable_nonboot_cpus+0x1f/0x1d0 > > #6: (cpu_hotplug.lock){++++++}, at: [<ffffffff810522a0>] cpu_hotplug_begin+0x0/0x80 > > #7: (cpu_hotplug.lock#2){+.+.+.}, at: [<ffffffff810522ea>] cpu_hotplug_begin+0x4a/0x80 > > > > stack backtrace: > > CPU: 1 PID: 2367 Comm: do_s2disk Not tainted 3.18.0-rc1-00069-g4da0564 #1 > > Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008 > > ffffffff823e4330 ffff8800789e7a48 ffffffff815b6754 0000000000001a69 > > ffffffff823e4330 ffff8800789e7a98 ffffffff815b078b ffff8800741a5510 > > ffff8800789e7af8 ffff8800741a5ea8 5a024e919538010b ffff8800741a5ea8 > > Call Trace: > > [<ffffffff815b6754>] dump_stack+0x4e/0x68 > > [<ffffffff815b078b>] print_circular_bug+0x203/0x214 > > [<ffffffff81099433>] __lock_acquire+0x1a03/0x1e30 > > [<ffffffff8109766d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 > > [<ffffffff81099fac>] lock_acquire+0xac/0x130 > > [<ffffffff814916c2>] ? cpuidle_pause_and_lock+0x12/0x20 > > [<ffffffff815b9f2c>] mutex_lock_nested+0x5c/0x3b0 > > [<ffffffff814916c2>] ? cpuidle_pause_and_lock+0x12/0x20 > > [<ffffffff814916c2>] cpuidle_pause_and_lock+0x12/0x20 > > [<ffffffffc02e184c>] acpi_processor_hotplug+0x45/0x8a [processor] > > [<ffffffffc02df25a>] acpi_cpu_soft_notify+0xad/0xe3 [processor] > > [<ffffffff81071393>] notifier_call_chain+0x53/0xa0 > > [<ffffffff810713e9>] __raw_notifier_call_chain+0x9/0x10 > > [<ffffffff810521ce>] cpu_notify+0x1e/0x40 > > [<ffffffff810524a8>] _cpu_up+0x148/0x160 > > [<ffffffff815a7b99>] enable_nonboot_cpus+0xc9/0x1d0 > > [<ffffffff810a0955>] hibernation_snapshot+0x265/0x370 > > [<ffffffff810a1248>] hibernate+0x168/0x210 > > [<ffffffff8109e9b4>] state_store+0xe4/0xf0 > > [<ffffffff813003ef>] kobj_attr_store+0xf/0x20 > > [<ffffffff8121e9a3>] sysfs_kf_write+0x43/0x60 > > [<ffffffff8121e287>] kernfs_fop_write+0xe7/0x170 > > [<ffffffff811a7342>] vfs_write+0xb2/0x1f0 > > [<ffffffff815be87b>] ? sysret_check+0x1b/0x56 > > [<ffffffff811a7da4>] SyS_write+0x44/0xb0 > > [<ffffffff815be856>] system_call_fastpath+0x16/0x1b > > -- > Jiri Kosina > SUSE Labs > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/22/2014 04:36 PM, Jiri Kosina wrote: > On Wed, 22 Oct 2014, Daniel Lezcano wrote: > >>>> I am seeing the lockdep report below when resuming from suspend-to-disk >>>> with current Linus' tree (c2661b80609). >>>> >>>> The reason for CCing Ingo and Peter is that I can't make any sense of one >>>> of the stacktraces lockdep is providing. >>>> >>>> Please have a look at the very first stacktrace in the dump, where lockdep >>>> is trying to explain where cpu_hotplug.lock#2 has been acquired. It seems >>>> to imply that cpuidle_pause() is taking cpu_hotplug.lock, but that's not >>>> the case at all. >>>> >>>> What am I missing? >>> >>> Okay, reverting 442bf3aaf55a ("sched: Let the scheduler see CPU idle >>> states") and followup 83a0a96a5f26 ("sched/fair: Leverage the idle state >>> info when choosing the "idlest" cpu") which depends on it makes the splat >>> go away. >> >> Are you able to reproduce it by offlining the cpu and onlining it again ? > > No, that doesn't trigger it (please note that all the relevant stacktraces > from lockdep are going through hibernation). Ok, thanks.
On Wed, 22 Oct 2014 11:53:49 +0200 (CEST) Jiri Kosina <jkosina@suse.cz> wrote: > Still, the lockdep stacktrace is bogus and didn't really help > understanding this. Any idea why it's wrong? Could possibly be from a tail call? > > > ====================================================== > > [ INFO: possible circular locking dependency detected ] > > 3.18.0-rc1-00069-gc2661b8 #1 Not tainted > > ------------------------------------------------------- > > do_s2disk/2367 is trying to acquire lock: > > (cpuidle_lock){+.+.+.}, at: [<ffffffff814916c2>] cpuidle_pause_and_lock+0x12/0x20 > > > > but task is already holding lock: > > (cpu_hotplug.lock#2){+.+.+.}, at: [<ffffffff810522ea>] cpu_hotplug_begin+0x4a/0x80 > > > > which lock already depends on the new lock. > > > > the existing dependency chain (in reverse order) is: > > > > -> #1 (cpu_hotplug.lock#2){+.+.+.}: > > [<ffffffff81099fac>] lock_acquire+0xac/0x130 > > [<ffffffff815b9f2c>] mutex_lock_nested+0x5c/0x3b0 > > [<ffffffff81491892>] cpuidle_pause+0x12/0x30 Where exactly does that address point to? -- Steve > > [<ffffffff81402314>] dpm_suspend_noirq+0x44/0x340 > > [<ffffffff81402958>] dpm_suspend_end+0x38/0x80 > > [<ffffffff810a07bd>] hibernation_snapshot+0xcd/0x370 > > [<ffffffff810a1248>] hibernate+0x168/0x210 > > [<ffffffff8109e9b4>] state_store+0xe4/0xf0 > > [<ffffffff813003ef>] kobj_attr_store+0xf/0x20 > > [<ffffffff8121e9a3>] sysfs_kf_write+0x43/0x60 > > [<ffffffff8121e287>] kernfs_fop_write+0xe7/0x170 > > [<ffffffff811a7342>] vfs_write+0xb2/0x1f0 > > [<ffffffff811a7da4>] SyS_write+0x44/0xb0 > > [<ffffffff815be856>] system_call_fastpath+0x16/0x1b > > > > -> #0 (cpuidle_lock){+.+.+.}: > > [<ffffffff81099433>] __lock_acquire+0x1a03/0x1e30 > > [<ffffffff81099fac>] lock_acquire+0xac/0x130 > > [<ffffffff815b9f2c>] mutex_lock_nested+0x5c/0x3b0 > > [<ffffffff814916c2>] cpuidle_pause_and_lock+0x12/0x20 > > [<ffffffffc02e184c>] acpi_processor_hotplug+0x45/0x8a [processor] > > [<ffffffffc02df25a>] acpi_cpu_soft_notify+0xad/0xe3 [processor] > > [<ffffffff81071393>] notifier_call_chain+0x53/0xa0 > > [<ffffffff810713e9>] __raw_notifier_call_chain+0x9/0x10 > > [<ffffffff810521ce>] cpu_notify+0x1e/0x40 > > [<ffffffff810524a8>] _cpu_up+0x148/0x160 > > [<ffffffff815a7b99>] enable_nonboot_cpus+0xc9/0x1d0 > > [<ffffffff810a0955>] hibernation_snapshot+0x265/0x370 > > [<ffffffff810a1248>] hibernate+0x168/0x210 > > [<ffffffff8109e9b4>] state_store+0xe4/0xf0 > > [<ffffffff813003ef>] kobj_attr_store+0xf/0x20 > > [<ffffffff8121e9a3>] sysfs_kf_write+0x43/0x60 > > [<ffffffff8121e287>] kernfs_fop_write+0xe7/0x170 > > [<ffffffff811a7342>] vfs_write+0xb2/0x1f0 > > [<ffffffff811a7da4>] SyS_write+0x44/0xb0 > > [<ffffffff815be856>] system_call_fastpath+0x16/0x1b > > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 22 Oct 2014, Steven Rostedt wrote: > > Still, the lockdep stacktrace is bogus and didn't really help > > understanding this. Any idea why it's wrong? > > Could possibly be from a tail call? Doesn't seem so: (gdb) disassemble cpuidle_pause Dump of assembler code for function cpuidle_pause: 0xffffffff81491880 <+0>: push %rbp 0xffffffff81491881 <+1>: xor %esi,%esi 0xffffffff81491883 <+3>: mov $0xffffffff81a9eb20,%rdi 0xffffffff8149188a <+10>: mov %rsp,%rbp 0xffffffff8149188d <+13>: callq 0xffffffff815b9ed0 <mutex_lock_nested> 0xffffffff81491892 <+18>: callq 0xffffffff81491680 <cpuidle_uninstall_idle_handler> 0xffffffff81491897 <+23>: mov $0xffffffff81a9eb20,%rdi 0xffffffff8149189e <+30>: callq 0xffffffff815bbe60 <mutex_unlock> 0xffffffff814918a3 <+35>: pop %rbp 0xffffffff814918a4 <+36>: retq End of assembler dump. (gdb) disassemble cpuidle_uninstall_idle_handler Dump of assembler code for function cpuidle_uninstall_idle_handler: 0xffffffff81491680 <+0>: mov 0x159da32(%rip),%eax # 0xffffffff82a2f0b8 <enabled_devices> 0xffffffff81491686 <+6>: push %rbp 0xffffffff81491687 <+7>: mov %rsp,%rbp 0xffffffff8149168a <+10>: test %eax,%eax 0xffffffff8149168c <+12>: je 0xffffffff8149169d <cpuidle_uninstall_idle_handler+29> 0xffffffff8149168e <+14>: movl $0x0,0x64794c(%rip) # 0xffffffff81ad8fe4 <initialized> 0xffffffff81491698 <+24>: callq 0xffffffff810cf9b0 <wake_up_all_idle_cpus> 0xffffffff8149169d <+29>: callq 0xffffffff810b47b0 <synchronize_sched> 0xffffffff814916a2 <+34>: pop %rbp 0xffffffff814916a3 <+35>: retq End of assembler dump. (gdb) disassemble synchronize_sched Dump of assembler code for function synchronize_sched: 0xffffffff810b47b0 <+0>: push %rbp 0xffffffff810b47b1 <+1>: xor %edx,%edx 0xffffffff810b47b3 <+3>: mov $0xad5,%esi 0xffffffff810b47b8 <+8>: mov $0xffffffff817fad6d,%rdi 0xffffffff810b47bf <+15>: mov %rsp,%rbp 0xffffffff810b47c2 <+18>: callq 0xffffffff81075900 <__might_sleep> 0xffffffff810b47c7 <+23>: incl %gs:0xbaa0 0xffffffff810b47cf <+31>: mov 0x5587b2(%rip),%rdi # 0xffffffff8160cf88 <cpu_online_mask> 0xffffffff810b47d6 <+38>: mov $0x200,%esi 0xffffffff810b47db <+43>: callq 0xffffffff8130e710 <__bitmap_weight> 0xffffffff810b47e0 <+48>: decl %gs:0xbaa0 0xffffffff810b47e8 <+56>: cmp $0x1,%eax 0xffffffff810b47eb <+59>: jbe 0xffffffff810b4803 <synchronize_sched+83> 0xffffffff810b47ed <+61>: mov 0xbdf97d(%rip),%eax # 0xffffffff81c94170 <rcu_expedited> 0xffffffff810b47f3 <+67>: test %eax,%eax 0xffffffff810b47f5 <+69>: jne 0xffffffff810b4808 <synchronize_sched+88> 0xffffffff810b47f7 <+71>: mov $0xffffffff810b3d80,%rdi 0xffffffff810b47fe <+78>: callq 0xffffffff810b1b00 <wait_rcu_gp> 0xffffffff810b4803 <+83>: pop %rbp 0xffffffff810b4804 <+84>: retq 0xffffffff810b4805 <+85>: nopl (%rax) 0xffffffff810b4808 <+88>: callq 0xffffffff810b4820 <synchronize_sched_expedited> 0xffffffff810b480d <+93>: pop %rbp 0xffffffff810b480e <+94>: xchg %ax,%ax 0xffffffff810b4810 <+96>: retq > > > ====================================================== > > > [ INFO: possible circular locking dependency detected ] > > > 3.18.0-rc1-00069-gc2661b8 #1 Not tainted > > > ------------------------------------------------------- > > > do_s2disk/2367 is trying to acquire lock: > > > (cpuidle_lock){+.+.+.}, at: [<ffffffff814916c2>] cpuidle_pause_and_lock+0x12/0x20 > > > > > > but task is already holding lock: > > > (cpu_hotplug.lock#2){+.+.+.}, at: [<ffffffff810522ea>] cpu_hotplug_begin+0x4a/0x80 > > > > > > which lock already depends on the new lock. > > > > > > the existing dependency chain (in reverse order) is: > > > > > > -> #1 (cpu_hotplug.lock#2){+.+.+.}: > > > [<ffffffff81099fac>] lock_acquire+0xac/0x130 > > > [<ffffffff815b9f2c>] mutex_lock_nested+0x5c/0x3b0 > > > [<ffffffff81491892>] cpuidle_pause+0x12/0x30 > > Where exactly does that address point to? (gdb) disassemble cpuidle_pause Dump of assembler code for function cpuidle_pause: 0xffffffff81491880 <+0>: push %rbp 0xffffffff81491881 <+1>: xor %esi,%esi 0xffffffff81491883 <+3>: mov $0xffffffff81a9eb20,%rdi 0xffffffff8149188a <+10>: mov %rsp,%rbp 0xffffffff8149188d <+13>: callq 0xffffffff815b9ed0 <mutex_lock_nested> 0xffffffff81491892 <+18>: callq 0xffffffff81491680 <cpuidle_uninstall_idle_handler> 0xffffffff81491897 <+23>: mov $0xffffffff81a9eb20,%rdi 0xffffffff8149189e <+30>: callq 0xffffffff815bbe60 <mutex_unlock> 0xffffffff814918a3 <+35>: pop %rbp 0xffffffff814918a4 <+36>: retq
On Wed, Oct 22, 2014 at 11:53:49AM +0200, Jiri Kosina wrote: > > The reason for CCing Ingo and Peter is that I can't make any sense of one > > of the stacktraces lockdep is providing. > > > > Please have a look at the very first stacktrace in the dump, where lockdep > > is trying to explain where cpu_hotplug.lock#2 has been acquired. It seems > > to imply that cpuidle_pause() is taking cpu_hotplug.lock, but that's not > > the case at all. > > > > What am I missing? > Still, the lockdep stacktrace is bogus and didn't really help > understanding this. Any idea why it's wrong? > > -> #1 (cpu_hotplug.lock#2){+.+.+.}: > > [<ffffffff81099fac>] lock_acquire+0xac/0x130 > > [<ffffffff815b9f2c>] mutex_lock_nested+0x5c/0x3b0 > > [<ffffffff81491892>] cpuidle_pause+0x12/0x30 > > [<ffffffff81402314>] dpm_suspend_noirq+0x44/0x340 > > [<ffffffff81402958>] dpm_suspend_end+0x38/0x80 > > [<ffffffff810a07bd>] hibernation_snapshot+0xcd/0x370 > > [<ffffffff810a1248>] hibernate+0x168/0x210 > > [<ffffffff8109e9b4>] state_store+0xe4/0xf0 > > [<ffffffff813003ef>] kobj_attr_store+0xf/0x20 > > [<ffffffff8121e9a3>] sysfs_kf_write+0x43/0x60 > > [<ffffffff8121e287>] kernfs_fop_write+0xe7/0x170 > > [<ffffffff811a7342>] vfs_write+0xb2/0x1f0 > > [<ffffffff811a7da4>] SyS_write+0x44/0xb0 > > [<ffffffff815be856>] system_call_fastpath+0x16/0x1b Right, so I've seen it more often, and I'm not sure I can explain either. Lockdep uses save_stack_trace() with trace->skip=3, typically if you get ->skip wrong you'd not even see the lock_acquire, so that can't be it. The only thing I can come up with is that for some reason the intermediate entries are !reliable, save_stack_trace() skips those for CONFIG_FRAME_POINTER=y, see arch/x86/kernel/stacktrace.c:__save_stack_address(). -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 125150d..d31e04c 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -225,12 +225,6 @@ void cpuidle_uninstall_idle_handler(void) initialized = 0; wake_up_all_idle_cpus(); } - - /* - * Make sure external observers (such as the scheduler) - * are done looking at pointed idle states. - */ - synchronize_rcu(); } /**