Message ID | 1427382128-12541-1-git-send-email-daniel.wagner@bmw-carit.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Mar 26, 2015 at 04:02:08PM +0100, Daniel Wagner wrote: > @@ -67,9 +67,9 @@ void lg_global_lock(struct lglock *lg) > preempt_disable(); > lock_acquire_exclusive(&lg->lock_dep_map, 0, 0, NULL, _RET_IP_); > for_each_possible_cpu(i) { > - arch_spinlock_t *lock; > + spinlock_t *lock; > lock = per_cpu_ptr(lg->lock, i); > - arch_spin_lock(lock); > + spin_lock(lock); > } > } Nope, that'll blow up in two separate places. One: lockdep, it can only track a limited number of held locks, and it will further report a recursion warning on the 2nd cpu. Second: preempt_count_add(), spin_lock() does preempt_disable(), with enough CPUs you'll overflow the preempt counter (255). -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/26/2015 05:03 PM, Peter Zijlstra wrote: > On Thu, Mar 26, 2015 at 04:02:08PM +0100, Daniel Wagner wrote: >> @@ -67,9 +67,9 @@ void lg_global_lock(struct lglock *lg) >> preempt_disable(); >> lock_acquire_exclusive(&lg->lock_dep_map, 0, 0, NULL, _RET_IP_); >> for_each_possible_cpu(i) { >> - arch_spinlock_t *lock; >> + spinlock_t *lock; >> lock = per_cpu_ptr(lg->lock, i); >> - arch_spin_lock(lock); >> + spin_lock(lock); >> } >> } > > Nope, that'll blow up in two separate places. > > One: lockdep, it can only track a limited number of held locks, and it > will further report a recursion warning on the 2nd cpu. I was wondering why I haven't seen it explode. As it turns out I haven't looked closely enough at dmesg: [ +0.001231] BUG: MAX_LOCK_DEPTH too low! [ +0.000092] turning off the locking correctness validator. [ +0.000092] Please attach the output of /proc/lock_stat to the bug report [ +0.000094] depth: 48 max: 48! [ +0.000087] 48 locks held by swapper/0/1: [ +0.000090] #0: (cpu_hotplug.lock){++++++}, at: [<ffffffff8109e767>] get_online_cpus+0x37/0x80 [ +0.000503] #1: (stop_cpus_mutex){+.+...}, at: [<ffffffff8114b889>] stop_cpus+0x29/0x60 [ +0.000504] #2: (stop_cpus_lock){+.+...}, at: [<ffffffff8114b2de>] queue_stop_cpus_work+0x7e/0xd0 [ +0.000582] #3: (stop_cpus_lock_lock){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.000496] #4: (stop_cpus_lock_lock#2){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.000577] #5: (stop_cpus_lock_lock#3){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.000581] #6: (stop_cpus_lock_lock#4){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.000576] #7: (stop_cpus_lock_lock#5){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.000576] #8: (stop_cpus_lock_lock#6){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.000575] #9: (stop_cpus_lock_lock#7){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.000817] #10: (stop_cpus_lock_lock#8){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001057] #11: (stop_cpus_lock_lock#9){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001057] #12: (stop_cpus_lock_lock#10){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001055] #13: (stop_cpus_lock_lock#11){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001059] #14: (stop_cpus_lock_lock#12){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001056] #15: (stop_cpus_lock_lock#13){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001054] #16: (stop_cpus_lock_lock#14){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001053] #17: (stop_cpus_lock_lock#15){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001052] #18: (stop_cpus_lock_lock#16){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001059] #19: (stop_cpus_lock_lock#17){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001072] #20: (stop_cpus_lock_lock#18){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001057] #21: (stop_cpus_lock_lock#19){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001060] #22: (stop_cpus_lock_lock#20){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001060] #23: (stop_cpus_lock_lock#21){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001060] #24: (stop_cpus_lock_lock#22){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001065] #25: (stop_cpus_lock_lock#23){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001055] #26: (stop_cpus_lock_lock#24){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001058] #27: (stop_cpus_lock_lock#25){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001055] #28: (stop_cpus_lock_lock#26){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001054] #29: (stop_cpus_lock_lock#27){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001052] #30: (stop_cpus_lock_lock#28){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001059] #31: (stop_cpus_lock_lock#29){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001056] #32: (stop_cpus_lock_lock#30){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001149] #33: (stop_cpus_lock_lock#31){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001060] #34: (stop_cpus_lock_lock#32){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001063] #35: (stop_cpus_lock_lock#33){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001064] #36: (stop_cpus_lock_lock#34){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001061] #37: (stop_cpus_lock_lock#35){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001054] #38: (stop_cpus_lock_lock#36){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001059] #39: (stop_cpus_lock_lock#37){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001059] #40: (stop_cpus_lock_lock#38){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001057] #41: (stop_cpus_lock_lock#39){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001056] #42: (stop_cpus_lock_lock#40){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001055] #43: (stop_cpus_lock_lock#41){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001055] #44: (stop_cpus_lock_lock#42){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001059] #45: (stop_cpus_lock_lock#43){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001051] #46: (stop_cpus_lock_lock#44){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001123] #47: (stop_cpus_lock_lock#45){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.001058] INFO: lockdep is turned off. [ +0.000330] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.0.0-rc5-00001-g70ed1b1 #31 [ +0.000581] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014 [ +0.000582] 0000000000000000 000000002752ae20 ffff881fb119ba88 ffffffff817dcbc1 [ +0.000895] 0000000000000000 ffff885fb14f8000 ffff881fb119bb88 ffffffff810ed5aa [ +0.000899] ffffffff82885150 ffff885fb14f8000 0000000000000292 0000000000000000 [ +0.000893] Call Trace: [ +0.000329] [<ffffffff817dcbc1>] dump_stack+0x4c/0x65 [ +0.000333] [<ffffffff810ed5aa>] __lock_acquire+0xfca/0x1f90 [ +0.000334] [<ffffffff8110f72f>] ? rcu_irq_exit+0x7f/0xd0 [ +0.000333] [<ffffffff817e766c>] ? retint_restore_args+0x13/0x13 [ +0.000335] [<ffffffff810ef5a7>] lock_acquire+0xc7/0x160 [ +0.000334] [<ffffffff810f2af6>] ? lg_global_lock+0x66/0x90 [ +0.000334] [<ffffffff8114b130>] ? cpu_stop_should_run+0x50/0x50 [ +0.000335] [<ffffffff817e59dd>] _raw_spin_lock+0x3d/0x80 [ +0.000336] [<ffffffff810f2af6>] ? lg_global_lock+0x66/0x90 [ +0.000335] [<ffffffff810f2af6>] lg_global_lock+0x66/0x90 [ +0.000333] [<ffffffff8114b2de>] ? queue_stop_cpus_work+0x7e/0xd0 [ +0.000338] [<ffffffff8114b2de>] queue_stop_cpus_work+0x7e/0xd0 [ +0.000335] [<ffffffff8114b130>] ? cpu_stop_should_run+0x50/0x50 [ +0.000337] [<ffffffff8114b538>] __stop_cpus+0x58/0xa0 [ +0.000335] [<ffffffff8114b130>] ? cpu_stop_should_run+0x50/0x50 [ +0.000334] [<ffffffff8114b130>] ? cpu_stop_should_run+0x50/0x50 [ +0.000335] [<ffffffff8114b897>] stop_cpus+0x37/0x60 [ +0.000339] [<ffffffff810444e0>] ? mtrr_restore+0xb0/0xb0 [ +0.000337] [<ffffffff8114ba15>] __stop_machine+0xf5/0x130 [ +0.000334] [<ffffffff810444e0>] ? mtrr_restore+0xb0/0xb0 [ +0.000334] [<ffffffff810444e0>] ? mtrr_restore+0xb0/0xb0 [ +0.000337] [<ffffffff8114ba7e>] stop_machine+0x2e/0x50 [ +0.000330] [<ffffffff81044efb>] mtrr_aps_init+0x7b/0x90 [ +0.000433] [<ffffffff81f3faac>] native_smp_cpus_done+0x10b/0x113 [ +0.000335] [<ffffffff81f51d74>] smp_init+0x78/0x80 [ +0.000332] [<ffffffff81f2d1e1>] kernel_init_freeable+0x167/0x28d [ +0.000336] [<ffffffff817d2690>] ? rest_init+0xd0/0xd0 [ +0.000334] [<ffffffff817d269e>] kernel_init+0xe/0xf0 [ +0.000336] [<ffffffff817e6918>] ret_from_fork+0x58/0x90 [ +0.000333] [<ffffffff817d2690>] ? rest_init+0xd0/0xd0 and after that lockdep is disabled. /me feeling extremely stupid. > Second: preempt_count_add(), spin_lock() does preempt_disable(), with > enough CPUs you'll overflow the preempt counter (255). Thanks for the review. cheers, daniel -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Daniel Wagner <daniel.wagner@bmw-carit.de> wrote: > On 03/26/2015 05:03 PM, Peter Zijlstra wrote: > > On Thu, Mar 26, 2015 at 04:02:08PM +0100, Daniel Wagner wrote: > >> @@ -67,9 +67,9 @@ void lg_global_lock(struct lglock *lg) > >> preempt_disable(); > >> lock_acquire_exclusive(&lg->lock_dep_map, 0, 0, NULL, _RET_IP_); > >> for_each_possible_cpu(i) { > >> - arch_spinlock_t *lock; > >> + spinlock_t *lock; > >> lock = per_cpu_ptr(lg->lock, i); > >> - arch_spin_lock(lock); > >> + spin_lock(lock); > >> } > >> } > > > > Nope, that'll blow up in two separate places. > > > > One: lockdep, it can only track a limited number of held locks, and it > > will further report a recursion warning on the 2nd cpu. > > I was wondering why I haven't seen it explode. As it turns out I haven't > looked closely enough at dmesg: > > [ +0.001231] BUG: MAX_LOCK_DEPTH too low! > [ +0.000092] turning off the locking correctness validator. Yeah, we try really hard to not crash the kernel from debugging code, whenever we can avoid it! That sometimes creates a false sense of good kernel health. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/lglock.h b/include/linux/lglock.h index 0081f00..ea97f74 100644 --- a/include/linux/lglock.h +++ b/include/linux/lglock.h @@ -34,7 +34,7 @@ #endif struct lglock { - arch_spinlock_t __percpu *lock; + spinlock_t __percpu *lock; #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lock_class_key lock_key; struct lockdep_map lock_dep_map; @@ -42,13 +42,13 @@ struct lglock { }; #define DEFINE_LGLOCK(name) \ - static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ - = __ARCH_SPIN_LOCK_UNLOCKED; \ + static DEFINE_PER_CPU(spinlock_t, name ## _lock) \ + = __SPIN_LOCK_UNLOCKED(name ## _lock); \ struct lglock name = { .lock = &name ## _lock } #define DEFINE_STATIC_LGLOCK(name) \ - static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ - = __ARCH_SPIN_LOCK_UNLOCKED; \ + static DEFINE_PER_CPU(spinlock_t, name ## _lock) \ + = __SPIN_LOCK_UNLOCKED(name ## _lock); \ static struct lglock name = { .lock = &name ## _lock } void lg_lock_init(struct lglock *lg, char *name); diff --git a/kernel/locking/lglock.c b/kernel/locking/lglock.c index 86ae2ae..34077a7 100644 --- a/kernel/locking/lglock.c +++ b/kernel/locking/lglock.c @@ -18,44 +18,44 @@ EXPORT_SYMBOL(lg_lock_init); void lg_local_lock(struct lglock *lg) { - arch_spinlock_t *lock; + spinlock_t *lock; preempt_disable(); lock_acquire_shared(&lg->lock_dep_map, 0, 0, NULL, _RET_IP_); lock = this_cpu_ptr(lg->lock); - arch_spin_lock(lock); + spin_lock(lock); } EXPORT_SYMBOL(lg_local_lock); void lg_local_unlock(struct lglock *lg) { - arch_spinlock_t *lock; + spinlock_t *lock; lock_release(&lg->lock_dep_map, 1, _RET_IP_); lock = this_cpu_ptr(lg->lock); - arch_spin_unlock(lock); + spin_unlock(lock); preempt_enable(); } EXPORT_SYMBOL(lg_local_unlock); void lg_local_lock_cpu(struct lglock *lg, int cpu) { - arch_spinlock_t *lock; + spinlock_t *lock; preempt_disable(); lock_acquire_shared(&lg->lock_dep_map, 0, 0, NULL, _RET_IP_); lock = per_cpu_ptr(lg->lock, cpu); - arch_spin_lock(lock); + spin_lock(lock); } EXPORT_SYMBOL(lg_local_lock_cpu); void lg_local_unlock_cpu(struct lglock *lg, int cpu) { - arch_spinlock_t *lock; + spinlock_t *lock; lock_release(&lg->lock_dep_map, 1, _RET_IP_); lock = per_cpu_ptr(lg->lock, cpu); - arch_spin_unlock(lock); + spin_unlock(lock); preempt_enable(); } EXPORT_SYMBOL(lg_local_unlock_cpu); @@ -67,9 +67,9 @@ void lg_global_lock(struct lglock *lg) preempt_disable(); lock_acquire_exclusive(&lg->lock_dep_map, 0, 0, NULL, _RET_IP_); for_each_possible_cpu(i) { - arch_spinlock_t *lock; + spinlock_t *lock; lock = per_cpu_ptr(lg->lock, i); - arch_spin_lock(lock); + spin_lock(lock); } } EXPORT_SYMBOL(lg_global_lock); @@ -80,9 +80,9 @@ void lg_global_unlock(struct lglock *lg) lock_release(&lg->lock_dep_map, 1, _RET_IP_); for_each_possible_cpu(i) { - arch_spinlock_t *lock; + spinlock_t *lock; lock = per_cpu_ptr(lg->lock, i); - arch_spin_unlock(lock); + spin_unlock(lock); } preempt_enable(); }
arch_spinlock_t is the most low level spinlock type. lglock is not depending on arch_spinlock_t type and works also fine with normal spinlock_t. So there is no need to use it outside of the archicture code. There are two users of lglock which is fs/locks.c and kernel/stop_machine.c. The later doesn't depend on performance. So here some numbers for fs/locks.c only. Running all tests from lockperf 100 times on a 4 socket machine, Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz. flock01 -n 128 -l 128 mean variance sigma max min 4.0.0-rc5 448.0287 15417.8359 124.1686 527.8083 0.0081 4.0.0-rc5-spinlocks_t 395.1646 28713.4347 169.4504 520.7507 0.0075 flock02 -n 256 -l 20480 mean variance sigma max min 4.0.0-rc5 6.9185 0.8830 0.9397 10.6138 4.9707 4.0.0-rc5-spinlocks_t 6.2474 0.9234 0.9610 9.5478 4.3703 lease01 -n 128 -l 128 mean variance sigma max min 4.0.0-rc5 7.7040 0.3930 0.6269 9.1874 5.4179 4.0.0-rc5-spinlocks_t 7.6862 0.7794 0.8828 9.0623 1.3639 lease02 -n 128 -l 512 mean variance sigma max min 4.0.0-rc5 16.3074 0.1418 0.3766 17.1600 15.0240 4.0.0-rc5-spinlocks_t 16.2698 0.2772 0.5265 17.2221 13.4127 posix01 -n 128 -l 128 mean variance sigma max min 4.0.0-rc5 531.5151 181.3078 13.4651 596.5883 501.2940 4.0.0-rc5-spinlocks_t 531.3600 209.0023 14.4569 600.7317 507.1767 posix02 -n 256 -l 20480 mean variance sigma max min 4.0.0-rc5 13.8395 2.9768 1.7253 22.0783 9.9136 4.0.0-rc5-spinlocks_t 12.6822 3.1645 1.7789 18.1258 9.0030 posix03 -n 512 -i 128 mean variance sigma max min 4.0.0-rc5 0.8939 0.0006 0.0242 0.9392 0.8360 4.0.0-rc5-spinlocks_t 0.9050 0.0006 0.0254 0.9617 0.8454 posix04 -n 64 -i 32 mean variance sigma max min 4.0.0-rc5 0.0122 0.0000 0.0023 0.0227 0.0083 4.0.0-rc5-spinlocks_t 0.0115 0.0000 0.0016 0.0165 0.0091 This also makes -rt a bit more happy place since normal spinlocks_t can sleep with PREEMPT_RT=y. Link: https://git.samba.org/jlayton/linux.git/?p=jlayton/lockperf.git;a=summary Link: https://lwn.net/Articles/365863/ Link: http://marc.info/?l=linux-kernel&m=142737586415051&w=2 Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- include/linux/lglock.h | 10 +++++----- kernel/locking/lglock.c | 24 ++++++++++++------------ 2 files changed, 17 insertions(+), 17 deletions(-)