Message ID | ZNep5EcYskP9HtGD@p100 (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | lockdep: Fix static memory detection even more | expand |
Hello, kernel test robot noticed "BUG:key#has_not_been_registered" on: commit: 94d4413e506da48ea18f1cc982202874d35c76b1 ("[PATCH] lockdep: Fix static memory detection even more") url: https://github.com/intel-lab-lkp/linux/commits/Helge-Deller/lockdep-Fix-static-memory-detection-even-more/20230812-235022 base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/all/ZNep5EcYskP9HtGD@p100/ patch subject: [PATCH] lockdep: Fix static memory detection even more in testcase: boot compiler: gcc-12 test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G (please refer to attached dmesg/kmsg for entire log/backtrace) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202308141646.d3160c77-oliver.sang@intel.com [ 0.575811][ T1] BUG: key b35c282c has not been registered! [ 0.576580][ T1] ------------[ cut here ]------------ [ 0.577216][ T1] DEBUG_LOCKS_WARN_ON(1) [ 0.577236][ T1] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:4888 lockdep_init_map_type+0x155/0x250 [ 0.578389][ T1] Modules linked in: [ 0.578856][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc4-00587-g94d4413e506d #2 [ 0.579899][ T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 0.581143][ T1] EIP: lockdep_init_map_type+0x155/0x250 [ 0.581677][ T1] Code: 70 a2 51 b4 85 db 75 ab e8 78 79 80 00 85 c0 74 a2 8b 0d 98 0a d4 b3 85 c9 75 98 68 59 2a 3b b3 68 8a ce 36 b3 e8 2b 7a f8 ff <0f> 0b 58 5a eb 83 8d 74 26 00 90 e8 4b 79 80 00 85 c0 0f 84 71 ff [ 0.584028][ T1] EAX: 00000016 EBX: 00000000 ECX: 00000000 EDX: 00000000 [ 0.584881][ T1] ESI: b3384ac4 EDI: b35c282c EBP: b67fbe58 ESP: b67fbe44 [ 0.585009][ T1] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010286 [ 0.585927][ T1] CR0: 80050033 CR2: ff999000 CR3: 03f25000 CR4: 000406d0 [ 0.586773][ T1] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 0.587629][ T1] DR6: fffe0ff0 DR7: 00000400 [ 0.588198][ T1] Call Trace: [ 0.588344][ T1] ? show_regs+0x74/0x80 [ 0.588860][ T1] ? lockdep_init_map_type+0x155/0x250 [ 0.589521][ T1] ? __warn+0x71/0x160 [ 0.590011][ T1] ? lockdep_init_map_type+0x155/0x250 [ 0.590684][ T1] ? report_bug+0x158/0x180 [ 0.591242][ T1] ? exc_overflow+0x40/0x40 [ 0.591677][ T1] ? handle_bug+0x2a/0x50 [ 0.592202][ T1] ? exc_invalid_op+0x25/0x70 [ 0.592749][ T1] ? handle_exception+0x14d/0x14d [ 0.593337][ T1] ? exc_overflow+0x40/0x40 [ 0.593881][ T1] ? lockdep_init_map_type+0x155/0x250 [ 0.594502][ T1] ? exc_overflow+0x40/0x40 [ 0.595011][ T1] ? lockdep_init_map_type+0x155/0x250 [ 0.595653][ T1] __kernfs_create_file+0x6a/0xe0 [ 0.596244][ T1] sysfs_add_bin_file_mode_ns+0x4e/0xc0 [ 0.596891][ T1] sysfs_create_bin_file+0x5c/0x90 [ 0.597496][ T1] ? nsproxy_cache_init+0x30/0x30 [ 0.598090][ T1] ksysfs_init+0x5c/0x90 [ 0.598350][ T1] ? nsproxy_cache_init+0x30/0x30 [ 0.598935][ T1] do_one_initcall+0x5b/0x2f0 [ 0.599480][ T1] ? parse_args+0x1f3/0x350 [ 0.600028][ T1] do_initcalls+0xe4/0x1e0 [ 0.600558][ T1] kernel_init_freeable+0x171/0x2a0 [ 0.601187][ T1] ? rest_init+0x1b0/0x1b0 [ 0.601677][ T1] kernel_init+0x17/0x1b0 [ 0.602193][ T1] ret_from_fork+0x35/0x50 [ 0.602714][ T1] ? rest_init+0x1b0/0x1b0 [ 0.603235][ T1] ret_from_fork_asm+0x12/0x20 [ 0.603812][ T1] entry_INT80_32+0x10d/0x10d [ 0.604359][ T1] irq event stamp: 7609 [ 0.604842][ T1] hardirqs last enabled at (7609): [<b1114e0e>] __up_console_sem+0x5e/0x80 [ 0.605011][ T1] hardirqs last disabled at (7608): [<b1114df5>] __up_console_sem+0x45/0x80 [ 0.606012][ T1] softirqs last enabled at (7260): [<b27a3d23>] __do_softirq+0x2a3/0x3a8 [ 0.607008][ T1] softirqs last disabled at (7229): [<b102f2e6>] do_softirq_own_stack+0x26/0x30 [ 0.608086][ T1] ---[ end trace 0000000000000000 ]--- The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20230814/202308141646.d3160c77-oliver.sang@intel.com
On Mon, Aug 14, 2023 at 09:40:42PM +0800, kernel test robot wrote: > > > Hello, > > kernel test robot noticed "BUG:key#has_not_been_registered" on: > > commit: 94d4413e506da48ea18f1cc982202874d35c76b1 ("[PATCH] lockdep: Fix static memory detection even more") > url: https://github.com/intel-lab-lkp/linux/commits/Helge-Deller/lockdep-Fix-static-memory-detection-even-more/20230812-235022 > base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything > patch link: https://lore.kernel.org/all/ZNep5EcYskP9HtGD@p100/ > patch subject: [PATCH] lockdep: Fix static memory detection even more > > in testcase: boot > > compiler: gcc-12 > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G > > (please refer to attached dmesg/kmsg for entire log/backtrace) > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <oliver.sang@intel.com> > | Closes: https://lore.kernel.org/oe-lkp/202308141646.d3160c77-oliver.sang@intel.com > > > > [ 0.575811][ T1] BUG: key b35c282c has not been registered! > [ 0.576580][ T1] ------------[ cut here ]------------ > [ 0.577216][ T1] DEBUG_LOCKS_WARN_ON(1) > [ 0.577236][ T1] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:4888 lockdep_init_map_type+0x155/0x250 > [ 0.578389][ T1] Modules linked in: > [ 0.578856][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc4-00587-g94d4413e506d #2 > [ 0.595653][ T1] __kernfs_create_file+0x6a/0xe0 > [ 0.596244][ T1] sysfs_add_bin_file_mode_ns+0x4e/0xc0 > [ 0.596891][ T1] sysfs_create_bin_file+0x5c/0x90 > [ 0.597496][ T1] ? nsproxy_cache_init+0x30/0x30 > [ 0.598090][ T1] ksysfs_init+0x5c/0x90 Problem seems to be __ro_after_init. Your patch only considers is_kernel_core_data(), which seems to not include these other fancy data sections we have.
On 8/14/23 16:17, Peter Zijlstra wrote: > On Mon, Aug 14, 2023 at 09:40:42PM +0800, kernel test robot wrote: >> >> >> Hello, >> >> kernel test robot noticed "BUG:key#has_not_been_registered" on: >> >> commit: 94d4413e506da48ea18f1cc982202874d35c76b1 ("[PATCH] lockdep: Fix static memory detection even more") >> url: https://github.com/intel-lab-lkp/linux/commits/Helge-Deller/lockdep-Fix-static-memory-detection-even-more/20230812-235022 >> base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything >> patch link: https://lore.kernel.org/all/ZNep5EcYskP9HtGD@p100/ >> patch subject: [PATCH] lockdep: Fix static memory detection even more >> >> in testcase: boot >> >> compiler: gcc-12 >> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G >> >> (please refer to attached dmesg/kmsg for entire log/backtrace) >> >> >> >> If you fix the issue in a separate patch/commit (i.e. not just a new version of >> the same patch/commit), kindly add following tags >> | Reported-by: kernel test robot <oliver.sang@intel.com> >> | Closes: https://lore.kernel.org/oe-lkp/202308141646.d3160c77-oliver.sang@intel.com >> >> >> >> [ 0.575811][ T1] BUG: key b35c282c has not been registered! >> [ 0.576580][ T1] ------------[ cut here ]------------ >> [ 0.577216][ T1] DEBUG_LOCKS_WARN_ON(1) >> [ 0.577236][ T1] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:4888 lockdep_init_map_type+0x155/0x250 >> [ 0.578389][ T1] Modules linked in: >> [ 0.578856][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc4-00587-g94d4413e506d #2 > >> [ 0.595653][ T1] __kernfs_create_file+0x6a/0xe0 >> [ 0.596244][ T1] sysfs_add_bin_file_mode_ns+0x4e/0xc0 >> [ 0.596891][ T1] sysfs_create_bin_file+0x5c/0x90 >> [ 0.597496][ T1] ? nsproxy_cache_init+0x30/0x30 >> [ 0.598090][ T1] ksysfs_init+0x5c/0x90 > > Problem seems to be __ro_after_init. Your patch only considers > is_kernel_core_data(), which seems to not include these other fancy data > sections we have. Thanks for that hint! So, adding "if (is_kernel_rodata(addr))..." I'll test and send a new patch. Helge
Hi, On Sat, Aug 12, 2023 at 05:48:52PM +0200, Helge Deller wrote: > On the parisc architecture, lockdep reports for all static objects which > are in the __initdata section (e.g. "setup_done" in devtmpfs, > "kthreadd_done" in init/main.c) this warning: > > INFO: trying to register non-static key. > > The warning itself is wrong, because those objects are in the __initdata > section, but the section itself is on parisc outside of range from > _stext to _end, which is why the static_obj() functions returns a wrong > answer. > > While fixing this issue, I noticed that the whole existing check can > be simplified a lot. > Instead of checking against the _stext and _end symbols (which include > code areas too) just check for the .data and .bss segments (since we check a > data object). This can be done with the existing is_kernel_core_data() > macro. > > In addition objects in the __initdata section can be checked with > init_section_contains(). > > This partly reverts and simplifies commit bac59d18c701 ("x86/setup: Fix static > memory detection"). > > Tested on x86-64 and parisc. > > Signed-off-by: Helge Deller <deller@gmx.de> > Fixes: bac59d18c701 ("x86/setup: Fix static memory detection") On loongarch, this patch results in the following backtrace. EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path EFI stub: Exiting boot services [ 0.000000] INFO: trying to register non-static key. [ 0.000000] The code is fine but needs lockdep annotation, or maybe [ 0.000000] you didn't initialize this object before use? [ 0.000000] turning off the locking correctness validator. [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0+ #1 [ 0.000000] Stack : 0000000000000000 0000000000000000 9000000000223d6c 9000000001df0000 [ 0.000000] 9000000001df39a0 9000000001df39a8 0000000000000000 0000000000000000 [ 0.000000] 9000000001df39a8 0000000000000001 0000000000000000 900000000154b910 [ 0.000000] fffffffffffffffe 9000000001df39a8 0000000000000000 0000000000000000 [ 0.000000] 0000000000000001 0000000000000003 0000000000000010 0000000000000030 [ 0.000000] 0000000000000063 0000000000000001 0000000000000000 0000000000000000 [ 0.000000] 0000000000000000 0000000000000000 9000000001c60650 9000000001e12000 [ 0.000000] 0000000000000000 9000000001560bc0 0000000000000000 9000000002ee6000 [ 0.000000] 0000000000000000 0000000000000000 9000000000223d84 0000000000000000 [ 0.000000] 00000000000000b0 0000000000000004 0000000000000000 0000000000000800 [ 0.000000] ... [ 0.000000] Call Trace: [ 0.000000] [<9000000000223d84>] show_stack+0x5c/0x180 [ 0.000000] [<900000000153e0b4>] dump_stack_lvl+0x88/0xd0 [ 0.000000] [<90000000002bc548>] register_lock_class+0x768/0x770 [ 0.000000] [<90000000002bc710>] __lock_acquire+0xb0/0x2a18 [ 0.000000] [<90000000002bba1c>] lock_acquire+0x11c/0x328 [ 0.000000] [<9000000000b34a60>] __debug_object_init+0x60/0x244 [ 0.000000] [<9000000000337f94>] init_cgroup_housekeeping+0xe8/0x144 [ 0.000000] [<900000000033e364>] init_cgroup_root+0x38/0xa0 [ 0.000000] [<90000000017801ac>] cgroup_init_early+0x44/0x16c [ 0.000000] [<9000000001770758>] start_kernel+0x50/0x624 [ 0.000000] [<90000000015410b4>] kernel_entry+0xb4/0xc4 Reverting it fixes the problem. Bisect log attached. This is also seen in v6.5.y and v6.4.y since the patch has been applied to those branches. Guenter --- # bad: [99d99825fc075fd24b60cc9cf0fb1e20b9c16b0f] Merge tag 'nfs-for-6.6-1' of git://git.linux-nfs.org/projects/anna/linux-nfs # good: [1c59d383390f970b891b503b7f79b63a02db2ec5] Merge tag 'linux-kselftest-nolibc-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest git bisect start '99d99825fc07' '1c59d383390f' # bad: [53ea7f624fb91074c2f9458832ed74975ee5d64c] Merge tag 'xfs-6.6-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux git bisect bad 53ea7f624fb91074c2f9458832ed74975ee5d64c # good: [aa05346dad4b37d068cf9144f0a1236757816886] Merge branch 'iep-drver-timestamping-support' git bisect good aa05346dad4b37d068cf9144f0a1236757816886 # good: [651a00bc56403161351090a9d7ddbd7095975324] Merge tag 'slab-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab git bisect good 651a00bc56403161351090a9d7ddbd7095975324 # good: [b96a3e9142fdf346b05b20e867b4f0dfca119e96] Merge tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm git bisect good b96a3e9142fdf346b05b20e867b4f0dfca119e96 # bad: [3d3dfeb3aec7b612d266d500c82054f1fded4980] Merge tag 'for-6.6/block-2023-08-28' of git://git.kernel.dk/linux git bisect bad 3d3dfeb3aec7b612d266d500c82054f1fded4980 # bad: [c1b7fcf3f6d94c2c3528bf77054bf174a5ef63d7] Merge tag 'for-6.6/io_uring-2023-08-28' of git://git.kernel.dk/linux git bisect bad c1b7fcf3f6d94c2c3528bf77054bf174a5ef63d7 # bad: [dce8f8ed1de1d9d6d27c5ccd202ce4ec163b100c] document while_each_thread(), change first_tid() to use for_each_thread() git bisect bad dce8f8ed1de1d9d6d27c5ccd202ce4ec163b100c # good: [02d7f74a04b1299ab2a0a9095f0e2d5aa3aed553] lib/bch.c: use bitrev instead of internal logic git bisect good 02d7f74a04b1299ab2a0a9095f0e2d5aa3aed553 # good: [79939c4a79bc643d399bd3fdd0f87100ea6b4362] scripts/gdb/slab: add slab support git bisect good 79939c4a79bc643d399bd3fdd0f87100ea6b4362 # bad: [5ffd2c37cb7a53d52099e5ed1fd7ccbc9e358791] kill do_each_thread() git bisect bad 5ffd2c37cb7a53d52099e5ed1fd7ccbc9e358791 # good: [665536092355f17f0e2ea291eec70f9787dccd32] lib/vsprintf: declare no_hash_pointers in sprintf.h git bisect good 665536092355f17f0e2ea291eec70f9787dccd32 # bad: [ef815d2cba782e96b9aad9483523d474ed41c62a] treewide: drop CONFIG_EMBEDDED git bisect bad ef815d2cba782e96b9aad9483523d474ed41c62a # bad: [0a6b58c5cd0dfd7961e725212f0fc8dfc5d96195] lockdep: fix static memory detection even more git bisect bad 0a6b58c5cd0dfd7961e725212f0fc8dfc5d96195 # first bad commit: [0a6b58c5cd0dfd7961e725212f0fc8dfc5d96195] lockdep: fix static memory detection even more
* Guenter Roeck <linux@roeck-us.net>: > Hi, > > On Sat, Aug 12, 2023 at 05:48:52PM +0200, Helge Deller wrote: > > On the parisc architecture, lockdep reports for all static objects which > > are in the __initdata section (e.g. "setup_done" in devtmpfs, > > "kthreadd_done" in init/main.c) this warning: > > > > INFO: trying to register non-static key. > > > > The warning itself is wrong, because those objects are in the __initdata > > section, but the section itself is on parisc outside of range from > > _stext to _end, which is why the static_obj() functions returns a wrong > > answer. > > > > While fixing this issue, I noticed that the whole existing check can > > be simplified a lot. > > Instead of checking against the _stext and _end symbols (which include > > code areas too) just check for the .data and .bss segments (since we check a > > data object). This can be done with the existing is_kernel_core_data() > > macro. > > > > In addition objects in the __initdata section can be checked with > > init_section_contains(). > > > > This partly reverts and simplifies commit bac59d18c701 ("x86/setup: Fix static > > memory detection"). > > > > Tested on x86-64 and parisc. > > > > Signed-off-by: Helge Deller <deller@gmx.de> > > Fixes: bac59d18c701 ("x86/setup: Fix static memory detection") > > On loongarch, this patch results in the following backtrace. > > EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path > EFI stub: Exiting boot services > [ 0.000000] INFO: trying to register non-static key. > [ 0.000000] The code is fine but needs lockdep annotation, or maybe > [ 0.000000] you didn't initialize this object before use? > [ 0.000000] turning off the locking correctness validator. > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0+ #1 > [ 0.000000] Stack : 0000000000000000 0000000000000000 9000000000223d6c 9000000001df0000 > [ 0.000000] 9000000001df39a0 9000000001df39a8 0000000000000000 0000000000000000 > [ 0.000000] 9000000001df39a8 0000000000000001 0000000000000000 900000000154b910 > [ 0.000000] fffffffffffffffe 9000000001df39a8 0000000000000000 0000000000000000 > [ 0.000000] 0000000000000001 0000000000000003 0000000000000010 0000000000000030 > [ 0.000000] 0000000000000063 0000000000000001 0000000000000000 0000000000000000 > [ 0.000000] 0000000000000000 0000000000000000 9000000001c60650 9000000001e12000 > [ 0.000000] 0000000000000000 9000000001560bc0 0000000000000000 9000000002ee6000 > [ 0.000000] 0000000000000000 0000000000000000 9000000000223d84 0000000000000000 > [ 0.000000] 00000000000000b0 0000000000000004 0000000000000000 0000000000000800 > [ 0.000000] ... > [ 0.000000] Call Trace: > [ 0.000000] [<9000000000223d84>] show_stack+0x5c/0x180 > [ 0.000000] [<900000000153e0b4>] dump_stack_lvl+0x88/0xd0 > [ 0.000000] [<90000000002bc548>] register_lock_class+0x768/0x770 > [ 0.000000] [<90000000002bc710>] __lock_acquire+0xb0/0x2a18 > [ 0.000000] [<90000000002bba1c>] lock_acquire+0x11c/0x328 > [ 0.000000] [<9000000000b34a60>] __debug_object_init+0x60/0x244 > [ 0.000000] [<9000000000337f94>] init_cgroup_housekeeping+0xe8/0x144 > [ 0.000000] [<900000000033e364>] init_cgroup_root+0x38/0xa0 > [ 0.000000] [<90000000017801ac>] cgroup_init_early+0x44/0x16c > [ 0.000000] [<9000000001770758>] start_kernel+0x50/0x624 > [ 0.000000] [<90000000015410b4>] kernel_entry+0xb4/0xc4 > > Reverting it fixes the problem. Bisect log attached. > > This is also seen in v6.5.y and v6.4.y since the patch has been applied > to those branches. Does this happens with CONFIG_SMP=n ? If so, I think the untested patch below might fix the issue. Helge --- [PATCH] loogarch: Keep PERCPU section in init section even for !CONFIG_SMP Signed-off-by: Helge Deller <deller@gmx.de> diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S index b1686afcf876..32d61e931cdc 100644 --- a/arch/loongarch/kernel/vmlinux.lds.S +++ b/arch/loongarch/kernel/vmlinux.lds.S @@ -99,9 +99,7 @@ SECTIONS EXIT_DATA } -#ifdef CONFIG_SMP PERCPU_SECTION(1 << CONFIG_L1_CACHE_SHIFT) -#endif .init.bss : { *(.init.bss)
On 9/3/23 14:11, Helge Deller wrote: > * Guenter Roeck <linux@roeck-us.net>: >> Hi, >> >> On Sat, Aug 12, 2023 at 05:48:52PM +0200, Helge Deller wrote: >>> On the parisc architecture, lockdep reports for all static objects which >>> are in the __initdata section (e.g. "setup_done" in devtmpfs, >>> "kthreadd_done" in init/main.c) this warning: >>> >>> INFO: trying to register non-static key. >>> >>> The warning itself is wrong, because those objects are in the __initdata >>> section, but the section itself is on parisc outside of range from >>> _stext to _end, which is why the static_obj() functions returns a wrong >>> answer. >>> >>> While fixing this issue, I noticed that the whole existing check can >>> be simplified a lot. >>> Instead of checking against the _stext and _end symbols (which include >>> code areas too) just check for the .data and .bss segments (since we check a >>> data object). This can be done with the existing is_kernel_core_data() >>> macro. >>> >>> In addition objects in the __initdata section can be checked with >>> init_section_contains(). >>> >>> This partly reverts and simplifies commit bac59d18c701 ("x86/setup: Fix static >>> memory detection"). >>> >>> Tested on x86-64 and parisc. >>> >>> Signed-off-by: Helge Deller <deller@gmx.de> >>> Fixes: bac59d18c701 ("x86/setup: Fix static memory detection") >> >> On loongarch, this patch results in the following backtrace. >> >> EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path >> EFI stub: Exiting boot services >> [ 0.000000] INFO: trying to register non-static key. >> [ 0.000000] The code is fine but needs lockdep annotation, or maybe >> [ 0.000000] you didn't initialize this object before use? >> [ 0.000000] turning off the locking correctness validator. >> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0+ #1 >> [ 0.000000] Stack : 0000000000000000 0000000000000000 9000000000223d6c 9000000001df0000 >> [ 0.000000] 9000000001df39a0 9000000001df39a8 0000000000000000 0000000000000000 >> [ 0.000000] 9000000001df39a8 0000000000000001 0000000000000000 900000000154b910 >> [ 0.000000] fffffffffffffffe 9000000001df39a8 0000000000000000 0000000000000000 >> [ 0.000000] 0000000000000001 0000000000000003 0000000000000010 0000000000000030 >> [ 0.000000] 0000000000000063 0000000000000001 0000000000000000 0000000000000000 >> [ 0.000000] 0000000000000000 0000000000000000 9000000001c60650 9000000001e12000 >> [ 0.000000] 0000000000000000 9000000001560bc0 0000000000000000 9000000002ee6000 >> [ 0.000000] 0000000000000000 0000000000000000 9000000000223d84 0000000000000000 >> [ 0.000000] 00000000000000b0 0000000000000004 0000000000000000 0000000000000800 >> [ 0.000000] ... >> [ 0.000000] Call Trace: >> [ 0.000000] [<9000000000223d84>] show_stack+0x5c/0x180 >> [ 0.000000] [<900000000153e0b4>] dump_stack_lvl+0x88/0xd0 >> [ 0.000000] [<90000000002bc548>] register_lock_class+0x768/0x770 >> [ 0.000000] [<90000000002bc710>] __lock_acquire+0xb0/0x2a18 >> [ 0.000000] [<90000000002bba1c>] lock_acquire+0x11c/0x328 >> [ 0.000000] [<9000000000b34a60>] __debug_object_init+0x60/0x244 >> [ 0.000000] [<9000000000337f94>] init_cgroup_housekeeping+0xe8/0x144 >> [ 0.000000] [<900000000033e364>] init_cgroup_root+0x38/0xa0 >> [ 0.000000] [<90000000017801ac>] cgroup_init_early+0x44/0x16c >> [ 0.000000] [<9000000001770758>] start_kernel+0x50/0x624 >> [ 0.000000] [<90000000015410b4>] kernel_entry+0xb4/0xc4 >> >> Reverting it fixes the problem. Bisect log attached. >> >> This is also seen in v6.5.y and v6.4.y since the patch has been applied >> to those branches. > > Does this happens with CONFIG_SMP=n ? > If so, I think the untested patch below might fix the issue. > No, this is loongarch:defconfig with various debug options enabled. That has CONFIG_SMP=y. Guenter > Helge > > --- > > [PATCH] loogarch: Keep PERCPU section in init section even for !CONFIG_SMP > > Signed-off-by: Helge Deller <deller@gmx.de> > > diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S > index b1686afcf876..32d61e931cdc 100644 > --- a/arch/loongarch/kernel/vmlinux.lds.S > +++ b/arch/loongarch/kernel/vmlinux.lds.S > @@ -99,9 +99,7 @@ SECTIONS > EXIT_DATA > } > > -#ifdef CONFIG_SMP > PERCPU_SECTION(1 << CONFIG_L1_CACHE_SHIFT) > -#endif > > .init.bss : { > *(.init.bss)
* Guenter Roeck <linux@roeck-us.net>: > On 9/3/23 14:11, Helge Deller wrote: > > * Guenter Roeck <linux@roeck-us.net>: > > > Hi, > > > > > > On Sat, Aug 12, 2023 at 05:48:52PM +0200, Helge Deller wrote: > > > > On the parisc architecture, lockdep reports for all static objects which > > > > are in the __initdata section (e.g. "setup_done" in devtmpfs, > > > > "kthreadd_done" in init/main.c) this warning: > > > > > > > > INFO: trying to register non-static key. > > > > > > > > The warning itself is wrong, because those objects are in the __initdata > > > > section, but the section itself is on parisc outside of range from > > > > _stext to _end, which is why the static_obj() functions returns a wrong > > > > answer. > > > > > > > > While fixing this issue, I noticed that the whole existing check can > > > > be simplified a lot. > > > > Instead of checking against the _stext and _end symbols (which include > > > > code areas too) just check for the .data and .bss segments (since we check a > > > > data object). This can be done with the existing is_kernel_core_data() > > > > macro. > > > > > > > > In addition objects in the __initdata section can be checked with > > > > init_section_contains(). > > > > > > > > This partly reverts and simplifies commit bac59d18c701 ("x86/setup: Fix static > > > > memory detection"). > > > > > > > > Tested on x86-64 and parisc. > > > > > > > > Signed-off-by: Helge Deller <deller@gmx.de> > > > > Fixes: bac59d18c701 ("x86/setup: Fix static memory detection") > > > > > > On loongarch, this patch results in the following backtrace. > > > > > > EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path > > > EFI stub: Exiting boot services > > > [ 0.000000] INFO: trying to register non-static key. > > > [ 0.000000] The code is fine but needs lockdep annotation, or maybe > > > [ 0.000000] you didn't initialize this object before use? > > > [ 0.000000] turning off the locking correctness validator. > > > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0+ #1 > > > [ 0.000000] Stack : 0000000000000000 0000000000000000 9000000000223d6c 9000000001df0000 > > > [ 0.000000] 9000000001df39a0 9000000001df39a8 0000000000000000 0000000000000000 > > > [ 0.000000] 9000000001df39a8 0000000000000001 0000000000000000 900000000154b910 > > > [ 0.000000] fffffffffffffffe 9000000001df39a8 0000000000000000 0000000000000000 > > > [ 0.000000] 0000000000000001 0000000000000003 0000000000000010 0000000000000030 > > > [ 0.000000] 0000000000000063 0000000000000001 0000000000000000 0000000000000000 > > > [ 0.000000] 0000000000000000 0000000000000000 9000000001c60650 9000000001e12000 > > > [ 0.000000] 0000000000000000 9000000001560bc0 0000000000000000 9000000002ee6000 > > > [ 0.000000] 0000000000000000 0000000000000000 9000000000223d84 0000000000000000 > > > [ 0.000000] 00000000000000b0 0000000000000004 0000000000000000 0000000000000800 > > > [ 0.000000] ... > > > [ 0.000000] Call Trace: > > > [ 0.000000] [<9000000000223d84>] show_stack+0x5c/0x180 > > > [ 0.000000] [<900000000153e0b4>] dump_stack_lvl+0x88/0xd0 > > > [ 0.000000] [<90000000002bc548>] register_lock_class+0x768/0x770 > > > [ 0.000000] [<90000000002bc710>] __lock_acquire+0xb0/0x2a18 > > > [ 0.000000] [<90000000002bba1c>] lock_acquire+0x11c/0x328 > > > [ 0.000000] [<9000000000b34a60>] __debug_object_init+0x60/0x244 > > > [ 0.000000] [<9000000000337f94>] init_cgroup_housekeeping+0xe8/0x144 > > > [ 0.000000] [<900000000033e364>] init_cgroup_root+0x38/0xa0 > > > [ 0.000000] [<90000000017801ac>] cgroup_init_early+0x44/0x16c > > > [ 0.000000] [<9000000001770758>] start_kernel+0x50/0x624 > > > [ 0.000000] [<90000000015410b4>] kernel_entry+0xb4/0xc4 > > > > > > Reverting it fixes the problem. Bisect log attached. > > > > > > This is also seen in v6.5.y and v6.4.y since the patch has been applied > > > to those branches. > > > > Does this happens with CONFIG_SMP=n ? > > If so, I think the untested patch below might fix the issue. > > > > No, this is loongarch:defconfig with various debug options enabled. > That has CONFIG_SMP=y. Could you apply below patch and verify with the contents of the System.map file where the lock is located ? diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index e85b5ad3e206..db0a301f9740 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -969,7 +969,7 @@ static bool assign_lock_key(struct lockdep_map *lock) else { /* Debug-check: all keys must be persistent! */ debug_locks_off(); - pr_err("INFO: trying to register non-static key.\n"); + pr_err("INFO: trying to register non-static key at %08lx.\n", addr); pr_err("The code is fine but needs lockdep annotation, or maybe\n"); pr_err("you didn't initialize this object before use?\n"); pr_err("turning off the locking correctness validator.\n");
On 9/6/23 00:18, Helge Deller wrote: > * Guenter Roeck <linux@roeck-us.net>: >> On 9/3/23 14:11, Helge Deller wrote: >>> * Guenter Roeck <linux@roeck-us.net>: >>>> Hi, >>>> >>>> On Sat, Aug 12, 2023 at 05:48:52PM +0200, Helge Deller wrote: >>>>> On the parisc architecture, lockdep reports for all static objects which >>>>> are in the __initdata section (e.g. "setup_done" in devtmpfs, >>>>> "kthreadd_done" in init/main.c) this warning: >>>>> >>>>> INFO: trying to register non-static key. >>>>> >>>>> The warning itself is wrong, because those objects are in the __initdata >>>>> section, but the section itself is on parisc outside of range from >>>>> _stext to _end, which is why the static_obj() functions returns a wrong >>>>> answer. >>>>> >>>>> While fixing this issue, I noticed that the whole existing check can >>>>> be simplified a lot. >>>>> Instead of checking against the _stext and _end symbols (which include >>>>> code areas too) just check for the .data and .bss segments (since we check a >>>>> data object). This can be done with the existing is_kernel_core_data() >>>>> macro. >>>>> >>>>> In addition objects in the __initdata section can be checked with >>>>> init_section_contains(). >>>>> >>>>> This partly reverts and simplifies commit bac59d18c701 ("x86/setup: Fix static >>>>> memory detection"). >>>>> >>>>> Tested on x86-64 and parisc. >>>>> >>>>> Signed-off-by: Helge Deller <deller@gmx.de> >>>>> Fixes: bac59d18c701 ("x86/setup: Fix static memory detection") >>>> >>>> On loongarch, this patch results in the following backtrace. >>>> >>>> EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path >>>> EFI stub: Exiting boot services >>>> [ 0.000000] INFO: trying to register non-static key. >>>> [ 0.000000] The code is fine but needs lockdep annotation, or maybe >>>> [ 0.000000] you didn't initialize this object before use? >>>> [ 0.000000] turning off the locking correctness validator. >>>> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0+ #1 >>>> [ 0.000000] Stack : 0000000000000000 0000000000000000 9000000000223d6c 9000000001df0000 >>>> [ 0.000000] 9000000001df39a0 9000000001df39a8 0000000000000000 0000000000000000 >>>> [ 0.000000] 9000000001df39a8 0000000000000001 0000000000000000 900000000154b910 >>>> [ 0.000000] fffffffffffffffe 9000000001df39a8 0000000000000000 0000000000000000 >>>> [ 0.000000] 0000000000000001 0000000000000003 0000000000000010 0000000000000030 >>>> [ 0.000000] 0000000000000063 0000000000000001 0000000000000000 0000000000000000 >>>> [ 0.000000] 0000000000000000 0000000000000000 9000000001c60650 9000000001e12000 >>>> [ 0.000000] 0000000000000000 9000000001560bc0 0000000000000000 9000000002ee6000 >>>> [ 0.000000] 0000000000000000 0000000000000000 9000000000223d84 0000000000000000 >>>> [ 0.000000] 00000000000000b0 0000000000000004 0000000000000000 0000000000000800 >>>> [ 0.000000] ... >>>> [ 0.000000] Call Trace: >>>> [ 0.000000] [<9000000000223d84>] show_stack+0x5c/0x180 >>>> [ 0.000000] [<900000000153e0b4>] dump_stack_lvl+0x88/0xd0 >>>> [ 0.000000] [<90000000002bc548>] register_lock_class+0x768/0x770 >>>> [ 0.000000] [<90000000002bc710>] __lock_acquire+0xb0/0x2a18 >>>> [ 0.000000] [<90000000002bba1c>] lock_acquire+0x11c/0x328 >>>> [ 0.000000] [<9000000000b34a60>] __debug_object_init+0x60/0x244 >>>> [ 0.000000] [<9000000000337f94>] init_cgroup_housekeeping+0xe8/0x144 >>>> [ 0.000000] [<900000000033e364>] init_cgroup_root+0x38/0xa0 >>>> [ 0.000000] [<90000000017801ac>] cgroup_init_early+0x44/0x16c >>>> [ 0.000000] [<9000000001770758>] start_kernel+0x50/0x624 >>>> [ 0.000000] [<90000000015410b4>] kernel_entry+0xb4/0xc4 >>>> >>>> Reverting it fixes the problem. Bisect log attached. >>>> >>>> This is also seen in v6.5.y and v6.4.y since the patch has been applied >>>> to those branches. >>> >>> Does this happens with CONFIG_SMP=n ? >>> If so, I think the untested patch below might fix the issue. >>> >> >> No, this is loongarch:defconfig with various debug options enabled. >> That has CONFIG_SMP=y. > > Could you apply below patch and verify with the contents of the > System.map file where the lock is located ? > > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c > index e85b5ad3e206..db0a301f9740 100644 > --- a/kernel/locking/lockdep.c > +++ b/kernel/locking/lockdep.c > @@ -969,7 +969,7 @@ static bool assign_lock_key(struct lockdep_map *lock) > else { > /* Debug-check: all keys must be persistent! */ > debug_locks_off(); > - pr_err("INFO: trying to register non-static key.\n"); > + pr_err("INFO: trying to register non-static key at %08lx.\n", addr); > pr_err("The code is fine but needs lockdep annotation, or maybe\n"); > pr_err("you didn't initialize this object before use?\n"); > pr_err("turning off the locking correctness validator.\n"); 90000000015602d0 D __la_abs_end ... 90000000016815c0 d fill_pool_map.3 <--- lock pointer ... 9000000001770000 T __init_begin Guenter
[side notes: * mail has a reduced set of recipients * fix is in the works, but add it to the tracking, as rc1 is due * just in case it was missed: culprit made it to various stable series] [TLDR: I'm adding this report to the list of tracked Linux kernel regressions; the text you find below is based on a few templates paragraphs you might have encountered already in similar form. See link in footer if these mails annoy you.] On 03.09.23 21:36, Guenter Roeck wrote: > > On Sat, Aug 12, 2023 at 05:48:52PM +0200, Helge Deller wrote: >> On the parisc architecture, lockdep reports for all static objects which >> are in the __initdata section (e.g. "setup_done" in devtmpfs, >> "kthreadd_done" in init/main.c) this warning: >> >> INFO: trying to register non-static key. >> > [...] >> Signed-off-by: Helge Deller <deller@gmx.de> >> Fixes: bac59d18c701 ("x86/setup: Fix static memory detection") > > On loongarch, this patch results in the following backtrace. > [...]> # first bad commit: [0a6b58c5cd0dfd7961e725212f0fc8dfc5d96195] lockdep: fix static memory detection even more Thanks for the report. To be sure the issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression tracking bot: #regzbot ^introduced 0a6b58c5cd0dfd7961e725212f0fc8dfc5d961 #regzbot title lockdep: backtrace on loongarch #regzbot ignore-activity This isn't a regression? This issue or a fix for it are already discussed somewhere else? It was fixed already? You want to clarify when the regression started to happen? Or point out I got the title or something else totally wrong? Then just reply and tell me -- ideally while also telling regzbot about it, as explained by the page listed in the footer of this mail. Developers: When fixing the issue, remember to add 'Link:' tags pointing to the report (the parent of this mail). See page linked in footer for details. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr That page also explains what to do if mails like this annoy you.
[TLDR: This mail in primarily relevant for Linux regression tracking. A change or fix related to the regression discussed in this thread was posted or applied, but it did not use a Closes: tag to point to the report, as Linus and the documentation call for. Things happen, no worries -- but now the regression tracking bot needs to be told manually about the fix. See link in footer if these mails annoy you.] On 10.09.23 13:20, Linux regression tracking #adding (Thorsten Leemhuis) wrote: > On 03.09.23 21:36, Guenter Roeck wrote: >> >> On Sat, Aug 12, 2023 at 05:48:52PM +0200, Helge Deller wrote: >>> On the parisc architecture, lockdep reports for all static objects which >>> are in the __initdata section (e.g. "setup_done" in devtmpfs, >>> "kthreadd_done" in init/main.c) this warning: >>> >>> INFO: trying to register non-static key. >>> >> [...] >>> Signed-off-by: Helge Deller <deller@gmx.de> >>> Fixes: bac59d18c701 ("x86/setup: Fix static memory detection") >> >> On loongarch, this patch results in the following backtrace. >> [...]> # first bad commit: [0a6b58c5cd0dfd7961e725212f0fc8dfc5d96195] > lockdep: fix static memory detection even more > > Thanks for the report. To be sure the issue doesn't fall through the > cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression > tracking bot: > > #regzbot ^introduced 0a6b58c5cd0dfd7961e725212f0fc8dfc5d961 > #regzbot title lockdep: backtrace on loongarch > #regzbot ignore-activity Fix fow in next (as 68ffa230daa0d35b7cce476098433d763d5fd42f) #regzbot fix: LoongArch: Fix lockdep static memory detection #regzbot ignore-activity Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr That page also explains what to do if mails like this annoy you.
diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h index a6e8373a5170..3fa87e5e11ab 100644 --- a/arch/x86/include/asm/sections.h +++ b/arch/x86/include/asm/sections.h @@ -2,8 +2,6 @@ #ifndef _ASM_X86_SECTIONS_H #define _ASM_X86_SECTIONS_H -#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed - #include <asm-generic/sections.h> #include <asm/extable.h> @@ -18,20 +16,4 @@ extern char __end_of_kernel_reserve[]; extern unsigned long _brk_start, _brk_end; -static inline bool arch_is_kernel_initmem_freed(unsigned long addr) -{ - /* - * If _brk_start has not been cleared, brk allocation is incomplete, - * and we can not make assumptions about its use. - */ - if (_brk_start) - return 0; - - /* - * After brk allocation is complete, space between _brk_end and _end - * is available for allocation. - */ - return addr >= _brk_end && addr < (unsigned long)&_end; -} - #endif /* _ASM_X86_SECTIONS_H */ diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index 111607d91489..957b785d0b59 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -819,34 +819,20 @@ static int very_verbose(struct lock_class *class) * Is this the address of a static object: */ #ifdef __KERNEL__ -/* - * Check if an address is part of freed initmem. After initmem is freed, - * memory can be allocated from it, and such allocations would then have - * addresses within the range [_stext, _end]. - */ -#ifndef arch_is_kernel_initmem_freed -static int arch_is_kernel_initmem_freed(unsigned long addr) -{ - if (system_state < SYSTEM_FREEING_INITMEM) - return 0; - - return init_section_contains((void *)addr, 1); -} -#endif - static int static_obj(const void *obj) { - unsigned long start = (unsigned long) &_stext, - end = (unsigned long) &_end, - addr = (unsigned long) obj; + unsigned long addr = (unsigned long) obj; - if (arch_is_kernel_initmem_freed(addr)) - return 0; + if (is_kernel_core_data(addr)) + return 1; /* - * static variable? + * in initdata section and used during bootup only? + * NOTE: On some platforms the initdata section is + * outside of the _stext ... _end range. */ - if ((addr >= start) && (addr < end)) + if (system_state < SYSTEM_FREEING_INITMEM && + init_section_contains((void *)addr, 1)) return 1; /*
On the parisc architecture, lockdep reports for all static objects which are in the __initdata section (e.g. "setup_done" in devtmpfs, "kthreadd_done" in init/main.c) this warning: INFO: trying to register non-static key. The warning itself is wrong, because those objects are in the __initdata section, but the section itself is on parisc outside of range from _stext to _end, which is why the static_obj() functions returns a wrong answer. While fixing this issue, I noticed that the whole existing check can be simplified a lot. Instead of checking against the _stext and _end symbols (which include code areas too) just check for the .data and .bss segments (since we check a data object). This can be done with the existing is_kernel_core_data() macro. In addition objects in the __initdata section can be checked with init_section_contains(). This partly reverts and simplifies commit bac59d18c701 ("x86/setup: Fix static memory detection"). Tested on x86-64 and parisc. Signed-off-by: Helge Deller <deller@gmx.de> Fixes: bac59d18c701 ("x86/setup: Fix static memory detection") ---