Message ID | 20250213152054.2785669-1-Ilia.Gavrilov@infotecs.ru (mailing list archive) |
---|---|
State | Accepted |
Commit | 07b598c0e6f06a0f254c88dafb4ad50f8a8c6eea |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v2] drop_monitor: fix incorrect initialization order | expand |
On Thu, Feb 13, 2025 at 03:20:55PM +0000, Gavrilov Ilia wrote: > Syzkaller reports the following bug: > > BUG: spinlock bad magic on CPU#1, syz-executor.0/7995 > lock: 0xffff88805303f3e0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 > CPU: 1 PID: 7995 Comm: syz-executor.0 Tainted: G E 5.10.209+ #1 > Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x119/0x179 lib/dump_stack.c:118 > debug_spin_lock_before kernel/locking/spinlock_debug.c:83 [inline] > do_raw_spin_lock+0x1f6/0x270 kernel/locking/spinlock_debug.c:112 > __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:117 [inline] > _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159 > reset_per_cpu_data+0xe6/0x240 [drop_monitor] > net_dm_cmd_trace+0x43d/0x17a0 [drop_monitor] > genl_family_rcv_msg_doit+0x22f/0x330 net/netlink/genetlink.c:739 > genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] > genl_rcv_msg+0x341/0x5a0 net/netlink/genetlink.c:800 > netlink_rcv_skb+0x14d/0x440 net/netlink/af_netlink.c:2497 > genl_rcv+0x29/0x40 net/netlink/genetlink.c:811 > netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline] > netlink_unicast+0x54b/0x800 net/netlink/af_netlink.c:1348 > netlink_sendmsg+0x914/0xe00 net/netlink/af_netlink.c:1916 > sock_sendmsg_nosec net/socket.c:651 [inline] > __sock_sendmsg+0x157/0x190 net/socket.c:663 > ____sys_sendmsg+0x712/0x870 net/socket.c:2378 > ___sys_sendmsg+0xf8/0x170 net/socket.c:2432 > __sys_sendmsg+0xea/0x1b0 net/socket.c:2461 > do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46 > entry_SYSCALL_64_after_hwframe+0x62/0xc7 > RIP: 0033:0x7f3f9815aee9 > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007f3f972bf0c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 00007f3f9826d050 RCX: 00007f3f9815aee9 > RDX: 0000000020000000 RSI: 0000000020001300 RDI: 0000000000000007 > RBP: 00007f3f981b63bd R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 000000000000006e R14: 00007f3f9826d050 R15: 00007ffe01ee6768 > > If drop_monitor is built as a kernel module, syzkaller may have time > to send a netlink NET_DM_CMD_START message during the module loading. > This will call the net_dm_monitor_start() function that uses > a spinlock that has not yet been initialized. > > To fix this, let's place resource initialization above the registration > of a generic netlink family. > > Found by InfoTeCS on behalf of Linux Verification Center > (linuxtesting.org) with Syzkaller. > > Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol") > Cc: stable@vger.kernel.org > Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru> Reviewed-by: Ido Schimmel <idosch@nvidia.com> I wouldn't object if someone requested to remove these archaic {BUG,WARN}_ON()s, but figured this cleanup is more of a net-next material.
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Thu, 13 Feb 2025 15:20:55 +0000 you wrote: > Syzkaller reports the following bug: > > BUG: spinlock bad magic on CPU#1, syz-executor.0/7995 > lock: 0xffff88805303f3e0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 > CPU: 1 PID: 7995 Comm: syz-executor.0 Tainted: G E 5.10.209+ #1 > Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x119/0x179 lib/dump_stack.c:118 > debug_spin_lock_before kernel/locking/spinlock_debug.c:83 [inline] > do_raw_spin_lock+0x1f6/0x270 kernel/locking/spinlock_debug.c:112 > __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:117 [inline] > _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159 > reset_per_cpu_data+0xe6/0x240 [drop_monitor] > net_dm_cmd_trace+0x43d/0x17a0 [drop_monitor] > genl_family_rcv_msg_doit+0x22f/0x330 net/netlink/genetlink.c:739 > genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] > genl_rcv_msg+0x341/0x5a0 net/netlink/genetlink.c:800 > netlink_rcv_skb+0x14d/0x440 net/netlink/af_netlink.c:2497 > genl_rcv+0x29/0x40 net/netlink/genetlink.c:811 > netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline] > netlink_unicast+0x54b/0x800 net/netlink/af_netlink.c:1348 > netlink_sendmsg+0x914/0xe00 net/netlink/af_netlink.c:1916 > sock_sendmsg_nosec net/socket.c:651 [inline] > __sock_sendmsg+0x157/0x190 net/socket.c:663 > ____sys_sendmsg+0x712/0x870 net/socket.c:2378 > ___sys_sendmsg+0xf8/0x170 net/socket.c:2432 > __sys_sendmsg+0xea/0x1b0 net/socket.c:2461 > do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46 > entry_SYSCALL_64_after_hwframe+0x62/0xc7 > RIP: 0033:0x7f3f9815aee9 > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007f3f972bf0c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 00007f3f9826d050 RCX: 00007f3f9815aee9 > RDX: 0000000020000000 RSI: 0000000020001300 RDI: 0000000000000007 > RBP: 00007f3f981b63bd R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 000000000000006e R14: 00007f3f9826d050 R15: 00007ffe01ee6768 > > [...] Here is the summary with links: - [net,v2] drop_monitor: fix incorrect initialization order https://git.kernel.org/netdev/net/c/07b598c0e6f0 You are awesome, thank you!
diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c index 6efd4cccc9dd..212f0a048cab 100644 --- a/net/core/drop_monitor.c +++ b/net/core/drop_monitor.c @@ -1734,30 +1734,30 @@ static int __init init_net_drop_monitor(void) return -ENOSPC; } - rc = genl_register_family(&net_drop_monitor_family); - if (rc) { - pr_err("Could not create drop monitor netlink family\n"); - return rc; + for_each_possible_cpu(cpu) { + net_dm_cpu_data_init(cpu); + net_dm_hw_cpu_data_init(cpu); } - WARN_ON(net_drop_monitor_family.mcgrp_offset != NET_DM_GRP_ALERT); rc = register_netdevice_notifier(&dropmon_net_notifier); if (rc < 0) { pr_crit("Failed to register netdevice notifier\n"); + return rc; + } + + rc = genl_register_family(&net_drop_monitor_family); + if (rc) { + pr_err("Could not create drop monitor netlink family\n"); goto out_unreg; } + WARN_ON(net_drop_monitor_family.mcgrp_offset != NET_DM_GRP_ALERT); rc = 0; - for_each_possible_cpu(cpu) { - net_dm_cpu_data_init(cpu); - net_dm_hw_cpu_data_init(cpu); - } - goto out; out_unreg: - genl_unregister_family(&net_drop_monitor_family); + WARN_ON(unregister_netdevice_notifier(&dropmon_net_notifier)); out: return rc; } @@ -1766,19 +1766,18 @@ static void exit_net_drop_monitor(void) { int cpu; - BUG_ON(unregister_netdevice_notifier(&dropmon_net_notifier)); - /* * Because of the module_get/put we do in the trace state change path * we are guaranteed not to have any current users when we get here */ + BUG_ON(genl_unregister_family(&net_drop_monitor_family)); + + BUG_ON(unregister_netdevice_notifier(&dropmon_net_notifier)); for_each_possible_cpu(cpu) { net_dm_hw_cpu_data_fini(cpu); net_dm_cpu_data_fini(cpu); } - - BUG_ON(genl_unregister_family(&net_drop_monitor_family)); } module_init(init_net_drop_monitor);
Syzkaller reports the following bug: BUG: spinlock bad magic on CPU#1, syz-executor.0/7995 lock: 0xffff88805303f3e0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 1 PID: 7995 Comm: syz-executor.0 Tainted: G E 5.10.209+ #1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x119/0x179 lib/dump_stack.c:118 debug_spin_lock_before kernel/locking/spinlock_debug.c:83 [inline] do_raw_spin_lock+0x1f6/0x270 kernel/locking/spinlock_debug.c:112 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:117 [inline] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159 reset_per_cpu_data+0xe6/0x240 [drop_monitor] net_dm_cmd_trace+0x43d/0x17a0 [drop_monitor] genl_family_rcv_msg_doit+0x22f/0x330 net/netlink/genetlink.c:739 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] genl_rcv_msg+0x341/0x5a0 net/netlink/genetlink.c:800 netlink_rcv_skb+0x14d/0x440 net/netlink/af_netlink.c:2497 genl_rcv+0x29/0x40 net/netlink/genetlink.c:811 netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline] netlink_unicast+0x54b/0x800 net/netlink/af_netlink.c:1348 netlink_sendmsg+0x914/0xe00 net/netlink/af_netlink.c:1916 sock_sendmsg_nosec net/socket.c:651 [inline] __sock_sendmsg+0x157/0x190 net/socket.c:663 ____sys_sendmsg+0x712/0x870 net/socket.c:2378 ___sys_sendmsg+0xf8/0x170 net/socket.c:2432 __sys_sendmsg+0xea/0x1b0 net/socket.c:2461 do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x62/0xc7 RIP: 0033:0x7f3f9815aee9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f3f972bf0c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f3f9826d050 RCX: 00007f3f9815aee9 RDX: 0000000020000000 RSI: 0000000020001300 RDI: 0000000000000007 RBP: 00007f3f981b63bd R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000006e R14: 00007f3f9826d050 R15: 00007ffe01ee6768 If drop_monitor is built as a kernel module, syzkaller may have time to send a netlink NET_DM_CMD_START message during the module loading. This will call the net_dm_monitor_start() function that uses a spinlock that has not yet been initialized. To fix this, let's place resource initialization above the registration of a generic netlink family. Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol") Cc: stable@vger.kernel.org Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru> --- V2: - Move the registration of the netlink family to the end of the module initialization function net/core/drop_monitor.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-)