Message ID | 20241108145420.2445641-1-lizhi.xu@windriver.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | mac802154: add a check for slave data list before delete | expand |
Hello, On 08/11/2024 at 22:54:20 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote: > syzkaller reported a corrupted list in ieee802154_if_remove. [1] > > Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4 > hardware device from the system. > > CPU0 CPU1 > ==== ==== > genl_family_rcv_msg_doit ieee802154_unregister_hw > ieee802154_del_iface ieee802154_remove_interfaces > rdev_del_virtual_intf_deprecated list_del(&sdata->list) > ieee802154_if_remove > list_del_rcu FYI this is a "duplicate" but with a different approach than: https://lore.kernel.org/linux-wpan/87v7wtpngj.fsf@bootlin.com/T/#m02cebe86ec0171fc4d3350676bbdd4a7e3827077 Thanks, Miquèl > > Avoid this issue, by adding slave data state bit SDATA_STATE_LISTDONE, set > SDATA_STATE_LISTDONE when unregistering the hardware from the system, and > add state bit SDATA_STATE_LISTDONE judgment before removing the interface > to delete the list. > > [1] > kernel BUG at lib/list_debug.c:58! > Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI > CPU: 0 UID: 0 PID: 6277 Comm: syz-executor157 Not tainted 6.12.0-rc6-syzkaller-00005-g557329bcecc2 #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 > RIP: 0010:__list_del_entry_valid_or_report+0xf4/0x140 lib/list_debug.c:56 > Code: e8 a1 7e 00 07 90 0f 0b 48 c7 c7 e0 37 60 8c 4c 89 fe e8 8f 7e 00 07 90 0f 0b 48 c7 c7 40 38 60 8c 4c 89 fe e8 7d 7e 00 07 90 <0f> 0b 48 c7 c7 a0 38 60 8c 4c 89 fe e8 6b 7e 00 07 90 0f 0b 48 c7 > RSP: 0018:ffffc9000490f3d0 EFLAGS: 00010246 > RAX: 000000000000004e RBX: dead000000000122 RCX: d211eee56bb28d00 > RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 > RBP: ffff88805b278dd8 R08: ffffffff8174a12c R09: 1ffffffff2852f0d > R10: dffffc0000000000 R11: fffffbfff2852f0e R12: dffffc0000000000 > R13: dffffc0000000000 R14: dead000000000100 R15: ffff88805b278cc0 > FS: 0000555572f94380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000056262e4a3000 CR3: 0000000078496000 CR4: 00000000003526f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <TASK> > __list_del_entry_valid include/linux/list.h:124 [inline] > __list_del_entry include/linux/list.h:215 [inline] > list_del_rcu include/linux/rculist.h:157 [inline] > ieee802154_if_remove+0x86/0x1e0 net/mac802154/iface.c:687 > rdev_del_virtual_intf_deprecated net/ieee802154/rdev-ops.h:24 [inline] > ieee802154_del_iface+0x2c0/0x5c0 net/ieee802154/nl-phy.c:323 > genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline] > genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] > genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210 > netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2551 > genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 > netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline] > netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357 > netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901 > sock_sendmsg_nosec net/socket.c:729 [inline] > __sock_sendmsg+0x221/0x270 net/socket.c:744 > ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2607 > ___sys_sendmsg net/socket.c:2661 [inline] > __sys_sendmsg+0x292/0x380 net/socket.c:2690 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > Reported-and-tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92 > Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> > --- > net/mac802154/ieee802154_i.h | 1 + > net/mac802154/iface.c | 4 ++++ > 2 files changed, 5 insertions(+) > > diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h > index 08dd521a51a5..6771c0569516 100644 > --- a/net/mac802154/ieee802154_i.h > +++ b/net/mac802154/ieee802154_i.h > @@ -101,6 +101,7 @@ enum { > > enum ieee802154_sdata_state_bits { > SDATA_STATE_RUNNING, > + SDATA_STATE_LISTDONE, > }; > > /* Slave interface definition. > diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c > index c0e2da5072be..aed2fc63395d 100644 > --- a/net/mac802154/iface.c > +++ b/net/mac802154/iface.c > @@ -683,6 +683,9 @@ void ieee802154_if_remove(struct ieee802154_sub_if_data *sdata) > { > ASSERT_RTNL(); > > + if (test_bit(SDATA_STATE_LISTDONE, &sdata->state)) > + return; > + > mutex_lock(&sdata->local->iflist_mtx); > list_del_rcu(&sdata->list); > mutex_unlock(&sdata->local->iflist_mtx); > @@ -698,6 +701,7 @@ void ieee802154_remove_interfaces(struct ieee802154_local *local) > mutex_lock(&local->iflist_mtx); > list_for_each_entry_safe(sdata, tmp, &local->interfaces, list) { > list_del(&sdata->list); > + set_bit(SDATA_STATE_LISTDONE, &sdata->state); > > unregister_netdevice(sdata->dev); > }
On Mon, 11 Nov 2024 20:46:57 +0100, Miquel Raynal wrote: > On 08/11/2024 at 22:54:20 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote: > > > syzkaller reported a corrupted list in ieee802154_if_remove. [1] > > > > Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4 > > hardware device from the system. > > > > CPU0 CPU1 > > ==== ==== > > genl_family_rcv_msg_doit ieee802154_unregister_hw > > ieee802154_del_iface ieee802154_remove_interfaces > > rdev_del_virtual_intf_deprecated list_del(&sdata->list) > > ieee802154_if_remove > > list_del_rcu > > FYI this is a "duplicate" but with a different approach than: > https://lore.kernel.org/linux-wpan/87v7wtpngj.fsf@bootlin.com/T/#m02cebe86ec0171fc4d3350676bbdd4a7e3827077 No, my patch was the first to fix it, someone else copied my patch. Here is my patch: From: syzbot <syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com> To: linux-kernel@vger.kernel.org Subject: Re: [syzbot] Re: [syzbot] [wpan?] [usb?] BUG: corrupted list in ieee802154_if_remove Date: Fri, 08 Nov 2024 03:24:46 -0800 [thread overview] Message-ID: <672df4fe.050a0220.69fce.0011.GAE@google.com> (raw) In-Reply-To: <672b9f03.050a0220.350062.0276.GAE@google.com> For archival purposes, forwarding an incoming command email to linux-kernel@vger.kernel.org. *** Subject: Re: [syzbot] [wpan?] [usb?] BUG: corrupted list in ieee802154_if_remove Author: lizhi.xu@windriver.com net device has been unregistered ? #syz test diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h index 08dd521a51a5..6771c0569516 100644 --- a/net/mac802154/ieee802154_i.h +++ b/net/mac802154/ieee802154_i.h @@ -101,6 +101,7 @@ enum { enum ieee802154_sdata_state_bits { SDATA_STATE_RUNNING, + SDATA_STATE_LISTDONE, }; /* Slave interface definition. diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c index c0e2da5072be..95f11d377fd3 100644 --- a/net/mac802154/iface.c +++ b/net/mac802154/iface.c @@ -683,6 +683,10 @@ void ieee802154_if_remove(struct ieee802154_sub_if_data *sdata) { ASSERT_RTNL(); + printk("sd: %p, sdl: %p, dev: %p, l: %p, if remove\n", sdata, sdata->list, sdata->dev, sdata->local); + if (test_bit(SDATA_STATE_LISTDONE, &sdata->state)) + return; + mutex_lock(&sdata->local->iflist_mtx); list_del_rcu(&sdata->list); mutex_unlock(&sdata->local->iflist_mtx); @@ -697,7 +701,9 @@ void ieee802154_remove_interfaces(struct ieee802154_local *local) mutex_lock(&local->iflist_mtx); list_for_each_entry_safe(sdata, tmp, &local->interfaces, list) { + printk("sd: %p, sdl: %p, dev: %p, l: %p, rmv interfaces\n", sdata, sdata->list, sdata->dev, sdata->local); list_del(&sdata->list); + set_bit(SDATA_STATE_LISTDONE, &sdata->state); unregister_netdevice(sdata->dev); } diff --git a/net/mac802154/main.c b/net/mac802154/main.c index 21b7c3b280b4..81289719584e 100644 --- a/net/mac802154/main.c +++ b/net/mac802154/main.c @@ -279,6 +279,7 @@ void ieee802154_unregister_hw(struct ieee802154_hw *hw) rtnl_lock(); + printk("l: %p unreg hw\n", local); ieee802154_remove_interfaces(local); rtnl_unlock(); > > Thanks, > Miquèl > > > > > Avoid this issue, by adding slave data state bit SDATA_STATE_LISTDONE, set > > SDATA_STATE_LISTDONE when unregistering the hardware from the system, and > > add state bit SDATA_STATE_LISTDONE judgment before removing the interface > > to delete the list. > > > > [1] > > kernel BUG at lib/list_debug.c:58! > > Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI > > CPU: 0 UID: 0 PID: 6277 Comm: syz-executor157 Not tainted 6.12.0-rc6-syzkaller-00005-g557329bcecc2 #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 > > RIP: 0010:__list_del_entry_valid_or_report+0xf4/0x140 lib/list_debug.c:56 > > Code: e8 a1 7e 00 07 90 0f 0b 48 c7 c7 e0 37 60 8c 4c 89 fe e8 8f 7e 00 07 90 0f 0b 48 c7 c7 40 38 60 8c 4c 89 fe e8 7d 7e 00 07 90 <0f> 0b 48 c7 c7 a0 38 60 8c 4c 89 fe e8 6b 7e 00 07 90 0f 0b 48 c7 > > RSP: 0018:ffffc9000490f3d0 EFLAGS: 00010246 > > RAX: 000000000000004e RBX: dead000000000122 RCX: d211eee56bb28d00 > > RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 > > RBP: ffff88805b278dd8 R08: ffffffff8174a12c R09: 1ffffffff2852f0d > > R10: dffffc0000000000 R11: fffffbfff2852f0e R12: dffffc0000000000 > > R13: dffffc0000000000 R14: dead000000000100 R15: ffff88805b278cc0 > > FS: 0000555572f94380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 000056262e4a3000 CR3: 0000000078496000 CR4: 00000000003526f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > <TASK> > > __list_del_entry_valid include/linux/list.h:124 [inline] > > __list_del_entry include/linux/list.h:215 [inline] > > list_del_rcu include/linux/rculist.h:157 [inline] > > ieee802154_if_remove+0x86/0x1e0 net/mac802154/iface.c:687 > > rdev_del_virtual_intf_deprecated net/ieee802154/rdev-ops.h:24 [inline] > > ieee802154_del_iface+0x2c0/0x5c0 net/ieee802154/nl-phy.c:323 > > genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline] > > genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] > > genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210 > > netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2551 > > genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 > > netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline] > > netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357 > > netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901 > > sock_sendmsg_nosec net/socket.c:729 [inline] > > __sock_sendmsg+0x221/0x270 net/socket.c:744 > > ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2607 > > ___sys_sendmsg net/socket.c:2661 [inline] > > __sys_sendmsg+0x292/0x380 net/socket.c:2690 > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > Reported-and-tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com > > Closes: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92 > > Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> > > --- > > net/mac802154/ieee802154_i.h | 1 + > > net/mac802154/iface.c | 4 ++++ > > 2 files changed, 5 insertions(+) > > > > diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h > > index 08dd521a51a5..6771c0569516 100644 > > --- a/net/mac802154/ieee802154_i.h > > +++ b/net/mac802154/ieee802154_i.h > > @@ -101,6 +101,7 @@ enum { > > > > enum ieee802154_sdata_state_bits { > > SDATA_STATE_RUNNING, > > + SDATA_STATE_LISTDONE, > > }; > > > > /* Slave interface definition. > > diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c > > index c0e2da5072be..aed2fc63395d 100644 > > --- a/net/mac802154/iface.c > > +++ b/net/mac802154/iface.c > > @@ -683,6 +683,9 @@ void ieee802154_if_remove(struct ieee802154_sub_if_data *sdata) > > { > > ASSERT_RTNL(); > > > > + if (test_bit(SDATA_STATE_LISTDONE, &sdata->state)) > > + return; > > + > > mutex_lock(&sdata->local->iflist_mtx); > > list_del_rcu(&sdata->list); > > mutex_unlock(&sdata->local->iflist_mtx); > > @@ -698,6 +701,7 @@ void ieee802154_remove_interfaces(struct ieee802154_local *local) > > mutex_lock(&local->iflist_mtx); > > list_for_each_entry_safe(sdata, tmp, &local->interfaces, list) { > > list_del(&sdata->list); > > + set_bit(SDATA_STATE_LISTDONE, &sdata->state); > > > > unregister_netdevice(sdata->dev); > > } BR, Lizhi
Hello,
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Reported-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com
Tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com
Tested on:
commit: 2d5404ca Linux 6.12-rc7
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1608335f980000
kernel config: https://syzkaller.appspot.com/x/.config?x=1503500c6f615d24
dashboard link: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=109ed35f980000
Note: testing is done by a robot and is best-effort only.
On 12/11/2024 at 08:21:33 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote: > On Mon, 11 Nov 2024 20:46:57 +0100, Miquel Raynal wrote: >> On 08/11/2024 at 22:54:20 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote: >> >> > syzkaller reported a corrupted list in ieee802154_if_remove. [1] >> > >> > Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4 >> > hardware device from the system. >> > >> > CPU0 CPU1 >> > ==== ==== >> > genl_family_rcv_msg_doit ieee802154_unregister_hw >> > ieee802154_del_iface ieee802154_remove_interfaces >> > rdev_del_virtual_intf_deprecated list_del(&sdata->list) >> > ieee802154_if_remove >> > list_del_rcu >> >> FYI this is a "duplicate" but with a different approach than: >> https://lore.kernel.org/linux-wpan/87v7wtpngj.fsf@bootlin.com/T/#m02cebe86ec0171fc4d3350676bbdd4a7e3827077 > No, my patch was the first to fix it, someone else copied my > patch. Here is my patch: Ok, so same question as to the other contributor, why not enclosing the remaining list_del_rcu() within mutex protection? Can we avoid the creation of the LISTDONE state bit? Thanks, Miquèl
On Tue, 12 Nov 2024 12:01:21 +0100, Miquel Raynal wrote: >On 12/11/2024 at 08:21:33 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote: > >> On Mon, 11 Nov 2024 20:46:57 +0100, Miquel Raynal wrote: >>> On 08/11/2024 at 22:54:20 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote: >>> >>> > syzkaller reported a corrupted list in ieee802154_if_remove. [1] >>> > >>> > Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4 >>> > hardware device from the system. >>> > >>> > CPU0 CPU1 >>> > ==== ==== >>> > genl_family_rcv_msg_doit ieee802154_unregister_hw >>> > ieee802154_del_iface ieee802154_remove_interfaces >>> > rdev_del_virtual_intf_deprecated list_del(&sdata->list) >>> > ieee802154_if_remove >>> > list_del_rcu >>> >>> FYI this is a "duplicate" but with a different approach than: >>> https://lore.kernel.org/linux-wpan/87v7wtpngj.fsf@bootlin.com/T/#m02cebe86ec0171fc4d3350676bbdd4a7e3827077 >> No, my patch was the first to fix it, someone else copied my >> patch. Here is my patch: > >Ok, so same question as to the other contributor, why not enclosing the >remaining list_del_rcu() within mutex protection? Can we avoid the >creation of the LISTDONE state bit? From the analysis of the list itself, we can not rely on the newly added state bit. The net device has been unregistered, since the rcu grace period, unregistration must be run before ieee802154_if_remove. Following is my V2 patch, it has been tested and works well. From: Lizhi Xu <lizhi.xu@windriver.com> Date: Tue, 12 Nov 2024 20:59:34 +0800 Subject: [PATCH V2] mac802154: check local interfaces before deleting sdata list syzkaller reported a corrupted list in ieee802154_if_remove. [1] Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4 hardware device from the system. CPU0 CPU1 ==== ==== genl_family_rcv_msg_doit ieee802154_unregister_hw ieee802154_del_iface ieee802154_remove_interfaces rdev_del_virtual_intf_deprecated list_del(&sdata->list) ieee802154_if_remove list_del_rcu The net device has been unregistered, since the rcu grace period, unregistration must be run before ieee802154_if_remove. To avoid this issue, add a check for local->interfaces before deleting sdata list. [1] kernel BUG at lib/list_debug.c:58! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 6277 Comm: syz-executor157 Not tainted 6.12.0-rc6-syzkaller-00005-g557329bcecc2 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:__list_del_entry_valid_or_report+0xf4/0x140 lib/list_debug.c:56 Code: e8 a1 7e 00 07 90 0f 0b 48 c7 c7 e0 37 60 8c 4c 89 fe e8 8f 7e 00 07 90 0f 0b 48 c7 c7 40 38 60 8c 4c 89 fe e8 7d 7e 00 07 90 <0f> 0b 48 c7 c7 a0 38 60 8c 4c 89 fe e8 6b 7e 00 07 90 0f 0b 48 c7 RSP: 0018:ffffc9000490f3d0 EFLAGS: 00010246 RAX: 000000000000004e RBX: dead000000000122 RCX: d211eee56bb28d00 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffff88805b278dd8 R08: ffffffff8174a12c R09: 1ffffffff2852f0d R10: dffffc0000000000 R11: fffffbfff2852f0e R12: dffffc0000000000 R13: dffffc0000000000 R14: dead000000000100 R15: ffff88805b278cc0 FS: 0000555572f94380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000056262e4a3000 CR3: 0000000078496000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __list_del_entry_valid include/linux/list.h:124 [inline] __list_del_entry include/linux/list.h:215 [inline] list_del_rcu include/linux/rculist.h:157 [inline] ieee802154_if_remove+0x86/0x1e0 net/mac802154/iface.c:687 rdev_del_virtual_intf_deprecated net/ieee802154/rdev-ops.h:24 [inline] ieee802154_del_iface+0x2c0/0x5c0 net/ieee802154/nl-phy.c:323 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2551 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline] netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg+0x221/0x270 net/socket.c:744 ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2607 ___sys_sendmsg net/socket.c:2661 [inline] __sys_sendmsg+0x292/0x380 net/socket.c:2690 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Reported-and-tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92 Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> --- V1 -> V2: remove state bit and add a check for local interfaces before deleting sdata list net/mac802154/iface.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c index c0e2da5072be..9e4631fade90 100644 --- a/net/mac802154/iface.c +++ b/net/mac802154/iface.c @@ -684,6 +684,10 @@ void ieee802154_if_remove(struct ieee802154_sub_if_data *sdata) ASSERT_RTNL(); mutex_lock(&sdata->local->iflist_mtx); + if (list_empty(&sdata->local->interfaces)) { + mutex_unlock(&sdata->local->iflist_mtx); + return; + } list_del_rcu(&sdata->list); mutex_unlock(&sdata->local->iflist_mtx);
On 12/11/2024 at 21:41:45 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote: > On Tue, 12 Nov 2024 12:01:21 +0100, Miquel Raynal wrote: >>On 12/11/2024 at 08:21:33 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote: >> >>> On Mon, 11 Nov 2024 20:46:57 +0100, Miquel Raynal wrote: >>>> On 08/11/2024 at 22:54:20 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote: >>>> >>>> > syzkaller reported a corrupted list in ieee802154_if_remove. [1] >>>> > >>>> > Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4 >>>> > hardware device from the system. >>>> > >>>> > CPU0 CPU1 >>>> > ==== ==== >>>> > genl_family_rcv_msg_doit ieee802154_unregister_hw >>>> > ieee802154_del_iface ieee802154_remove_interfaces >>>> > rdev_del_virtual_intf_deprecated list_del(&sdata->list) >>>> > ieee802154_if_remove >>>> > list_del_rcu >>>> >>>> FYI this is a "duplicate" but with a different approach than: >>>> https://lore.kernel.org/linux-wpan/87v7wtpngj.fsf@bootlin.com/T/#m02cebe86ec0171fc4d3350676bbdd4a7e3827077 >>> No, my patch was the first to fix it, someone else copied my >>> patch. Here is my patch: >> >>Ok, so same question as to the other contributor, why not enclosing the >>remaining list_del_rcu() within mutex protection? Can we avoid the >>creation of the LISTDONE state bit? > From the analysis of the list itself, we can not rely on the newly added state bit. > The net device has been unregistered, since the rcu grace period, > unregistration must be run before ieee802154_if_remove. > > Following is my V2 patch, it has been tested and works well. Please send a proper v2, not an inline v2. However the new approach looks better to me, so you can add my Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Thanks, Miquèl
On 11/12/24 4:41 PM, Lizhi Xu wrote: > mutex_lock(&sdata->local->iflist_mtx); > + if (list_empty(&sdata->local->interfaces)) { > + mutex_unlock(&sdata->local->iflist_mtx); > + return; > + } > list_del_rcu(&sdata->list); > mutex_unlock(&sdata->local->iflist_mtx); Note https://syzkaller.appspot.com/text?tag=ReproC&x=12a9f740580000 makes an attempt to connect the only device. How this is expected to work if there are more than one device? Dmitry
On 13/11/2024 at 13:29:55 +03, Dmitry Antipov <dmantipov@yandex.ru> wrote: > On 11/12/24 4:41 PM, Lizhi Xu wrote: > >> mutex_lock(&sdata->local->iflist_mtx); >> + if (list_empty(&sdata->local->interfaces)) { >> + mutex_unlock(&sdata->local->iflist_mtx); >> + return; >> + } >> list_del_rcu(&sdata->list); >> mutex_unlock(&sdata->local->iflist_mtx); > > Note https://syzkaller.appspot.com/text?tag=ReproC&x=12a9f740580000 makes an > attempt to connect the only device. How this is expected to work if there are > more than one device? Isn't sdata already specific enough? What do you mean by "device"? Thanks, Miquèl
On 11/13/24 1:58 PM, Miquel Raynal wrote: >> Note https://syzkaller.appspot.com/text?tag=ReproC&x=12a9f740580000 makes an >> attempt to connect the only device. How this is expected to work if there are >> more than one device? > > Isn't sdata already specific enough? What do you mean by "device"? Well, syzbot's reproducer triggers this issue via USB Raw Gadget API. IIUC this is a debugging feature and it is possible to have the only raw gadget device. So when running syzbot's reproducer, 'list_count_nodes(&sdata->local->interfaces)' is always <= 1. But how this is expected to work for >1 case? Dmitry
On Wed, 13 Nov 2024 13:29:55 +0300, Dmitry Antipov wrote: > On 11/12/24 4:41 PM, Lizhi Xu wrote: > > > mutex_lock(&sdata->local->iflist_mtx); > > + if (list_empty(&sdata->local->interfaces)) { > > + mutex_unlock(&sdata->local->iflist_mtx); > > + return; > > + } > > list_del_rcu(&sdata->list); > > mutex_unlock(&sdata->local->iflist_mtx); > > Note https://syzkaller.appspot.com/text?tag=ReproC&x=12a9f740580000 makes an > attempt to connect the only device. How this is expected to work if there are > more than one device? There are two locks (rtnl and iflist_mtx) to protection and synchronization local->interfaces, so no need to worry about multiple devices. Lizhi
On Thu, 14 Nov 2024 09:00:25 +0800, Lizhi Xu wrote: > On Wed, 13 Nov 2024 13:29:55 +0300, Dmitry Antipov wrote: > > On 11/12/24 4:41 PM, Lizhi Xu wrote: > > > > > mutex_lock(&sdata->local->iflist_mtx); > > > + if (list_empty(&sdata->local->interfaces)) { > > > + mutex_unlock(&sdata->local->iflist_mtx); > > > + return; > > > + } > > > list_del_rcu(&sdata->list); > > > mutex_unlock(&sdata->local->iflist_mtx); > > > > Note https://syzkaller.appspot.com/text?tag=ReproC&x=12a9f740580000 makes an > > attempt to connect the only device. How this is expected to work if there are > > more than one device? > There are two locks (rtnl and iflist_mtx) to protection and synchronization > local->interfaces, so no need to worry about multiple devices. In other words, this case is a race between removing the 802154 master and the user sendmsg actively deleting the slave. Then when the master is removed, there is no need to execute the latter to remove the slave, because all the slave devices have been deleted when the master device is removed.. Lizhi
diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h index 08dd521a51a5..6771c0569516 100644 --- a/net/mac802154/ieee802154_i.h +++ b/net/mac802154/ieee802154_i.h @@ -101,6 +101,7 @@ enum { enum ieee802154_sdata_state_bits { SDATA_STATE_RUNNING, + SDATA_STATE_LISTDONE, }; /* Slave interface definition. diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c index c0e2da5072be..aed2fc63395d 100644 --- a/net/mac802154/iface.c +++ b/net/mac802154/iface.c @@ -683,6 +683,9 @@ void ieee802154_if_remove(struct ieee802154_sub_if_data *sdata) { ASSERT_RTNL(); + if (test_bit(SDATA_STATE_LISTDONE, &sdata->state)) + return; + mutex_lock(&sdata->local->iflist_mtx); list_del_rcu(&sdata->list); mutex_unlock(&sdata->local->iflist_mtx); @@ -698,6 +701,7 @@ void ieee802154_remove_interfaces(struct ieee802154_local *local) mutex_lock(&local->iflist_mtx); list_for_each_entry_safe(sdata, tmp, &local->interfaces, list) { list_del(&sdata->list); + set_bit(SDATA_STATE_LISTDONE, &sdata->state); unregister_netdevice(sdata->dev); }
syzkaller reported a corrupted list in ieee802154_if_remove. [1] Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4 hardware device from the system. CPU0 CPU1 ==== ==== genl_family_rcv_msg_doit ieee802154_unregister_hw ieee802154_del_iface ieee802154_remove_interfaces rdev_del_virtual_intf_deprecated list_del(&sdata->list) ieee802154_if_remove list_del_rcu Avoid this issue, by adding slave data state bit SDATA_STATE_LISTDONE, set SDATA_STATE_LISTDONE when unregistering the hardware from the system, and add state bit SDATA_STATE_LISTDONE judgment before removing the interface to delete the list. [1] kernel BUG at lib/list_debug.c:58! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 6277 Comm: syz-executor157 Not tainted 6.12.0-rc6-syzkaller-00005-g557329bcecc2 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:__list_del_entry_valid_or_report+0xf4/0x140 lib/list_debug.c:56 Code: e8 a1 7e 00 07 90 0f 0b 48 c7 c7 e0 37 60 8c 4c 89 fe e8 8f 7e 00 07 90 0f 0b 48 c7 c7 40 38 60 8c 4c 89 fe e8 7d 7e 00 07 90 <0f> 0b 48 c7 c7 a0 38 60 8c 4c 89 fe e8 6b 7e 00 07 90 0f 0b 48 c7 RSP: 0018:ffffc9000490f3d0 EFLAGS: 00010246 RAX: 000000000000004e RBX: dead000000000122 RCX: d211eee56bb28d00 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffff88805b278dd8 R08: ffffffff8174a12c R09: 1ffffffff2852f0d R10: dffffc0000000000 R11: fffffbfff2852f0e R12: dffffc0000000000 R13: dffffc0000000000 R14: dead000000000100 R15: ffff88805b278cc0 FS: 0000555572f94380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000056262e4a3000 CR3: 0000000078496000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __list_del_entry_valid include/linux/list.h:124 [inline] __list_del_entry include/linux/list.h:215 [inline] list_del_rcu include/linux/rculist.h:157 [inline] ieee802154_if_remove+0x86/0x1e0 net/mac802154/iface.c:687 rdev_del_virtual_intf_deprecated net/ieee802154/rdev-ops.h:24 [inline] ieee802154_del_iface+0x2c0/0x5c0 net/ieee802154/nl-phy.c:323 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2551 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline] netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg+0x221/0x270 net/socket.c:744 ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2607 ___sys_sendmsg net/socket.c:2661 [inline] __sys_sendmsg+0x292/0x380 net/socket.c:2690 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Reported-and-tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92 Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> --- net/mac802154/ieee802154_i.h | 1 + net/mac802154/iface.c | 4 ++++ 2 files changed, 5 insertions(+)