diff mbox series

mac802154: add a check for slave data list before delete

Message ID 20241108145420.2445641-1-lizhi.xu@windriver.com (mailing list archive)
State New
Headers show
Series mac802154: add a check for slave data list before delete | expand

Commit Message

Lizhi Xu Nov. 8, 2024, 2:54 p.m. UTC
syzkaller reported a corrupted list in ieee802154_if_remove. [1]

Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4
hardware device from the system.

CPU0					CPU1
====					====
genl_family_rcv_msg_doit		ieee802154_unregister_hw
ieee802154_del_iface			ieee802154_remove_interfaces
rdev_del_virtual_intf_deprecated	list_del(&sdata->list)
ieee802154_if_remove
list_del_rcu

Avoid this issue, by adding slave data state bit SDATA_STATE_LISTDONE, set
SDATA_STATE_LISTDONE when unregistering the hardware from the system, and
add state bit SDATA_STATE_LISTDONE judgment before removing the interface
to delete the list. 

[1]
kernel BUG at lib/list_debug.c:58!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 0 UID: 0 PID: 6277 Comm: syz-executor157 Not tainted 6.12.0-rc6-syzkaller-00005-g557329bcecc2 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
RIP: 0010:__list_del_entry_valid_or_report+0xf4/0x140 lib/list_debug.c:56
Code: e8 a1 7e 00 07 90 0f 0b 48 c7 c7 e0 37 60 8c 4c 89 fe e8 8f 7e 00 07 90 0f 0b 48 c7 c7 40 38 60 8c 4c 89 fe e8 7d 7e 00 07 90 <0f> 0b 48 c7 c7 a0 38 60 8c 4c 89 fe e8 6b 7e 00 07 90 0f 0b 48 c7
RSP: 0018:ffffc9000490f3d0 EFLAGS: 00010246
RAX: 000000000000004e RBX: dead000000000122 RCX: d211eee56bb28d00
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
RBP: ffff88805b278dd8 R08: ffffffff8174a12c R09: 1ffffffff2852f0d
R10: dffffc0000000000 R11: fffffbfff2852f0e R12: dffffc0000000000
R13: dffffc0000000000 R14: dead000000000100 R15: ffff88805b278cc0
FS:  0000555572f94380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000056262e4a3000 CR3: 0000000078496000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 __list_del_entry_valid include/linux/list.h:124 [inline]
 __list_del_entry include/linux/list.h:215 [inline]
 list_del_rcu include/linux/rculist.h:157 [inline]
 ieee802154_if_remove+0x86/0x1e0 net/mac802154/iface.c:687
 rdev_del_virtual_intf_deprecated net/ieee802154/rdev-ops.h:24 [inline]
 ieee802154_del_iface+0x2c0/0x5c0 net/ieee802154/nl-phy.c:323
 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
 genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2551
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
 netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357
 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
 sock_sendmsg_nosec net/socket.c:729 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:744
 ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2607
 ___sys_sendmsg net/socket.c:2661 [inline]
 __sys_sendmsg+0x292/0x380 net/socket.c:2690
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Reported-and-tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
---
 net/mac802154/ieee802154_i.h | 1 +
 net/mac802154/iface.c        | 4 ++++
 2 files changed, 5 insertions(+)

Comments

Miquel Raynal Nov. 11, 2024, 7:46 p.m. UTC | #1
Hello,

On 08/11/2024 at 22:54:20 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote:

> syzkaller reported a corrupted list in ieee802154_if_remove. [1]
>
> Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4
> hardware device from the system.
>
> CPU0					CPU1
> ====					====
> genl_family_rcv_msg_doit		ieee802154_unregister_hw
> ieee802154_del_iface			ieee802154_remove_interfaces
> rdev_del_virtual_intf_deprecated	list_del(&sdata->list)
> ieee802154_if_remove
> list_del_rcu

FYI this is a "duplicate" but with a different approach than:
https://lore.kernel.org/linux-wpan/87v7wtpngj.fsf@bootlin.com/T/#m02cebe86ec0171fc4d3350676bbdd4a7e3827077

Thanks,
Miquèl

>
> Avoid this issue, by adding slave data state bit SDATA_STATE_LISTDONE, set
> SDATA_STATE_LISTDONE when unregistering the hardware from the system, and
> add state bit SDATA_STATE_LISTDONE judgment before removing the interface
> to delete the list. 
>
> [1]
> kernel BUG at lib/list_debug.c:58!
> Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
> CPU: 0 UID: 0 PID: 6277 Comm: syz-executor157 Not tainted 6.12.0-rc6-syzkaller-00005-g557329bcecc2 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
> RIP: 0010:__list_del_entry_valid_or_report+0xf4/0x140 lib/list_debug.c:56
> Code: e8 a1 7e 00 07 90 0f 0b 48 c7 c7 e0 37 60 8c 4c 89 fe e8 8f 7e 00 07 90 0f 0b 48 c7 c7 40 38 60 8c 4c 89 fe e8 7d 7e 00 07 90 <0f> 0b 48 c7 c7 a0 38 60 8c 4c 89 fe e8 6b 7e 00 07 90 0f 0b 48 c7
> RSP: 0018:ffffc9000490f3d0 EFLAGS: 00010246
> RAX: 000000000000004e RBX: dead000000000122 RCX: d211eee56bb28d00
> RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
> RBP: ffff88805b278dd8 R08: ffffffff8174a12c R09: 1ffffffff2852f0d
> R10: dffffc0000000000 R11: fffffbfff2852f0e R12: dffffc0000000000
> R13: dffffc0000000000 R14: dead000000000100 R15: ffff88805b278cc0
> FS:  0000555572f94380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000056262e4a3000 CR3: 0000000078496000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  __list_del_entry_valid include/linux/list.h:124 [inline]
>  __list_del_entry include/linux/list.h:215 [inline]
>  list_del_rcu include/linux/rculist.h:157 [inline]
>  ieee802154_if_remove+0x86/0x1e0 net/mac802154/iface.c:687
>  rdev_del_virtual_intf_deprecated net/ieee802154/rdev-ops.h:24 [inline]
>  ieee802154_del_iface+0x2c0/0x5c0 net/ieee802154/nl-phy.c:323
>  genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
>  genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
>  genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
>  netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2551
>  genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
>  netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
>  netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357
>  netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
>  sock_sendmsg_nosec net/socket.c:729 [inline]
>  __sock_sendmsg+0x221/0x270 net/socket.c:744
>  ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2607
>  ___sys_sendmsg net/socket.c:2661 [inline]
>  __sys_sendmsg+0x292/0x380 net/socket.c:2690
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> Reported-and-tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92
> Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
> ---
>  net/mac802154/ieee802154_i.h | 1 +
>  net/mac802154/iface.c        | 4 ++++
>  2 files changed, 5 insertions(+)
>
> diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
> index 08dd521a51a5..6771c0569516 100644
> --- a/net/mac802154/ieee802154_i.h
> +++ b/net/mac802154/ieee802154_i.h
> @@ -101,6 +101,7 @@ enum {
>  
>  enum ieee802154_sdata_state_bits {
>  	SDATA_STATE_RUNNING,
> +	SDATA_STATE_LISTDONE,
>  };
>  
>  /* Slave interface definition.
> diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c
> index c0e2da5072be..aed2fc63395d 100644
> --- a/net/mac802154/iface.c
> +++ b/net/mac802154/iface.c
> @@ -683,6 +683,9 @@ void ieee802154_if_remove(struct ieee802154_sub_if_data *sdata)
>  {
>  	ASSERT_RTNL();
>  
> +	if (test_bit(SDATA_STATE_LISTDONE, &sdata->state))
> +		return;
> +
>  	mutex_lock(&sdata->local->iflist_mtx);
>  	list_del_rcu(&sdata->list);
>  	mutex_unlock(&sdata->local->iflist_mtx);
> @@ -698,6 +701,7 @@ void ieee802154_remove_interfaces(struct ieee802154_local *local)
>  	mutex_lock(&local->iflist_mtx);
>  	list_for_each_entry_safe(sdata, tmp, &local->interfaces, list) {
>  		list_del(&sdata->list);
> +		set_bit(SDATA_STATE_LISTDONE, &sdata->state);
>  
>  		unregister_netdevice(sdata->dev);
>  	}
Lizhi Xu Nov. 12, 2024, 12:21 a.m. UTC | #2
On Mon, 11 Nov 2024 20:46:57 +0100, Miquel Raynal wrote:
> On 08/11/2024 at 22:54:20 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote:
> 
> > syzkaller reported a corrupted list in ieee802154_if_remove. [1]
> >
> > Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4
> > hardware device from the system.
> >
> > CPU0					CPU1
> > ====					====
> > genl_family_rcv_msg_doit		ieee802154_unregister_hw
> > ieee802154_del_iface			ieee802154_remove_interfaces
> > rdev_del_virtual_intf_deprecated	list_del(&sdata->list)
> > ieee802154_if_remove
> > list_del_rcu
> 
> FYI this is a "duplicate" but with a different approach than:
> https://lore.kernel.org/linux-wpan/87v7wtpngj.fsf@bootlin.com/T/#m02cebe86ec0171fc4d3350676bbdd4a7e3827077
No, my patch was the first to fix it, someone else copied my patch. Here is my patch:

From: syzbot <syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com>
To: linux-kernel@vger.kernel.org
Subject: Re: [syzbot] Re: [syzbot] [wpan?] [usb?] BUG: corrupted list in ieee802154_if_remove
Date: Fri, 08 Nov 2024 03:24:46 -0800	[thread overview]
Message-ID: <672df4fe.050a0220.69fce.0011.GAE@google.com> (raw)
In-Reply-To: <672b9f03.050a0220.350062.0276.GAE@google.com>

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org.

***

Subject: Re: [syzbot] [wpan?] [usb?] BUG: corrupted list in ieee802154_if_remove
Author: lizhi.xu@windriver.com

net device has been unregistered ?

#syz test

diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
index 08dd521a51a5..6771c0569516 100644
--- a/net/mac802154/ieee802154_i.h
+++ b/net/mac802154/ieee802154_i.h
@@ -101,6 +101,7 @@ enum {
 
 enum ieee802154_sdata_state_bits {
 	SDATA_STATE_RUNNING,
+	SDATA_STATE_LISTDONE,
 };
 
 /* Slave interface definition.
diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c
index c0e2da5072be..95f11d377fd3 100644
--- a/net/mac802154/iface.c
+++ b/net/mac802154/iface.c
@@ -683,6 +683,10 @@ void ieee802154_if_remove(struct ieee802154_sub_if_data *sdata)
 {
 	ASSERT_RTNL();
 
+	printk("sd: %p, sdl: %p, dev: %p, l: %p, if remove\n", sdata, sdata->list, sdata->dev, sdata->local);
+	if (test_bit(SDATA_STATE_LISTDONE, &sdata->state))
+		return;
+
 	mutex_lock(&sdata->local->iflist_mtx);
 	list_del_rcu(&sdata->list);
 	mutex_unlock(&sdata->local->iflist_mtx);
@@ -697,7 +701,9 @@ void ieee802154_remove_interfaces(struct ieee802154_local *local)
 
 	mutex_lock(&local->iflist_mtx);
 	list_for_each_entry_safe(sdata, tmp, &local->interfaces, list) {
+		printk("sd: %p, sdl: %p, dev: %p, l: %p, rmv interfaces\n", sdata, sdata->list, sdata->dev, sdata->local);
 		list_del(&sdata->list);
+		set_bit(SDATA_STATE_LISTDONE, &sdata->state);
 
 		unregister_netdevice(sdata->dev);
 	}
diff --git a/net/mac802154/main.c b/net/mac802154/main.c
index 21b7c3b280b4..81289719584e 100644
--- a/net/mac802154/main.c
+++ b/net/mac802154/main.c
@@ -279,6 +279,7 @@ void ieee802154_unregister_hw(struct ieee802154_hw *hw)
 
 	rtnl_lock();
 
+	printk("l: %p unreg hw\n", local);
 	ieee802154_remove_interfaces(local);
 
 	rtnl_unlock();

> 
> Thanks,
> Miquèl
> 
> >
> > Avoid this issue, by adding slave data state bit SDATA_STATE_LISTDONE, set
> > SDATA_STATE_LISTDONE when unregistering the hardware from the system, and
> > add state bit SDATA_STATE_LISTDONE judgment before removing the interface
> > to delete the list.
> >
> > [1]
> > kernel BUG at lib/list_debug.c:58!
> > Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
> > CPU: 0 UID: 0 PID: 6277 Comm: syz-executor157 Not tainted 6.12.0-rc6-syzkaller-00005-g557329bcecc2 #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
> > RIP: 0010:__list_del_entry_valid_or_report+0xf4/0x140 lib/list_debug.c:56
> > Code: e8 a1 7e 00 07 90 0f 0b 48 c7 c7 e0 37 60 8c 4c 89 fe e8 8f 7e 00 07 90 0f 0b 48 c7 c7 40 38 60 8c 4c 89 fe e8 7d 7e 00 07 90 <0f> 0b 48 c7 c7 a0 38 60 8c 4c 89 fe e8 6b 7e 00 07 90 0f 0b 48 c7
> > RSP: 0018:ffffc9000490f3d0 EFLAGS: 00010246
> > RAX: 000000000000004e RBX: dead000000000122 RCX: d211eee56bb28d00
> > RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
> > RBP: ffff88805b278dd8 R08: ffffffff8174a12c R09: 1ffffffff2852f0d
> > R10: dffffc0000000000 R11: fffffbfff2852f0e R12: dffffc0000000000
> > R13: dffffc0000000000 R14: dead000000000100 R15: ffff88805b278cc0
> > FS:  0000555572f94380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 000056262e4a3000 CR3: 0000000078496000 CR4: 00000000003526f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  <TASK>
> >  __list_del_entry_valid include/linux/list.h:124 [inline]
> >  __list_del_entry include/linux/list.h:215 [inline]
> >  list_del_rcu include/linux/rculist.h:157 [inline]
> >  ieee802154_if_remove+0x86/0x1e0 net/mac802154/iface.c:687
> >  rdev_del_virtual_intf_deprecated net/ieee802154/rdev-ops.h:24 [inline]
> >  ieee802154_del_iface+0x2c0/0x5c0 net/ieee802154/nl-phy.c:323
> >  genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
> >  genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
> >  genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
> >  netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2551
> >  genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
> >  netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
> >  netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357
> >  netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
> >  sock_sendmsg_nosec net/socket.c:729 [inline]
> >  __sock_sendmsg+0x221/0x270 net/socket.c:744
> >  ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2607
> >  ___sys_sendmsg net/socket.c:2661 [inline]
> >  __sys_sendmsg+0x292/0x380 net/socket.c:2690
> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> >  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >
> > Reported-and-tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92
> > Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
> > ---
> >  net/mac802154/ieee802154_i.h | 1 +
> >  net/mac802154/iface.c        | 4 ++++
> >  2 files changed, 5 insertions(+)
> >
> > diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
> > index 08dd521a51a5..6771c0569516 100644
> > --- a/net/mac802154/ieee802154_i.h
> > +++ b/net/mac802154/ieee802154_i.h
> > @@ -101,6 +101,7 @@ enum {
> >
> >  enum ieee802154_sdata_state_bits {
> >  	SDATA_STATE_RUNNING,
> > +	SDATA_STATE_LISTDONE,
> >  };
> >
> >  /* Slave interface definition.
> > diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c
> > index c0e2da5072be..aed2fc63395d 100644
> > --- a/net/mac802154/iface.c
> > +++ b/net/mac802154/iface.c
> > @@ -683,6 +683,9 @@ void ieee802154_if_remove(struct ieee802154_sub_if_data *sdata)
> >  {
> >  	ASSERT_RTNL();
> >
> > +	if (test_bit(SDATA_STATE_LISTDONE, &sdata->state))
> > +		return;
> > +
> >  	mutex_lock(&sdata->local->iflist_mtx);
> >  	list_del_rcu(&sdata->list);
> >  	mutex_unlock(&sdata->local->iflist_mtx);
> > @@ -698,6 +701,7 @@ void ieee802154_remove_interfaces(struct ieee802154_local *local)
> >  	mutex_lock(&local->iflist_mtx);
> >  	list_for_each_entry_safe(sdata, tmp, &local->interfaces, list) {
> >  		list_del(&sdata->list);
> > +		set_bit(SDATA_STATE_LISTDONE, &sdata->state);
> >
> >  		unregister_netdevice(sdata->dev);
> >  	}

BR,
Lizhi
syzbot Nov. 12, 2024, 4:31 a.m. UTC | #3
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com
Tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com

Tested on:

commit:         2d5404ca Linux 6.12-rc7
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1608335f980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1503500c6f615d24
dashboard link: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=109ed35f980000

Note: testing is done by a robot and is best-effort only.
Miquel Raynal Nov. 12, 2024, 11:01 a.m. UTC | #4
On 12/11/2024 at 08:21:33 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote:

> On Mon, 11 Nov 2024 20:46:57 +0100, Miquel Raynal wrote:
>> On 08/11/2024 at 22:54:20 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote:
>> 
>> > syzkaller reported a corrupted list in ieee802154_if_remove. [1]
>> >
>> > Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4
>> > hardware device from the system.
>> >
>> > CPU0					CPU1
>> > ====					====
>> > genl_family_rcv_msg_doit		ieee802154_unregister_hw
>> > ieee802154_del_iface			ieee802154_remove_interfaces
>> > rdev_del_virtual_intf_deprecated	list_del(&sdata->list)
>> > ieee802154_if_remove
>> > list_del_rcu
>> 
>> FYI this is a "duplicate" but with a different approach than:
>> https://lore.kernel.org/linux-wpan/87v7wtpngj.fsf@bootlin.com/T/#m02cebe86ec0171fc4d3350676bbdd4a7e3827077
> No, my patch was the first to fix it, someone else copied my
> patch. Here is my patch:

Ok, so same question as to the other contributor, why not enclosing the
remaining list_del_rcu() within mutex protection? Can we avoid the
creation of the LISTDONE state bit?

Thanks,
Miquèl
Lizhi Xu Nov. 12, 2024, 1:41 p.m. UTC | #5
On Tue, 12 Nov 2024 12:01:21 +0100, Miquel Raynal wrote:
>On 12/11/2024 at 08:21:33 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote:
>
>> On Mon, 11 Nov 2024 20:46:57 +0100, Miquel Raynal wrote:
>>> On 08/11/2024 at 22:54:20 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote:
>>>
>>> > syzkaller reported a corrupted list in ieee802154_if_remove. [1]
>>> >
>>> > Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4
>>> > hardware device from the system.
>>> >
>>> > CPU0					CPU1
>>> > ====					====
>>> > genl_family_rcv_msg_doit		ieee802154_unregister_hw
>>> > ieee802154_del_iface			ieee802154_remove_interfaces
>>> > rdev_del_virtual_intf_deprecated	list_del(&sdata->list)
>>> > ieee802154_if_remove
>>> > list_del_rcu
>>>
>>> FYI this is a "duplicate" but with a different approach than:
>>> https://lore.kernel.org/linux-wpan/87v7wtpngj.fsf@bootlin.com/T/#m02cebe86ec0171fc4d3350676bbdd4a7e3827077
>> No, my patch was the first to fix it, someone else copied my
>> patch. Here is my patch:
>
>Ok, so same question as to the other contributor, why not enclosing the
>remaining list_del_rcu() within mutex protection? Can we avoid the
>creation of the LISTDONE state bit?
From the analysis of the list itself, we can not rely on the newly added state bit. 
The net device has been unregistered, since the rcu grace period,
unregistration must be run before ieee802154_if_remove.

Following is my V2 patch, it has been tested and works well.

From: Lizhi Xu <lizhi.xu@windriver.com>
Date: Tue, 12 Nov 2024 20:59:34 +0800
Subject: [PATCH V2] mac802154: check local interfaces before deleting sdata list

syzkaller reported a corrupted list in ieee802154_if_remove. [1]

Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4
hardware device from the system.

CPU0					CPU1
====					====
genl_family_rcv_msg_doit		ieee802154_unregister_hw
ieee802154_del_iface			ieee802154_remove_interfaces
rdev_del_virtual_intf_deprecated	list_del(&sdata->list)
ieee802154_if_remove
list_del_rcu

The net device has been unregistered, since the rcu grace period,
unregistration must be run before ieee802154_if_remove.

To avoid this issue, add a check for local->interfaces before deleting
sdata list.

[1]
kernel BUG at lib/list_debug.c:58!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 0 UID: 0 PID: 6277 Comm: syz-executor157 Not tainted 6.12.0-rc6-syzkaller-00005-g557329bcecc2 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
RIP: 0010:__list_del_entry_valid_or_report+0xf4/0x140 lib/list_debug.c:56
Code: e8 a1 7e 00 07 90 0f 0b 48 c7 c7 e0 37 60 8c 4c 89 fe e8 8f 7e 00 07 90 0f 0b 48 c7 c7 40 38 60 8c 4c 89 fe e8 7d 7e 00 07 90 <0f> 0b 48 c7 c7 a0 38 60 8c 4c 89 fe e8 6b 7e 00 07 90 0f 0b 48 c7
RSP: 0018:ffffc9000490f3d0 EFLAGS: 00010246
RAX: 000000000000004e RBX: dead000000000122 RCX: d211eee56bb28d00
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
RBP: ffff88805b278dd8 R08: ffffffff8174a12c R09: 1ffffffff2852f0d
R10: dffffc0000000000 R11: fffffbfff2852f0e R12: dffffc0000000000
R13: dffffc0000000000 R14: dead000000000100 R15: ffff88805b278cc0
FS:  0000555572f94380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000056262e4a3000 CR3: 0000000078496000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 __list_del_entry_valid include/linux/list.h:124 [inline]
 __list_del_entry include/linux/list.h:215 [inline]
 list_del_rcu include/linux/rculist.h:157 [inline]
 ieee802154_if_remove+0x86/0x1e0 net/mac802154/iface.c:687
 rdev_del_virtual_intf_deprecated net/ieee802154/rdev-ops.h:24 [inline]
 ieee802154_del_iface+0x2c0/0x5c0 net/ieee802154/nl-phy.c:323
 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
 genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2551
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
 netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357
 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
 sock_sendmsg_nosec net/socket.c:729 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:744
 ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2607
 ___sys_sendmsg net/socket.c:2661 [inline]
 __sys_sendmsg+0x292/0x380 net/socket.c:2690
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Reported-and-tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
---
V1 -> V2: remove state bit and add a check for local interfaces before
          deleting sdata list

 net/mac802154/iface.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c
index c0e2da5072be..9e4631fade90 100644
--- a/net/mac802154/iface.c
+++ b/net/mac802154/iface.c
@@ -684,6 +684,10 @@ void ieee802154_if_remove(struct ieee802154_sub_if_data *sdata)
 	ASSERT_RTNL();
 
 	mutex_lock(&sdata->local->iflist_mtx);
+	if (list_empty(&sdata->local->interfaces)) {
+		mutex_unlock(&sdata->local->iflist_mtx);
+		return;
+	}
 	list_del_rcu(&sdata->list);
 	mutex_unlock(&sdata->local->iflist_mtx);
Miquel Raynal Nov. 13, 2024, 8:26 a.m. UTC | #6
On 12/11/2024 at 21:41:45 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote:

> On Tue, 12 Nov 2024 12:01:21 +0100, Miquel Raynal wrote:
>>On 12/11/2024 at 08:21:33 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote:
>>
>>> On Mon, 11 Nov 2024 20:46:57 +0100, Miquel Raynal wrote:
>>>> On 08/11/2024 at 22:54:20 +08, Lizhi Xu <lizhi.xu@windriver.com> wrote:
>>>>
>>>> > syzkaller reported a corrupted list in ieee802154_if_remove. [1]
>>>> >
>>>> > Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4
>>>> > hardware device from the system.
>>>> >
>>>> > CPU0					CPU1
>>>> > ====					====
>>>> > genl_family_rcv_msg_doit		ieee802154_unregister_hw
>>>> > ieee802154_del_iface			ieee802154_remove_interfaces
>>>> > rdev_del_virtual_intf_deprecated	list_del(&sdata->list)
>>>> > ieee802154_if_remove
>>>> > list_del_rcu
>>>>
>>>> FYI this is a "duplicate" but with a different approach than:
>>>> https://lore.kernel.org/linux-wpan/87v7wtpngj.fsf@bootlin.com/T/#m02cebe86ec0171fc4d3350676bbdd4a7e3827077
>>> No, my patch was the first to fix it, someone else copied my
>>> patch. Here is my patch:
>>
>>Ok, so same question as to the other contributor, why not enclosing the
>>remaining list_del_rcu() within mutex protection? Can we avoid the
>>creation of the LISTDONE state bit?
> From the analysis of the list itself, we can not rely on the newly added state bit. 
> The net device has been unregistered, since the rcu grace period,
> unregistration must be run before ieee802154_if_remove.
>
> Following is my V2 patch, it has been tested and works well.

Please send a proper v2, not an inline v2.

However the new approach looks better to me, so you can add my

Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>

Thanks,
Miquèl
Dmitry Antipov Nov. 13, 2024, 10:29 a.m. UTC | #7
On 11/12/24 4:41 PM, Lizhi Xu wrote:

>   	mutex_lock(&sdata->local->iflist_mtx);
> +	if (list_empty(&sdata->local->interfaces)) {
> +		mutex_unlock(&sdata->local->iflist_mtx);
> +		return;
> +	}
>   	list_del_rcu(&sdata->list);
>   	mutex_unlock(&sdata->local->iflist_mtx);

Note https://syzkaller.appspot.com/text?tag=ReproC&x=12a9f740580000 makes an
attempt to connect the only device. How this is expected to work if there are
more than one device?

Dmitry
Miquel Raynal Nov. 13, 2024, 10:58 a.m. UTC | #8
On 13/11/2024 at 13:29:55 +03, Dmitry Antipov <dmantipov@yandex.ru> wrote:

> On 11/12/24 4:41 PM, Lizhi Xu wrote:
>
>>   	mutex_lock(&sdata->local->iflist_mtx);
>> +	if (list_empty(&sdata->local->interfaces)) {
>> +		mutex_unlock(&sdata->local->iflist_mtx);
>> +		return;
>> +	}
>>   	list_del_rcu(&sdata->list);
>>   	mutex_unlock(&sdata->local->iflist_mtx);
>
> Note https://syzkaller.appspot.com/text?tag=ReproC&x=12a9f740580000 makes an
> attempt to connect the only device. How this is expected to work if there are
> more than one device?

Isn't sdata already specific enough? What do you mean by "device"?

Thanks,
Miquèl
Dmitry Antipov Nov. 13, 2024, 12:45 p.m. UTC | #9
On 11/13/24 1:58 PM, Miquel Raynal wrote:

>> Note https://syzkaller.appspot.com/text?tag=ReproC&x=12a9f740580000 makes an
>> attempt to connect the only device. How this is expected to work if there are
>> more than one device?
> 
> Isn't sdata already specific enough? What do you mean by "device"?

Well, syzbot's reproducer triggers this issue via USB Raw Gadget API. IIUC this
is a debugging feature and it is possible to have the only raw gadget device.
So when running syzbot's reproducer, 'list_count_nodes(&sdata->local->interfaces)'
is always <= 1. But how this is expected to work for >1 case?

Dmitry
Lizhi Xu Nov. 14, 2024, 1 a.m. UTC | #10
On Wed, 13 Nov 2024 13:29:55 +0300, Dmitry Antipov wrote:
> On 11/12/24 4:41 PM, Lizhi Xu wrote:
> 
> >   	mutex_lock(&sdata->local->iflist_mtx);
> > +	if (list_empty(&sdata->local->interfaces)) {
> > +		mutex_unlock(&sdata->local->iflist_mtx);
> > +		return;
> > +	}
> >   	list_del_rcu(&sdata->list);
> >   	mutex_unlock(&sdata->local->iflist_mtx);
> 
> Note https://syzkaller.appspot.com/text?tag=ReproC&x=12a9f740580000 makes an
> attempt to connect the only device. How this is expected to work if there are
> more than one device?
There are two locks (rtnl and iflist_mtx) to protection and synchronization
local->interfaces, so no need to worry about multiple devices.

Lizhi
Lizhi Xu Nov. 14, 2024, 1:17 a.m. UTC | #11
On Thu, 14 Nov 2024 09:00:25 +0800, Lizhi Xu wrote:
> On Wed, 13 Nov 2024 13:29:55 +0300, Dmitry Antipov wrote:
> > On 11/12/24 4:41 PM, Lizhi Xu wrote:
> >
> > >   	mutex_lock(&sdata->local->iflist_mtx);
> > > +	if (list_empty(&sdata->local->interfaces)) {
> > > +		mutex_unlock(&sdata->local->iflist_mtx);
> > > +		return;
> > > +	}
> > >   	list_del_rcu(&sdata->list);
> > >   	mutex_unlock(&sdata->local->iflist_mtx);
> >
> > Note https://syzkaller.appspot.com/text?tag=ReproC&x=12a9f740580000 makes an
> > attempt to connect the only device. How this is expected to work if there are
> > more than one device?
> There are two locks (rtnl and iflist_mtx) to protection and synchronization
> local->interfaces, so no need to worry about multiple devices.
In other words, this case is a race between removing the 802154 master
and the user sendmsg actively deleting the slave.
Then when the master is removed, there is no need to execute the latter to
remove the slave, because all the slave devices have been deleted when the
master device is removed..

Lizhi
diff mbox series

Patch

diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
index 08dd521a51a5..6771c0569516 100644
--- a/net/mac802154/ieee802154_i.h
+++ b/net/mac802154/ieee802154_i.h
@@ -101,6 +101,7 @@  enum {
 
 enum ieee802154_sdata_state_bits {
 	SDATA_STATE_RUNNING,
+	SDATA_STATE_LISTDONE,
 };
 
 /* Slave interface definition.
diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c
index c0e2da5072be..aed2fc63395d 100644
--- a/net/mac802154/iface.c
+++ b/net/mac802154/iface.c
@@ -683,6 +683,9 @@  void ieee802154_if_remove(struct ieee802154_sub_if_data *sdata)
 {
 	ASSERT_RTNL();
 
+	if (test_bit(SDATA_STATE_LISTDONE, &sdata->state))
+		return;
+
 	mutex_lock(&sdata->local->iflist_mtx);
 	list_del_rcu(&sdata->list);
 	mutex_unlock(&sdata->local->iflist_mtx);
@@ -698,6 +701,7 @@  void ieee802154_remove_interfaces(struct ieee802154_local *local)
 	mutex_lock(&local->iflist_mtx);
 	list_for_each_entry_safe(sdata, tmp, &local->interfaces, list) {
 		list_del(&sdata->list);
+		set_bit(SDATA_STATE_LISTDONE, &sdata->state);
 
 		unregister_netdevice(sdata->dev);
 	}