Message ID | 20240515132339.3346267-1-edumazet@google.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] netfilter: nfnetlink_queue: acquire rcu_read_lock() in instance_destroy_rcu() | expand |
Eric Dumazet <edumazet@google.com> wrote: > diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c > index 00f4bd21c59b419e96794127693c21ccb05e45b0..f1c31757e4969e8f975c7a1ebbc3b96148ec9724 100644 > --- a/net/netfilter/nfnetlink_queue.c > +++ b/net/netfilter/nfnetlink_queue.c > @@ -169,7 +169,9 @@ instance_destroy_rcu(struct rcu_head *head) > struct nfqnl_instance *inst = container_of(head, struct nfqnl_instance, > rcu); > > + rcu_read_lock(); > nfqnl_flush(inst, NULL, 0); > + rcu_read_unlock(); That works too. I sent a different patch for the same issue yesterday: https://patchwork.ozlabs.org/project/netfilter-devel/patch/20240514103133.2784-1-fw@strlen.de/ If you prefer Erics patch thats absolutely fine with me, I'll rebase in that case to keep the selftest around.
On Wed, May 15, 2024 at 3:27 PM Florian Westphal <fw@strlen.de> wrote: > > Eric Dumazet <edumazet@google.com> wrote: > > diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c > > index 00f4bd21c59b419e96794127693c21ccb05e45b0..f1c31757e4969e8f975c7a1ebbc3b96148ec9724 100644 > > --- a/net/netfilter/nfnetlink_queue.c > > +++ b/net/netfilter/nfnetlink_queue.c > > @@ -169,7 +169,9 @@ instance_destroy_rcu(struct rcu_head *head) > > struct nfqnl_instance *inst = container_of(head, struct nfqnl_instance, > > rcu); > > > > + rcu_read_lock(); > > nfqnl_flush(inst, NULL, 0); > > + rcu_read_unlock(); > > That works too. I sent a different patch for the same issue yesterday: > > https://patchwork.ozlabs.org/project/netfilter-devel/patch/20240514103133.2784-1-fw@strlen.de/ > > If you prefer Erics patch thats absolutely fine with me, I'll rebase in > that case to keep the selftest around. I missed your patch, otherwise I would have done nothing ;) I saw the recent changes about nf_reinject() and tried to have a patch that would be easily backported without conflicts. Do you think the splat is caused by recent changes, or is it simply syzbot getting smarter ? Thanks !
On Wed, May 15, 2024 at 3:39 PM Eric Dumazet <edumazet@google.com> wrote: > > On Wed, May 15, 2024 at 3:27 PM Florian Westphal <fw@strlen.de> wrote: > > > > Eric Dumazet <edumazet@google.com> wrote: > > > diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c > > > index 00f4bd21c59b419e96794127693c21ccb05e45b0..f1c31757e4969e8f975c7a1ebbc3b96148ec9724 100644 > > > --- a/net/netfilter/nfnetlink_queue.c > > > +++ b/net/netfilter/nfnetlink_queue.c > > > @@ -169,7 +169,9 @@ instance_destroy_rcu(struct rcu_head *head) > > > struct nfqnl_instance *inst = container_of(head, struct nfqnl_instance, > > > rcu); > > > > > > + rcu_read_lock(); > > > nfqnl_flush(inst, NULL, 0); > > > + rcu_read_unlock(); > > > > That works too. I sent a different patch for the same issue yesterday: > > > > https://patchwork.ozlabs.org/project/netfilter-devel/patch/20240514103133.2784-1-fw@strlen.de/ > > > > If you prefer Erics patch thats absolutely fine with me, I'll rebase in > > that case to keep the selftest around. > > I missed your patch, otherwise I would have done nothing ;) > > I saw the recent changes about nf_reinject() and tried to have a patch > that would be easily backported without conflicts. > > Do you think the splat is caused by recent changes, or is it simply > syzbot getting smarter ? (It took me a fair amount of time to find a Fixes: tag, this is why I am asking)
Eric Dumazet <edumazet@google.com> wrote: > > If you prefer Erics patch thats absolutely fine with me, I'll rebase in > > that case to keep the selftest around. > > I missed your patch, otherwise I would have done nothing ;) > > I saw the recent changes about nf_reinject() and tried to have a patch > that would be easily backported without conflicts. Right, makes sense from that pov. I think its fine to apply the patch in this case, I'll followup later. Thus: Acked-by: Florian Westphal <fw@strlen.de> > Do you think the splat is caused by recent changes, or is it simply > syzbot getting smarter ? Its old bug, AFAICS your Fixes tag is correct. 1. Userspace prog needs to subscribe to queue x 2. iptables/nftables rule needs to send packets to queue x 3. actual packets that match that have to be sent 4. Userspace program needs to exit while at least one packet is queued Amazing that syzbot managed to hit all 4 checkboxes :)
diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c index 00f4bd21c59b419e96794127693c21ccb05e45b0..f1c31757e4969e8f975c7a1ebbc3b96148ec9724 100644 --- a/net/netfilter/nfnetlink_queue.c +++ b/net/netfilter/nfnetlink_queue.c @@ -169,7 +169,9 @@ instance_destroy_rcu(struct rcu_head *head) struct nfqnl_instance *inst = container_of(head, struct nfqnl_instance, rcu); + rcu_read_lock(); nfqnl_flush(inst, NULL, 0); + rcu_read_unlock(); kfree(inst); module_put(THIS_MODULE); }
syzbot reported that nf_reinject() could be called without rcu_read_lock() : WARNING: suspicious RCU usage 6.9.0-rc7-syzkaller-02060-g5c1672705a1a #0 Not tainted net/netfilter/nfnetlink_queue.c:263 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by syz-executor.4/13427: #0: ffffffff8e334f60 (rcu_callback){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline] #0: ffffffff8e334f60 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2190 [inline] #0: ffffffff8e334f60 (rcu_callback){....}-{0:0}, at: rcu_core+0xa86/0x1830 kernel/rcu/tree.c:2471 #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline] #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: nfqnl_flush net/netfilter/nfnetlink_queue.c:405 [inline] #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: instance_destroy_rcu+0x30/0x220 net/netfilter/nfnetlink_queue.c:172 stack backtrace: CPU: 0 PID: 13427 Comm: syz-executor.4 Not tainted 6.9.0-rc7-syzkaller-02060-g5c1672705a1a #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 lockdep_rcu_suspicious+0x221/0x340 kernel/locking/lockdep.c:6712 nf_reinject net/netfilter/nfnetlink_queue.c:323 [inline] nfqnl_reinject+0x6ec/0x1120 net/netfilter/nfnetlink_queue.c:397 nfqnl_flush net/netfilter/nfnetlink_queue.c:410 [inline] instance_destroy_rcu+0x1ae/0x220 net/netfilter/nfnetlink_queue.c:172 rcu_do_batch kernel/rcu/tree.c:2196 [inline] rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2471 handle_softirqs+0x2d6/0x990 kernel/softirq.c:554 __do_softirq kernel/softirq.c:588 [inline] invoke_softirq kernel/softirq.c:428 [inline] __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637 irq_exit_rcu+0x9/0x30 kernel/softirq.c:649 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043 </IRQ> <TASK> Fixes: 9872bec773c2 ("[NETFILTER]: nfnetlink: use RCU for queue instances hash") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> --- net/netfilter/nfnetlink_queue.c | 2 ++ 1 file changed, 2 insertions(+)