Message ID | 20240402134133.2352776-1-edumazet@google.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 7eb322360b0266481e560d1807ee79e0cef5742b |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net/sched: fix lockdep splat in qdisc_tree_reduce_backlog() | expand |
Tue, Apr 02, 2024 at 03:41:33PM CEST, edumazet@google.com wrote: >qdisc_tree_reduce_backlog() is called with the qdisc lock held, >not RTNL. > >We must use qdisc_lookup_rcu() instead of qdisc_lookup() > >syzbot reported: > >WARNING: suspicious RCU usage >6.1.74-syzkaller #0 Not tainted >----------------------------- >net/sched/sch_api.c:305 suspicious rcu_dereference_protected() usage! > >other info that might help us debug this: > >rcu_scheduler_active = 2, debug_locks = 1 >3 locks held by udevd/1142: > #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline] > #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline] > #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: net_tx_action+0x64a/0x970 net/core/dev.c:5282 > #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:350 [inline] > #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: net_tx_action+0x754/0x970 net/core/dev.c:5297 > #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline] > #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline] > #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: qdisc_tree_reduce_backlog+0x84/0x580 net/sched/sch_api.c:792 > >stack backtrace: >CPU: 1 PID: 1142 Comm: udevd Not tainted 6.1.74-syzkaller #0 >Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 >Call Trace: > <TASK> > [<ffffffff85b85f14>] __dump_stack lib/dump_stack.c:88 [inline] > [<ffffffff85b85f14>] dump_stack_lvl+0x1b1/0x28f lib/dump_stack.c:106 > [<ffffffff85b86007>] dump_stack+0x15/0x1e lib/dump_stack.c:113 > [<ffffffff81802299>] lockdep_rcu_suspicious+0x1b9/0x260 kernel/locking/lockdep.c:6592 > [<ffffffff84f0054c>] qdisc_lookup+0xac/0x6f0 net/sched/sch_api.c:305 > [<ffffffff84f037c3>] qdisc_tree_reduce_backlog+0x243/0x580 net/sched/sch_api.c:811 > [<ffffffff84f5b78c>] pfifo_tail_enqueue+0x32c/0x4b0 net/sched/sch_fifo.c:51 > [<ffffffff84fbcf63>] qdisc_enqueue include/net/sch_generic.h:833 [inline] > [<ffffffff84fbcf63>] netem_dequeue+0xeb3/0x15d0 net/sched/sch_netem.c:723 > [<ffffffff84eecab9>] dequeue_skb net/sched/sch_generic.c:292 [inline] > [<ffffffff84eecab9>] qdisc_restart net/sched/sch_generic.c:397 [inline] > [<ffffffff84eecab9>] __qdisc_run+0x249/0x1e60 net/sched/sch_generic.c:415 > [<ffffffff84d7aa96>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125 > [<ffffffff84d85d29>] net_tx_action+0x7c9/0x970 net/core/dev.c:5313 > [<ffffffff85e002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:616 > [<ffffffff81568bca>] invoke_softirq kernel/softirq.c:447 [inline] > [<ffffffff81568bca>] __irq_exit_rcu+0xca/0x230 kernel/softirq.c:700 > [<ffffffff81568ae9>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:712 > [<ffffffff85b89f52>] sysvec_apic_timer_interrupt+0x42/0x90 arch/x86/kernel/apic/apic.c:1107 > [<ffffffff85c00ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:656 > >Fixes: d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping") >Reported-by: syzbot <syzkaller@googlegroups.com> >Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com>
On Tue, Apr 2, 2024 at 9:41 AM Eric Dumazet <edumazet@google.com> wrote: > > qdisc_tree_reduce_backlog() is called with the qdisc lock held, > not RTNL. > > We must use qdisc_lookup_rcu() instead of qdisc_lookup() > > syzbot reported: > > WARNING: suspicious RCU usage > 6.1.74-syzkaller #0 Not tainted > ----------------------------- > net/sched/sch_api.c:305 suspicious rcu_dereference_protected() usage! > > other info that might help us debug this: > > rcu_scheduler_active = 2, debug_locks = 1 > 3 locks held by udevd/1142: > #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline] > #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline] > #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: net_tx_action+0x64a/0x970 net/core/dev.c:5282 > #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:350 [inline] > #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: net_tx_action+0x754/0x970 net/core/dev.c:5297 > #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline] > #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline] > #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: qdisc_tree_reduce_backlog+0x84/0x580 net/sched/sch_api.c:792 > > stack backtrace: > CPU: 1 PID: 1142 Comm: udevd Not tainted 6.1.74-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 > Call Trace: > <TASK> > [<ffffffff85b85f14>] __dump_stack lib/dump_stack.c:88 [inline] > [<ffffffff85b85f14>] dump_stack_lvl+0x1b1/0x28f lib/dump_stack.c:106 > [<ffffffff85b86007>] dump_stack+0x15/0x1e lib/dump_stack.c:113 > [<ffffffff81802299>] lockdep_rcu_suspicious+0x1b9/0x260 kernel/locking/lockdep.c:6592 > [<ffffffff84f0054c>] qdisc_lookup+0xac/0x6f0 net/sched/sch_api.c:305 > [<ffffffff84f037c3>] qdisc_tree_reduce_backlog+0x243/0x580 net/sched/sch_api.c:811 > [<ffffffff84f5b78c>] pfifo_tail_enqueue+0x32c/0x4b0 net/sched/sch_fifo.c:51 > [<ffffffff84fbcf63>] qdisc_enqueue include/net/sch_generic.h:833 [inline] > [<ffffffff84fbcf63>] netem_dequeue+0xeb3/0x15d0 net/sched/sch_netem.c:723 > [<ffffffff84eecab9>] dequeue_skb net/sched/sch_generic.c:292 [inline] > [<ffffffff84eecab9>] qdisc_restart net/sched/sch_generic.c:397 [inline] > [<ffffffff84eecab9>] __qdisc_run+0x249/0x1e60 net/sched/sch_generic.c:415 > [<ffffffff84d7aa96>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125 > [<ffffffff84d85d29>] net_tx_action+0x7c9/0x970 net/core/dev.c:5313 > [<ffffffff85e002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:616 > [<ffffffff81568bca>] invoke_softirq kernel/softirq.c:447 [inline] > [<ffffffff81568bca>] __irq_exit_rcu+0xca/0x230 kernel/softirq.c:700 > [<ffffffff81568ae9>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:712 > [<ffffffff85b89f52>] sysvec_apic_timer_interrupt+0x42/0x90 arch/x86/kernel/apic/apic.c:1107 > [<ffffffff85c00ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:656 > > Fixes: d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping") > Reported-by: syzbot <syzkaller@googlegroups.com> > Signed-off-by: Eric Dumazet <edumazet@google.com> LGTM. Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> cheers, jamal > --- > net/sched/sch_api.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c > index 65e05b0c98e461953aa8d98020142f0abe3ad8a7..60239378d43fb7adfe3926f927f3883f09673c16 100644 > --- a/net/sched/sch_api.c > +++ b/net/sched/sch_api.c > @@ -809,7 +809,7 @@ void qdisc_tree_reduce_backlog(struct Qdisc *sch, int n, int len) > notify = !sch->q.qlen && !WARN_ON_ONCE(!n && > !qdisc_is_offloaded); > /* TODO: perform the search on a per txq basis */ > - sch = qdisc_lookup(qdisc_dev(sch), TC_H_MAJ(parentid)); > + sch = qdisc_lookup_rcu(qdisc_dev(sch), TC_H_MAJ(parentid)); > if (sch == NULL) { > WARN_ON_ONCE(parentid != TC_H_ROOT); > break; > -- > 2.44.0.478.gd926399ef9-goog >
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Tue, 2 Apr 2024 13:41:33 +0000 you wrote: > qdisc_tree_reduce_backlog() is called with the qdisc lock held, > not RTNL. > > We must use qdisc_lookup_rcu() instead of qdisc_lookup() > > syzbot reported: > > [...] Here is the summary with links: - [net] net/sched: fix lockdep splat in qdisc_tree_reduce_backlog() https://git.kernel.org/netdev/net/c/7eb322360b02 You are awesome, thank you!
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 65e05b0c98e461953aa8d98020142f0abe3ad8a7..60239378d43fb7adfe3926f927f3883f09673c16 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -809,7 +809,7 @@ void qdisc_tree_reduce_backlog(struct Qdisc *sch, int n, int len) notify = !sch->q.qlen && !WARN_ON_ONCE(!n && !qdisc_is_offloaded); /* TODO: perform the search on a per txq basis */ - sch = qdisc_lookup(qdisc_dev(sch), TC_H_MAJ(parentid)); + sch = qdisc_lookup_rcu(qdisc_dev(sch), TC_H_MAJ(parentid)); if (sch == NULL) { WARN_ON_ONCE(parentid != TC_H_ROOT); break;
qdisc_tree_reduce_backlog() is called with the qdisc lock held, not RTNL. We must use qdisc_lookup_rcu() instead of qdisc_lookup() syzbot reported: WARNING: suspicious RCU usage 6.1.74-syzkaller #0 Not tainted ----------------------------- net/sched/sch_api.c:305 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by udevd/1142: #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline] #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline] #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: net_tx_action+0x64a/0x970 net/core/dev.c:5282 #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:350 [inline] #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: net_tx_action+0x754/0x970 net/core/dev.c:5297 #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline] #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline] #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: qdisc_tree_reduce_backlog+0x84/0x580 net/sched/sch_api.c:792 stack backtrace: CPU: 1 PID: 1142 Comm: udevd Not tainted 6.1.74-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 Call Trace: <TASK> [<ffffffff85b85f14>] __dump_stack lib/dump_stack.c:88 [inline] [<ffffffff85b85f14>] dump_stack_lvl+0x1b1/0x28f lib/dump_stack.c:106 [<ffffffff85b86007>] dump_stack+0x15/0x1e lib/dump_stack.c:113 [<ffffffff81802299>] lockdep_rcu_suspicious+0x1b9/0x260 kernel/locking/lockdep.c:6592 [<ffffffff84f0054c>] qdisc_lookup+0xac/0x6f0 net/sched/sch_api.c:305 [<ffffffff84f037c3>] qdisc_tree_reduce_backlog+0x243/0x580 net/sched/sch_api.c:811 [<ffffffff84f5b78c>] pfifo_tail_enqueue+0x32c/0x4b0 net/sched/sch_fifo.c:51 [<ffffffff84fbcf63>] qdisc_enqueue include/net/sch_generic.h:833 [inline] [<ffffffff84fbcf63>] netem_dequeue+0xeb3/0x15d0 net/sched/sch_netem.c:723 [<ffffffff84eecab9>] dequeue_skb net/sched/sch_generic.c:292 [inline] [<ffffffff84eecab9>] qdisc_restart net/sched/sch_generic.c:397 [inline] [<ffffffff84eecab9>] __qdisc_run+0x249/0x1e60 net/sched/sch_generic.c:415 [<ffffffff84d7aa96>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125 [<ffffffff84d85d29>] net_tx_action+0x7c9/0x970 net/core/dev.c:5313 [<ffffffff85e002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:616 [<ffffffff81568bca>] invoke_softirq kernel/softirq.c:447 [inline] [<ffffffff81568bca>] __irq_exit_rcu+0xca/0x230 kernel/softirq.c:700 [<ffffffff81568ae9>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:712 [<ffffffff85b89f52>] sysvec_apic_timer_interrupt+0x42/0x90 arch/x86/kernel/apic/apic.c:1107 [<ffffffff85c00ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:656 Fixes: d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> --- net/sched/sch_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)