Message ID | 20210621144417.694367-1-eric.dumazet@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 85e8b032d6ebb0f698a34dd22c2f13443d905888 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] vxlan: add missing rcu_read_lock() in neigh_reduce() | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | fail | 1 blamed authors not CCed: amwang@redhat.com; 1 maintainers not CCed: amwang@redhat.com |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 1 this patch: 1 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | warning | WARNING: Possible repeated word: 'Google' |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 1 this patch: 1 |
netdev/header_inline | success | Link |
On 6/21/21 4:44 PM, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> > > syzbot complained in neigh_reduce(), because rcu_read_lock_bh() > is treated differently than rcu_read_lock() > > WARNING: suspicious RCU usage > 5.13.0-rc6-syzkaller #0 Not tainted > ----------------------------- > include/net/addrconf.h:313 suspicious rcu_dereference_check() usage! > > other info that might help us debug this: > > rcu_scheduler_active = 2, debug_locks = 1 > 3 locks held by kworker/0:0/5: > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline] > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline] > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline] > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:617 [inline] > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline] > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x871/0x1600 kernel/workqueue.c:2247 > #1: ffffc90000ca7da8 ((work_completion)(&port->wq)){+.+.}-{0:0}, at: process_one_work+0x8a5/0x1600 kernel/workqueue.c:2251 > #2: ffffffff8bf795c0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1da/0x3130 net/core/dev.c:4180 > > stack backtrace: > CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.13.0-rc6-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > Workqueue: events ipvlan_process_multicast > Call Trace: > __dump_stack lib/dump_stack.c:79 [inline] > dump_stack+0x141/0x1d7 lib/dump_stack.c:120 > __in6_dev_get include/net/addrconf.h:313 [inline] > __in6_dev_get include/net/addrconf.h:311 [inline] > neigh_reduce drivers/net/vxlan.c:2167 [inline] > vxlan_xmit+0x34d5/0x4c30 drivers/net/vxlan.c:2919 > __netdev_start_xmit include/linux/netdevice.h:4944 [inline] > netdev_start_xmit include/linux/netdevice.h:4958 [inline] > xmit_one net/core/dev.c:3654 [inline] > dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3670 > __dev_queue_xmit+0x2133/0x3130 net/core/dev.c:4246 > ipvlan_process_multicast+0xa99/0xd70 drivers/net/ipvlan/ipvlan_core.c:287 > process_one_work+0x98d/0x1600 kernel/workqueue.c:2276 > worker_thread+0x64c/0x1120 kernel/workqueue.c:2422 > kthread+0x3b1/0x4a0 kernel/kthread.c:313 > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 > > Fixes: f564f45c4518 ("vxlan: add ipv6 proxy support") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: syzbot <syzkaller@googlegroups.com> [ +Paul/Toke ] Only a side comment on this fix given the series under [0] where we remove the rcu_read_lock() given covered by rcu_read_lock_bh(): [...] It seems [1] that back in the early days of XDP, local_bh_disable() did not provide RCU protection, which is why the rcu_read_lock() calls were added to drivers in the first place. But according to Paul [2], in recent kernels a local_bh_disable()/local_bh_enable() pair functions as one big RCU read-side section, so no further protection is needed. This even applies to -rt kernels, which has an explicit rcu_read_lock() in place as part of the local_bh_disable() [3]. [...] Paul/Toke, with regards to related questions under [1], I presume there should additionally be a fixup for lockdep /in general/ to silence warning like these ? [0] https://lore.kernel.org/bpf/20210617212748.32456-1-toke@redhat.com/ [1] https://lore.kernel.org/bpf/1881ecbe-06ec-6b0a-836c-033c31fabef4@iogearbox.net/ > --- > drivers/net/vxlan.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c > index 02a14f1b938ad50fc28044b7670ba5f6bf924345..5a8df5a195cb5700c45b4785355ef8ed84866052 100644 > --- a/drivers/net/vxlan.c > +++ b/drivers/net/vxlan.c > @@ -2164,6 +2164,7 @@ static int neigh_reduce(struct net_device *dev, struct sk_buff *skb, __be32 vni) > struct neighbour *n; > struct nd_msg *msg; > > + rcu_read_lock(); > in6_dev = __in6_dev_get(dev); > if (!in6_dev) > goto out; > @@ -2215,6 +2216,7 @@ static int neigh_reduce(struct net_device *dev, struct sk_buff *skb, __be32 vni) > } > > out: > + rcu_read_unlock(); > consume_skb(skb); > return NETDEV_TX_OK; > } >
On Mon, Jun 21, 2021 at 06:04:46PM +0200, Daniel Borkmann wrote: > On 6/21/21 4:44 PM, Eric Dumazet wrote: > > From: Eric Dumazet <edumazet@google.com> > > > > syzbot complained in neigh_reduce(), because rcu_read_lock_bh() > > is treated differently than rcu_read_lock() > > > > WARNING: suspicious RCU usage > > 5.13.0-rc6-syzkaller #0 Not tainted > > ----------------------------- > > include/net/addrconf.h:313 suspicious rcu_dereference_check() usage! > > > > other info that might help us debug this: > > > > rcu_scheduler_active = 2, debug_locks = 1 This "debug_locks = 1" often indicates that there was some other lockdep complaint that happened at about the same time. > > 3 locks held by kworker/0:0/5: > > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline] > > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline] > > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline] > > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:617 [inline] > > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline] > > #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x871/0x1600 kernel/workqueue.c:2247 > > #1: ffffc90000ca7da8 ((work_completion)(&port->wq)){+.+.}-{0:0}, at: process_one_work+0x8a5/0x1600 kernel/workqueue.c:2251 > > #2: ffffffff8bf795c0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1da/0x3130 net/core/dev.c:4180 > > > > stack backtrace: > > CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.13.0-rc6-syzkaller #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > Workqueue: events ipvlan_process_multicast > > Call Trace: > > __dump_stack lib/dump_stack.c:79 [inline] > > dump_stack+0x141/0x1d7 lib/dump_stack.c:120 > > __in6_dev_get include/net/addrconf.h:313 [inline] > > __in6_dev_get include/net/addrconf.h:311 [inline] > > neigh_reduce drivers/net/vxlan.c:2167 [inline] > > vxlan_xmit+0x34d5/0x4c30 drivers/net/vxlan.c:2919 > > __netdev_start_xmit include/linux/netdevice.h:4944 [inline] > > netdev_start_xmit include/linux/netdevice.h:4958 [inline] > > xmit_one net/core/dev.c:3654 [inline] > > dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3670 > > __dev_queue_xmit+0x2133/0x3130 net/core/dev.c:4246 > > ipvlan_process_multicast+0xa99/0xd70 drivers/net/ipvlan/ipvlan_core.c:287 > > process_one_work+0x98d/0x1600 kernel/workqueue.c:2276 > > worker_thread+0x64c/0x1120 kernel/workqueue.c:2422 > > kthread+0x3b1/0x4a0 kernel/kthread.c:313 > > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 > > > > Fixes: f564f45c4518 ("vxlan: add ipv6 proxy support") > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > Reported-by: syzbot <syzkaller@googlegroups.com> > > [ +Paul/Toke ] > > Only a side comment on this fix given the series under [0] where we remove the > rcu_read_lock() given covered by rcu_read_lock_bh(): > > [...] It seems [1] that back in the early days of XDP, local_bh_disable() did > not provide RCU protection, which is why the rcu_read_lock() calls were added > to drivers in the first place. But according to Paul [2], in recent kernels > a local_bh_disable()/local_bh_enable() pair functions as one big RCU > read-side section, so no further protection is needed. This even applies to > -rt kernels, which has an explicit rcu_read_lock() in place as part of the > local_bh_disable() [3]. [...] > > Paul/Toke, with regards to related questions under [1], I presume there should > additionally be a fixup for lockdep /in general/ to silence warning like these ? Do these commits help? 1feb2cc8db48 ("lockdep: Explicitly flag likely false-positive report") 3066820034b5 ("rcu: Reject RCU_LOCKDEP_WARN() false positives") With a bit of luck, these will go into the upcoming merge window. Thanx, Paul > [0] https://lore.kernel.org/bpf/20210617212748.32456-1-toke@redhat.com/ > [1] https://lore.kernel.org/bpf/1881ecbe-06ec-6b0a-836c-033c31fabef4@iogearbox.net/ > > > --- > > drivers/net/vxlan.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c > > index 02a14f1b938ad50fc28044b7670ba5f6bf924345..5a8df5a195cb5700c45b4785355ef8ed84866052 100644 > > --- a/drivers/net/vxlan.c > > +++ b/drivers/net/vxlan.c > > @@ -2164,6 +2164,7 @@ static int neigh_reduce(struct net_device *dev, struct sk_buff *skb, __be32 vni) > > struct neighbour *n; > > struct nd_msg *msg; > > + rcu_read_lock(); > > in6_dev = __in6_dev_get(dev); > > if (!in6_dev) > > goto out; > > @@ -2215,6 +2216,7 @@ static int neigh_reduce(struct net_device *dev, struct sk_buff *skb, __be32 vni) > > } > > out: > > + rcu_read_unlock(); > > consume_skb(skb); > > return NETDEV_TX_OK; > > } > > >
Hello: This patch was applied to netdev/net.git (refs/heads/master): On Mon, 21 Jun 2021 07:44:17 -0700 you wrote: > From: Eric Dumazet <edumazet@google.com> > > syzbot complained in neigh_reduce(), because rcu_read_lock_bh() > is treated differently than rcu_read_lock() > > WARNING: suspicious RCU usage > 5.13.0-rc6-syzkaller #0 Not tainted > > [...] Here is the summary with links: - [net] vxlan: add missing rcu_read_lock() in neigh_reduce() https://git.kernel.org/netdev/net/c/85e8b032d6eb You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 02a14f1b938ad50fc28044b7670ba5f6bf924345..5a8df5a195cb5700c45b4785355ef8ed84866052 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -2164,6 +2164,7 @@ static int neigh_reduce(struct net_device *dev, struct sk_buff *skb, __be32 vni) struct neighbour *n; struct nd_msg *msg; + rcu_read_lock(); in6_dev = __in6_dev_get(dev); if (!in6_dev) goto out; @@ -2215,6 +2216,7 @@ static int neigh_reduce(struct net_device *dev, struct sk_buff *skb, __be32 vni) } out: + rcu_read_unlock(); consume_skb(skb); return NETDEV_TX_OK; }