Message ID | 20221012133412.519394-1-edumazet@google.com (mailing list archive) |
---|---|
State | Accepted |
Commit | ec7eede369fe5b0d085ac51fdbb95184f87bfc6c |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] kcm: avoid potential race in kcm_tx_work | expand |
On Wed, 12 Oct 2022 at 15:34, 'Eric Dumazet' via syzkaller <syzkaller@googlegroups.com> wrote: > > syzbot found that kcm_tx_work() could crash [1] in: > > /* Primarily for SOCK_SEQPACKET sockets */ > if (likely(sk->sk_socket) && > test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) { > <<*>> clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags); > sk->sk_write_space(sk); > } > > I think the reason is that another thread might concurrently > run in kcm_release() and call sock_orphan(sk) while sk is not > locked. kcm_tx_work() find sk->sk_socket being NULL. Does it make sense to add some lockdep annotations to sock_orphan() and maybe some other similar functions to catch such cases earlier? > [1] > BUG: KASAN: null-ptr-deref in instrument_atomic_write include/linux/instrumented.h:86 [inline] > BUG: KASAN: null-ptr-deref in clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] > BUG: KASAN: null-ptr-deref in kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742 > Write of size 8 at addr 0000000000000008 by task kworker/u4:3/53 > > CPU: 0 PID: 53 Comm: kworker/u4:3 Not tainted 5.19.0-rc3-next-20220621-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > Workqueue: kkcmd kcm_tx_work > Call Trace: > <TASK> > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 > kasan_report+0xbe/0x1f0 mm/kasan/report.c:495 > check_region_inline mm/kasan/generic.c:183 [inline] > kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189 > instrument_atomic_write include/linux/instrumented.h:86 [inline] > clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] > kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742 > process_one_work+0x996/0x1610 kernel/workqueue.c:2289 > worker_thread+0x665/0x1080 kernel/workqueue.c:2436 > kthread+0x2e9/0x3a0 kernel/kthread.c:376 > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302 > </TASK> > > Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") > Reported-by: syzbot <syzkaller@googlegroups.com> > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Tom Herbert <tom@herbertland.com> > --- > net/kcm/kcmsock.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c > index 1215c863e1c410fa9ba5b9c3706152decfb3ebac..27725464ec08fe2b5f2e86202636cbc895568098 100644 > --- a/net/kcm/kcmsock.c > +++ b/net/kcm/kcmsock.c > @@ -1838,10 +1838,10 @@ static int kcm_release(struct socket *sock) > kcm = kcm_sk(sk); > mux = kcm->mux; > > + lock_sock(sk); > sock_orphan(sk); > kfree_skb(kcm->seq_skb); > > - lock_sock(sk); > /* Purge queue under lock to avoid race condition with tx_work trying > * to act when queue is nonempty. If tx_work runs after this point > * it will just return. > -- > 2.38.0.rc1.362.ged0d419d3c-goog > > -- > You received this message because you are subscribed to the Google Groups "syzkaller" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/20221012133412.519394-1-edumazet%40google.com.
On Wed, Oct 12, 2022 at 7:00 AM Dmitry Vyukov <dvyukov@google.com> wrote: > > On Wed, 12 Oct 2022 at 15:34, 'Eric Dumazet' via syzkaller > <syzkaller@googlegroups.com> wrote: > > > > syzbot found that kcm_tx_work() could crash [1] in: > > > > /* Primarily for SOCK_SEQPACKET sockets */ > > if (likely(sk->sk_socket) && > > test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) { > > <<*>> clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags); > > sk->sk_write_space(sk); > > } > > > > I think the reason is that another thread might concurrently > > run in kcm_release() and call sock_orphan(sk) while sk is not > > locked. kcm_tx_work() find sk->sk_socket being NULL. > > Does it make sense to add some lockdep annotations to sock_orphan() > and maybe some other similar functions to catch such cases earlier? I thought about that, but this seems net-next material. > > > > [1] > > BUG: KASAN: null-ptr-deref in instrument_atomic_write include/linux/instrumented.h:86 [inline] > > BUG: KASAN: null-ptr-deref in clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] > > BUG: KASAN: null-ptr-deref in kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742 > > Write of size 8 at addr 0000000000000008 by task kworker/u4:3/53 > > > > CPU: 0 PID: 53 Comm: kworker/u4:3 Not tainted 5.19.0-rc3-next-20220621-syzkaller #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > Workqueue: kkcmd kcm_tx_work > > Call Trace: > > <TASK> > > __dump_stack lib/dump_stack.c:88 [inline] > > dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 > > kasan_report+0xbe/0x1f0 mm/kasan/report.c:495 > > check_region_inline mm/kasan/generic.c:183 [inline] > > kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189 > > instrument_atomic_write include/linux/instrumented.h:86 [inline] > > clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] > > kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742 > > process_one_work+0x996/0x1610 kernel/workqueue.c:2289 > > worker_thread+0x665/0x1080 kernel/workqueue.c:2436 > > kthread+0x2e9/0x3a0 kernel/kthread.c:376 > > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302 > > </TASK> > > > > Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") > > Reported-by: syzbot <syzkaller@googlegroups.com> > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > Cc: Tom Herbert <tom@herbertland.com> > > --- > > net/kcm/kcmsock.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c > > index 1215c863e1c410fa9ba5b9c3706152decfb3ebac..27725464ec08fe2b5f2e86202636cbc895568098 100644 > > --- a/net/kcm/kcmsock.c > > +++ b/net/kcm/kcmsock.c > > @@ -1838,10 +1838,10 @@ static int kcm_release(struct socket *sock) > > kcm = kcm_sk(sk); > > mux = kcm->mux; > > > > + lock_sock(sk); > > sock_orphan(sk); > > kfree_skb(kcm->seq_skb); > > > > - lock_sock(sk); > > /* Purge queue under lock to avoid race condition with tx_work trying > > * to act when queue is nonempty. If tx_work runs after this point > > * it will just return. > > -- > > 2.38.0.rc1.362.ged0d419d3c-goog > > > > -- > > You received this message because you are subscribed to the Google Groups "syzkaller" group. > > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com. > > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/20221012133412.519394-1-edumazet%40google.com.
Hello: This patch was applied to netdev/net.git (master) by Jakub Kicinski <kuba@kernel.org>: On Wed, 12 Oct 2022 13:34:12 +0000 you wrote: > syzbot found that kcm_tx_work() could crash [1] in: > > /* Primarily for SOCK_SEQPACKET sockets */ > if (likely(sk->sk_socket) && > test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) { > <<*>> clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags); > sk->sk_write_space(sk); > } > > [...] Here is the summary with links: - [net] kcm: avoid potential race in kcm_tx_work https://git.kernel.org/netdev/net/c/ec7eede369fe You are awesome, thank you!
diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index 1215c863e1c410fa9ba5b9c3706152decfb3ebac..27725464ec08fe2b5f2e86202636cbc895568098 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -1838,10 +1838,10 @@ static int kcm_release(struct socket *sock) kcm = kcm_sk(sk); mux = kcm->mux; + lock_sock(sk); sock_orphan(sk); kfree_skb(kcm->seq_skb); - lock_sock(sk); /* Purge queue under lock to avoid race condition with tx_work trying * to act when queue is nonempty. If tx_work runs after this point * it will just return.
syzbot found that kcm_tx_work() could crash [1] in: /* Primarily for SOCK_SEQPACKET sockets */ if (likely(sk->sk_socket) && test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) { <<*>> clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags); sk->sk_write_space(sk); } I think the reason is that another thread might concurrently run in kcm_release() and call sock_orphan(sk) while sk is not locked. kcm_tx_work() find sk->sk_socket being NULL. [1] BUG: KASAN: null-ptr-deref in instrument_atomic_write include/linux/instrumented.h:86 [inline] BUG: KASAN: null-ptr-deref in clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] BUG: KASAN: null-ptr-deref in kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742 Write of size 8 at addr 0000000000000008 by task kworker/u4:3/53 CPU: 0 PID: 53 Comm: kworker/u4:3 Not tainted 5.19.0-rc3-next-20220621-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: kkcmd kcm_tx_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 kasan_report+0xbe/0x1f0 mm/kasan/report.c:495 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189 instrument_atomic_write include/linux/instrumented.h:86 [inline] clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302 </TASK> Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> --- net/kcm/kcmsock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)