Message ID | 20240518000148.27947-1-kuniyu@amazon.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v1,net] af_unix: Annotate data-race around unix_sk(sk)->addr. | expand |
On Sat, 2024-05-18 at 09:01 +0900, Kuniyuki Iwashima wrote: > Once unix_sk(sk)->addr is assigned under net->unx.table.locks, > *(unix_sk(sk)->addr) and unix_sk(sk)->path are fully set up, and > unix_sk(sk)->addr is never changed. > > unix_getname() and unix_copy_addr() access the two fields locklessly, > and commit ae3b564179bf ("missing barriers in some of unix_sock ->addr > and ->path accesses") added smp_store_release() and smp_load_acquire() > pairs. > > In other functions, we still read unix_sk(sk)->addr locklessly to check > if the socket is bound, and KCSAN complains about it. [0] > > Given these functions have no dependency for *(unix_sk(sk)->addr) and > unix_sk(sk)->path, READ_ONCE() is enough to annotate the data-race. > > [0]: > BUG: KCSAN: data-race in unix_bind / unix_listen > > write (marked) to 0xffff88805f8d1840 of 8 bytes by task 13723 on cpu 0: > __unix_set_addr_hash net/unix/af_unix.c:329 [inline] > unix_bind_bsd net/unix/af_unix.c:1241 [inline] > unix_bind+0x881/0x1000 net/unix/af_unix.c:1319 > __sys_bind+0x194/0x1e0 net/socket.c:1847 > __do_sys_bind net/socket.c:1858 [inline] > __se_sys_bind net/socket.c:1856 [inline] > __x64_sys_bind+0x40/0x50 net/socket.c:1856 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > > read to 0xffff88805f8d1840 of 8 bytes by task 13724 on cpu 1: > unix_listen+0x72/0x180 net/unix/af_unix.c:734 > __sys_listen+0xdc/0x160 net/socket.c:1881 > __do_sys_listen net/socket.c:1890 [inline] > __se_sys_listen net/socket.c:1888 [inline] > __x64_sys_listen+0x2e/0x40 net/socket.c:1888 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > > value changed: 0x0000000000000000 -> 0xffff88807b5b1b40 > > Reported by Kernel Concurrency Sanitizer on: > CPU: 1 PID: 13724 Comm: syz-executor.4 Not tainted 6.8.0-12822-gcd51db110a7e #12 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Reported-by: syzkaller <syzkaller@googlegroups.com> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > --- > net/unix/af_unix.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > index ca101690e740..92a88ac070ca 100644 > --- a/net/unix/af_unix.c > +++ b/net/unix/af_unix.c > @@ -731,7 +731,7 @@ static int unix_listen(struct socket *sock, int backlog) > if (sock->type != SOCK_STREAM && sock->type != SOCK_SEQPACKET) > goto out; /* Only stream/seqpacket sockets accept */ > err = -EINVAL; > - if (!u->addr) > + if (!READ_ONCE(u->addr)) > goto out; /* No listens on an unbound socket */ > unix_state_lock(sk); > if (sk->sk_state != TCP_CLOSE && sk->sk_state != TCP_LISTEN) > @@ -1369,7 +1369,7 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > test_bit(SOCK_PASSPIDFD, &sock->flags)) && > - !unix_sk(sk)->addr) { > + !READ_ONCE(unix_sk(sk)->addr)) { > err = unix_autobind(sk); > if (err) > goto out; > @@ -1481,7 +1481,8 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, > goto out; > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > - test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) { > + test_bit(SOCK_PASSPIDFD, &sock->flags)) && > + !READ_ONCE(u->addr)) { > err = unix_autobind(sk); > if (err) > goto out; > @@ -1951,7 +1952,8 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, > } > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > - test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) { > + test_bit(SOCK_PASSPIDFD, &sock->flags)) && > + !READ_ONCE(u->addr)) { > err = unix_autobind(sk); > if (err) > goto out; There are a few other places where ->addr is accessed lockless (under the bindlock, but prior to acquiring the table spinlock, e.g. unix_bind_* and unix_autobind. The latter is suspect as it's called right after the touched code. Why are such spots not relevant here? Also the newu->addr/otheru->addr handling in unix_stream_connect() looks suspect. Thanks, Paolo
From: Paolo Abeni <pabeni@redhat.com> Date: Wed, 22 May 2024 10:52:19 +0200 > On Sat, 2024-05-18 at 09:01 +0900, Kuniyuki Iwashima wrote: > > Once unix_sk(sk)->addr is assigned under net->unx.table.locks, > > *(unix_sk(sk)->addr) and unix_sk(sk)->path are fully set up, and > > unix_sk(sk)->addr is never changed. > > > > unix_getname() and unix_copy_addr() access the two fields locklessly, > > and commit ae3b564179bf ("missing barriers in some of unix_sock ->addr > > and ->path accesses") added smp_store_release() and smp_load_acquire() > > pairs. > > > > In other functions, we still read unix_sk(sk)->addr locklessly to check > > if the socket is bound, and KCSAN complains about it. [0] > > > > Given these functions have no dependency for *(unix_sk(sk)->addr) and > > unix_sk(sk)->path, READ_ONCE() is enough to annotate the data-race. > > > > [0]: > > BUG: KCSAN: data-race in unix_bind / unix_listen > > > > write (marked) to 0xffff88805f8d1840 of 8 bytes by task 13723 on cpu 0: > > __unix_set_addr_hash net/unix/af_unix.c:329 [inline] > > unix_bind_bsd net/unix/af_unix.c:1241 [inline] > > unix_bind+0x881/0x1000 net/unix/af_unix.c:1319 > > __sys_bind+0x194/0x1e0 net/socket.c:1847 > > __do_sys_bind net/socket.c:1858 [inline] > > __se_sys_bind net/socket.c:1856 [inline] > > __x64_sys_bind+0x40/0x50 net/socket.c:1856 > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 > > entry_SYSCALL_64_after_hwframe+0x46/0x4e > > > > read to 0xffff88805f8d1840 of 8 bytes by task 13724 on cpu 1: > > unix_listen+0x72/0x180 net/unix/af_unix.c:734 > > __sys_listen+0xdc/0x160 net/socket.c:1881 > > __do_sys_listen net/socket.c:1890 [inline] > > __se_sys_listen net/socket.c:1888 [inline] > > __x64_sys_listen+0x2e/0x40 net/socket.c:1888 > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 > > entry_SYSCALL_64_after_hwframe+0x46/0x4e > > > > value changed: 0x0000000000000000 -> 0xffff88807b5b1b40 > > > > Reported by Kernel Concurrency Sanitizer on: > > CPU: 1 PID: 13724 Comm: syz-executor.4 Not tainted 6.8.0-12822-gcd51db110a7e #12 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > Reported-by: syzkaller <syzkaller@googlegroups.com> > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > > --- > > net/unix/af_unix.c | 10 ++++++---- > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > > index ca101690e740..92a88ac070ca 100644 > > --- a/net/unix/af_unix.c > > +++ b/net/unix/af_unix.c > > @@ -731,7 +731,7 @@ static int unix_listen(struct socket *sock, int backlog) > > if (sock->type != SOCK_STREAM && sock->type != SOCK_SEQPACKET) > > goto out; /* Only stream/seqpacket sockets accept */ > > err = -EINVAL; > > - if (!u->addr) > > + if (!READ_ONCE(u->addr)) > > goto out; /* No listens on an unbound socket */ > > unix_state_lock(sk); > > if (sk->sk_state != TCP_CLOSE && sk->sk_state != TCP_LISTEN) > > @@ -1369,7 +1369,7 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, > > > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > > test_bit(SOCK_PASSPIDFD, &sock->flags)) && > > - !unix_sk(sk)->addr) { > > + !READ_ONCE(unix_sk(sk)->addr)) { > > err = unix_autobind(sk); > > if (err) > > goto out; > > @@ -1481,7 +1481,8 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, > > goto out; > > > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > > - test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) { > > + test_bit(SOCK_PASSPIDFD, &sock->flags)) && > > + !READ_ONCE(u->addr)) { > > err = unix_autobind(sk); > > if (err) > > goto out; > > @@ -1951,7 +1952,8 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, > > } > > > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > > - test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) { > > + test_bit(SOCK_PASSPIDFD, &sock->flags)) && > > + !READ_ONCE(u->addr)) { > > err = unix_autobind(sk); > > if (err) > > goto out; > > There are a few other places where ->addr is accessed lockless (under > the bindlock, but prior to acquiring the table spinlock, e.g. > unix_bind_* and unix_autobind. The latter is suspect as it's called > right after the touched code. Why are such spots not relevant here? When u->addr is set, bindlock and the hash table lock are held. unix_bind_(abstract|bsd)() and unix_autobind() are all serialised by bindlock, so ->addr check after acquiring bindlock is not lockless actually. > > Also the newu->addr/otheru->addr handling in unix_stream_connect() > looks suspect. u->addr is set before the socket is put into the hash table, and it never changes after bind(). otheru is found by unix_find_other() from the hash table, and then we access ->addr in unix_stream_connect(). So, there is no race. Also, newu is a child socket created by connect() and does not exist in the hash table but has ->addr set. At this moment, no one can read its ->addr, and smp_store_release() publishes it for the later ->getname() in case the child socket is accept()ed. Thanks!
On Wed, 2024-05-22 at 23:53 +0900, Kuniyuki Iwashima wrote: > From: Paolo Abeni <pabeni@redhat.com> > Date: Wed, 22 May 2024 10:52:19 +0200 > > On Sat, 2024-05-18 at 09:01 +0900, Kuniyuki Iwashima wrote: > > > Once unix_sk(sk)->addr is assigned under net->unx.table.locks, > > > *(unix_sk(sk)->addr) and unix_sk(sk)->path are fully set up, and > > > unix_sk(sk)->addr is never changed. > > > > > > unix_getname() and unix_copy_addr() access the two fields locklessly, > > > and commit ae3b564179bf ("missing barriers in some of unix_sock ->addr > > > and ->path accesses") added smp_store_release() and smp_load_acquire() > > > pairs. > > > > > > In other functions, we still read unix_sk(sk)->addr locklessly to check > > > if the socket is bound, and KCSAN complains about it. [0] > > > > > > Given these functions have no dependency for *(unix_sk(sk)->addr) and > > > unix_sk(sk)->path, READ_ONCE() is enough to annotate the data-race. > > > > > > [0]: > > > BUG: KCSAN: data-race in unix_bind / unix_listen > > > > > > write (marked) to 0xffff88805f8d1840 of 8 bytes by task 13723 on cpu 0: > > > __unix_set_addr_hash net/unix/af_unix.c:329 [inline] > > > unix_bind_bsd net/unix/af_unix.c:1241 [inline] > > > unix_bind+0x881/0x1000 net/unix/af_unix.c:1319 > > > __sys_bind+0x194/0x1e0 net/socket.c:1847 > > > __do_sys_bind net/socket.c:1858 [inline] > > > __se_sys_bind net/socket.c:1856 [inline] > > > __x64_sys_bind+0x40/0x50 net/socket.c:1856 > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 > > > entry_SYSCALL_64_after_hwframe+0x46/0x4e > > > > > > read to 0xffff88805f8d1840 of 8 bytes by task 13724 on cpu 1: > > > unix_listen+0x72/0x180 net/unix/af_unix.c:734 > > > __sys_listen+0xdc/0x160 net/socket.c:1881 > > > __do_sys_listen net/socket.c:1890 [inline] > > > __se_sys_listen net/socket.c:1888 [inline] > > > __x64_sys_listen+0x2e/0x40 net/socket.c:1888 > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 > > > entry_SYSCALL_64_after_hwframe+0x46/0x4e > > > > > > value changed: 0x0000000000000000 -> 0xffff88807b5b1b40 > > > > > > Reported by Kernel Concurrency Sanitizer on: > > > CPU: 1 PID: 13724 Comm: syz-executor.4 Not tainted 6.8.0-12822-gcd51db110a7e #12 > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > > > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > > Reported-by: syzkaller <syzkaller@googlegroups.com> > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > > > --- > > > net/unix/af_unix.c | 10 ++++++---- > > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > > > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > > > index ca101690e740..92a88ac070ca 100644 > > > --- a/net/unix/af_unix.c > > > +++ b/net/unix/af_unix.c > > > @@ -731,7 +731,7 @@ static int unix_listen(struct socket *sock, int backlog) > > > if (sock->type != SOCK_STREAM && sock->type != SOCK_SEQPACKET) > > > goto out; /* Only stream/seqpacket sockets accept */ > > > err = -EINVAL; > > > - if (!u->addr) > > > + if (!READ_ONCE(u->addr)) > > > goto out; /* No listens on an unbound socket */ > > > unix_state_lock(sk); > > > if (sk->sk_state != TCP_CLOSE && sk->sk_state != TCP_LISTEN) > > > @@ -1369,7 +1369,7 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, > > > > > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > > > test_bit(SOCK_PASSPIDFD, &sock->flags)) && > > > - !unix_sk(sk)->addr) { > > > + !READ_ONCE(unix_sk(sk)->addr)) { > > > err = unix_autobind(sk); > > > if (err) > > > goto out; > > > @@ -1481,7 +1481,8 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, > > > goto out; > > > > > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > > > - test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) { > > > + test_bit(SOCK_PASSPIDFD, &sock->flags)) && > > > + !READ_ONCE(u->addr)) { > > > err = unix_autobind(sk); > > > if (err) > > > goto out; > > > @@ -1951,7 +1952,8 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, > > > } > > > > > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > > > - test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) { > > > + test_bit(SOCK_PASSPIDFD, &sock->flags)) && > > > + !READ_ONCE(u->addr)) { > > > err = unix_autobind(sk); > > > if (err) > > > goto out; > > > > There are a few other places where ->addr is accessed lockless (under > > the bindlock, but prior to acquiring the table spinlock, e.g. > > unix_bind_* and unix_autobind. The latter is suspect as it's called > > right after the touched code. Why are such spots not relevant here? > > When u->addr is set, bindlock and the hash table lock are held. > unix_bind_(abstract|bsd)() and unix_autobind() are all serialised > by bindlock, so ->addr check after acquiring bindlock is not > lockless actually. > > > > > > Also the newu->addr/otheru->addr handling in unix_stream_connect() > > looks suspect. > > u->addr is set before the socket is put into the hash table, and > it never changes after bind(). > > otheru is found by unix_find_other() from the hash table, and then > we access ->addr in unix_stream_connect(). So, there is no race. I see. I think the serialization chain is not trivial, I think it would help future memory to at least mention it in the commit message or in a code comment. Thanks, Paolo
From: Paolo Abeni <pabeni@redhat.com> Date: Wed, 22 May 2024 17:25:23 +0200 > On Wed, 2024-05-22 at 23:53 +0900, Kuniyuki Iwashima wrote: > > From: Paolo Abeni <pabeni@redhat.com> > > Date: Wed, 22 May 2024 10:52:19 +0200 > > > On Sat, 2024-05-18 at 09:01 +0900, Kuniyuki Iwashima wrote: > > > > Once unix_sk(sk)->addr is assigned under net->unx.table.locks, > > > > *(unix_sk(sk)->addr) and unix_sk(sk)->path are fully set up, and > > > > unix_sk(sk)->addr is never changed. > > > > > > > > unix_getname() and unix_copy_addr() access the two fields locklessly, > > > > and commit ae3b564179bf ("missing barriers in some of unix_sock ->addr > > > > and ->path accesses") added smp_store_release() and smp_load_acquire() > > > > pairs. > > > > > > > > In other functions, we still read unix_sk(sk)->addr locklessly to check > > > > if the socket is bound, and KCSAN complains about it. [0] > > > > > > > > Given these functions have no dependency for *(unix_sk(sk)->addr) and > > > > unix_sk(sk)->path, READ_ONCE() is enough to annotate the data-race. > > > > > > > > [0]: > > > > BUG: KCSAN: data-race in unix_bind / unix_listen > > > > > > > > write (marked) to 0xffff88805f8d1840 of 8 bytes by task 13723 on cpu 0: > > > > __unix_set_addr_hash net/unix/af_unix.c:329 [inline] > > > > unix_bind_bsd net/unix/af_unix.c:1241 [inline] > > > > unix_bind+0x881/0x1000 net/unix/af_unix.c:1319 > > > > __sys_bind+0x194/0x1e0 net/socket.c:1847 > > > > __do_sys_bind net/socket.c:1858 [inline] > > > > __se_sys_bind net/socket.c:1856 [inline] > > > > __x64_sys_bind+0x40/0x50 net/socket.c:1856 > > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > > do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 > > > > entry_SYSCALL_64_after_hwframe+0x46/0x4e > > > > > > > > read to 0xffff88805f8d1840 of 8 bytes by task 13724 on cpu 1: > > > > unix_listen+0x72/0x180 net/unix/af_unix.c:734 > > > > __sys_listen+0xdc/0x160 net/socket.c:1881 > > > > __do_sys_listen net/socket.c:1890 [inline] > > > > __se_sys_listen net/socket.c:1888 [inline] > > > > __x64_sys_listen+0x2e/0x40 net/socket.c:1888 > > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > > do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 > > > > entry_SYSCALL_64_after_hwframe+0x46/0x4e > > > > > > > > value changed: 0x0000000000000000 -> 0xffff88807b5b1b40 > > > > > > > > Reported by Kernel Concurrency Sanitizer on: > > > > CPU: 1 PID: 13724 Comm: syz-executor.4 Not tainted 6.8.0-12822-gcd51db110a7e #12 > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > > > > > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > > > Reported-by: syzkaller <syzkaller@googlegroups.com> > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > > > > --- > > > > net/unix/af_unix.c | 10 ++++++---- > > > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > > > > index ca101690e740..92a88ac070ca 100644 > > > > --- a/net/unix/af_unix.c > > > > +++ b/net/unix/af_unix.c > > > > @@ -731,7 +731,7 @@ static int unix_listen(struct socket *sock, int backlog) > > > > if (sock->type != SOCK_STREAM && sock->type != SOCK_SEQPACKET) > > > > goto out; /* Only stream/seqpacket sockets accept */ > > > > err = -EINVAL; > > > > - if (!u->addr) > > > > + if (!READ_ONCE(u->addr)) > > > > goto out; /* No listens on an unbound socket */ > > > > unix_state_lock(sk); > > > > if (sk->sk_state != TCP_CLOSE && sk->sk_state != TCP_LISTEN) > > > > @@ -1369,7 +1369,7 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, > > > > > > > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > > > > test_bit(SOCK_PASSPIDFD, &sock->flags)) && > > > > - !unix_sk(sk)->addr) { > > > > + !READ_ONCE(unix_sk(sk)->addr)) { > > > > err = unix_autobind(sk); > > > > if (err) > > > > goto out; > > > > @@ -1481,7 +1481,8 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, > > > > goto out; > > > > > > > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > > > > - test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) { > > > > + test_bit(SOCK_PASSPIDFD, &sock->flags)) && > > > > + !READ_ONCE(u->addr)) { > > > > err = unix_autobind(sk); > > > > if (err) > > > > goto out; > > > > @@ -1951,7 +1952,8 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, > > > > } > > > > > > > > if ((test_bit(SOCK_PASSCRED, &sock->flags) || > > > > - test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) { > > > > + test_bit(SOCK_PASSPIDFD, &sock->flags)) && > > > > + !READ_ONCE(u->addr)) { > > > > err = unix_autobind(sk); > > > > if (err) > > > > goto out; > > > > > > There are a few other places where ->addr is accessed lockless (under > > > the bindlock, but prior to acquiring the table spinlock, e.g. > > > unix_bind_* and unix_autobind. The latter is suspect as it's called > > > right after the touched code. Why are such spots not relevant here? > > > > When u->addr is set, bindlock and the hash table lock are held. > > unix_bind_(abstract|bsd)() and unix_autobind() are all serialised > > by bindlock, so ->addr check after acquiring bindlock is not > > lockless actually. > > > > > > > > > > Also the newu->addr/otheru->addr handling in unix_stream_connect() > > > looks suspect. > > > > u->addr is set before the socket is put into the hash table, and > > it never changes after bind(). > > > > otheru is found by unix_find_other() from the hash table, and then > > we access ->addr in unix_stream_connect(). So, there is no race. > > I see. > > I think the serialization chain is not trivial, I think it would help > future memory to at least mention it in the commit message or in a code > comment. Will add more explanation in the commit message in v2. Thanks!
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index ca101690e740..92a88ac070ca 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -731,7 +731,7 @@ static int unix_listen(struct socket *sock, int backlog) if (sock->type != SOCK_STREAM && sock->type != SOCK_SEQPACKET) goto out; /* Only stream/seqpacket sockets accept */ err = -EINVAL; - if (!u->addr) + if (!READ_ONCE(u->addr)) goto out; /* No listens on an unbound socket */ unix_state_lock(sk); if (sk->sk_state != TCP_CLOSE && sk->sk_state != TCP_LISTEN) @@ -1369,7 +1369,7 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, if ((test_bit(SOCK_PASSCRED, &sock->flags) || test_bit(SOCK_PASSPIDFD, &sock->flags)) && - !unix_sk(sk)->addr) { + !READ_ONCE(unix_sk(sk)->addr)) { err = unix_autobind(sk); if (err) goto out; @@ -1481,7 +1481,8 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, goto out; if ((test_bit(SOCK_PASSCRED, &sock->flags) || - test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) { + test_bit(SOCK_PASSPIDFD, &sock->flags)) && + !READ_ONCE(u->addr)) { err = unix_autobind(sk); if (err) goto out; @@ -1951,7 +1952,8 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, } if ((test_bit(SOCK_PASSCRED, &sock->flags) || - test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) { + test_bit(SOCK_PASSPIDFD, &sock->flags)) && + !READ_ONCE(u->addr)) { err = unix_autobind(sk); if (err) goto out;
Once unix_sk(sk)->addr is assigned under net->unx.table.locks, *(unix_sk(sk)->addr) and unix_sk(sk)->path are fully set up, and unix_sk(sk)->addr is never changed. unix_getname() and unix_copy_addr() access the two fields locklessly, and commit ae3b564179bf ("missing barriers in some of unix_sock ->addr and ->path accesses") added smp_store_release() and smp_load_acquire() pairs. In other functions, we still read unix_sk(sk)->addr locklessly to check if the socket is bound, and KCSAN complains about it. [0] Given these functions have no dependency for *(unix_sk(sk)->addr) and unix_sk(sk)->path, READ_ONCE() is enough to annotate the data-race. [0]: BUG: KCSAN: data-race in unix_bind / unix_listen write (marked) to 0xffff88805f8d1840 of 8 bytes by task 13723 on cpu 0: __unix_set_addr_hash net/unix/af_unix.c:329 [inline] unix_bind_bsd net/unix/af_unix.c:1241 [inline] unix_bind+0x881/0x1000 net/unix/af_unix.c:1319 __sys_bind+0x194/0x1e0 net/socket.c:1847 __do_sys_bind net/socket.c:1858 [inline] __se_sys_bind net/socket.c:1856 [inline] __x64_sys_bind+0x40/0x50 net/socket.c:1856 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x46/0x4e read to 0xffff88805f8d1840 of 8 bytes by task 13724 on cpu 1: unix_listen+0x72/0x180 net/unix/af_unix.c:734 __sys_listen+0xdc/0x160 net/socket.c:1881 __do_sys_listen net/socket.c:1890 [inline] __se_sys_listen net/socket.c:1888 [inline] __x64_sys_listen+0x2e/0x40 net/socket.c:1888 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x46/0x4e value changed: 0x0000000000000000 -> 0xffff88807b5b1b40 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 13724 Comm: syz-executor.4 Not tainted 6.8.0-12822-gcd51db110a7e #12 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> --- net/unix/af_unix.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)