diff mbox series

[net] net-timestamp: convert sk->sk_tskey to atomic_t

Message ID 20220217170502.641160-1-eric.dumazet@gmail.com (mailing list archive)
State Accepted
Commit a1cdec57e03a1352e92fbbe7974039dda4efcec0
Delegated to: Netdev Maintainers
Headers show
Series [net] net-timestamp: convert sk->sk_tskey to atomic_t | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2955 this patch: 2955
netdev/cc_maintainers warning 8 maintainers not CCed: mkl@pengutronix.de linux@rempel-privat.de kernel@pengutronix.de robin@protonic.nl dsahern@kernel.org linux-can@vger.kernel.org socketcan@hartkopp.net yoshfuji@linux-ipv6.org
netdev/build_clang success Errors and warnings before: 352 this patch: 352
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 3093 this patch: 3093
netdev/checkpatch warning WARNING: Possible repeated word: 'Google'
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Eric Dumazet Feb. 17, 2022, 5:05 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

UDP sendmsg() can be lockless, this is causing all kinds
of data races.

This patch converts sk->sk_tskey to remove one of these races.

BUG: KCSAN: data-race in __ip_append_data / __ip_append_data

read to 0xffff8881035d4b6c of 4 bytes by task 8877 on cpu 1:
 __ip_append_data+0x1c1/0x1de0 net/ipv4/ip_output.c:994
 ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636
 udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249
 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
 sock_sendmsg_nosec net/socket.c:705 [inline]
 sock_sendmsg net/socket.c:725 [inline]
 ____sys_sendmsg+0x39a/0x510 net/socket.c:2413
 ___sys_sendmsg net/socket.c:2467 [inline]
 __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
 __do_sys_sendmmsg net/socket.c:2582 [inline]
 __se_sys_sendmmsg net/socket.c:2579 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

write to 0xffff8881035d4b6c of 4 bytes by task 8880 on cpu 0:
 __ip_append_data+0x1d8/0x1de0 net/ipv4/ip_output.c:994
 ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636
 udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249
 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
 sock_sendmsg_nosec net/socket.c:705 [inline]
 sock_sendmsg net/socket.c:725 [inline]
 ____sys_sendmsg+0x39a/0x510 net/socket.c:2413
 ___sys_sendmsg net/socket.c:2467 [inline]
 __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
 __do_sys_sendmmsg net/socket.c:2582 [inline]
 __se_sys_sendmmsg net/socket.c:2579 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x0000054d -> 0x0000054e

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 8880 Comm: syz-executor.5 Not tainted 5.17.0-rc2-syzkaller-00167-gdcb85f85fa6f-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 09c2d251b707 ("net-timestamp: add key to disambiguate concurrent datagrams")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
---
 include/net/sock.h        | 4 ++--
 net/can/j1939/transport.c | 2 +-
 net/core/skbuff.c         | 2 +-
 net/core/sock.c           | 4 ++--
 net/ipv4/ip_output.c      | 2 +-
 net/ipv6/ip6_output.c     | 2 +-
 6 files changed, 8 insertions(+), 8 deletions(-)

Comments

Marco Elver Feb. 17, 2022, 5:18 p.m. UTC | #1
On Thu, 17 Feb 2022 at 18:05, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> UDP sendmsg() can be lockless, this is causing all kinds
> of data races.
>
> This patch converts sk->sk_tskey to remove one of these races.
>
> BUG: KCSAN: data-race in __ip_append_data / __ip_append_data
>
> read to 0xffff8881035d4b6c of 4 bytes by task 8877 on cpu 1:
>  __ip_append_data+0x1c1/0x1de0 net/ipv4/ip_output.c:994
>  ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636
>  udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249
>  inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
>  sock_sendmsg_nosec net/socket.c:705 [inline]
>  sock_sendmsg net/socket.c:725 [inline]
>  ____sys_sendmsg+0x39a/0x510 net/socket.c:2413
>  ___sys_sendmsg net/socket.c:2467 [inline]
>  __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
>  __do_sys_sendmmsg net/socket.c:2582 [inline]
>  __se_sys_sendmmsg net/socket.c:2579 [inline]
>  __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> write to 0xffff8881035d4b6c of 4 bytes by task 8880 on cpu 0:
>  __ip_append_data+0x1d8/0x1de0 net/ipv4/ip_output.c:994
>  ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636
>  udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249
>  inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
>  sock_sendmsg_nosec net/socket.c:705 [inline]
>  sock_sendmsg net/socket.c:725 [inline]
>  ____sys_sendmsg+0x39a/0x510 net/socket.c:2413
>  ___sys_sendmsg net/socket.c:2467 [inline]
>  __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
>  __do_sys_sendmmsg net/socket.c:2582 [inline]
>  __se_sys_sendmmsg net/socket.c:2579 [inline]
>  __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> value changed: 0x0000054d -> 0x0000054e
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 PID: 8880 Comm: syz-executor.5 Not tainted 5.17.0-rc2-syzkaller-00167-gdcb85f85fa6f-dirty #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>
> Fixes: 09c2d251b707 ("net-timestamp: add key to disambiguate concurrent datagrams")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>
> ---
>  include/net/sock.h        | 4 ++--
>  net/can/j1939/transport.c | 2 +-
>  net/core/skbuff.c         | 2 +-
>  net/core/sock.c           | 4 ++--
>  net/ipv4/ip_output.c      | 2 +-
>  net/ipv6/ip6_output.c     | 2 +-
>  6 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index ff9b508d9c5ffcb9a30deb730b27046e463bda37..50aecd28b355082bce495a89a8a871b15e3e7e2c 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -507,7 +507,7 @@ struct sock {
>  #endif
>         u16                     sk_tsflags;
>         u8                      sk_shutdown;
> -       u32                     sk_tskey;
> +       atomic_t                sk_tskey;
>         atomic_t                sk_zckey;
>
>         u8                      sk_clockid;
> @@ -2667,7 +2667,7 @@ static inline void _sock_tx_timestamp(struct sock *sk, __u16 tsflags,
>                 __sock_tx_timestamp(tsflags, tx_flags);
>                 if (tsflags & SOF_TIMESTAMPING_OPT_ID && tskey &&
>                     tsflags & SOF_TIMESTAMPING_TX_RECORD_MASK)
> -                       *tskey = sk->sk_tskey++;
> +                       *tskey = atomic_inc_return(&sk->sk_tskey) - 1;
>         }
>         if (unlikely(sock_flag(sk, SOCK_WIFI_STATUS)))
>                 *tx_flags |= SKBTX_WIFI_STATUS;
> diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
> index a271688780a2c1a3bff6c2578502f972da34a30b..307ee1174a6e2e3d8cb9edd2c7485ddd22014ce6 100644
> --- a/net/can/j1939/transport.c
> +++ b/net/can/j1939/transport.c
> @@ -2006,7 +2006,7 @@ struct j1939_session *j1939_tp_send(struct j1939_priv *priv,
>                 /* set the end-packet for broadcast */
>                 session->pkt.last = session->pkt.total;
>
> -       skcb->tskey = session->sk->sk_tskey++;
> +       skcb->tskey = atomic_inc_return(&session->sk->sk_tskey) - 1;

We also have atomic_fetch_inc() in case it'd make this simpler.

Thanks,
-- Marco
Eric Dumazet Feb. 17, 2022, 5:23 p.m. UTC | #2
On Thu, Feb 17, 2022 at 9:18 AM Marco Elver <elver@google.com> wrote:
>
>
>
> We also have atomic_fetch_inc() in case it'd make this simpler.

This was not the case back in 2014, I did not want to add more work
for stable teams.

Thanks !
patchwork-bot+netdevbpf@kernel.org Feb. 18, 2022, 11:50 a.m. UTC | #3
Hello:

This patch was applied to netdev/net.git (master)
by David S. Miller <davem@davemloft.net>:

On Thu, 17 Feb 2022 09:05:02 -0800 you wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> UDP sendmsg() can be lockless, this is causing all kinds
> of data races.
> 
> This patch converts sk->sk_tskey to remove one of these races.
> 
> [...]

Here is the summary with links:
  - [net] net-timestamp: convert sk->sk_tskey to atomic_t
    https://git.kernel.org/netdev/net/c/a1cdec57e03a

You are awesome, thank you!
diff mbox series

Patch

diff --git a/include/net/sock.h b/include/net/sock.h
index ff9b508d9c5ffcb9a30deb730b27046e463bda37..50aecd28b355082bce495a89a8a871b15e3e7e2c 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -507,7 +507,7 @@  struct sock {
 #endif
 	u16			sk_tsflags;
 	u8			sk_shutdown;
-	u32			sk_tskey;
+	atomic_t		sk_tskey;
 	atomic_t		sk_zckey;
 
 	u8			sk_clockid;
@@ -2667,7 +2667,7 @@  static inline void _sock_tx_timestamp(struct sock *sk, __u16 tsflags,
 		__sock_tx_timestamp(tsflags, tx_flags);
 		if (tsflags & SOF_TIMESTAMPING_OPT_ID && tskey &&
 		    tsflags & SOF_TIMESTAMPING_TX_RECORD_MASK)
-			*tskey = sk->sk_tskey++;
+			*tskey = atomic_inc_return(&sk->sk_tskey) - 1;
 	}
 	if (unlikely(sock_flag(sk, SOCK_WIFI_STATUS)))
 		*tx_flags |= SKBTX_WIFI_STATUS;
diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index a271688780a2c1a3bff6c2578502f972da34a30b..307ee1174a6e2e3d8cb9edd2c7485ddd22014ce6 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -2006,7 +2006,7 @@  struct j1939_session *j1939_tp_send(struct j1939_priv *priv,
 		/* set the end-packet for broadcast */
 		session->pkt.last = session->pkt.total;
 
-	skcb->tskey = session->sk->sk_tskey++;
+	skcb->tskey = atomic_inc_return(&session->sk->sk_tskey) - 1;
 	session->tskey = skcb->tskey;
 
 	return session;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9d0388bed0c1d2166214c95081f4778afe9f50ed..6a15ce3eb1d338616d6ab52d6c5c21baa9db993b 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4730,7 +4730,7 @@  static void __skb_complete_tx_timestamp(struct sk_buff *skb,
 	if (sk->sk_tsflags & SOF_TIMESTAMPING_OPT_ID) {
 		serr->ee.ee_data = skb_shinfo(skb)->tskey;
 		if (sk_is_tcp(sk))
-			serr->ee.ee_data -= sk->sk_tskey;
+			serr->ee.ee_data -= atomic_read(&sk->sk_tskey);
 	}
 
 	err = sock_queue_err_skb(sk, skb);
diff --git a/net/core/sock.c b/net/core/sock.c
index 4ff806d71921618e2dbf7d5bb041040cbc72b674..6eb174805bf022f2e143c52b68aec3684a8d0956 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -879,9 +879,9 @@  int sock_set_timestamping(struct sock *sk, int optname,
 			if ((1 << sk->sk_state) &
 			    (TCPF_CLOSE | TCPF_LISTEN))
 				return -EINVAL;
-			sk->sk_tskey = tcp_sk(sk)->snd_una;
+			atomic_set(&sk->sk_tskey, tcp_sk(sk)->snd_una);
 		} else {
-			sk->sk_tskey = 0;
+			atomic_set(&sk->sk_tskey, 0);
 		}
 	}
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 139cec29ed06cd092ebdfd2bf0d13aaf67c5359d..7911916a480bd9a8ef0e17414a6dcf6def6475dd 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -991,7 +991,7 @@  static int __ip_append_data(struct sock *sk,
 
 	if (cork->tx_flags & SKBTX_ANY_SW_TSTAMP &&
 	    sk->sk_tsflags & SOF_TIMESTAMPING_OPT_ID)
-		tskey = sk->sk_tskey++;
+		tskey = atomic_inc_return(&sk->sk_tskey) - 1;
 
 	hh_len = LL_RESERVED_SPACE(rt->dst.dev);
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 2995f8d89e7e923203be2dfe8674be4ab424d323..304a295de84f9c05bbc99fc1db5f4507004c0769 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1465,7 +1465,7 @@  static int __ip6_append_data(struct sock *sk,
 
 	if (cork->tx_flags & SKBTX_ANY_SW_TSTAMP &&
 	    sk->sk_tsflags & SOF_TIMESTAMPING_OPT_ID)
-		tskey = sk->sk_tskey++;
+		tskey = atomic_inc_return(&sk->sk_tskey) - 1;
 
 	hh_len = LL_RESERVED_SPACE(rt->dst.dev);