diff mbox series

[PATHC,bpf,v2] tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function

Message ID 20211012052019.184398-1-liujian56@huawei.com (mailing list archive)
State Accepted
Delegated to: BPF
Headers show
Series [PATHC,bpf,v2] tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function | expand

Checks

Context Check Description
netdev/cover_letter success Single patches do not need cover letters
netdev/fixes_present success Fixes tag present in non-next series
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf
netdev/subject_prefix success Link
netdev/cc_maintainers success CCed 17 of 17 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Fixes tag looks correct
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 31 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success No static functions without inline keyword in header files
bpf/vmtest-bpf-PR success PR summary
bpf/vmtest-bpf success VM_Test

Commit Message

liujian (CE) Oct. 12, 2021, 5:20 a.m. UTC
With two Msgs, msgA and msgB and a user doing nonblocking sendmsg calls (or
multiple cores) on a single socket 'sk' we could get the following flow.

 msgA, sk                               msgB, sk
 -----------                            ---------------
 tcp_bpf_sendmsg()
 lock(sk)
 psock = sk->psock
                                        tcp_bpf_sendmsg()
                                        lock(sk) ... blocking
tcp_bpf_send_verdict
if (psock->eval == NONE)
   psock->eval = sk_psock_msg_verdict
 ..
 < handle SK_REDIRECT case >
   release_sock(sk)                     < lock dropped so grab here >
   ret = tcp_bpf_sendmsg_redir
                                        psock = sk->psock
                                        tcp_bpf_send_verdict
 lock_sock(sk) ... blocking on B
                                        if (psock->eval == NONE) <- boom.
                                         psock->eval will have msgA state

The problem here is we dropped the lock on msgA and grabbed it with msgB.
Now we have old state in psock and importantly psock->eval has not been
cleared. So msgB will run whatever action was done on A and the verdict
program may never see it.

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56@huawei.com>
---
v2: change commit message, and add the fixes tag
 net/ipv4/tcp_bpf.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

John Fastabend Oct. 26, 2021, 2:26 p.m. UTC | #1
Liu Jian wrote:
> With two Msgs, msgA and msgB and a user doing nonblocking sendmsg calls (or
> multiple cores) on a single socket 'sk' we could get the following flow.
> 
>  msgA, sk                               msgB, sk
>  -----------                            ---------------
>  tcp_bpf_sendmsg()
>  lock(sk)
>  psock = sk->psock
>                                         tcp_bpf_sendmsg()
>                                         lock(sk) ... blocking
> tcp_bpf_send_verdict
> if (psock->eval == NONE)
>    psock->eval = sk_psock_msg_verdict
>  ..
>  < handle SK_REDIRECT case >
>    release_sock(sk)                     < lock dropped so grab here >
>    ret = tcp_bpf_sendmsg_redir
>                                         psock = sk->psock
>                                         tcp_bpf_send_verdict
>  lock_sock(sk) ... blocking on B
>                                         if (psock->eval == NONE) <- boom.
>                                          psock->eval will have msgA state
> 
> The problem here is we dropped the lock on msgA and grabbed it with msgB.
> Now we have old state in psock and importantly psock->eval has not been
> cleared. So msgB will run whatever action was done on A and the verdict
> program may never see it.
> 
> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: Liu Jian <liujian56@huawei.com>

Yep thanks for digging into this. Nice catch. And commit looks good now.

Acked-by: John Fastabend <john.fastabend@gmail.com>
diff mbox series

Patch

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index d3e9386b493e..9d068153c316 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -232,6 +232,7 @@  static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
 	bool cork = false, enospc = sk_msg_full(msg);
 	struct sock *sk_redir;
 	u32 tosend, delta = 0;
+	u32 eval = __SK_NONE;
 	int ret;
 
 more_data:
@@ -275,13 +276,24 @@  static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
 	case __SK_REDIRECT:
 		sk_redir = psock->sk_redir;
 		sk_msg_apply_bytes(psock, tosend);
+		if (!psock->apply_bytes) {
+			/* Clean up before releasing the sock lock. */
+			eval = psock->eval;
+			psock->eval = __SK_NONE;
+			psock->sk_redir = NULL;
+		}
 		if (psock->cork) {
 			cork = true;
 			psock->cork = NULL;
 		}
 		sk_msg_return(sk, msg, tosend);
 		release_sock(sk);
+
 		ret = tcp_bpf_sendmsg_redir(sk_redir, msg, tosend, flags);
+
+		if (eval == __SK_REDIRECT)
+			sock_put(sk_redir);
+
 		lock_sock(sk);
 		if (unlikely(ret < 0)) {
 			int free = sk_msg_free_nocharge(sk, msg);