Message ID | 1667000674-13237-1-git-send-email-wangyufen@huawei.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [net,v2] bpf, sockmap: fix the sk->sk_forward_alloc warning of sk_stream_kill_queues() | expand |
On Sat, Oct 29, 2022 at 07:44 AM +08, Wang Yufen wrote: > When running `test_sockmap` selftests, got the following warning: > > WARNING: CPU: 2 PID: 197 at net/core/stream.c:205 sk_stream_kill_queues+0xd3/0xf0 > Call Trace: > <TASK> > inet_csk_destroy_sock+0x55/0x110 > tcp_rcv_state_process+0xd28/0x1380 > ? tcp_v4_do_rcv+0x77/0x2c0 > tcp_v4_do_rcv+0x77/0x2c0 > __release_sock+0x106/0x130 > __tcp_close+0x1a7/0x4e0 > tcp_close+0x20/0x70 > inet_release+0x3c/0x80 > __sock_release+0x3a/0xb0 > sock_close+0x14/0x20 > __fput+0xa3/0x260 > task_work_run+0x59/0xb0 > exit_to_user_mode_prepare+0x1b3/0x1c0 > syscall_exit_to_user_mode+0x19/0x50 > do_syscall_64+0x48/0x90 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > The root case is: In commit 84472b436e76 ("bpf, sockmap: Fix more > uncharged while msg has more_data") , I used msg->sg.size replace > tosend rudely, which break the > if (msg->apply_bytes && msg->apply_bytes < send) > scene. > > Fixes: 84472b436e76 ("bpf, sockmap: Fix more uncharged while msg has more_data") > Reported-by: Jakub Sitnicki <jakub@cloudflare.com> > Signed-off-by: Wang Yufen <wangyufen@huawei.com> > Acked-by: John Fastabend <john.fastabend@gmail.com> > --- > v1 -> v2: typo fixup > net/ipv4/tcp_bpf.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c > index a1626af..774d481 100644 > --- a/net/ipv4/tcp_bpf.c > +++ b/net/ipv4/tcp_bpf.c > @@ -278,7 +278,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, > { > bool cork = false, enospc = sk_msg_full(msg); > struct sock *sk_redir; > - u32 tosend, delta = 0; > + u32 tosend, orgsize, sent, delta = 0; > u32 eval = __SK_NONE; > int ret; > > @@ -333,10 +333,12 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, > cork = true; > psock->cork = NULL; > } > - sk_msg_return(sk, msg, msg->sg.size); > + sk_msg_return(sk, msg, tosend); > release_sock(sk); > > + orgsize = msg->sg.size; > ret = tcp_bpf_sendmsg_redir(sk_redir, msg, tosend, flags); > + sent = orgsize - msg->sg.size; If I'm reading the code right, it's the same as: sent = tosend - msg->sg.size; If so, no need for orgsize. > > if (eval == __SK_REDIRECT) > sock_put(sk_redir); > @@ -375,7 +377,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, > msg->sg.data[msg->sg.start].page_link && > msg->sg.data[msg->sg.start].length) { > if (eval == __SK_REDIRECT) > - sk_mem_charge(sk, msg->sg.size); > + sk_mem_charge(sk, tosend - sent); > goto more_data; > } > }
On Mon, Oct 31, 2022 at 06:56 PM +01, Jakub Sitnicki wrote: > On Sat, Oct 29, 2022 at 07:44 AM +08, Wang Yufen wrote: >> When running `test_sockmap` selftests, got the following warning: >> >> WARNING: CPU: 2 PID: 197 at net/core/stream.c:205 sk_stream_kill_queues+0xd3/0xf0 >> Call Trace: >> <TASK> >> inet_csk_destroy_sock+0x55/0x110 >> tcp_rcv_state_process+0xd28/0x1380 >> ? tcp_v4_do_rcv+0x77/0x2c0 >> tcp_v4_do_rcv+0x77/0x2c0 >> __release_sock+0x106/0x130 >> __tcp_close+0x1a7/0x4e0 >> tcp_close+0x20/0x70 >> inet_release+0x3c/0x80 >> __sock_release+0x3a/0xb0 >> sock_close+0x14/0x20 >> __fput+0xa3/0x260 >> task_work_run+0x59/0xb0 >> exit_to_user_mode_prepare+0x1b3/0x1c0 >> syscall_exit_to_user_mode+0x19/0x50 >> do_syscall_64+0x48/0x90 >> entry_SYSCALL_64_after_hwframe+0x44/0xae >> >> The root case is: In commit 84472b436e76 ("bpf, sockmap: Fix more >> uncharged while msg has more_data") , I used msg->sg.size replace >> tosend rudely, which break the >> if (msg->apply_bytes && msg->apply_bytes < send) >> scene. >> >> Fixes: 84472b436e76 ("bpf, sockmap: Fix more uncharged while msg has more_data") >> Reported-by: Jakub Sitnicki <jakub@cloudflare.com> >> Signed-off-by: Wang Yufen <wangyufen@huawei.com> >> Acked-by: John Fastabend <john.fastabend@gmail.com> >> --- >> v1 -> v2: typo fixup >> net/ipv4/tcp_bpf.c | 8 +++++--- >> 1 file changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c >> index a1626af..774d481 100644 >> --- a/net/ipv4/tcp_bpf.c >> +++ b/net/ipv4/tcp_bpf.c >> @@ -278,7 +278,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, >> { >> bool cork = false, enospc = sk_msg_full(msg); >> struct sock *sk_redir; >> - u32 tosend, delta = 0; >> + u32 tosend, orgsize, sent, delta = 0; >> u32 eval = __SK_NONE; >> int ret; >> >> @@ -333,10 +333,12 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, >> cork = true; >> psock->cork = NULL; >> } >> - sk_msg_return(sk, msg, msg->sg.size); >> + sk_msg_return(sk, msg, tosend); >> release_sock(sk); >> >> + orgsize = msg->sg.size; >> ret = tcp_bpf_sendmsg_redir(sk_redir, msg, tosend, flags); >> + sent = orgsize - msg->sg.size; > > If I'm reading the code right, it's the same as: > > sent = tosend - msg->sg.size; > > If so, no need for orgsize. Sorry, that doesn't make any sense. I misread the code. The fix is correct. If I can have a small ask to rename `orgsize` to something more common. We have `orig_size` or `origsize` in use today, but no `orgsize`: $ git grep -c '\<orig_size\>' -- net net/core/sysctl_net_core.c:3 net/psample/psample.c:1 net/tls/tls_device.c:5 net/tls/tls_sw.c:7 $ git grep -c '\<origsize\>' -- net net/bridge/netfilter/ebtables.c:5 net/ipv4/netfilter/arp_tables.c:10 net/ipv4/netfilter/ip_tables.c:10 net/ipv6/netfilter/ip6_tables.c:10 It reads a bit better, IMHO. Thanks for fixing it so quickly. Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
在 2022/11/1 6:26, Jakub Sitnicki 写道: > On Mon, Oct 31, 2022 at 06:56 PM +01, Jakub Sitnicki wrote: >> On Sat, Oct 29, 2022 at 07:44 AM +08, Wang Yufen wrote: >>> When running `test_sockmap` selftests, got the following warning: >>> >>> WARNING: CPU: 2 PID: 197 at net/core/stream.c:205 sk_stream_kill_queues+0xd3/0xf0 >>> Call Trace: >>> <TASK> >>> inet_csk_destroy_sock+0x55/0x110 >>> tcp_rcv_state_process+0xd28/0x1380 >>> ? tcp_v4_do_rcv+0x77/0x2c0 >>> tcp_v4_do_rcv+0x77/0x2c0 >>> __release_sock+0x106/0x130 >>> __tcp_close+0x1a7/0x4e0 >>> tcp_close+0x20/0x70 >>> inet_release+0x3c/0x80 >>> __sock_release+0x3a/0xb0 >>> sock_close+0x14/0x20 >>> __fput+0xa3/0x260 >>> task_work_run+0x59/0xb0 >>> exit_to_user_mode_prepare+0x1b3/0x1c0 >>> syscall_exit_to_user_mode+0x19/0x50 >>> do_syscall_64+0x48/0x90 >>> entry_SYSCALL_64_after_hwframe+0x44/0xae >>> >>> The root case is: In commit 84472b436e76 ("bpf, sockmap: Fix more >>> uncharged while msg has more_data") , I used msg->sg.size replace >>> tosend rudely, which break the >>> if (msg->apply_bytes && msg->apply_bytes < send) >>> scene. Sorry, I made a mistake here: send --> tosend alse will change in v3 >>> >>> Fixes: 84472b436e76 ("bpf, sockmap: Fix more uncharged while msg has more_data") >>> Reported-by: Jakub Sitnicki <jakub@cloudflare.com> >>> Signed-off-by: Wang Yufen <wangyufen@huawei.com> >>> Acked-by: John Fastabend <john.fastabend@gmail.com> >>> --- >>> v1 -> v2: typo fixup >>> net/ipv4/tcp_bpf.c | 8 +++++--- >>> 1 file changed, 5 insertions(+), 3 deletions(-) >>> >>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c >>> index a1626af..774d481 100644 >>> --- a/net/ipv4/tcp_bpf.c >>> +++ b/net/ipv4/tcp_bpf.c >>> @@ -278,7 +278,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, >>> { >>> bool cork = false, enospc = sk_msg_full(msg); >>> struct sock *sk_redir; >>> - u32 tosend, delta = 0; >>> + u32 tosend, orgsize, sent, delta = 0; >>> u32 eval = __SK_NONE; >>> int ret; >>> >>> @@ -333,10 +333,12 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, >>> cork = true; >>> psock->cork = NULL; >>> } >>> - sk_msg_return(sk, msg, msg->sg.size); >>> + sk_msg_return(sk, msg, tosend); >>> release_sock(sk); >>> >>> + orgsize = msg->sg.size; >>> ret = tcp_bpf_sendmsg_redir(sk_redir, msg, tosend, flags); >>> + sent = orgsize - msg->sg.size; >> If I'm reading the code right, it's the same as: >> >> sent = tosend - msg->sg.size; >> >> If so, no need for orgsize. > Sorry, that doesn't make any sense. I misread the code. > > The fix is correct. > > If I can have a small ask to rename `orgsize` to something more common. > > We have `orig_size` or `origsize` in use today, but no `orgsize`: ok, I will change in v3, thanks. > > $ git grep -c '\<orig_size\>' -- net > net/core/sysctl_net_core.c:3 > net/psample/psample.c:1 > net/tls/tls_device.c:5 > net/tls/tls_sw.c:7 > $ git grep -c '\<origsize\>' -- net > net/bridge/netfilter/ebtables.c:5 > net/ipv4/netfilter/arp_tables.c:10 > net/ipv4/netfilter/ip_tables.c:10 > net/ipv6/netfilter/ip6_tables.c:10 > > It reads a bit better, IMHO. > > Thanks for fixing it so quickly. > > Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index a1626af..774d481 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -278,7 +278,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, { bool cork = false, enospc = sk_msg_full(msg); struct sock *sk_redir; - u32 tosend, delta = 0; + u32 tosend, orgsize, sent, delta = 0; u32 eval = __SK_NONE; int ret; @@ -333,10 +333,12 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, cork = true; psock->cork = NULL; } - sk_msg_return(sk, msg, msg->sg.size); + sk_msg_return(sk, msg, tosend); release_sock(sk); + orgsize = msg->sg.size; ret = tcp_bpf_sendmsg_redir(sk_redir, msg, tosend, flags); + sent = orgsize - msg->sg.size; if (eval == __SK_REDIRECT) sock_put(sk_redir); @@ -375,7 +377,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, msg->sg.data[msg->sg.start].page_link && msg->sg.data[msg->sg.start].length) { if (eval == __SK_REDIRECT) - sk_mem_charge(sk, msg->sg.size); + sk_mem_charge(sk, tosend - sent); goto more_data; } }