diff mbox series

[bpf-next,2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full

Message ID 20220225014929.942444-3-wangyufen@huawei.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series bpf, sockmap: Fix memleaks and issues of mem charge/uncharge | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers success CCed 17 of 17 maintainers
netdev/build_clang success Errors and warnings before: 18 this patch: 18
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 7 this patch: 7
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 11 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next success VM_Test

Commit Message

wangyufen Feb. 25, 2022, 1:49 a.m. UTC
If tcp_bpf_sendmsg() is running while sk msg is full, sk_msg_alloc()
returns -ENOSPC, tcp_bpf_sendmsg() goto wait for memory. If partial memory
has been alloced by sk_msg_alloc(), that is, msg_tx->sg.size is greater
than osize after sk_msg_alloc(), memleak occurs. To fix we use
sk_msg_trim() to release the allocated memory, then goto wait for memory.

This issue can cause the following info:
WARNING: CPU: 3 PID: 7950 at net/core/stream.c:208 sk_stream_kill_queues+0xd4/0x1a0
Call Trace:
 <TASK>
 inet_csk_destroy_sock+0x55/0x110
 __tcp_close+0x279/0x470
 tcp_close+0x1f/0x60
 inet_release+0x3f/0x80
 __sock_release+0x3d/0xb0
 sock_close+0x11/0x20
 __fput+0x92/0x250
 task_work_run+0x6a/0xa0
 do_exit+0x33b/0xb60
 do_group_exit+0x2f/0xa0
 get_signal+0xb6/0x950
 arch_do_signal_or_restart+0xac/0x2a0
 exit_to_user_mode_prepare+0xa9/0x200
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x46/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
 </TASK>

WARNING: CPU: 3 PID: 2094 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 kthread+0xe6/0x110
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
---
 net/ipv4/tcp_bpf.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

John Fastabend March 1, 2022, 4:02 a.m. UTC | #1
Wang Yufen wrote:
> If tcp_bpf_sendmsg() is running while sk msg is full, sk_msg_alloc()
> returns -ENOSPC, tcp_bpf_sendmsg() goto wait for memory. If partial memory
> has been alloced by sk_msg_alloc(), that is, msg_tx->sg.size is greater
> than osize after sk_msg_alloc(), memleak occurs. To fix we use
> sk_msg_trim() to release the allocated memory, then goto wait for memory.

Small nit, "sk_msg_alloc() returns -ENOSPC" should be something like, "when
sk_msg_alloc() returns -ENOMEM error,..." That error path is from ENOMEM not
the ENOSPC.

But nice find thanks! I think we might have seen this in a couple cases on
our side as well.

> 
> This issue can cause the following info:
> WARNING: CPU: 3 PID: 7950 at net/core/stream.c:208 sk_stream_kill_queues+0xd4/0x1a0
> Call Trace:
>  <TASK>
>  inet_csk_destroy_sock+0x55/0x110
>  __tcp_close+0x279/0x470
>  tcp_close+0x1f/0x60
>  inet_release+0x3f/0x80
>  __sock_release+0x3d/0xb0
>  sock_close+0x11/0x20
>  __fput+0x92/0x250
>  task_work_run+0x6a/0xa0
>  do_exit+0x33b/0xb60
>  do_group_exit+0x2f/0xa0
>  get_signal+0xb6/0x950
>  arch_do_signal_or_restart+0xac/0x2a0
>  exit_to_user_mode_prepare+0xa9/0x200
>  syscall_exit_to_user_mode+0x12/0x30
>  do_syscall_64+0x46/0x80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
>  </TASK>
> 
> WARNING: CPU: 3 PID: 2094 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
> Call Trace:
>  <TASK>
>  __sk_destruct+0x24/0x1f0
>  sk_psock_destroy+0x19b/0x1c0
>  process_one_work+0x1b3/0x3c0
>  kthread+0xe6/0x110
>  ret_from_fork+0x22/0x30
>  </TASK>
> 
> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: Wang Yufen <wangyufen@huawei.com>
> ---
>  net/ipv4/tcp_bpf.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> index 9b9b02052fd3..ac9f491cc139 100644
> --- a/net/ipv4/tcp_bpf.c
> +++ b/net/ipv4/tcp_bpf.c
> @@ -421,8 +421,10 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
>  		osize = msg_tx->sg.size;
>  		err = sk_msg_alloc(sk, msg_tx, msg_tx->sg.size + copy, msg_tx->sg.end - 1);
>  		if (err) {
> -			if (err != -ENOSPC)
> +			if (err != -ENOSPC) {
> +				sk_msg_trim(sk, msg_tx, osize);
>  				goto wait_for_memory;
> +			}
>  			enospc = true;
>  			copy = msg_tx->sg.size - osize;
>  		}
> -- 
> 2.25.1
> 

Acked-by: John Fastabend <john.fastabend@gmail.com>
wangyufen March 1, 2022, 7:05 a.m. UTC | #2
在 2022/3/1 12:02, John Fastabend 写道:
> Wang Yufen wrote:
>> If tcp_bpf_sendmsg() is running while sk msg is full, sk_msg_alloc()
>> returns -ENOSPC, tcp_bpf_sendmsg() goto wait for memory. If partial memory
>> has been alloced by sk_msg_alloc(), that is, msg_tx->sg.size is greater
>> than osize after sk_msg_alloc(), memleak occurs. To fix we use
>> sk_msg_trim() to release the allocated memory, then goto wait for memory.
> Small nit, "sk_msg_alloc() returns -ENOSPC" should be something like, "when
> sk_msg_alloc() returns -ENOMEM error,..." That error path is from ENOMEM not
> the ENOSPC.
Thanks, I will fix in v2.
>
> But nice find thanks! I think we might have seen this in a couple cases on
> our side as well.
>
>> This issue can cause the following info:
>> WARNING: CPU: 3 PID: 7950 at net/core/stream.c:208 sk_stream_kill_queues+0xd4/0x1a0
>> Call Trace:
>>   <TASK>
>>   inet_csk_destroy_sock+0x55/0x110
>>   __tcp_close+0x279/0x470
>>   tcp_close+0x1f/0x60
>>   inet_release+0x3f/0x80
>>   __sock_release+0x3d/0xb0
>>   sock_close+0x11/0x20
>>   __fput+0x92/0x250
>>   task_work_run+0x6a/0xa0
>>   do_exit+0x33b/0xb60
>>   do_group_exit+0x2f/0xa0
>>   get_signal+0xb6/0x950
>>   arch_do_signal_or_restart+0xac/0x2a0
>>   exit_to_user_mode_prepare+0xa9/0x200
>>   syscall_exit_to_user_mode+0x12/0x30
>>   do_syscall_64+0x46/0x80
>>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>>   </TASK>
>>
>> WARNING: CPU: 3 PID: 2094 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
>> Call Trace:
>>   <TASK>
>>   __sk_destruct+0x24/0x1f0
>>   sk_psock_destroy+0x19b/0x1c0
>>   process_one_work+0x1b3/0x3c0
>>   kthread+0xe6/0x110
>>   ret_from_fork+0x22/0x30
>>   </TASK>
>>
>> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
>> Signed-off-by: Wang Yufen <wangyufen@huawei.com>
>> ---
>>   net/ipv4/tcp_bpf.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
>> index 9b9b02052fd3..ac9f491cc139 100644
>> --- a/net/ipv4/tcp_bpf.c
>> +++ b/net/ipv4/tcp_bpf.c
>> @@ -421,8 +421,10 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
>>   		osize = msg_tx->sg.size;
>>   		err = sk_msg_alloc(sk, msg_tx, msg_tx->sg.size + copy, msg_tx->sg.end - 1);
>>   		if (err) {
>> -			if (err != -ENOSPC)
>> +			if (err != -ENOSPC) {
>> +				sk_msg_trim(sk, msg_tx, osize);
>>   				goto wait_for_memory;
>> +			}
>>   			enospc = true;
>>   			copy = msg_tx->sg.size - osize;
>>   		}
>> -- 
>> 2.25.1
>>
> Acked-by: John Fastabend <john.fastabend@gmail.com>
> .
diff mbox series

Patch

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 9b9b02052fd3..ac9f491cc139 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -421,8 +421,10 @@  static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 		osize = msg_tx->sg.size;
 		err = sk_msg_alloc(sk, msg_tx, msg_tx->sg.size + copy, msg_tx->sg.end - 1);
 		if (err) {
-			if (err != -ENOSPC)
+			if (err != -ENOSPC) {
+				sk_msg_trim(sk, msg_tx, osize);
 				goto wait_for_memory;
+			}
 			enospc = true;
 			copy = msg_tx->sg.size - osize;
 		}