diff mbox series

[net] net: stream: purge sk_error_queue in sk_stream_kill_queues()

Message ID 20221216162917.119406-1-edumazet@google.com (mailing list archive)
State Accepted
Commit e0c8bccd40fc1c19e1d246c39bcf79e357e1ada3
Delegated to: Netdev Maintainers
Headers show
Series [net] net: stream: purge sk_error_queue in sk_stream_kill_queues() | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers warning 4 maintainers not CCed: gregkh@linuxfoundation.org keescook@chromium.org Jason@zx2c4.com djwong@kernel.org
netdev/build_clang success Errors and warnings before: 1 this patch: 1
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 12 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Eric Dumazet Dec. 16, 2022, 4:29 p.m. UTC
Changheon Lee reported TCP socket leaks, with a nice repro.

It seems we leak TCP sockets with the following sequence:

1) SOF_TIMESTAMPING_TX_ACK is enabled on the socket.

   Each ACK will cook an skb put in error queue, from __skb_tstamp_tx().
   __skb_tstamp_tx() is using skb_clone(), unless
   SOF_TIMESTAMPING_OPT_TSONLY was also requested.

2) If the application is also using MSG_ZEROCOPY, then we put in the
   error queue cloned skbs that had a struct ubuf_info attached to them.

   Whenever an struct ubuf_info is allocated, sock_zerocopy_alloc()
   does a sock_hold().

   As long as the cloned skbs are still in sk_error_queue,
   socket refcount is kept elevated.

3) Application closes the socket, while error queue is not empty.

Since tcp_close() no longer purges the socket error queue,
we might end up with a TCP socket with at least one skb in
error queue keeping the socket alive forever.

This bug can be (ab)used to consume all kernel memory
and freeze the host.

We need to purge the error queue, with proper synchronization
against concurrent writers.

Fixes: 24bcbe1cc69f ("net: stream: don't purge sk_error_queue in sk_stream_kill_queues()")
Reported-by: Changheon Lee <darklight2357@icloud.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/stream.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

patchwork-bot+netdevbpf@kernel.org Dec. 19, 2022, 12:40 p.m. UTC | #1
Hello:

This patch was applied to netdev/net.git (master)
by David S. Miller <davem@davemloft.net>:

On Fri, 16 Dec 2022 16:29:17 +0000 you wrote:
> Changheon Lee reported TCP socket leaks, with a nice repro.
> 
> It seems we leak TCP sockets with the following sequence:
> 
> 1) SOF_TIMESTAMPING_TX_ACK is enabled on the socket.
> 
>    Each ACK will cook an skb put in error queue, from __skb_tstamp_tx().
>    __skb_tstamp_tx() is using skb_clone(), unless
>    SOF_TIMESTAMPING_OPT_TSONLY was also requested.
> 
> [...]

Here is the summary with links:
  - [net] net: stream: purge sk_error_queue in sk_stream_kill_queues()
    https://git.kernel.org/netdev/net/c/e0c8bccd40fc

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/core/stream.c b/net/core/stream.c
index 5b1fe2b82eac753bc8e18c02db04c5906b3a2d97..cd06750dd3297cd0e0f073057a4d85d4078f87c3 100644
--- a/net/core/stream.c
+++ b/net/core/stream.c
@@ -196,6 +196,12 @@  void sk_stream_kill_queues(struct sock *sk)
 	/* First the read buffer. */
 	__skb_queue_purge(&sk->sk_receive_queue);
 
+	/* Next, the error queue.
+	 * We need to use queue lock, because other threads might
+	 * add packets to the queue without socket lock being held.
+	 */
+	skb_queue_purge(&sk->sk_error_queue);
+
 	/* Next, the write queue. */
 	WARN_ON_ONCE(!skb_queue_empty(&sk->sk_write_queue));