Message ID | 20220609063412.2205738-1-eric.dumazet@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | net: reduce tcp_memory_allocated inflation | expand |
On Thu, Jun 9, 2022 at 2:34 AM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > From: Eric Dumazet <edumazet@google.com> > > Hosts with a lot of sockets tend to hit so called TCP memory pressure, > leading to very bad TCP performance and/or OOM. > > The problem is that some TCP sockets can hold up to 2MB of 'forward > allocations' in their per-socket cache (sk->sk_forward_alloc), > and there is no mechanism to make them relinquish their share > under mem pressure. > Only under some potentially rare events their share is reclaimed, > one socket at a time. > > In this series, I implemented a per-cpu cache instead of a per-socket one. > > Each CPU has a +1/-1 MB (256 pages on x86) forward alloc cache, in order > to not dirty tcp_memory_allocated shared cache line too often. > > We keep sk->sk_forward_alloc values as small as possible, to meet > memcg page granularity constraint. > > Note that memcg already has a per-cpu cache, although MEMCG_CHARGE_BATCH > is defined to 32 pages, which seems a bit small. > > Note that while this cover letter mentions TCP, this work is generic > and supports TCP, UDP, DECNET, SCTP. > > Eric Dumazet (7): > Revert "net: set SK_MEM_QUANTUM to 4096" > net: remove SK_MEM_QUANTUM and SK_MEM_QUANTUM_SHIFT > net: add per_cpu_fw_alloc field to struct proto > net: implement per-cpu reserves for memory_allocated > net: fix sk_wmem_schedule() and sk_rmem_schedule() errors > net: keep sk->sk_forward_alloc as small as possible > net: unexport __sk_mem_{raise|reduce}_allocated Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Very nice work! Thank you for scaling up TCP again! > include/net/sock.h | 100 +++++++++++++++-------------------- > include/net/tcp.h | 2 + > include/net/udp.h | 1 + > net/core/datagram.c | 3 -- > net/core/sock.c | 22 ++++---- > net/decnet/af_decnet.c | 4 ++ > net/ipv4/tcp.c | 13 ++--- > net/ipv4/tcp_input.c | 6 +-- > net/ipv4/tcp_ipv4.c | 3 ++ > net/ipv4/tcp_output.c | 2 +- > net/ipv4/tcp_timer.c | 19 ++----- > net/ipv4/udp.c | 14 +++-- > net/ipv4/udplite.c | 3 ++ > net/ipv6/tcp_ipv6.c | 3 ++ > net/ipv6/udp.c | 3 ++ > net/ipv6/udplite.c | 3 ++ > net/iucv/af_iucv.c | 2 - > net/mptcp/protocol.c | 13 +++-- > net/sctp/protocol.c | 4 +- > net/sctp/sm_statefuns.c | 2 - > net/sctp/socket.c | 12 +++-- > net/sctp/stream_interleave.c | 2 - > net/sctp/ulpqueue.c | 4 -- > 23 files changed, 114 insertions(+), 126 deletions(-) > > -- > 2.36.1.255.ge46751e96f-goog >
Hello: This series was applied to netdev/net-next.git (master) by Jakub Kicinski <kuba@kernel.org>: On Wed, 8 Jun 2022 23:34:05 -0700 you wrote: > From: Eric Dumazet <edumazet@google.com> > > Hosts with a lot of sockets tend to hit so called TCP memory pressure, > leading to very bad TCP performance and/or OOM. > > The problem is that some TCP sockets can hold up to 2MB of 'forward > allocations' in their per-socket cache (sk->sk_forward_alloc), > and there is no mechanism to make them relinquish their share > under mem pressure. > Only under some potentially rare events their share is reclaimed, > one socket at a time. > > [...] Here is the summary with links: - [net-next,1/7] Revert "net: set SK_MEM_QUANTUM to 4096" https://git.kernel.org/netdev/net-next/c/e70f3c701276 - [net-next,2/7] net: remove SK_MEM_QUANTUM and SK_MEM_QUANTUM_SHIFT https://git.kernel.org/netdev/net-next/c/100fdd1faf50 - [net-next,3/7] net: add per_cpu_fw_alloc field to struct proto https://git.kernel.org/netdev/net-next/c/0defbb0af775 - [net-next,4/7] net: implement per-cpu reserves for memory_allocated https://git.kernel.org/netdev/net-next/c/3cd3399dd7a8 - [net-next,5/7] net: fix sk_wmem_schedule() and sk_rmem_schedule() errors https://git.kernel.org/netdev/net-next/c/7c80b038d23e - [net-next,6/7] net: keep sk->sk_forward_alloc as small as possible https://git.kernel.org/netdev/net-next/c/4890b686f408 - [net-next,7/7] net: unexport __sk_mem_{raise|reduce}_allocated https://git.kernel.org/netdev/net-next/c/0f2c2693988a You are awesome, thank you!
From: Eric Dumazet <edumazet@google.com> Hosts with a lot of sockets tend to hit so called TCP memory pressure, leading to very bad TCP performance and/or OOM. The problem is that some TCP sockets can hold up to 2MB of 'forward allocations' in their per-socket cache (sk->sk_forward_alloc), and there is no mechanism to make them relinquish their share under mem pressure. Only under some potentially rare events their share is reclaimed, one socket at a time. In this series, I implemented a per-cpu cache instead of a per-socket one. Each CPU has a +1/-1 MB (256 pages on x86) forward alloc cache, in order to not dirty tcp_memory_allocated shared cache line too often. We keep sk->sk_forward_alloc values as small as possible, to meet memcg page granularity constraint. Note that memcg already has a per-cpu cache, although MEMCG_CHARGE_BATCH is defined to 32 pages, which seems a bit small. Note that while this cover letter mentions TCP, this work is generic and supports TCP, UDP, DECNET, SCTP. Eric Dumazet (7): Revert "net: set SK_MEM_QUANTUM to 4096" net: remove SK_MEM_QUANTUM and SK_MEM_QUANTUM_SHIFT net: add per_cpu_fw_alloc field to struct proto net: implement per-cpu reserves for memory_allocated net: fix sk_wmem_schedule() and sk_rmem_schedule() errors net: keep sk->sk_forward_alloc as small as possible net: unexport __sk_mem_{raise|reduce}_allocated include/net/sock.h | 100 +++++++++++++++-------------------- include/net/tcp.h | 2 + include/net/udp.h | 1 + net/core/datagram.c | 3 -- net/core/sock.c | 22 ++++---- net/decnet/af_decnet.c | 4 ++ net/ipv4/tcp.c | 13 ++--- net/ipv4/tcp_input.c | 6 +-- net/ipv4/tcp_ipv4.c | 3 ++ net/ipv4/tcp_output.c | 2 +- net/ipv4/tcp_timer.c | 19 ++----- net/ipv4/udp.c | 14 +++-- net/ipv4/udplite.c | 3 ++ net/ipv6/tcp_ipv6.c | 3 ++ net/ipv6/udp.c | 3 ++ net/ipv6/udplite.c | 3 ++ net/iucv/af_iucv.c | 2 - net/mptcp/protocol.c | 13 +++-- net/sctp/protocol.c | 4 +- net/sctp/sm_statefuns.c | 2 - net/sctp/socket.c | 12 +++-- net/sctp/stream_interleave.c | 2 - net/sctp/ulpqueue.c | 4 -- 23 files changed, 114 insertions(+), 126 deletions(-)