Message ID | 20220125024511.27480-1-dsahern@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Commit | ab14f1802cfb2d7ca120bbf48e3ba6712314ffc3 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net: Adjust sk_gso_max_size once when set | expand |
On Mon, Jan 24, 2022 at 6:45 PM David Ahern <dsahern@kernel.org> wrote: > > sk_gso_max_size is set based on the dst dev. Both users of it > adjust the value by the same offset - (MAX_TCP_HEADER + 1). Rather > than compute the same adjusted value on each call do the adjustment > once when set. > > Signed-off-by: David Ahern <dsahern@kernel.org> > Cc: Eric Dumazet <edumazet@google.com> SGTM, thanks. Reviewed-by: Eric Dumazet <edumazet@google.com>
On 1/25/22 9:46 AM, Eric Dumazet wrote: > On Mon, Jan 24, 2022 at 6:45 PM David Ahern <dsahern@kernel.org> wrote: >> >> sk_gso_max_size is set based on the dst dev. Both users of it >> adjust the value by the same offset - (MAX_TCP_HEADER + 1). Rather >> than compute the same adjusted value on each call do the adjustment >> once when set. >> >> Signed-off-by: David Ahern <dsahern@kernel.org> >> Cc: Eric Dumazet <edumazet@google.com> > > > SGTM, thanks. > > Reviewed-by: Eric Dumazet <edumazet@google.com> The git history does not explain why MAX_TCP_HEADER is used to lower sk_gso_max_size. Do you recall the history on it?
On Tue, Jan 25, 2022 at 9:16 AM David Ahern <dsahern@gmail.com> wrote: > > On 1/25/22 9:46 AM, Eric Dumazet wrote: > > On Mon, Jan 24, 2022 at 6:45 PM David Ahern <dsahern@kernel.org> wrote: > >> > >> sk_gso_max_size is set based on the dst dev. Both users of it > >> adjust the value by the same offset - (MAX_TCP_HEADER + 1). Rather > >> than compute the same adjusted value on each call do the adjustment > >> once when set. > >> > >> Signed-off-by: David Ahern <dsahern@kernel.org> > >> Cc: Eric Dumazet <edumazet@google.com> > > > > > > SGTM, thanks. > > > > Reviewed-by: Eric Dumazet <edumazet@google.com> > > The git history does not explain why MAX_TCP_HEADER is used to lower > sk_gso_max_size. Do you recall the history on it? Simply that max IP datagram size is 64K And TCP is sizing its payload size there (eg in tcp_tso_autosize()), when skb only contains payload. Headers are added later in various xmit layers. MAX_TCP_HEADER is chosen to avoid re-allocs of skb->head in typical workload.
Hello: This patch was applied to netdev/net-next.git (master) by Jakub Kicinski <kuba@kernel.org>: On Mon, 24 Jan 2022 19:45:11 -0700 you wrote: > sk_gso_max_size is set based on the dst dev. Both users of it > adjust the value by the same offset - (MAX_TCP_HEADER + 1). Rather > than compute the same adjusted value on each call do the adjustment > once when set. > > Signed-off-by: David Ahern <dsahern@kernel.org> > Cc: Eric Dumazet <edumazet@google.com> > > [...] Here is the summary with links: - [net-next] net: Adjust sk_gso_max_size once when set https://git.kernel.org/netdev/net-next/c/ab14f1802cfb You are awesome, thank you!
On 1/25/22 10:20 AM, Eric Dumazet wrote: >> The git history does not explain why MAX_TCP_HEADER is used to lower >> sk_gso_max_size. Do you recall the history on it? > > Simply that max IP datagram size is 64K > > And TCP is sizing its payload size there (eg in tcp_tso_autosize()), > when skb only contains payload. > > Headers are added later in various xmit layers. > > MAX_TCP_HEADER is chosen to avoid re-allocs of skb->head in typical workload. From what I can tell skb->head is allocated based on MAX_TCP_HEADER, and payload is added as frags for TSO. I was just curious because I noticed a few MTUs (I only looked multiples of 100 from 1500 to 9000) can get an extra segment in a TSO packet and stay under the 64kB limit if that offset had better information of the actual header size needed (if any beyond network + tcp).
On Tue, Jan 25, 2022 at 3:49 PM David Ahern <dsahern@gmail.com> wrote: > > On 1/25/22 10:20 AM, Eric Dumazet wrote: > >> The git history does not explain why MAX_TCP_HEADER is used to lower > >> sk_gso_max_size. Do you recall the history on it? > > > > Simply that max IP datagram size is 64K > > > > And TCP is sizing its payload size there (eg in tcp_tso_autosize()), > > when skb only contains payload. > > > > Headers are added later in various xmit layers. > > > > MAX_TCP_HEADER is chosen to avoid re-allocs of skb->head in typical workload. > > From what I can tell skb->head is allocated based on MAX_TCP_HEADER, and > payload is added as frags for TSO. Sure, but at the end, ip packet length field is 16bit wide, so sizeof(network+tcp headers) + tcp_payload <= 65535 -> tcp_payload =< 65535 - sizeof(headers) -> tcp_payload_max_per_skb = 65536 - ( MAX_TCP_HEADER + 1) (This would not include Ethernet header) > > I was just curious because I noticed a few MTUs (I only looked multiples > of 100 from 1500 to 9000) can get an extra segment in a TSO packet and > stay under the 64kB limit if that offset had better information of the > actual header size needed (if any beyond network + tcp). TCP does not care about the extra sub-mss bytes that _could_ be added to a TSO packet So if I have 4K MTU (4096 bytes of payload), max TSO size would be 15*4k = 60K Application writing 60*1024+100 bytes in one sendmsg() would send one TSO packet of 15 segments, plus one extra tiny skb with 100 bytes of payload. I have played in the past trying to cover this case, but adding tests in the fast path gave no noticeable difference for common workloads.
diff --git a/net/core/sock.c b/net/core/sock.c index e21485ab285d..114a6e220ba9 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2261,6 +2261,7 @@ void sk_setup_caps(struct sock *sk, struct dst_entry *dst) sk->sk_route_caps |= NETIF_F_SG | NETIF_F_HW_CSUM; /* pairs with the WRITE_ONCE() in netif_set_gso_max_size() */ sk->sk_gso_max_size = READ_ONCE(dst->dev->gso_max_size); + sk->sk_gso_max_size -= (MAX_TCP_HEADER + 1); /* pairs with the WRITE_ONCE() in netif_set_gso_max_segs() */ max_segs = max_t(u32, READ_ONCE(dst->dev->gso_max_segs), 1); } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 3b75836db19b..1afa3f2f9a6d 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -893,8 +893,7 @@ static unsigned int tcp_xmit_size_goal(struct sock *sk, u32 mss_now, return mss_now; /* Note : tcp_tso_autosize() will eventually split this later */ - new_size_goal = sk->sk_gso_max_size - 1 - MAX_TCP_HEADER; - new_size_goal = tcp_bound_to_half_wnd(tp, new_size_goal); + new_size_goal = tcp_bound_to_half_wnd(tp, sk->sk_gso_max_size); /* We try hard to avoid divides here */ size_goal = tp->gso_segs * mss_now; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 5079832af5c1..11c06b9db801 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1960,7 +1960,7 @@ static u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now, bytes = min_t(unsigned long, sk->sk_pacing_rate >> READ_ONCE(sk->sk_pacing_shift), - sk->sk_gso_max_size - 1 - MAX_TCP_HEADER); + sk->sk_gso_max_size); /* Goal is to send at least one packet per ms, * not one big TSO packet every 100 ms.
sk_gso_max_size is set based on the dst dev. Both users of it adjust the value by the same offset - (MAX_TCP_HEADER + 1). Rather than compute the same adjusted value on each call do the adjustment once when set. Signed-off-by: David Ahern <dsahern@kernel.org> Cc: Eric Dumazet <edumazet@google.com> --- net/core/sock.c | 1 + net/ipv4/tcp.c | 3 +-- net/ipv4/tcp_output.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-)