Message ID | 20220518210548.2296546-1-zenczykowski@gmail.com (mailing list archive) |
---|---|
State | Accepted, archived |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | xfrm: do not set IPv4 DF flag when encapsulating IPv6 frames <= 1280 bytes. | expand |
On Wed, May 18, 2022 at 02:05:48PM -0700, Maciej Żenczykowski wrote: > From: Maciej Żenczykowski <maze@google.com> > > One may want to have DF set on large packets to support discovering > path mtu and limiting the size of generated packets (hence not > setting the XFRM_STATE_NOPMTUDISC tunnel flag), while still > supporting networks that are incapable of carrying even minimal > sized IPv6 frames (post encapsulation). > > Having IPv4 Don't Frag bit set on encapsulated IPv6 frames that > are not larger than the minimum IPv6 mtu of 1280 isn't useful, > because the resulting ICMP Fragmentation Required error isn't > actionable (even assuming you receive it) because IPv6 will not > drop it's path mtu below 1280 anyway. While the IPv4 stack > could prefrag the packets post encap, this requires the ICMP > error to be successfully delivered and causes a loss of the > original IPv6 frame (thus requiring a retransmit and latency > hit). Luckily with IPv4 if we simply don't set the DF flag, > we'll just make further fragmenting the packets some other > router's problems. > > We'll still learn the correct IPv4 path mtu through encapsulation > of larger IPv6 frames. > > I'm still not convinced this patch is entirely sufficient to make > everything happy... but I don't see how it could possibly > make things worse. > > See also recent: > 4ff2980b6bd2 'xfrm: fix tunnel model fragmentation behavior' > and friends > > Bug: 203183943 To what does this bug number refer to? Bugzilla? Please make that clear if you want to have this number in the commit message. Thanks!
On Wed, May 18, 2022 at 02:05:48PM -0700, Maciej Żenczykowski wrote: > From: Maciej Żenczykowski <maze@google.com> > > One may want to have DF set on large packets to support discovering > path mtu and limiting the size of generated packets (hence not > setting the XFRM_STATE_NOPMTUDISC tunnel flag), while still > supporting networks that are incapable of carrying even minimal > sized IPv6 frames (post encapsulation). > > Having IPv4 Don't Frag bit set on encapsulated IPv6 frames that > are not larger than the minimum IPv6 mtu of 1280 isn't useful, > because the resulting ICMP Fragmentation Required error isn't > actionable (even assuming you receive it) because IPv6 will not > drop it's path mtu below 1280 anyway. While the IPv4 stack > could prefrag the packets post encap, this requires the ICMP > error to be successfully delivered and causes a loss of the > original IPv6 frame (thus requiring a retransmit and latency > hit). Luckily with IPv4 if we simply don't set the DF flag, > we'll just make further fragmenting the packets some other > router's problems. > > We'll still learn the correct IPv4 path mtu through encapsulation > of larger IPv6 frames. > > I'm still not convinced this patch is entirely sufficient to make > everything happy... but I don't see how it could possibly > make things worse. > > See also recent: > 4ff2980b6bd2 'xfrm: fix tunnel model fragmentation behavior' > and friends > > Bug: 203183943 > Cc: Lorenzo Colitti <lorenzo@google.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Lina Wang <lina.wang@mediatek.com> > Cc: Steffen Klassert <steffen.klassert@secunet.com> > Signed-off-by: Maciej Zenczykowski <maze@google.com> Applied, thanks a lot!
On Wed, May 25, 2022 at 11:51 PM Steffen Klassert <steffen.klassert@secunet.com> wrote: > > On Wed, May 18, 2022 at 02:05:48PM -0700, Maciej Żenczykowski wrote: > > From: Maciej Żenczykowski <maze@google.com> > > > > One may want to have DF set on large packets to support discovering > > path mtu and limiting the size of generated packets (hence not > > setting the XFRM_STATE_NOPMTUDISC tunnel flag), while still > > supporting networks that are incapable of carrying even minimal > > sized IPv6 frames (post encapsulation). > > > > Having IPv4 Don't Frag bit set on encapsulated IPv6 frames that > > are not larger than the minimum IPv6 mtu of 1280 isn't useful, > > because the resulting ICMP Fragmentation Required error isn't > > actionable (even assuming you receive it) because IPv6 will not > > drop it's path mtu below 1280 anyway. While the IPv4 stack > > could prefrag the packets post encap, this requires the ICMP > > error to be successfully delivered and causes a loss of the > > original IPv6 frame (thus requiring a retransmit and latency > > hit). Luckily with IPv4 if we simply don't set the DF flag, > > we'll just make further fragmenting the packets some other > > router's problems. > > > > We'll still learn the correct IPv4 path mtu through encapsulation > > of larger IPv6 frames. > > > > I'm still not convinced this patch is entirely sufficient to make > > everything happy... but I don't see how it could possibly > > make things worse. > > > > See also recent: > > 4ff2980b6bd2 'xfrm: fix tunnel model fragmentation behavior' > > and friends > > > > Bug: 203183943 > > Cc: Lorenzo Colitti <lorenzo@google.com> > > Cc: Eric Dumazet <edumazet@google.com> > > Cc: Lina Wang <lina.wang@mediatek.com> > > Cc: Steffen Klassert <steffen.klassert@secunet.com> > > Signed-off-by: Maciej Zenczykowski <maze@google.com> > > Applied, thanks a lot! Thanks. Is this published somewhere, since I'd lack to backport it to Android Common Kernel 5.10+, but can't find a sha1 (yet?)
On Sat, May 28, 2022 at 02:25:59AM -0700, Maciej Żenczykowski wrote: > On Wed, May 25, 2022 at 11:51 PM Steffen Klassert > <steffen.klassert@secunet.com> wrote: > > > > On Wed, May 18, 2022 at 02:05:48PM -0700, Maciej Żenczykowski wrote: > > > From: Maciej Żenczykowski <maze@google.com> > > > > > > One may want to have DF set on large packets to support discovering > > > path mtu and limiting the size of generated packets (hence not > > > setting the XFRM_STATE_NOPMTUDISC tunnel flag), while still > > > supporting networks that are incapable of carrying even minimal > > > sized IPv6 frames (post encapsulation). > > > > > > Having IPv4 Don't Frag bit set on encapsulated IPv6 frames that > > > are not larger than the minimum IPv6 mtu of 1280 isn't useful, > > > because the resulting ICMP Fragmentation Required error isn't > > > actionable (even assuming you receive it) because IPv6 will not > > > drop it's path mtu below 1280 anyway. While the IPv4 stack > > > could prefrag the packets post encap, this requires the ICMP > > > error to be successfully delivered and causes a loss of the > > > original IPv6 frame (thus requiring a retransmit and latency > > > hit). Luckily with IPv4 if we simply don't set the DF flag, > > > we'll just make further fragmenting the packets some other > > > router's problems. > > > > > > We'll still learn the correct IPv4 path mtu through encapsulation > > > of larger IPv6 frames. > > > > > > I'm still not convinced this patch is entirely sufficient to make > > > everything happy... but I don't see how it could possibly > > > make things worse. > > > > > > See also recent: > > > 4ff2980b6bd2 'xfrm: fix tunnel model fragmentation behavior' > > > and friends > > > > > > Bug: 203183943 > > > Cc: Lorenzo Colitti <lorenzo@google.com> > > > Cc: Eric Dumazet <edumazet@google.com> > > > Cc: Lina Wang <lina.wang@mediatek.com> > > > Cc: Steffen Klassert <steffen.klassert@secunet.com> > > > Signed-off-by: Maciej Zenczykowski <maze@google.com> > > > > Applied, thanks a lot! > > Thanks. > > Is this published somewhere, since I'd lack to backport it to Android > Common Kernel 5.10+, but can't find a sha1 (yet?) Actually I applied it, but forgot to push it out. It is now in the ipsec tree: git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec.git
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c index d4935b3b9983..555ab35cd119 100644 --- a/net/xfrm/xfrm_output.c +++ b/net/xfrm/xfrm_output.c @@ -273,6 +273,7 @@ static int xfrm4_beet_encap_add(struct xfrm_state *x, struct sk_buff *skb) */ static int xfrm4_tunnel_encap_add(struct xfrm_state *x, struct sk_buff *skb) { + bool small_ipv6 = (skb->protocol == htons(ETH_P_IPV6)) && (skb->len <= IPV6_MIN_MTU); struct dst_entry *dst = skb_dst(skb); struct iphdr *top_iph; int flags; @@ -303,7 +304,7 @@ static int xfrm4_tunnel_encap_add(struct xfrm_state *x, struct sk_buff *skb) if (flags & XFRM_STATE_NOECN) IP_ECN_clear(top_iph); - top_iph->frag_off = (flags & XFRM_STATE_NOPMTUDISC) ? + top_iph->frag_off = (flags & XFRM_STATE_NOPMTUDISC) || small_ipv6 ? 0 : (XFRM_MODE_SKB_CB(skb)->frag_off & htons(IP_DF)); top_iph->ttl = ip4_dst_hoplimit(xfrm_dst_child(dst));