diff mbox series

[net] tunnels: Fix off-by-one in lower MTU bounds for ICMP/ICMPv6 replies

Message ID 4f5fc2f33bfdf8409549fafd4f952b008bf04d63.1604681709.git.sbrivio@redhat.com (mailing list archive)
State Not Applicable
Delegated to: Netdev Maintainers
Headers show
Series [net] tunnels: Fix off-by-one in lower MTU bounds for ICMP/ICMPv6 replies | expand

Commit Message

Stefano Brivio Nov. 6, 2020, 4:59 p.m. UTC
Jianlin reports that a bridged IPv6 VXLAN endpoint, carrying IPv6
packets over a link with a PMTU estimation of exactly 1350 bytes,
won't trigger ICMPv6 Packet Too Big replies when the encapsulated
datagrams exceed said PMTU value. VXLAN over IPv6 adds 70 bytes of
overhead, so an ICMPv6 reply indicating 1280 bytes as inner MTU
would be legitimate and expected.

This comes from an off-by-one error I introduced in checks added
as part of commit 4cb47a8644cc ("tunnels: PMTU discovery support
for directly bridged IP packets"), whose purpose was to prevent
sending ICMPv6 Packet Too Big messages with an MTU lower than the
smallest permissible IPv6 link MTU, i.e. 1280 bytes.

In iptunnel_pmtud_check_icmpv6(), avoid triggering a reply only if
the advertised MTU would be less than, and not equal to, 1280 bytes.

Also fix the analogous comparison for IPv4, that is, skip the ICMP
reply only if the resulting MTU is strictly less than 576 bytes.

This becomes apparent while running the net/pmtu.sh bridged VXLAN
or GENEVE selftests with adjusted lower-link MTU values. Using
e.g. GENEVE, setting ll_mtu to the values reported below, in the
test_pmtu_ipvX_over_bridged_vxlanY_or_geneveY_exception() test
function, we can see failures on the following tests:

             test                | ll_mtu
  -------------------------------|--------
  pmtu_ipv4_br_geneve4_exception |   626
  pmtu_ipv6_br_geneve4_exception |  1330
  pmtu_ipv6_br_geneve6_exception |  1350

owing to the different tunneling overheads implied by the
corresponding configurations.

Reported-by: Jianlin Shi <jishi@redhat.com>
Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/ipv4/ip_tunnel_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Jakub Kicinski Nov. 9, 2020, 11:59 p.m. UTC | #1
On Fri,  6 Nov 2020 17:59:52 +0100 Stefano Brivio wrote:
> Jianlin reports that a bridged IPv6 VXLAN endpoint, carrying IPv6
> packets over a link with a PMTU estimation of exactly 1350 bytes,
> won't trigger ICMPv6 Packet Too Big replies when the encapsulated
> datagrams exceed said PMTU value. VXLAN over IPv6 adds 70 bytes of
> overhead, so an ICMPv6 reply indicating 1280 bytes as inner MTU
> would be legitimate and expected.
> 
> This comes from an off-by-one error I introduced in checks added
> as part of commit 4cb47a8644cc ("tunnels: PMTU discovery support
> for directly bridged IP packets"), whose purpose was to prevent
> sending ICMPv6 Packet Too Big messages with an MTU lower than the
> smallest permissible IPv6 link MTU, i.e. 1280 bytes.
> 
> In iptunnel_pmtud_check_icmpv6(), avoid triggering a reply only if
> the advertised MTU would be less than, and not equal to, 1280 bytes.
> 
> Also fix the analogous comparison for IPv4, that is, skip the ICMP
> reply only if the resulting MTU is strictly less than 576 bytes.
> 
> This becomes apparent while running the net/pmtu.sh bridged VXLAN
> or GENEVE selftests with adjusted lower-link MTU values. Using
> e.g. GENEVE, setting ll_mtu to the values reported below, in the
> test_pmtu_ipvX_over_bridged_vxlanY_or_geneveY_exception() test
> function, we can see failures on the following tests:
> 
>              test                | ll_mtu
>   -------------------------------|--------
>   pmtu_ipv4_br_geneve4_exception |   626
>   pmtu_ipv6_br_geneve4_exception |  1330
>   pmtu_ipv6_br_geneve6_exception |  1350
> 
> owing to the different tunneling overheads implied by the
> corresponding configurations.
> 
> Reported-by: Jianlin Shi <jishi@redhat.com>
> Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>

Applied, thanks.
diff mbox series

Patch

diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 25f1caf5abf9..e25be2d01a7a 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -263,7 +263,7 @@  static int iptunnel_pmtud_check_icmp(struct sk_buff *skb, int mtu)
 	const struct icmphdr *icmph = icmp_hdr(skb);
 	const struct iphdr *iph = ip_hdr(skb);
 
-	if (mtu <= 576 || iph->frag_off != htons(IP_DF))
+	if (mtu < 576 || iph->frag_off != htons(IP_DF))
 		return 0;
 
 	if (ipv4_is_lbcast(iph->daddr)  || ipv4_is_multicast(iph->daddr) ||
@@ -359,7 +359,7 @@  static int iptunnel_pmtud_check_icmpv6(struct sk_buff *skb, int mtu)
 	__be16 frag_off;
 	int offset;
 
-	if (mtu <= IPV6_MIN_MTU)
+	if (mtu < IPV6_MIN_MTU)
 		return 0;
 
 	if (stype == IPV6_ADDR_ANY || stype == IPV6_ADDR_MULTICAST ||