diff mbox series

[net-next,v3,1/8] udp: fixup csum for GSO receive slow path

Message ID 6430002178b54a389ea50413c7074ba9b48d6212.1617099959.git.pabeni@redhat.com (mailing list archive)
State Accepted
Commit 000ac44da7d0adfc5e62e6c019246a4afeeffd04
Delegated to: Netdev Maintainers
Headers show
Series udp: GRO L4 improvements | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 2 maintainers not CCed: yoshfuji@linux-ipv6.org dsahern@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 185 this patch: 185
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: line length of 83 exceeds 80 columns
netdev/build_allmodconfig_warn success Errors and warnings before: 293 this patch: 293
netdev/header_inline success Link

Commit Message

Paolo Abeni March 30, 2021, 10:28 a.m. UTC
When UDP packets generated locally by a socket with UDP_SEGMENT
traverse the following path:

UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) ->
	UDP tunnel (rx) -> UDP socket (no UDP_GRO)

ip_summed will be set to CHECKSUM_PARTIAL at creation time and
such checksum mode will be preserved in the above path up to the
UDP tunnel receive code where we have:

 __iptunnel_pull_header() -> skb_pull_rcsum() ->
skb_postpull_rcsum() -> __skb_postpull_rcsum()

The latter will convert the skb to CHECKSUM_NONE.

The UDP GSO packet will be later segmented as part of the rx socket
receive operation, and will present a CHECKSUM_NONE after segmentation.

Additionally the segmented packets UDP CB still refers to the original
GSO packet len. Overall that causes unexpected/wrong csum validation
errors later in the UDP receive path.

We could possibly address the issue with some additional checks and
csum mangling in the UDP tunnel code. Since the issue affects only
this UDP receive slow path, let's set a suitable csum status there.

Note that SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST packets lacking an UDP
encapsulation present a valid checksum when landing to udp_queue_rcv_skb(),
as the UDP checksum has been validated by the GRO engine.

v2 -> v3:
 - even more verbose commit message and comments

v1 -> v2:
 - restrict the csum update to the packets strictly needing them
 - hopefully clarify the commit message and code comments

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/udp.h | 23 +++++++++++++++++++++++
 net/ipv4/udp.c    |  2 ++
 net/ipv6/udp.c    |  1 +
 3 files changed, 26 insertions(+)

Comments

Willem de Bruijn March 30, 2021, 3:15 p.m. UTC | #1
On Tue, Mar 30, 2021 at 6:29 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> When UDP packets generated locally by a socket with UDP_SEGMENT
> traverse the following path:
>
> UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) ->
>         UDP tunnel (rx) -> UDP socket (no UDP_GRO)
>
> ip_summed will be set to CHECKSUM_PARTIAL at creation time and
> such checksum mode will be preserved in the above path up to the
> UDP tunnel receive code where we have:
>
>  __iptunnel_pull_header() -> skb_pull_rcsum() ->
> skb_postpull_rcsum() -> __skb_postpull_rcsum()
>
> The latter will convert the skb to CHECKSUM_NONE.
>
> The UDP GSO packet will be later segmented as part of the rx socket
> receive operation, and will present a CHECKSUM_NONE after segmentation.
>
> Additionally the segmented packets UDP CB still refers to the original
> GSO packet len. Overall that causes unexpected/wrong csum validation
> errors later in the UDP receive path.
>
> We could possibly address the issue with some additional checks and
> csum mangling in the UDP tunnel code. Since the issue affects only
> this UDP receive slow path, let's set a suitable csum status there.
>
> Note that SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST packets lacking an UDP
> encapsulation present a valid checksum when landing to udp_queue_rcv_skb(),
> as the UDP checksum has been validated by the GRO engine.
>
> v2 -> v3:
>  - even more verbose commit message and comments
>
> v1 -> v2:
>  - restrict the csum update to the packets strictly needing them
>  - hopefully clarify the commit message and code comments
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>
diff mbox series

Patch

diff --git a/include/net/udp.h b/include/net/udp.h
index d4d064c592328..adf2ff8ac87c9 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -515,6 +515,29 @@  static inline struct sk_buff *udp_rcv_segment(struct sock *sk,
 	return segs;
 }
 
+static inline void udp_post_segment_fix_csum(struct sk_buff *skb)
+{
+	/* UDP-lite can't land here - no GRO */
+	WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
+
+	/* UDP packets generated with UDP_SEGMENT and traversing:
+	 *
+	 * UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> UDP tunnel (rx)
+	 *
+	 * can reach an UDP socket with CHECKSUM_NONE, because
+	 * __iptunnel_pull_header() converts CHECKSUM_PARTIAL into NONE.
+	 * SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST packets with no UDP tunnel will
+	 * have a valid checksum, as the GRO engine validates the UDP csum
+	 * before the aggregation and nobody strips such info in between.
+	 * Instead of adding another check in the tunnel fastpath, we can force
+	 * a valid csum after the segmentation.
+	 * Additionally fixup the UDP CB.
+	 */
+	UDP_SKB_CB(skb)->cscov = skb->len;
+	if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid)
+		skb->csum_valid = 1;
+}
+
 #ifdef CONFIG_BPF_SYSCALL
 struct sk_psock;
 struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 4a0478b17243a..fe85dcf8c0087 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2178,6 +2178,8 @@  static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	segs = udp_rcv_segment(sk, skb, true);
 	skb_list_walk_safe(segs, skb, next) {
 		__skb_pull(skb, skb_transport_offset(skb));
+
+		udp_post_segment_fix_csum(skb);
 		ret = udp_queue_rcv_one_skb(sk, skb);
 		if (ret > 0)
 			ip_protocol_deliver_rcu(dev_net(skb->dev), skb, ret);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d25e5a9252fdb..fa2f547383925 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -749,6 +749,7 @@  static int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	skb_list_walk_safe(segs, skb, next) {
 		__skb_pull(skb, skb_transport_offset(skb));
 
+		udp_post_segment_fix_csum(skb);
 		ret = udpv6_queue_rcv_one_skb(sk, skb);
 		if (ret > 0)
 			ip6_protocol_deliver_rcu(dev_net(skb->dev), skb, ret,