[net-next,v6,10/25] ovpn: implement basic TX path (UDP)

Message ID	20240827120805.13681-11-antonio@openvpn.net (mailing list archive)
State	Changes Requested
Delegated to:	Netdev Maintainers
Headers	show Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC68E1A2C0E for <netdev@vger.kernel.org>; Tue, 27 Aug 2024 12:06:51 +0000 (UTC) From: Antonio Quartulli <antonio@openvpn.net> To: netdev@vger.kernel.org Cc: kuba@kernel.org, pabeni@redhat.com, ryazanov.s.a@gmail.com, edumazet@google.com, andrew@lunn.ch, sd@queasysnail.net, Antonio Quartulli <antonio@openvpn.net> Subject: [PATCH net-next v6 10/25] ovpn: implement basic TX path (UDP) Date: Tue, 27 Aug 2024 14:07:50 +0200 Message-ID: <20240827120805.13681-11-antonio@openvpn.net> In-Reply-To: <20240827120805.13681-1-antonio@openvpn.net> References: <20240827120805.13681-1-antonio@openvpn.net> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Introducing OpenVPN Data Channel Offload \| expand [net-next,v6,00/25] Introducing OpenVPN Data Channel Offload [net-next,v6,01/25] netlink: add NLA_POLICY_MAX_LEN macro [net-next,v6,02/25] rtnetlink: don't crash on unregister if no dellink exists [net-next,v6,03/25] net: introduce OpenVPN Data Channel Offload (ovpn) [net-next,v6,04/25] ovpn: add basic netlink support [net-next,v6,05/25] ovpn: add basic interface creation/destruction/management routines [net-next,v6,06/25] ovpn: implement interface creation/destruction via netlink [net-next,v6,07/25] ovpn: keep carrier always on [net-next,v6,08/25] ovpn: introduce the ovpn_peer object [net-next,v6,09/25] ovpn: introduce the ovpn_socket object [net-next,v6,10/25] ovpn: implement basic TX path (UDP) [net-next,v6,11/25] ovpn: implement basic RX path (UDP) [net-next,v6,12/25] ovpn: implement packet processing [net-next,v6,13/25] ovpn: store tunnel and transport statistics [net-next,v6,14/25] ovpn: implement TCP transport [net-next,v6,15/25] ovpn: implement multi-peer support [net-next,v6,16/25] ovpn: implement peer lookup logic [net-next,v6,17/25] ovpn: implement keepalive mechanism [net-next,v6,18/25] ovpn: add support for updating local UDP endpoint [net-next,v6,19/25] ovpn: add support for peer floating [net-next,v6,20/25] ovpn: implement peer add/dump/delete via netlink [net-next,v6,21/25] ovpn: implement key add/del/swap via netlink [net-next,v6,22/25] ovpn: kill key and notify userspace in case of IV exhaustion [net-next,v6,23/25] ovpn: notify userspace when a peer is deleted [net-next,v6,24/25] ovpn: add basic ethtool support [net-next,v6,25/25] testing/selftest: add test tool and scripts for ovpn module

Message ID

20240827120805.13681-11-antonio@openvpn.net (mailing list archive)

State

Changes Requested

Delegated to:

Netdev Maintainers

Headers

From: Antonio Quartulli <antonio@openvpn.net>
To: netdev@vger.kernel.org
Cc: kuba@kernel.org,
	pabeni@redhat.com,
	ryazanov.s.a@gmail.com,
	edumazet@google.com,
	andrew@lunn.ch,
	sd@queasysnail.net,
	Antonio Quartulli <antonio@openvpn.net>
Subject: [PATCH net-next v6 10/25] ovpn: implement basic TX path (UDP)
Date: Tue, 27 Aug 2024 14:07:50 +0200
Message-ID: <20240827120805.13681-11-antonio@openvpn.net>
In-Reply-To: <20240827120805.13681-1-antonio@openvpn.net>
References: <20240827120805.13681-1-antonio@openvpn.net>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

Introducing OpenVPN Data Channel Offload | expand

Context	Check	Description
netdev/series_format	fail	Series longer than 15 patches
netdev/tree_selection	success	Clearly marked for net-next, async
netdev/ynl	success	Generated files up to date; no warnings/errors; GEN HAS DIFF 2 files changed, 2619 insertions(+);
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 16 this patch: 16
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	warning	1 maintainers not CCed: openvpn-devel@lists.sourceforge.net
netdev/build_clang	success	Errors and warnings before: 17 this patch: 17
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 22 this patch: 22
netdev/checkpatch	warning	WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
netdev/contest	success	net-next-2024-08-30--06-00 (tests: 714)

Context

Check

Description

netdev/series_format

fail

Series longer than 15 patches

netdev/tree_selection

success

Clearly marked for net-next, async

netdev/ynl

success

Generated files up to date; no warnings/errors; GEN HAS DIFF 2 files changed, 2619 insertions(+);

netdev/fixes_present

success

Fixes tag not required for -next series

netdev/header_inline

success

No static functions without inline keyword in header files

netdev/build_32bit

success

Errors and warnings before: 16 this patch: 16

netdev/build_tools

success

No tools touched, skip

netdev/cc_maintainers

warning

1 maintainers not CCed: openvpn-devel@lists.sourceforge.net

netdev/build_clang

success

Errors and warnings before: 17 this patch: 17

netdev/verify_signedoff

success

Signed-off-by tag matches author and committer

netdev/deprecated_api

success

None detected

netdev/check_selftest

success

No net selftest shell script

netdev/verify_fixes

success

No Fixes tag

netdev/build_allmodconfig_warn

success

Errors and warnings before: 22 this patch: 22

netdev/checkpatch

warning

WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?

netdev/build_clang_rust

success

No Rust files in patch. Skipping build

netdev/kdoc

success

Errors and warnings before: 0 this patch: 0

netdev/source_inline

success

Was 0 now: 0

netdev/contest

success

net-next-2024-08-30--06-00 (tests: 714)

Commit Message

Antonio Quartulli Aug. 27, 2024, 12:07 p.m. UTC

Packets sent over the ovpn interface are processed and transmitted to the
connected peer, if any.

Implementation is UDP only. TCP will be added by a later patch.

Note: no crypto/encapsulation exists yet. packets are just captured and
sent.

Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
 drivers/net/ovpn/io.c   | 143 ++++++++++++++++++++++++-
 drivers/net/ovpn/peer.c |  37 ++++++-
 drivers/net/ovpn/peer.h |   5 +
 drivers/net/ovpn/skb.h  |  54 ++++++++++
 drivers/net/ovpn/udp.c  | 230 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/ovpn/udp.h  |   8 ++
 6 files changed, 474 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/ovpn/skb.h

Comments

Sabrina Dubroca Aug. 30, 2024, 5:02 p.m. UTC | #1

Hi Antonio,

Thanks for the updated patchset. I'm going through it again.

2024-08-27, 14:07:50 +0200, Antonio Quartulli wrote:
> +/* send skb to connected peer, if any */
> +static void ovpn_send(struct ovpn_struct *ovpn, struct sk_buff *skb,
> +		      struct ovpn_peer *peer)
> +{
> +	struct sk_buff *curr, *next;
> +
> +	if (likely(!peer))
> +		/* retrieve peer serving the destination IP of this packet */
> +		peer = ovpn_peer_get_by_dst(ovpn, skb);
> +	if (unlikely(!peer)) {
> +		net_dbg_ratelimited("%s: no peer to send data to\n",
> +				    ovpn->dev->name);
> +		dev_core_stats_tx_dropped_inc(ovpn->dev);
> +		goto drop;
> +	}
> +
> +	/* this might be a GSO-segmented skb list: process each skb
> +	 * independently
> +	 */
> +	skb_list_walk_safe(skb, curr, next)
> +		if (unlikely(!ovpn_encrypt_one(peer, curr))) {
> +			dev_core_stats_tx_dropped_inc(ovpn->dev);
> +			kfree_skb(curr);

Is this a bit inconsistent with ovpn_net_xmit's behavior? There we
drop the full list if we fail one skb_share_check, and here we only
drop the single packet that failed and handle the rest? Or am I
misreading this?

> +		}
> +
> +	/* skb passed over, no need to free */
> +	skb = NULL;
> +drop:
> +	if (likely(peer))
> +		ovpn_peer_put(peer);
> +	kfree_skb_list(skb);
> +}
>  
>  /* Send user data to the network
>   */
>  netdev_tx_t ovpn_net_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
> +	struct ovpn_struct *ovpn = netdev_priv(dev);
> +	struct sk_buff *segments, *tmp, *curr, *next;
> +	struct sk_buff_head skb_list;
> +	__be16 proto;
> +	int ret;
> +
> +	/* reset netfilter state */
> +	nf_reset_ct(skb);
> +
> +	/* verify IP header size in network packet */
> +	proto = ovpn_ip_check_protocol(skb);
> +	if (unlikely(!proto || skb->protocol != proto)) {
> +		net_err_ratelimited("%s: dropping malformed payload packet\n",
> +				    dev->name);
> +		dev_core_stats_tx_dropped_inc(ovpn->dev);
> +		goto drop;
> +	}
> +
> +	if (skb_is_gso(skb)) {
> +		segments = skb_gso_segment(skb, 0);
> +		if (IS_ERR(segments)) {
> +			ret = PTR_ERR(segments);
> +			net_err_ratelimited("%s: cannot segment packet: %d\n",
> +					    dev->name, ret);
> +			dev_core_stats_tx_dropped_inc(ovpn->dev);
> +			goto drop;
> +		}
> +
> +		consume_skb(skb);
> +		skb = segments;
> +	}
> +
> +	/* from this moment on, "skb" might be a list */
> +
> +	__skb_queue_head_init(&skb_list);
> +	skb_list_walk_safe(skb, curr, next) {
> +		skb_mark_not_on_list(curr);
> +
> +		tmp = skb_share_check(curr, GFP_ATOMIC);
> +		if (unlikely(!tmp)) {
> +			kfree_skb_list(next);

Those don't get counted as dropped, but the ones we've already handled
(and put on skb_list) will be counted as dev_core_stats_tx_dropped_inc?
(it probably doesn't matter that much, since if we'd dropped before/at
skb_gso_segment we'd only count one drop)

> +			net_err_ratelimited("%s: skb_share_check failed\n",
> +					    dev->name);
> +			goto drop_list;
> +		}
> +
> +		__skb_queue_tail(&skb_list, tmp);
> +	}
> +	skb_list.prev->next = NULL;
> +
> +	ovpn_send(ovpn, skb_list.next, NULL);
> +
> +	return NETDEV_TX_OK;
> +
> +drop_list:
> +	skb_queue_walk_safe(&skb_list, curr, next) {
> +		dev_core_stats_tx_dropped_inc(ovpn->dev);
> +		kfree_skb(curr);
> +	}
> +drop:
>  	skb_tx_error(skb);
> -	kfree_skb(skb);
> +	kfree_skb_list(skb);
>  	return NET_XMIT_DROP;
>  }


[...]
> +void ovpn_udp_send_skb(struct ovpn_struct *ovpn, struct ovpn_peer *peer,
> +		       struct sk_buff *skb)
> +{
> +	struct ovpn_bind *bind;
> +	struct socket *sock;
> +	int ret = -1;
> +
> +	skb->dev = ovpn->dev;
> +	/* no checksum performed at this layer */
> +	skb->ip_summed = CHECKSUM_NONE;
> +
> +	/* get socket info */
> +	sock = peer->sock->sock;
> +	if (unlikely(!sock)) {
> +		net_warn_ratelimited("%s: no sock for remote peer\n", __func__);
> +		goto out;
> +	}
> +
> +	rcu_read_lock();
> +	/* get binding */
> +	bind = rcu_dereference(peer->bind);
> +	if (unlikely(!bind)) {
> +		net_warn_ratelimited("%s: no bind for remote peer\n", __func__);
> +		goto out_unlock;
> +	}
> +
> +	/* crypto layer -> transport (UDP) */
> +	ret = ovpn_udp_output(ovpn, bind, &peer->dst_cache, sock->sk, skb);
> +
> +out_unlock:
> +	rcu_read_unlock();
> +out:
> +	if (unlikely(ret < 0)) {
> +		dev_core_stats_tx_dropped_inc(ovpn->dev);
> +		kfree_skb(skb);
> +		return;
> +	}
> +
> +	dev_sw_netstats_tx_add(ovpn->dev, 1, skb->len);

I don't think it's safe to access skb->len after calling
udp_tunnel(6)_xmit_skb.

For example, vxlan_xmit_one (drivers/net/vxlan/vxlan_core.c) has a
similar counter and saves skb->len into pkt_len.

Antonio Quartulli Sept. 2, 2024, 12:03 p.m. UTC | #2

On 30/08/2024 19:02, Sabrina Dubroca wrote:
> Hi Antonio,
> 
> Thanks for the updated patchset. I'm going through it again.
> 
> 2024-08-27, 14:07:50 +0200, Antonio Quartulli wrote:
>> +/* send skb to connected peer, if any */
>> +static void ovpn_send(struct ovpn_struct *ovpn, struct sk_buff *skb,
>> +		      struct ovpn_peer *peer)
>> +{
>> +	struct sk_buff *curr, *next;
>> +
>> +	if (likely(!peer))
>> +		/* retrieve peer serving the destination IP of this packet */
>> +		peer = ovpn_peer_get_by_dst(ovpn, skb);
>> +	if (unlikely(!peer)) {
>> +		net_dbg_ratelimited("%s: no peer to send data to\n",
>> +				    ovpn->dev->name);
>> +		dev_core_stats_tx_dropped_inc(ovpn->dev);
>> +		goto drop;
>> +	}
>> +
>> +	/* this might be a GSO-segmented skb list: process each skb
>> +	 * independently
>> +	 */
>> +	skb_list_walk_safe(skb, curr, next)
>> +		if (unlikely(!ovpn_encrypt_one(peer, curr))) {
>> +			dev_core_stats_tx_dropped_inc(ovpn->dev);
>> +			kfree_skb(curr);
> 
> Is this a bit inconsistent with ovpn_net_xmit's behavior? There we
> drop the full list if we fail one skb_share_check, and here we only
> drop the single packet that failed and handle the rest? Or am I
> misreading this?

You're right, it's inconsistent.

In ovpn_send() each call to ovpn_encrypt_one() will result in the skb 
being sent, therefore, if we wanted, we could free only skbs we haven't 
sent yet.

Maybe it makes sense to always try sending the rest and assume that the 
upper layer will deal with the missing data.

I'll adjust ovpn_net_xmit() to drop only the failing skb and move on.

> 
>> +		}
>> +
>> +	/* skb passed over, no need to free */
>> +	skb = NULL;
>> +drop:
>> +	if (likely(peer))
>> +		ovpn_peer_put(peer);
>> +	kfree_skb_list(skb);
>> +}
>>   
>>   /* Send user data to the network
>>    */
>>   netdev_tx_t ovpn_net_xmit(struct sk_buff *skb, struct net_device *dev)
>>   {
>> +	struct ovpn_struct *ovpn = netdev_priv(dev);
>> +	struct sk_buff *segments, *tmp, *curr, *next;
>> +	struct sk_buff_head skb_list;
>> +	__be16 proto;
>> +	int ret;
>> +
>> +	/* reset netfilter state */
>> +	nf_reset_ct(skb);
>> +
>> +	/* verify IP header size in network packet */
>> +	proto = ovpn_ip_check_protocol(skb);
>> +	if (unlikely(!proto || skb->protocol != proto)) {
>> +		net_err_ratelimited("%s: dropping malformed payload packet\n",
>> +				    dev->name);
>> +		dev_core_stats_tx_dropped_inc(ovpn->dev);
>> +		goto drop;
>> +	}
>> +
>> +	if (skb_is_gso(skb)) {
>> +		segments = skb_gso_segment(skb, 0);
>> +		if (IS_ERR(segments)) {
>> +			ret = PTR_ERR(segments);
>> +			net_err_ratelimited("%s: cannot segment packet: %d\n",
>> +					    dev->name, ret);
>> +			dev_core_stats_tx_dropped_inc(ovpn->dev);
>> +			goto drop;
>> +		}
>> +
>> +		consume_skb(skb);
>> +		skb = segments;
>> +	}
>> +
>> +	/* from this moment on, "skb" might be a list */
>> +
>> +	__skb_queue_head_init(&skb_list);
>> +	skb_list_walk_safe(skb, curr, next) {
>> +		skb_mark_not_on_list(curr);
>> +
>> +		tmp = skb_share_check(curr, GFP_ATOMIC);
>> +		if (unlikely(!tmp)) {
>> +			kfree_skb_list(next);
> 
> Those don't get counted as dropped, but the ones we've already handled
> (and put on skb_list) will be counted as dev_core_stats_tx_dropped_inc?
> (it probably doesn't matter that much, since if we'd dropped before/at
> skb_gso_segment we'd only count one drop)

Once I change this part, I'll ensure we count each single dropped packet.
The downside is that any failure before skb_gso_segment() is counted as 
1 drop, while anything later than that is counter per-segment.
I don't think we can do anything about it though.

> 
>> +			net_err_ratelimited("%s: skb_share_check failed\n",
>> +					    dev->name);
>> +			goto drop_list;
>> +		}
>> +
>> +		__skb_queue_tail(&skb_list, tmp);
>> +	}
>> +	skb_list.prev->next = NULL;
>> +
>> +	ovpn_send(ovpn, skb_list.next, NULL);
>> +
>> +	return NETDEV_TX_OK;
>> +
>> +drop_list:
>> +	skb_queue_walk_safe(&skb_list, curr, next) {
>> +		dev_core_stats_tx_dropped_inc(ovpn->dev);
>> +		kfree_skb(curr);
>> +	}
>> +drop:
>>   	skb_tx_error(skb);
>> -	kfree_skb(skb);
>> +	kfree_skb_list(skb);
>>   	return NET_XMIT_DROP;
>>   }
> 
> 
> [...]
>> +void ovpn_udp_send_skb(struct ovpn_struct *ovpn, struct ovpn_peer *peer,
>> +		       struct sk_buff *skb)
>> +{
>> +	struct ovpn_bind *bind;
>> +	struct socket *sock;
>> +	int ret = -1;
>> +
>> +	skb->dev = ovpn->dev;
>> +	/* no checksum performed at this layer */
>> +	skb->ip_summed = CHECKSUM_NONE;
>> +
>> +	/* get socket info */
>> +	sock = peer->sock->sock;
>> +	if (unlikely(!sock)) {
>> +		net_warn_ratelimited("%s: no sock for remote peer\n", __func__);
>> +		goto out;
>> +	}
>> +
>> +	rcu_read_lock();
>> +	/* get binding */
>> +	bind = rcu_dereference(peer->bind);
>> +	if (unlikely(!bind)) {
>> +		net_warn_ratelimited("%s: no bind for remote peer\n", __func__);
>> +		goto out_unlock;
>> +	}
>> +
>> +	/* crypto layer -> transport (UDP) */
>> +	ret = ovpn_udp_output(ovpn, bind, &peer->dst_cache, sock->sk, skb);
>> +
>> +out_unlock:
>> +	rcu_read_unlock();
>> +out:
>> +	if (unlikely(ret < 0)) {
>> +		dev_core_stats_tx_dropped_inc(ovpn->dev);
>> +		kfree_skb(skb);
>> +		return;
>> +	}
>> +
>> +	dev_sw_netstats_tx_add(ovpn->dev, 1, skb->len);
> 
> I don't think it's safe to access skb->len after calling
> udp_tunnel(6)_xmit_skb.

Absolutely right! Thanks!

> 
> For example, vxlan_xmit_one (drivers/net/vxlan/vxlan_core.c) has a
> similar counter and saves skb->len into pkt_len.

Yap, will do the same.

Cheers,

diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
index ad3813419c33..cbb42086f1fc 100644
--- a/drivers/net/ovpn/io.c
+++ b/drivers/net/ovpn/io.c
@@ -9,14 +9,155 @@ 
 
 #include <linux/netdevice.h>
 #include <linux/skbuff.h>
+#include <net/gso.h>
 
 #include "io.h"
+#include "ovpnstruct.h"
+#include "peer.h"
+#include "udp.h"
+#include "skb.h"
+
+static void ovpn_encrypt_post(struct sk_buff *skb, int ret)
+{
+	struct ovpn_peer *peer = ovpn_skb_cb(skb)->peer;
+
+	if (unlikely(ret < 0))
+		goto err;
+
+	skb_mark_not_on_list(skb);
+
+	switch (peer->sock->sock->sk->sk_protocol) {
+	case IPPROTO_UDP:
+		ovpn_udp_send_skb(peer->ovpn, peer, skb);
+		break;
+	default:
+		/* no transport configured yet */
+		goto err;
+	}
+	/* skb passed down the stack - don't free it */
+	skb = NULL;
+err:
+	if (unlikely(skb)) {
+		dev_core_stats_tx_dropped_inc(peer->ovpn->dev);
+		kfree_skb(skb);
+	}
+	ovpn_peer_put(peer);
+}
+
+static bool ovpn_encrypt_one(struct ovpn_peer *peer, struct sk_buff *skb)
+{
+	ovpn_skb_cb(skb)->peer = peer;
+
+	/* take a reference to the peer because the crypto code may run async.
+	 * ovpn_encrypt_post() will release it upon completion
+	 */
+	if (unlikely(!ovpn_peer_hold(peer))) {
+		DEBUG_NET_WARN_ON_ONCE(1);
+		return false;
+	}
+
+	ovpn_encrypt_post(skb, 0);
+	return true;
+}
+
+/* send skb to connected peer, if any */
+static void ovpn_send(struct ovpn_struct *ovpn, struct sk_buff *skb,
+		      struct ovpn_peer *peer)
+{
+	struct sk_buff *curr, *next;
+
+	if (likely(!peer))
+		/* retrieve peer serving the destination IP of this packet */
+		peer = ovpn_peer_get_by_dst(ovpn, skb);
+	if (unlikely(!peer)) {
+		net_dbg_ratelimited("%s: no peer to send data to\n",
+				    ovpn->dev->name);
+		dev_core_stats_tx_dropped_inc(ovpn->dev);
+		goto drop;
+	}
+
+	/* this might be a GSO-segmented skb list: process each skb
+	 * independently
+	 */
+	skb_list_walk_safe(skb, curr, next)
+		if (unlikely(!ovpn_encrypt_one(peer, curr))) {
+			dev_core_stats_tx_dropped_inc(ovpn->dev);
+			kfree_skb(curr);
+		}
+
+	/* skb passed over, no need to free */
+	skb = NULL;
+drop:
+	if (likely(peer))
+		ovpn_peer_put(peer);
+	kfree_skb_list(skb);
+}
 
 /* Send user data to the network
  */
 netdev_tx_t ovpn_net_xmit(struct sk_buff *skb, struct net_device *dev)
 {
+	struct ovpn_struct *ovpn = netdev_priv(dev);
+	struct sk_buff *segments, *tmp, *curr, *next;
+	struct sk_buff_head skb_list;
+	__be16 proto;
+	int ret;
+
+	/* reset netfilter state */
+	nf_reset_ct(skb);
+
+	/* verify IP header size in network packet */
+	proto = ovpn_ip_check_protocol(skb);
+	if (unlikely(!proto || skb->protocol != proto)) {
+		net_err_ratelimited("%s: dropping malformed payload packet\n",
+				    dev->name);
+		dev_core_stats_tx_dropped_inc(ovpn->dev);
+		goto drop;
+	}
+
+	if (skb_is_gso(skb)) {
+		segments = skb_gso_segment(skb, 0);
+		if (IS_ERR(segments)) {
+			ret = PTR_ERR(segments);
+			net_err_ratelimited("%s: cannot segment packet: %d\n",
+					    dev->name, ret);
+			dev_core_stats_tx_dropped_inc(ovpn->dev);
+			goto drop;
+		}
+
+		consume_skb(skb);
+		skb = segments;
+	}
+
+	/* from this moment on, "skb" might be a list */
+
+	__skb_queue_head_init(&skb_list);
+	skb_list_walk_safe(skb, curr, next) {
+		skb_mark_not_on_list(curr);
+
+		tmp = skb_share_check(curr, GFP_ATOMIC);
+		if (unlikely(!tmp)) {
+			kfree_skb_list(next);
+			net_err_ratelimited("%s: skb_share_check failed\n",
+					    dev->name);
+			goto drop_list;
+		}
+
+		__skb_queue_tail(&skb_list, tmp);
+	}
+	skb_list.prev->next = NULL;
+
+	ovpn_send(ovpn, skb_list.next, NULL);
+
+	return NETDEV_TX_OK;
+
+drop_list:
+	skb_queue_walk_safe(&skb_list, curr, next) {
+		dev_core_stats_tx_dropped_inc(ovpn->dev);
+		kfree_skb(curr);
+	}
+drop:
 	skb_tx_error(skb);
-	kfree_skb(skb);
+	kfree_skb_list(skb);
 	return NET_XMIT_DROP;
 }
diff --git a/drivers/net/ovpn/peer.c b/drivers/net/ovpn/peer.c
index 23ded79eb774..6bde4bd9395b 100644
--- a/drivers/net/ovpn/peer.c
+++ b/drivers/net/ovpn/peer.c
@@ -9,7 +9,6 @@ 
 
 #include <linux/skbuff.h>
 #include <linux/list.h>
-#include <linux/workqueue.h>
 
 #include "ovpnstruct.h"
 #include "bind.h"
@@ -65,8 +64,10 @@  struct ovpn_peer *ovpn_peer_new(struct ovpn_struct *ovpn, u32 id)
  */
 static void ovpn_peer_release(struct ovpn_peer *peer)
 {
-	ovpn_bind_reset(peer, NULL);
+	if (peer->sock)
+		ovpn_socket_put(peer->sock);
 
+	ovpn_bind_reset(peer, NULL);
 	dst_cache_destroy(&peer->dst_cache);
 }
 
@@ -244,6 +245,38 @@  struct ovpn_peer *ovpn_peer_get_by_id(struct ovpn_struct *ovpn, u32 peer_id)
 	return peer;
 }
 
+/**
+ * ovpn_peer_get_by_dst - Lookup peer to send skb to
+ * @ovpn: the private data representing the current VPN session
+ * @skb: the skb to extract the destination address from
+ *
+ * This function takes a tunnel packet and looks up the peer to send it to
+ * after encapsulation. The skb is expected to be the in-tunnel packet, without
+ * any OpenVPN related header.
+ *
+ * Assume that the IP header is accessible in the skb data.
+ *
+ * Return: the peer if found or NULL otherwise.
+ */
+struct ovpn_peer *ovpn_peer_get_by_dst(struct ovpn_struct *ovpn,
+				       struct sk_buff *skb)
+{
+	struct ovpn_peer *peer = NULL;
+
+	/* in P2P mode, no matter the destination, packets are always sent to
+	 * the single peer listening on the other side
+	 */
+	if (ovpn->mode == OVPN_MODE_P2P) {
+		rcu_read_lock();
+		peer = rcu_dereference(ovpn->peer);
+		if (unlikely(peer && !ovpn_peer_hold(peer)))
+			peer = NULL;
+		rcu_read_unlock();
+	}
+
+	return peer;
+}
+
 /**
  * ovpn_peer_add_p2p - add peer to related tables in a P2P instance
  * @ovpn: the instance to add the peer to
diff --git a/drivers/net/ovpn/peer.h b/drivers/net/ovpn/peer.h
index 6c51959363c7..37de5aff54a8 100644
--- a/drivers/net/ovpn/peer.h
+++ b/drivers/net/ovpn/peer.h
@@ -11,6 +11,7 @@ 
 #define _NET_OVPN_OVPNPEER_H_
 
 #include "bind.h"
+#include "socket.h"
 
 #include <net/dst_cache.h>
 #include <uapi/linux/ovpn.h>
@@ -22,6 +23,7 @@ 
  * @vpn_addrs: IP addresses assigned over the tunnel
  * @vpn_addrs.ipv4: IPv4 assigned to peer on the tunnel
  * @vpn_addrs.ipv6: IPv6 assigned to peer on the tunnel
+ * @sock: the socket being used to talk to this peer
  * @dst_cache: cache for dst_entry used to send to peer
  * @bind: remote peer binding
  * @halt: true if ovpn_peer_mark_delete was called
@@ -38,6 +40,7 @@  struct ovpn_peer {
 		struct in_addr ipv4;
 		struct in6_addr ipv6;
 	} vpn_addrs;
+	struct ovpn_socket *sock;
 	struct dst_cache dst_cache;
 	struct ovpn_bind __rcu *bind;
 	bool halt;
@@ -78,5 +81,7 @@  void ovpn_peer_release_p2p(struct ovpn_struct *ovpn);
 struct ovpn_peer *ovpn_peer_get_by_transp_addr(struct ovpn_struct *ovpn,
 					       struct sk_buff *skb);
 struct ovpn_peer *ovpn_peer_get_by_id(struct ovpn_struct *ovpn, u32 peer_id);
+struct ovpn_peer *ovpn_peer_get_by_dst(struct ovpn_struct *ovpn,
+				       struct sk_buff *skb);
 
 #endif /* _NET_OVPN_OVPNPEER_H_ */
diff --git a/drivers/net/ovpn/skb.h b/drivers/net/ovpn/skb.h
new file mode 100644
index 000000000000..7966a10d915f
--- /dev/null
+++ b/drivers/net/ovpn/skb.h
@@ -0,0 +1,54 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*  OpenVPN data channel offload
+ *
+ *  Copyright (C) 2020-2024 OpenVPN, Inc.
+ *
+ *  Author:	Antonio Quartulli <antonio@openvpn.net>
+ *		James Yonan <james@openvpn.net>
+ */
+
+#ifndef _NET_OVPN_SKB_H_
+#define _NET_OVPN_SKB_H_
+
+#include <linux/in.h>
+#include <linux/in6.h>
+#include <linux/ip.h>
+#include <linux/skbuff.h>
+#include <linux/socket.h>
+#include <linux/types.h>
+
+struct ovpn_cb {
+	struct aead_request *req;
+	struct ovpn_peer *peer;
+	struct ovpn_crypto_key_slot *ks;
+	unsigned int payload_offset;
+};
+
+static inline struct ovpn_cb *ovpn_skb_cb(struct sk_buff *skb)
+{
+	BUILD_BUG_ON(sizeof(struct ovpn_cb) > sizeof(skb->cb));
+	return (struct ovpn_cb *)skb->cb;
+}
+
+/* Return IP protocol version from skb header.
+ * Return 0 if protocol is not IPv4/IPv6 or cannot be read.
+ */
+static inline __be16 ovpn_ip_check_protocol(struct sk_buff *skb)
+{
+	__be16 proto = 0;
+
+	/* skb could be non-linear,
+	 * make sure IP header is in non-fragmented part
+	 */
+	if (!pskb_network_may_pull(skb, sizeof(struct iphdr)))
+		return 0;
+
+	if (ip_hdr(skb)->version == 4)
+		proto = htons(ETH_P_IP);
+	else if (ip_hdr(skb)->version == 6)
+		proto = htons(ETH_P_IPV6);
+
+	return proto;
+}
+
+#endif /* _NET_OVPN_SKB_H_ */
diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c
index c10474d252e1..0b1a541732a2 100644
--- a/drivers/net/ovpn/udp.c
+++ b/drivers/net/ovpn/udp.c
@@ -7,14 +7,244 @@ 
  */
 
 #include <linux/netdevice.h>
+#include <linux/inetdevice.h>
 #include <linux/socket.h>
+#include <net/addrconf.h>
+#include <net/dst_cache.h>
+#include <net/route.h>
+#include <net/ipv6_stubs.h>
 #include <net/udp.h>
+#include <net/udp_tunnel.h>
 
 #include "ovpnstruct.h"
 #include "main.h"
+#include "bind.h"
+#include "io.h"
+#include "peer.h"
 #include "socket.h"
 #include "udp.h"
 
+/**
+ * ovpn_udp4_output - send IPv4 packet over udp socket
+ * @ovpn: the openvpn instance
+ * @bind: the binding related to the destination peer
+ * @cache: dst cache
+ * @sk: the socket to send the packet over
+ * @skb: the packet to send
+ *
+ * Return: 0 on success or a negative error code otherwise
+ */
+static int ovpn_udp4_output(struct ovpn_struct *ovpn, struct ovpn_bind *bind,
+			    struct dst_cache *cache, struct sock *sk,
+			    struct sk_buff *skb)
+{
+	struct rtable *rt;
+	struct flowi4 fl = {
+		.saddr = bind->local.ipv4.s_addr,
+		.daddr = bind->remote.in4.sin_addr.s_addr,
+		.fl4_sport = inet_sk(sk)->inet_sport,
+		.fl4_dport = bind->remote.in4.sin_port,
+		.flowi4_proto = sk->sk_protocol,
+		.flowi4_mark = sk->sk_mark,
+	};
+	int ret;
+
+	local_bh_disable();
+	rt = dst_cache_get_ip4(cache, &fl.saddr);
+	if (rt)
+		goto transmit;
+
+	if (unlikely(!inet_confirm_addr(sock_net(sk), NULL, 0, fl.saddr,
+					RT_SCOPE_HOST))) {
+		/* we may end up here when the cached address is not usable
+		 * anymore. In this case we reset address/cache and perform a
+		 * new look up
+		 */
+		fl.saddr = 0;
+		bind->local.ipv4.s_addr = 0;
+		dst_cache_reset(cache);
+	}
+
+	rt = ip_route_output_flow(sock_net(sk), &fl, sk);
+	if (IS_ERR(rt) && PTR_ERR(rt) == -EINVAL) {
+		fl.saddr = 0;
+		bind->local.ipv4.s_addr = 0;
+		dst_cache_reset(cache);
+
+		rt = ip_route_output_flow(sock_net(sk), &fl, sk);
+	}
+
+	if (IS_ERR(rt)) {
+		ret = PTR_ERR(rt);
+		net_dbg_ratelimited("%s: no route to host %pISpc: %d\n",
+				    ovpn->dev->name, &bind->remote.in4, ret);
+		goto err;
+	}
+	dst_cache_set_ip4(cache, &rt->dst, fl.saddr);
+
+transmit:
+	udp_tunnel_xmit_skb(rt, sk, skb, fl.saddr, fl.daddr, 0,
+			    ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport,
+			    fl.fl4_dport, false, sk->sk_no_check_tx);
+	ret = 0;
+err:
+	local_bh_enable();
+	return ret;
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+/**
+ * ovpn_udp6_output - send IPv6 packet over udp socket
+ * @ovpn: the openvpn instance
+ * @bind: the binding related to the destination peer
+ * @cache: dst cache
+ * @sk: the socket to send the packet over
+ * @skb: the packet to send
+ *
+ * Return: 0 on success or a negative error code otherwise
+ */
+static int ovpn_udp6_output(struct ovpn_struct *ovpn, struct ovpn_bind *bind,
+			    struct dst_cache *cache, struct sock *sk,
+			    struct sk_buff *skb)
+{
+	struct dst_entry *dst;
+	int ret;
+
+	struct flowi6 fl = {
+		.saddr = bind->local.ipv6,
+		.daddr = bind->remote.in6.sin6_addr,
+		.fl6_sport = inet_sk(sk)->inet_sport,
+		.fl6_dport = bind->remote.in6.sin6_port,
+		.flowi6_proto = sk->sk_protocol,
+		.flowi6_mark = sk->sk_mark,
+		.flowi6_oif = bind->remote.in6.sin6_scope_id,
+	};
+
+	local_bh_disable();
+	dst = dst_cache_get_ip6(cache, &fl.saddr);
+	if (dst)
+		goto transmit;
+
+	if (unlikely(!ipv6_chk_addr(sock_net(sk), &fl.saddr, NULL, 0))) {
+		/* we may end up here when the cached address is not usable
+		 * anymore. In this case we reset address/cache and perform a
+		 * new look up
+		 */
+		fl.saddr = in6addr_any;
+		bind->local.ipv6 = in6addr_any;
+		dst_cache_reset(cache);
+	}
+
+	dst = ipv6_stub->ipv6_dst_lookup_flow(sock_net(sk), sk, &fl, NULL);
+	if (IS_ERR(dst)) {
+		ret = PTR_ERR(dst);
+		net_dbg_ratelimited("%s: no route to host %pISpc: %d\n",
+				    ovpn->dev->name, &bind->remote.in6, ret);
+		goto err;
+	}
+	dst_cache_set_ip6(cache, dst, &fl.saddr);
+
+transmit:
+	udp_tunnel6_xmit_skb(dst, sk, skb, skb->dev, &fl.saddr, &fl.daddr, 0,
+			     ip6_dst_hoplimit(dst), 0, fl.fl6_sport,
+			     fl.fl6_dport, udp_get_no_check6_tx(sk));
+	ret = 0;
+err:
+	local_bh_enable();
+	return ret;
+}
+#endif
+
+/**
+ * ovpn_udp_output - transmit skb using udp-tunnel
+ * @ovpn: the openvpn instance
+ * @bind: the binding related to the destination peer
+ * @cache: dst cache
+ * @sk: the socket to send the packet over
+ * @skb: the packet to send
+ *
+ * rcu_read_lock should be held on entry.
+ * On return, the skb is consumed.
+ *
+ * Return: 0 on success or a negative error code otherwise
+ */
+static int ovpn_udp_output(struct ovpn_struct *ovpn, struct ovpn_bind *bind,
+			   struct dst_cache *cache, struct sock *sk,
+			   struct sk_buff *skb)
+{
+	int ret;
+
+	/* set sk to null if skb is already orphaned */
+	if (!skb->destructor)
+		skb->sk = NULL;
+
+	/* always permit openvpn-created packets to be (outside) fragmented */
+	skb->ignore_df = 1;
+
+	switch (bind->remote.in4.sin_family) {
+	case AF_INET:
+		ret = ovpn_udp4_output(ovpn, bind, cache, sk, skb);
+		break;
+#if IS_ENABLED(CONFIG_IPV6)
+	case AF_INET6:
+		ret = ovpn_udp6_output(ovpn, bind, cache, sk, skb);
+		break;
+#endif
+	default:
+		ret = -EAFNOSUPPORT;
+		break;
+	}
+
+	return ret;
+}
+
+/**
+ * ovpn_udp_send_skb - prepare skb and send it over via UDP
+ * @ovpn: the openvpn instance
+ * @peer: the destination peer
+ * @skb: the packet to send
+ */
+void ovpn_udp_send_skb(struct ovpn_struct *ovpn, struct ovpn_peer *peer,
+		       struct sk_buff *skb)
+{
+	struct ovpn_bind *bind;
+	struct socket *sock;
+	int ret = -1;
+
+	skb->dev = ovpn->dev;
+	/* no checksum performed at this layer */
+	skb->ip_summed = CHECKSUM_NONE;
+
+	/* get socket info */
+	sock = peer->sock->sock;
+	if (unlikely(!sock)) {
+		net_warn_ratelimited("%s: no sock for remote peer\n", __func__);
+		goto out;
+	}
+
+	rcu_read_lock();
+	/* get binding */
+	bind = rcu_dereference(peer->bind);
+	if (unlikely(!bind)) {
+		net_warn_ratelimited("%s: no bind for remote peer\n", __func__);
+		goto out_unlock;
+	}
+
+	/* crypto layer -> transport (UDP) */
+	ret = ovpn_udp_output(ovpn, bind, &peer->dst_cache, sock->sk, skb);
+
+out_unlock:
+	rcu_read_unlock();
+out:
+	if (unlikely(ret < 0)) {
+		dev_core_stats_tx_dropped_inc(ovpn->dev);
+		kfree_skb(skb);
+		return;
+	}
+
+	dev_sw_netstats_tx_add(ovpn->dev, 1, skb->len);
+}
+
 /**
  * ovpn_udp_socket_attach - set udp-tunnel CBs on socket and link it to ovpn
  * @sock: socket to configure
diff --git a/drivers/net/ovpn/udp.h b/drivers/net/ovpn/udp.h
index f2507f8f2c71..e60f8cd2b4ac 100644
--- a/drivers/net/ovpn/udp.h
+++ b/drivers/net/ovpn/udp.h
@@ -9,9 +9,17 @@ 
 #ifndef _NET_OVPN_UDP_H_
 #define _NET_OVPN_UDP_H_
 
+#include <linux/skbuff.h>
+#include <net/sock.h>
+
+struct ovpn_peer;
 struct ovpn_struct;
+struct sk_buff;
 struct socket;
 
 int ovpn_udp_socket_attach(struct socket *sock, struct ovpn_struct *ovpn);
 
+void ovpn_udp_send_skb(struct ovpn_struct *ovpn, struct ovpn_peer *peer,
+		       struct sk_buff *skb);
+
 #endif /* _NET_OVPN_UDP_H_ */

[net-next,v6,10/25] ovpn: implement basic TX path (UDP)

Checks

Commit Message

Comments

Patch