Message ID | 20241219-b4-ovpn-v16-7-3e3001153683@openvpn.net (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Introducing OpenVPN Data Channel Offload | expand |
Hello Antonio, 2024-12-19, 02:42:01 +0100, Antonio Quartulli wrote: > +static void ovpn_socket_release_kref(struct kref *kref) > + __releases(sock->sock->sk) > +{ > + struct ovpn_socket *sock = container_of(kref, struct ovpn_socket, > + refcount); > + [extend with bits of patch 9] > /* UDP sockets are detached in this kref callback because > * we now know for sure that all concurrent users have > * finally gone (refcounter dropped to 0). > * > * Moreover, detachment is performed under lock to prevent > * a concurrent ovpn_socket_new() call with the same socket > * to find the socket still attached but with refcounter 0. I'm not convinced this really works, because ovpn_socket_new() doesn't use the same lock. lock_sock and bh_lock_sock both "lock the socket" in some sense, but they're not mutually exclusive (we talked about that around the TCP patch). Are you fundamentally opposed to making attach permanent? ie, once a UDP or TCP socket is assigned to an ovpn instance, it can't be detached and reused. I think it would be safer, simpler, and likely sufficient (I don't know openvpn much, but I don't see a use case for moving a socket from one ovpn instance to another, or using it without encap). Rough idea: - ovpn_socket_new is pretty much unchanged (locking still needed to protect against another simultaneous attach attempt, EALREADY case becomes a bit easier) - ovpn_peer_remove doesn't do anything socket-related - use ->encap_destroy/ovpn_tcp_close() to clean up sk_user_data - no more refcounting on ovpn_socket (since the encap can't be removed, the lifetime to ovpn_socket is tied to its socket) What do you think? I'm trying to poke holes into this idea now. close() vs attach worries me a bit. > */ > if (sock->sock->sk->sk_protocol == IPPROTO_UDP) > ovpn_udp_socket_detach(sock->sock); > + bh_unlock_sock(sock->sock->sk); > + sockfd_put(sock->sock); > + kfree_rcu(sock, rcu); > +} [...] > +struct ovpn_socket *ovpn_socket_new(struct socket *sock, struct ovpn_peer *peer) > +{ > + struct ovpn_socket *ovpn_sock; > + int ret; > + > + lock_sock(sock->sk); > + > + ret = ovpn_socket_attach(sock, peer); > + if (ret < 0 && ret != -EALREADY) > + goto err_release; > + > + /* if this socket is already owned by this interface, just increase the > + * refcounter and use it as expected. > + * > + * Since UDP sockets can be used to talk to multiple remote endpoints, > + * openvpn normally instantiates only one socket and shares it among all > + * its peers. For this reason, when we find out that a socket is already > + * used for some other peer in *this* instance, we can happily increase > + * its refcounter and use it normally. > + */ > + if (ret == -EALREADY) { > + /* caller is expected to increase the sock refcounter before > + * passing it to this function. For this reason we drop it if > + * not needed, like when this socket is already owned. > + */ > + ovpn_sock = ovpn_socket_get(sock); > + release_sock(sock->sk); > + sockfd_put(sock); > + return ovpn_sock; > + } > + > + ovpn_sock = kzalloc(sizeof(*ovpn_sock), GFP_KERNEL); > + if (!ovpn_sock) { > + ret = -ENOMEM; > + goto err_detach; > + } > + > + ovpn_sock->ovpn = peer->ovpn; > + ovpn_sock->sock = sock; > + kref_init(&ovpn_sock->refcount); > + > + rcu_assign_sk_user_data(sock->sk, ovpn_sock); > + release_sock(sock->sk); > + > + return ovpn_sock; > +err_detach: > + if (sock->sk->sk_protocol == IPPROTO_UDP) > + ovpn_udp_socket_detach(sock); This would leave the TCP socket half-attached, and if userspace tries to attach the same socket again (I don't think the ovpn module would prevent that since sk_user_data is still unset), both ->sk_data_ready and tcp.sk_cb.sk_data_ready will be set to ovpn's (same for sk_write_space with ovpn_tcp_write_space which will recurse into itself when called). I think it'd be easier to do the alloc first, then attach. Handling a failure to attach would be a simple kfree, while handling a failure to alloc is a detach (or part of a detach) which is not as easy. > +int ovpn_udp_socket_attach(struct socket *sock, struct ovpn_priv *ovpn) > +{ > + struct ovpn_socket *old_data; > + int ret = 0; > + > + /* make sure no pre-existing encapsulation handler exists */ > + rcu_read_lock(); > + old_data = rcu_dereference_sk_user_data(sock->sk); > + if (!old_data) { > + /* socket is currently unused - we can take it */ > + rcu_read_unlock(); > + return 0; > + } > + > + /* socket is in use. We need to understand if it's owned by this ovpn > + * instance or by something else. > + * In the former case, we can increase the refcounter and happily > + * use it, because the same UDP socket is expected to be shared among > + * different peers. > + * > + * Unlikely TCP, a single UDP socket can be used to talk to many remote nit: s/Unlikely/Unlike/ > + * hosts and therefore openvpn instantiates one only for all its peers > + */
Hi Sabrina, On 03/01/2025 18:00, Sabrina Dubroca wrote: > Hello Antonio, > > 2024-12-19, 02:42:01 +0100, Antonio Quartulli wrote: >> +static void ovpn_socket_release_kref(struct kref *kref) >> + __releases(sock->sock->sk) >> +{ >> + struct ovpn_socket *sock = container_of(kref, struct ovpn_socket, >> + refcount); >> + > > [extend with bits of patch 9] >> /* UDP sockets are detached in this kref callback because >> * we now know for sure that all concurrent users have >> * finally gone (refcounter dropped to 0). >> * >> * Moreover, detachment is performed under lock to prevent >> * a concurrent ovpn_socket_new() call with the same socket >> * to find the socket still attached but with refcounter 0. > > I'm not convinced this really works, because ovpn_socket_new() doesn't > use the same lock. lock_sock and bh_lock_sock both "lock the socket" > in some sense, but they're not mutually exclusive (we talked about > that around the TCP patch). You're right - but what prevents us from always using bh_lock_sock? > > Are you fundamentally opposed to making attach permanent? ie, once > a UDP or TCP socket is assigned to an ovpn instance, it can't be > detached and reused. I think it would be safer, simpler, and likely > sufficient (I don't know openvpn much, but I don't see a use case for > moving a socket from one ovpn instance to another, or using it without > encap). I hardly believe a socket will ever be moved to a different instance. There is no use case (and no userspace support) for that at the moment. > > Rough idea: > - ovpn_socket_new is pretty much unchanged (locking still needed to > protect against another simultaneous attach attempt, EALREADY case > becomes a bit easier) > - ovpn_peer_remove doesn't do anything socket-related > - use ->encap_destroy/ovpn_tcp_close() to clean up sk_user_data > - no more refcounting on ovpn_socket (since the encap can't be > removed, the lifetime to ovpn_socket is tied to its socket) > > What do you think? hmm how would that work with UDP? On a server all clients may disconnect, but the UDP socket is expected to still survive and be re-used for new clients (userspace will keep it alive and keep listening for new clients). Or you're saying that the socket will remain "attached" (i.e. sk_user_data set to the ovpn_priv*) even when no more clients are connected? > > I'm trying to poke holes into this idea now. close() vs attach worries > me a bit. Can that truly happen? If a socket is going through close(), there should be some way to mark it as "non-attachable". Actually, do we even need to clean up sk_user_data? The socket is being destroyed - why clean that up at all? > > >> */ >> if (sock->sock->sk->sk_protocol == IPPROTO_UDP) >> ovpn_udp_socket_detach(sock->sock); > > >> + bh_unlock_sock(sock->sock->sk); >> + sockfd_put(sock->sock); >> + kfree_rcu(sock, rcu); >> +} > > [...] >> +struct ovpn_socket *ovpn_socket_new(struct socket *sock, struct ovpn_peer *peer) >> +{ >> + struct ovpn_socket *ovpn_sock; >> + int ret; >> + >> + lock_sock(sock->sk); >> + >> + ret = ovpn_socket_attach(sock, peer); >> + if (ret < 0 && ret != -EALREADY) >> + goto err_release; >> + >> + /* if this socket is already owned by this interface, just increase the >> + * refcounter and use it as expected. >> + * >> + * Since UDP sockets can be used to talk to multiple remote endpoints, >> + * openvpn normally instantiates only one socket and shares it among all >> + * its peers. For this reason, when we find out that a socket is already >> + * used for some other peer in *this* instance, we can happily increase >> + * its refcounter and use it normally. >> + */ >> + if (ret == -EALREADY) { >> + /* caller is expected to increase the sock refcounter before >> + * passing it to this function. For this reason we drop it if >> + * not needed, like when this socket is already owned. >> + */ >> + ovpn_sock = ovpn_socket_get(sock); >> + release_sock(sock->sk); >> + sockfd_put(sock); >> + return ovpn_sock; >> + } >> + >> + ovpn_sock = kzalloc(sizeof(*ovpn_sock), GFP_KERNEL); >> + if (!ovpn_sock) { >> + ret = -ENOMEM; >> + goto err_detach; >> + } >> + >> + ovpn_sock->ovpn = peer->ovpn; >> + ovpn_sock->sock = sock; >> + kref_init(&ovpn_sock->refcount); >> + >> + rcu_assign_sk_user_data(sock->sk, ovpn_sock); >> + release_sock(sock->sk); >> + >> + return ovpn_sock; >> +err_detach: >> + if (sock->sk->sk_protocol == IPPROTO_UDP) >> + ovpn_udp_socket_detach(sock); > > This would leave the TCP socket half-attached, and if userspace tries > to attach the same socket again (I don't think the ovpn module would > prevent that since sk_user_data is still unset), both ->sk_data_ready > and tcp.sk_cb.sk_data_ready will be set to ovpn's (same for > sk_write_space with ovpn_tcp_write_space which will recurse into > itself when called). > > I think it'd be easier to do the alloc first, then attach. Handling a > failure to attach would be a simple kfree, while handling a failure to > alloc is a detach (or part of a detach) which is not as easy. Yap, makes sense! > > > >> +int ovpn_udp_socket_attach(struct socket *sock, struct ovpn_priv *ovpn) >> +{ >> + struct ovpn_socket *old_data; >> + int ret = 0; >> + >> + /* make sure no pre-existing encapsulation handler exists */ >> + rcu_read_lock(); >> + old_data = rcu_dereference_sk_user_data(sock->sk); >> + if (!old_data) { >> + /* socket is currently unused - we can take it */ >> + rcu_read_unlock(); >> + return 0; >> + } >> + >> + /* socket is in use. We need to understand if it's owned by this ovpn >> + * instance or by something else. >> + * In the former case, we can increase the refcounter and happily >> + * use it, because the same UDP socket is expected to be shared among >> + * different peers. >> + * >> + * Unlikely TCP, a single UDP socket can be used to talk to many remote > > nit: s/Unlikely/Unlike/ ACK > >> + * hosts and therefore openvpn instantiates one only for all its peers >> + */ Thanks a lot! Regards, >
diff --git a/drivers/net/ovpn/Makefile b/drivers/net/ovpn/Makefile index ce13499b3e1775a7f2a9ce16c6cb0aa088f93685..56bddc9bef83e0befde6af3c3565bb91731d7b22 100644 --- a/drivers/net/ovpn/Makefile +++ b/drivers/net/ovpn/Makefile @@ -13,3 +13,5 @@ ovpn-y += io.o ovpn-y += netlink.o ovpn-y += netlink-gen.o ovpn-y += peer.o +ovpn-y += socket.o +ovpn-y += udp.o diff --git a/drivers/net/ovpn/socket.c b/drivers/net/ovpn/socket.c new file mode 100644 index 0000000000000000000000000000000000000000..b41a95cc154ad50f86e67178a95f16e323880800 --- /dev/null +++ b/drivers/net/ovpn/socket.c @@ -0,0 +1,163 @@ +// SPDX-License-Identifier: GPL-2.0 +/* OpenVPN data channel offload + * + * Copyright (C) 2020-2024 OpenVPN, Inc. + * + * Author: James Yonan <james@openvpn.net> + * Antonio Quartulli <antonio@openvpn.net> + */ + +#include <linux/net.h> +#include <linux/netdevice.h> +#include <linux/udp.h> + +#include "ovpnstruct.h" +#include "main.h" +#include "io.h" +#include "peer.h" +#include "socket.h" +#include "udp.h" + +/** + * ovpn_socket_release_kref - kref_put callback + * @kref: the kref object + * + * This function releases the lock_sock which was previously + * acquired by kref_put_sock() in ovpn_socket_put() + */ +static void ovpn_socket_release_kref(struct kref *kref) + __releases(sock->sock->sk) +{ + struct ovpn_socket *sock = container_of(kref, struct ovpn_socket, + refcount); + + bh_unlock_sock(sock->sock->sk); + sockfd_put(sock->sock); + kfree_rcu(sock, rcu); +} + +/** + * ovpn_socket_put - decrease reference counter + * @sock: the socket whose reference counter should be decreased + * + * Decreases the refcounter and, if it reached zero, acquires the + * socket lock and calls the release callback. + * This particular behaviour allows the callback to operate under + * lock while atomically getting the refconter to 0. + * + * This function is only used internally. Users willing to release + * references to the ovpn_socket should use ovpn_socket_release() + */ +static void ovpn_socket_put(struct ovpn_socket *sock) +{ + kref_put_sock(&sock->refcount, ovpn_socket_release_kref, + sock->sock->sk); +} + +/** + * ovpn_socket_release - release resources owned by socket user + * @sock: the socket to process + * + * This function should be invoked when the user is shutting + * down and wants to drop its link to the socket. + * + * In case of UDP, the socket is owned by multiple users concurrently, + * therefore the release consists only in decreasing the refcounter. + * Once the refcounter reaches 0, the socket can be fully detached and + * released. In turn, the detach routine will drop a reference to the + * ovpn netdev, pointed by the ovpn_socket. + */ +void ovpn_socket_release(struct ovpn_socket *sock) +{ + if (sock->sock->sk->sk_protocol == IPPROTO_UDP) + ovpn_socket_put(sock); +} + +static bool ovpn_socket_hold(struct ovpn_socket *sock) +{ + return kref_get_unless_zero(&sock->refcount); +} + +static struct ovpn_socket *ovpn_socket_get(struct socket *sock) +{ + struct ovpn_socket *ovpn_sock; + + ovpn_sock = rcu_dereference_sk_user_data(sock->sk); + if (WARN_ON(!ovpn_socket_hold(ovpn_sock))) + ovpn_sock = NULL; + + return ovpn_sock; +} + +static int ovpn_socket_attach(struct socket *sock, struct ovpn_peer *peer) +{ + int ret = -EOPNOTSUPP; + + if (!sock || !peer) + return -EINVAL; + + if (sock->sk->sk_protocol == IPPROTO_UDP) + ret = ovpn_udp_socket_attach(sock, peer->ovpn); + + return ret; +} + +/** + * ovpn_socket_new - create a new socket and initialize it + * @sock: the kernel socket to embed + * @peer: the peer reachable via this socket + * + * Return: an openvpn socket on success or a negative error code otherwise + */ +struct ovpn_socket *ovpn_socket_new(struct socket *sock, struct ovpn_peer *peer) +{ + struct ovpn_socket *ovpn_sock; + int ret; + + lock_sock(sock->sk); + + ret = ovpn_socket_attach(sock, peer); + if (ret < 0 && ret != -EALREADY) + goto err_release; + + /* if this socket is already owned by this interface, just increase the + * refcounter and use it as expected. + * + * Since UDP sockets can be used to talk to multiple remote endpoints, + * openvpn normally instantiates only one socket and shares it among all + * its peers. For this reason, when we find out that a socket is already + * used for some other peer in *this* instance, we can happily increase + * its refcounter and use it normally. + */ + if (ret == -EALREADY) { + /* caller is expected to increase the sock refcounter before + * passing it to this function. For this reason we drop it if + * not needed, like when this socket is already owned. + */ + ovpn_sock = ovpn_socket_get(sock); + release_sock(sock->sk); + sockfd_put(sock); + return ovpn_sock; + } + + ovpn_sock = kzalloc(sizeof(*ovpn_sock), GFP_KERNEL); + if (!ovpn_sock) { + ret = -ENOMEM; + goto err_detach; + } + + ovpn_sock->ovpn = peer->ovpn; + ovpn_sock->sock = sock; + kref_init(&ovpn_sock->refcount); + + rcu_assign_sk_user_data(sock->sk, ovpn_sock); + release_sock(sock->sk); + + return ovpn_sock; +err_detach: + if (sock->sk->sk_protocol == IPPROTO_UDP) + ovpn_udp_socket_detach(sock); +err_release: + release_sock(sock->sk); + return ERR_PTR(ret); +} diff --git a/drivers/net/ovpn/socket.h b/drivers/net/ovpn/socket.h new file mode 100644 index 0000000000000000000000000000000000000000..aab26b575df9c886a078c2884900c362a6bf0eb2 --- /dev/null +++ b/drivers/net/ovpn/socket.h @@ -0,0 +1,38 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* OpenVPN data channel offload + * + * Copyright (C) 2020-2024 OpenVPN, Inc. + * + * Author: James Yonan <james@openvpn.net> + * Antonio Quartulli <antonio@openvpn.net> + */ + +#ifndef _NET_OVPN_SOCK_H_ +#define _NET_OVPN_SOCK_H_ + +#include <linux/net.h> +#include <linux/kref.h> +#include <net/sock.h> + +struct ovpn_priv; +struct ovpn_peer; + +/** + * struct ovpn_socket - a kernel socket referenced in the ovpn code + * @ovpn: ovpn instance owning this socket (UDP only) + * @sock: the low level sock object + * @refcount: amount of contexts currently referencing this object + * @rcu: member used to schedule RCU destructor callback + */ +struct ovpn_socket { + struct ovpn_priv *ovpn; + struct socket *sock; + struct kref refcount; + struct rcu_head rcu; +}; + +struct ovpn_socket *ovpn_socket_new(struct socket *sock, + struct ovpn_peer *peer); +void ovpn_socket_release(struct ovpn_socket *sock); + +#endif /* _NET_OVPN_SOCK_H_ */ diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c new file mode 100644 index 0000000000000000000000000000000000000000..c8a065fbf27d4cb43a73fe8381a4d0d6d4c9de0f --- /dev/null +++ b/drivers/net/ovpn/udp.c @@ -0,0 +1,73 @@ +// SPDX-License-Identifier: GPL-2.0 +/* OpenVPN data channel offload + * + * Copyright (C) 2019-2024 OpenVPN, Inc. + * + * Author: Antonio Quartulli <antonio@openvpn.net> + */ + +#include <linux/netdevice.h> +#include <linux/socket.h> +#include <linux/udp.h> +#include <net/udp.h> + +#include "ovpnstruct.h" +#include "main.h" +#include "socket.h" +#include "udp.h" + +/** + * ovpn_udp_socket_attach - set udp-tunnel CBs on socket and link it to ovpn + * @sock: socket to configure + * @ovpn: the openvp instance to link + * + * After invoking this function, the sock will be controlled by ovpn so that + * any incoming packet may be processed by ovpn first. + * + * Return: 0 on success or a negative error code otherwise + */ +int ovpn_udp_socket_attach(struct socket *sock, struct ovpn_priv *ovpn) +{ + struct ovpn_socket *old_data; + int ret = 0; + + /* make sure no pre-existing encapsulation handler exists */ + rcu_read_lock(); + old_data = rcu_dereference_sk_user_data(sock->sk); + if (!old_data) { + /* socket is currently unused - we can take it */ + rcu_read_unlock(); + return 0; + } + + /* socket is in use. We need to understand if it's owned by this ovpn + * instance or by something else. + * In the former case, we can increase the refcounter and happily + * use it, because the same UDP socket is expected to be shared among + * different peers. + * + * Unlikely TCP, a single UDP socket can be used to talk to many remote + * hosts and therefore openvpn instantiates one only for all its peers + */ + if ((READ_ONCE(udp_sk(sock->sk)->encap_type) == UDP_ENCAP_OVPNINUDP) && + old_data->ovpn == ovpn) { + netdev_dbg(ovpn->dev, + "provided socket already owned by this interface\n"); + ret = -EALREADY; + } else { + netdev_dbg(ovpn->dev, + "provided socket already taken by other user\n"); + ret = -EBUSY; + } + rcu_read_unlock(); + + return ret; +} + +/** + * ovpn_udp_socket_detach - clean udp-tunnel status for this socket + * @sock: the socket to clean + */ +void ovpn_udp_socket_detach(struct socket *sock) +{ +} diff --git a/drivers/net/ovpn/udp.h b/drivers/net/ovpn/udp.h new file mode 100644 index 0000000000000000000000000000000000000000..42e6cccf0fe3958c7742af68b4edfbabc0132f12 --- /dev/null +++ b/drivers/net/ovpn/udp.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* OpenVPN data channel offload + * + * Copyright (C) 2019-2024 OpenVPN, Inc. + * + * Author: Antonio Quartulli <antonio@openvpn.net> + */ + +#ifndef _NET_OVPN_UDP_H_ +#define _NET_OVPN_UDP_H_ + +struct ovpn_priv; +struct socket; + +int ovpn_udp_socket_attach(struct socket *sock, struct ovpn_priv *ovpn); +void ovpn_udp_socket_detach(struct socket *sock); + +#endif /* _NET_OVPN_UDP_H_ */ diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h index d85d671deed3c78f6969189281b9083dcac000c6..edca3e430305a6bffc34e617421f1f3071582e69 100644 --- a/include/uapi/linux/udp.h +++ b/include/uapi/linux/udp.h @@ -43,5 +43,6 @@ struct udphdr { #define UDP_ENCAP_GTP1U 5 /* 3GPP TS 29.060 */ #define UDP_ENCAP_RXRPC 6 #define TCP_ENCAP_ESPINTCP 7 /* Yikes, this is really xfrm encap types. */ +#define UDP_ENCAP_OVPNINUDP 8 /* OpenVPN traffic */ #endif /* _UAPI_LINUX_UDP_H */
This specific structure is used in the ovpn kernel module to wrap and carry around a standard kernel socket. ovpn takes ownership of passed sockets and therefore an ovpn specific objects is attached to them for status tracking purposes. Initially only UDP support is introduced. TCP will come in a later patch. Cc: willemdebruijn.kernel@gmail.com Signed-off-by: Antonio Quartulli <antonio@openvpn.net> --- drivers/net/ovpn/Makefile | 2 + drivers/net/ovpn/socket.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++ drivers/net/ovpn/socket.h | 38 +++++++++++ drivers/net/ovpn/udp.c | 73 +++++++++++++++++++++ drivers/net/ovpn/udp.h | 18 +++++ include/uapi/linux/udp.h | 1 + 6 files changed, 295 insertions(+)