Message ID | 20230821214523.720206-1-jrife@google.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 0bdf399342c5acbd817c9098b6c7ed21f1974312 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v3] net: Avoid address overwrite in kernel_connect | expand |
Hello: This patch was applied to netdev/net-next.git (main) by David S. Miller <davem@davemloft.net>: On Mon, 21 Aug 2023 16:45:23 -0500 you wrote: > BPF programs that run on connect can rewrite the connect address. For > the connect system call this isn't a problem, because a copy of the address > is made when it is moved into kernel space. However, kernel_connect > simply passes through the address it is given, so the caller may observe > its address value unexpectedly change. > > A practical example where this is problematic is where NFS is combined > with a system such as Cilium which implements BPF-based load balancing. > A common pattern in software-defined storage systems is to have an NFS > mount that connects to a persistent virtual IP which in turn maps to an > ephemeral server IP. This is usually done to achieve high availability: > if your server goes down you can quickly spin up a replacement and remap > the virtual IP to that endpoint. With BPF-based load balancing, mounts > will forget the virtual IP address when the address rewrite occurs > because a pointer to the only copy of that address is passed down the > stack. Server failover then breaks, because clients have forgotten the > virtual IP address. Reconnects fail and mounts remain broken. This patch > was tested by setting up a scenario like this and ensuring that NFS > reconnects worked after applying the patch. > > [...] Here is the summary with links: - [net-next,v3] net: Avoid address overwrite in kernel_connect https://git.kernel.org/netdev/net-next/c/0bdf399342c5 You are awesome, thank you!
diff --git a/net/socket.c b/net/socket.c index fdb5233bf560c..848116d06b511 100644 --- a/net/socket.c +++ b/net/socket.c @@ -3567,7 +3567,12 @@ EXPORT_SYMBOL(kernel_accept); int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen, int flags) { - return READ_ONCE(sock->ops)->connect(sock, addr, addrlen, flags); + struct sockaddr_storage address; + + memcpy(&address, addr, addrlen); + + return READ_ONCE(sock->ops)->connect(sock, (struct sockaddr *)&address, + addrlen, flags); } EXPORT_SYMBOL(kernel_connect);
BPF programs that run on connect can rewrite the connect address. For the connect system call this isn't a problem, because a copy of the address is made when it is moved into kernel space. However, kernel_connect simply passes through the address it is given, so the caller may observe its address value unexpectedly change. A practical example where this is problematic is where NFS is combined with a system such as Cilium which implements BPF-based load balancing. A common pattern in software-defined storage systems is to have an NFS mount that connects to a persistent virtual IP which in turn maps to an ephemeral server IP. This is usually done to achieve high availability: if your server goes down you can quickly spin up a replacement and remap the virtual IP to that endpoint. With BPF-based load balancing, mounts will forget the virtual IP address when the address rewrite occurs because a pointer to the only copy of that address is passed down the stack. Server failover then breaks, because clients have forgotten the virtual IP address. Reconnects fail and mounts remain broken. This patch was tested by setting up a scenario like this and ensuring that NFS reconnects worked after applying the patch. Signed-off-by: Jordan Rife <jrife@google.com> --- V2 -> V3: Broke up long line V1 -> V2: Rebased on net-next net/socket.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)