diff mbox series

[net-next,v3] net: Avoid address overwrite in kernel_connect

Message ID 20230821214523.720206-1-jrife@google.com (mailing list archive)
State Accepted
Commit 0bdf399342c5acbd817c9098b6c7ed21f1974312
Delegated to: Netdev Maintainers
Headers show
Series [net-next,v3] net: Avoid address overwrite in kernel_connect | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1333 this patch: 1333
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 1353 this patch: 1353
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1356 this patch: 1356
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 13 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Jordan Rife Aug. 21, 2023, 9:45 p.m. UTC
BPF programs that run on connect can rewrite the connect address. For
the connect system call this isn't a problem, because a copy of the address
is made when it is moved into kernel space. However, kernel_connect
simply passes through the address it is given, so the caller may observe
its address value unexpectedly change.

A practical example where this is problematic is where NFS is combined
with a system such as Cilium which implements BPF-based load balancing.
A common pattern in software-defined storage systems is to have an NFS
mount that connects to a persistent virtual IP which in turn maps to an
ephemeral server IP. This is usually done to achieve high availability:
if your server goes down you can quickly spin up a replacement and remap
the virtual IP to that endpoint. With BPF-based load balancing, mounts
will forget the virtual IP address when the address rewrite occurs
because a pointer to the only copy of that address is passed down the
stack. Server failover then breaks, because clients have forgotten the
virtual IP address. Reconnects fail and mounts remain broken. This patch
was tested by setting up a scenario like this and ensuring that NFS
reconnects worked after applying the patch.

Signed-off-by: Jordan Rife <jrife@google.com>
---
V2 -> V3: Broke up long line
V1 -> V2: Rebased on net-next

 net/socket.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

patchwork-bot+netdevbpf@kernel.org Aug. 23, 2023, 8:50 a.m. UTC | #1
Hello:

This patch was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:

On Mon, 21 Aug 2023 16:45:23 -0500 you wrote:
> BPF programs that run on connect can rewrite the connect address. For
> the connect system call this isn't a problem, because a copy of the address
> is made when it is moved into kernel space. However, kernel_connect
> simply passes through the address it is given, so the caller may observe
> its address value unexpectedly change.
> 
> A practical example where this is problematic is where NFS is combined
> with a system such as Cilium which implements BPF-based load balancing.
> A common pattern in software-defined storage systems is to have an NFS
> mount that connects to a persistent virtual IP which in turn maps to an
> ephemeral server IP. This is usually done to achieve high availability:
> if your server goes down you can quickly spin up a replacement and remap
> the virtual IP to that endpoint. With BPF-based load balancing, mounts
> will forget the virtual IP address when the address rewrite occurs
> because a pointer to the only copy of that address is passed down the
> stack. Server failover then breaks, because clients have forgotten the
> virtual IP address. Reconnects fail and mounts remain broken. This patch
> was tested by setting up a scenario like this and ensuring that NFS
> reconnects worked after applying the patch.
> 
> [...]

Here is the summary with links:
  - [net-next,v3] net: Avoid address overwrite in kernel_connect
    https://git.kernel.org/netdev/net-next/c/0bdf399342c5

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/socket.c b/net/socket.c
index fdb5233bf560c..848116d06b511 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3567,7 +3567,12 @@  EXPORT_SYMBOL(kernel_accept);
 int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
 		   int flags)
 {
-	return READ_ONCE(sock->ops)->connect(sock, addr, addrlen, flags);
+	struct sockaddr_storage address;
+
+	memcpy(&address, addr, addrlen);
+
+	return READ_ONCE(sock->ops)->connect(sock, (struct sockaddr *)&address,
+					     addrlen, flags);
 }
 EXPORT_SYMBOL(kernel_connect);