diff mbox series

[RFC,net,1/2] datagram: Rehash sockets only if local address changed for their family

Message ID 20241114215414.3357873-2-sbrivio@redhat.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series Fix race between datagram socket address change and rehash | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 4 maintainers not CCed: kuba@kernel.org dsahern@kernel.org horms@kernel.org pabeni@redhat.com
netdev/build_clang success Errors and warnings before: 3 this patch: 3
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 4 this patch: 4
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 16 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Stefano Brivio Nov. 14, 2024, 9:54 p.m. UTC
It makes no sense to rehash an IPv4 socket when we change
sk_v6_rcv_saddr, or to rehash an IPv6 socket as inet_rcv_saddr is set:
the secondary hash (including the local address) won't change, because
ipv4_portaddr_hash() and ipv6_portaddr_hash() only take the address
matching the socket family.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 net/ipv4/datagram.c | 2 +-
 net/ipv6/datagram.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Willem de Bruijn Nov. 15, 2024, 5:48 p.m. UTC | #1
Stefano Brivio wrote:
> It makes no sense to rehash an IPv4 socket when we change
> sk_v6_rcv_saddr, or to rehash an IPv6 socket as inet_rcv_saddr is set:
> the secondary hash (including the local address) won't change, because
> ipv4_portaddr_hash() and ipv6_portaddr_hash() only take the address
> matching the socket family.

Even if this is correct, it sounds like an optimization.
If so, it belongs in net-next.

Avoid making a fix (to net and eventually stable kernels) conditional
on optimizations that are not suitable for stable cherry-picks.

> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  net/ipv4/datagram.c | 2 +-
>  net/ipv6/datagram.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
> index cc6d0bd7b0a9..d52333e921f3 100644
> --- a/net/ipv4/datagram.c
> +++ b/net/ipv4/datagram.c
> @@ -65,7 +65,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
>  		inet->inet_saddr = fl4->saddr;	/* Update source address */
>  	if (!inet->inet_rcv_saddr) {
>  		inet->inet_rcv_saddr = fl4->saddr;
> -		if (sk->sk_prot->rehash)
> +		if (sk->sk_prot->rehash && sk->sk_family == AF_INET)
>  			sk->sk_prot->rehash(sk);

When is sk_family != AF_INET in __ip4_datagram_connect?


>  	}
>  	inet->inet_daddr = fl4->daddr;
> diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
> index fff78496803d..5c28a11128c7 100644
> --- a/net/ipv6/datagram.c
> +++ b/net/ipv6/datagram.c
> @@ -211,7 +211,7 @@ int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr,
>  		    ipv6_mapped_addr_any(&sk->sk_v6_rcv_saddr)) {
>  			ipv6_addr_set_v4mapped(inet->inet_rcv_saddr,
>  					       &sk->sk_v6_rcv_saddr);
> -			if (sk->sk_prot->rehash)
> +			if (sk->sk_prot->rehash && sk->sk_family == AF_INET6)
>  				sk->sk_prot->rehash(sk);

Even if this is a v4mappedv6 address, sk_family will be AF_INET6.
Stefano Brivio Nov. 15, 2024, 6:10 p.m. UTC | #2
[Updated Mike Manning's address in Cc:]

On Fri, 15 Nov 2024 12:48:29 -0500
Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:

> Stefano Brivio wrote:
> > It makes no sense to rehash an IPv4 socket when we change
> > sk_v6_rcv_saddr, or to rehash an IPv6 socket as inet_rcv_saddr is set:
> > the secondary hash (including the local address) won't change, because
> > ipv4_portaddr_hash() and ipv6_portaddr_hash() only take the address
> > matching the socket family.  
> 
> Even if this is correct, it sounds like an optimization.

It is, see the cover letter.

> If so, it belongs in net-next.

Well, it makes the fix smaller.

> Avoid making a fix (to net and eventually stable kernels) conditional
> on optimizations that are not suitable for stable cherry-picks.

Given that the fix is for an issue that existed for 15 years, I don't
think it's stable material.

Whether it's 'net' material is also debatable, if it looks too big to
you it probably isn't, let's go for net-next even if it's a fix.

> > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > ---
> >  net/ipv4/datagram.c | 2 +-
> >  net/ipv6/datagram.c | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
> > index cc6d0bd7b0a9..d52333e921f3 100644
> > --- a/net/ipv4/datagram.c
> > +++ b/net/ipv4/datagram.c
> > @@ -65,7 +65,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
> >  		inet->inet_saddr = fl4->saddr;	/* Update source address */
> >  	if (!inet->inet_rcv_saddr) {
> >  		inet->inet_rcv_saddr = fl4->saddr;
> > -		if (sk->sk_prot->rehash)
> > +		if (sk->sk_prot->rehash && sk->sk_family == AF_INET)
> >  			sk->sk_prot->rehash(sk);  
> 
> When is sk_family != AF_INET in __ip4_datagram_connect?

This happens with dual-stack sockets, that is, IPv6 sockets that don't
have IPV6_V6ONLY set, on which you connect() using an IPv4 address.

I haven't checked whether this makes sense in the bigger picture,
because trying to avoid this case is definitely beyond the scope of this
patch, but you can make it happen quite easily by simply starting a
recent Debian or Fedora with OpenSSH listening on both families
(default settings).

> 
> >  	}
> >  	inet->inet_daddr = fl4->daddr;
> > diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
> > index fff78496803d..5c28a11128c7 100644
> > --- a/net/ipv6/datagram.c
> > +++ b/net/ipv6/datagram.c
> > @@ -211,7 +211,7 @@ int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr,
> >  		    ipv6_mapped_addr_any(&sk->sk_v6_rcv_saddr)) {
> >  			ipv6_addr_set_v4mapped(inet->inet_rcv_saddr,
> >  					       &sk->sk_v6_rcv_saddr);
> > -			if (sk->sk_prot->rehash)
> > +			if (sk->sk_prot->rehash && sk->sk_family == AF_INET6)
> >  				sk->sk_prot->rehash(sk);  
> 
> Even if this is a v4mappedv6 address, sk_family will be AF_INET6.

I think we could have a case that's symmetric to the one above, even
though I haven't reproduced this one (but I didn't try hard).
Stefano Brivio Nov. 15, 2024, 6:23 p.m. UTC | #3
On Fri, 15 Nov 2024 19:10:24 +0100
Stefano Brivio <sbrivio@redhat.com> wrote:

> [Updated Mike Manning's address in Cc:]
> 
> On Fri, 15 Nov 2024 12:48:29 -0500
> Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> 
> > Stefano Brivio wrote:  
> > > It makes no sense to rehash an IPv4 socket when we change
> > > sk_v6_rcv_saddr, or to rehash an IPv6 socket as inet_rcv_saddr is set:
> > > the secondary hash (including the local address) won't change, because
> > > ipv4_portaddr_hash() and ipv6_portaddr_hash() only take the address
> > > matching the socket family.    
> > 
> > Even if this is correct, it sounds like an optimization.  
> 
> It is, see the cover letter.
> 
> > If so, it belongs in net-next.  
> 
> Well, it makes the fix smaller.
> 
> > Avoid making a fix (to net and eventually stable kernels) conditional
> > on optimizations that are not suitable for stable cherry-picks.  
> 
> Given that the fix is for an issue that existed for 15 years, I don't
> think it's stable material.
> 
> Whether it's 'net' material is also debatable, if it looks too big to
> you it probably isn't, let's go for net-next even if it's a fix.
> 
> > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > > ---
> > >  net/ipv4/datagram.c | 2 +-
> > >  net/ipv6/datagram.c | 2 +-
> > >  2 files changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
> > > index cc6d0bd7b0a9..d52333e921f3 100644
> > > --- a/net/ipv4/datagram.c
> > > +++ b/net/ipv4/datagram.c
> > > @@ -65,7 +65,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
> > >  		inet->inet_saddr = fl4->saddr;	/* Update source address */
> > >  	if (!inet->inet_rcv_saddr) {
> > >  		inet->inet_rcv_saddr = fl4->saddr;
> > > -		if (sk->sk_prot->rehash)
> > > +		if (sk->sk_prot->rehash && sk->sk_family == AF_INET)
> > >  			sk->sk_prot->rehash(sk);    
> > 
> > When is sk_family != AF_INET in __ip4_datagram_connect?  
> 
> This happens with dual-stack sockets, that is, IPv6 sockets that don't
> have IPV6_V6ONLY set, on which you connect() using an IPv4 address.
> 
> I haven't checked whether this makes sense in the bigger picture,
> because trying to avoid this case is definitely beyond the scope of this
> patch, but you can make it happen quite easily by simply starting a
> recent Debian or Fedora with OpenSSH listening on both families
> (default settings).

Ah, sorry, it's the other way around: the v4 rehash is called on a
AF_INET6 socket in that case.

I can have a look at what I can reproduce with several combinations,
even though I wonder if it isn't just more robust this way (given 2/2).
Stefano Brivio Nov. 19, 2024, 12:33 p.m. UTC | #4
On Fri, 15 Nov 2024 19:23:42 +0100
Stefano Brivio <sbrivio@redhat.com> wrote:

> > On Fri, 15 Nov 2024 12:48:29 -0500
> > Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> >   
> > > Stefano Brivio wrote:    
> > >
> > > [...]
> > >
> > > > diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
> > > > index cc6d0bd7b0a9..d52333e921f3 100644
> > > > --- a/net/ipv4/datagram.c
> > > > +++ b/net/ipv4/datagram.c
> > > > @@ -65,7 +65,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
> > > >  		inet->inet_saddr = fl4->saddr;	/* Update source address */
> > > >  	if (!inet->inet_rcv_saddr) {
> > > >  		inet->inet_rcv_saddr = fl4->saddr;
> > > > -		if (sk->sk_prot->rehash)
> > > > +		if (sk->sk_prot->rehash && sk->sk_family == AF_INET)
> > > >  			sk->sk_prot->rehash(sk);      
> > > 
> > > When is sk_family != AF_INET in __ip4_datagram_connect?    

So, this is the only "mismatching" case (by design) I can actually
reproduce. Long story short:

int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr,
			   int addr_len)
{

	[...]

	if (usin->sin6_family == AF_INET) {
		if (ipv6_only_sock(sk))
			return -EAFNOSUPPORT;
		err = __ip4_datagram_connect(sk, uaddr, addr_len);

		[...]

here we (intentionally) call __ip4_datagram_connect() on an AF_INET6
socket because we're connecting a dual-stack socket to an IPv4 address.

This happens for me with sshd (from OpenSSH) doing getaddrinfo() at
boot, it's some DNS stuff, but I didn't trace it all the way. You can
also reproduce it with:

  socat UDP6-LISTEN:1337,null-eof STDOUT & { sleep 1; : | socat STDIN UDP4:0:1337,shut-null; }

All in all, I would keep those checks. Even if this is the only case
we currently see, the assumptions __ip4_datagram_connect() <-> AF_INET
and __ip6_datagram_connect() <-> AF_INET6 don't hold in general.

Or do you find them exceedingly verbose / harmful for any other reason?

I would also make it clearer in the commit message why we need them
in the next patch (once net-next reopens).
Willem de Bruijn Nov. 19, 2024, 2:54 p.m. UTC | #5
Stefano Brivio wrote:
> On Fri, 15 Nov 2024 19:23:42 +0100
> Stefano Brivio <sbrivio@redhat.com> wrote:
> 
> > > On Fri, 15 Nov 2024 12:48:29 -0500
> > > Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> > >   
> > > > Stefano Brivio wrote:    
> > > >
> > > > [...]
> > > >
> > > > > diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
> > > > > index cc6d0bd7b0a9..d52333e921f3 100644
> > > > > --- a/net/ipv4/datagram.c
> > > > > +++ b/net/ipv4/datagram.c
> > > > > @@ -65,7 +65,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
> > > > >  		inet->inet_saddr = fl4->saddr;	/* Update source address */
> > > > >  	if (!inet->inet_rcv_saddr) {
> > > > >  		inet->inet_rcv_saddr = fl4->saddr;
> > > > > -		if (sk->sk_prot->rehash)
> > > > > +		if (sk->sk_prot->rehash && sk->sk_family == AF_INET)
> > > > >  			sk->sk_prot->rehash(sk);      
> > > > 
> > > > When is sk_family != AF_INET in __ip4_datagram_connect?    
> 
> So, this is the only "mismatching" case (by design) I can actually
> reproduce. Long story short:
> 
> int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr,
> 			   int addr_len)
> {
> 
> 	[...]
> 
> 	if (usin->sin6_family == AF_INET) {
> 		if (ipv6_only_sock(sk))
> 			return -EAFNOSUPPORT;
> 		err = __ip4_datagram_connect(sk, uaddr, addr_len);
> 
> 		[...]
> 
> here we (intentionally) call __ip4_datagram_connect() on an AF_INET6
> socket because we're connecting a dual-stack socket to an IPv4 address.
> 
> This happens for me with sshd (from OpenSSH) doing getaddrinfo() at
> boot, it's some DNS stuff, but I didn't trace it all the way. You can
> also reproduce it with:
> 
>   socat UDP6-LISTEN:1337,null-eof STDOUT & { sleep 1; : | socat STDIN UDP4:0:1337,shut-null; }
> 
> All in all, I would keep those checks. Even if this is the only case
> we currently see, the assumptions __ip4_datagram_connect() <-> AF_INET
> and __ip6_datagram_connect() <-> AF_INET6 don't hold in general.
> 
> Or do you find them exceedingly verbose / harmful for any other reason?
> 
> I would also make it clearer in the commit message why we need them
> in the next patch (once net-next reopens).

Thanks for that context. Sounds good to me.
diff mbox series

Patch

diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
index cc6d0bd7b0a9..d52333e921f3 100644
--- a/net/ipv4/datagram.c
+++ b/net/ipv4/datagram.c
@@ -65,7 +65,7 @@  int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
 		inet->inet_saddr = fl4->saddr;	/* Update source address */
 	if (!inet->inet_rcv_saddr) {
 		inet->inet_rcv_saddr = fl4->saddr;
-		if (sk->sk_prot->rehash)
+		if (sk->sk_prot->rehash && sk->sk_family == AF_INET)
 			sk->sk_prot->rehash(sk);
 	}
 	inet->inet_daddr = fl4->daddr;
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index fff78496803d..5c28a11128c7 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -211,7 +211,7 @@  int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr,
 		    ipv6_mapped_addr_any(&sk->sk_v6_rcv_saddr)) {
 			ipv6_addr_set_v4mapped(inet->inet_rcv_saddr,
 					       &sk->sk_v6_rcv_saddr);
-			if (sk->sk_prot->rehash)
+			if (sk->sk_prot->rehash && sk->sk_family == AF_INET6)
 				sk->sk_prot->rehash(sk);
 		}