diff mbox series

[v2,1/8] net: explicitly clear the sk pointer, when pf->create fails

Message ID 20241007213502.28183-2-ignat@cloudflare.com (mailing list archive)
State Handled Elsewhere
Headers show
Series do not leave dangling sk pointers in pf->create functions | expand

Commit Message

Ignat Korchagin Oct. 7, 2024, 9:34 p.m. UTC
We have recently noticed the exact same KASAN splat as in commit
6cd4a78d962b ("net: do not leave a dangling sk pointer, when socket
creation fails"). The problem is that commit did not fully address the
problem, as some pf->create implementations do not use sk_common_release
in their error paths.

For example, we can use the same reproducer as in the above commit, but
changing ping to arping. arping uses AF_PACKET socket and if packet_create
fails, it will just sk_free the allocated sk object.

While we could chase all the pf->create implementations and make sure they
NULL the freed sk object on error from the socket, we can't guarantee
future protocols will not make the same mistake.

So it is easier to just explicitly NULL the sk pointer upon return from
pf->create in __sock_create. We do know that pf->create always releases the
allocated sk object on error, so if the pointer is not NULL, it is
definitely dangling.

Fixes: 6cd4a78d962b ("net: do not leave a dangling sk pointer, when socket creation fails")
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Cc: stable@vger.kernel.org
---
 net/core/sock.c | 3 ---
 net/socket.c    | 7 ++++++-
 2 files changed, 6 insertions(+), 4 deletions(-)

Comments

Kuniyuki Iwashima Oct. 7, 2024, 9:47 p.m. UTC | #1
From: Ignat Korchagin <ignat@cloudflare.com>
Date: Mon,  7 Oct 2024 22:34:55 +0100
> We have recently noticed the exact same KASAN splat as in commit
> 6cd4a78d962b ("net: do not leave a dangling sk pointer, when socket
> creation fails"). The problem is that commit did not fully address the
> problem, as some pf->create implementations do not use sk_common_release
> in their error paths.
> 
> For example, we can use the same reproducer as in the above commit, but
> changing ping to arping. arping uses AF_PACKET socket and if packet_create
> fails, it will just sk_free the allocated sk object.
> 
> While we could chase all the pf->create implementations and make sure they
> NULL the freed sk object on error from the socket, we can't guarantee
> future protocols will not make the same mistake.
> 
> So it is easier to just explicitly NULL the sk pointer upon return from
> pf->create in __sock_create. We do know that pf->create always releases the
> allocated sk object on error, so if the pointer is not NULL, it is
> definitely dangling.
> 
> Fixes: 6cd4a78d962b ("net: do not leave a dangling sk pointer, when socket creation fails")
> Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Jakub Kicinski Oct. 8, 2024, 12:20 a.m. UTC | #2
On Mon,  7 Oct 2024 22:34:55 +0100 Ignat Korchagin wrote:
> diff --git a/net/socket.c b/net/socket.c
> index 601ad74930ef..042451f01c65 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -1574,8 +1574,13 @@ int __sock_create(struct net *net, int family, int type, int protocol,
>  	rcu_read_unlock();
>  
>  	err = pf->create(net, sock, protocol, kern);
> -	if (err < 0)
> +	if (err < 0) {
> +		/* ->create should release the allocated sock->sk object on error
> +		 * but it may leave the dangling pointer
> +		 */
> +		sock->sk = NULL;
>  		goto out_module_put;
> +	}

This chunk is already in net, as part of the fix you posted earlier.
Please resend the cleanup portion with the other patches for net-next
on Friday (IOW after net -> net-next merge).
diff mbox series

Patch

diff --git a/net/core/sock.c b/net/core/sock.c
index 039be95c40cf..e6e04081949c 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3819,9 +3819,6 @@  void sk_common_release(struct sock *sk)
 
 	sk->sk_prot->unhash(sk);
 
-	if (sk->sk_socket)
-		sk->sk_socket->sk = NULL;
-
 	/*
 	 * In this point socket cannot receive new packets, but it is possible
 	 * that some packets are in flight because some CPU runs receiver and
diff --git a/net/socket.c b/net/socket.c
index 601ad74930ef..042451f01c65 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1574,8 +1574,13 @@  int __sock_create(struct net *net, int family, int type, int protocol,
 	rcu_read_unlock();
 
 	err = pf->create(net, sock, protocol, kern);
-	if (err < 0)
+	if (err < 0) {
+		/* ->create should release the allocated sock->sk object on error
+		 * but it may leave the dangling pointer
+		 */
+		sock->sk = NULL;
 		goto out_module_put;
+	}
 
 	/*
 	 * Now to bump the refcnt of the [loadable] module that owns this