diff mbox series

GPF in net sybsystem

Message ID 20210505200242.31d58452@gmail.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series GPF in net sybsystem | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Guessed tree name to be net-next
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cc_maintainers warning 1 maintainers not CCed: linux-hams@vger.kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff fail Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 10 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link

Commit Message

Pavel Skripkin May 5, 2021, 5:02 p.m. UTC
Hi, netdev developers!

I've spent some time debugging this bug
https://syzkaller.appspot.com/bug?id=c670fb9da2ce08f7b5101baa9426083b39ee9f90
and, I believe, I found the root case:

static int nr_accept(struct socket *sock, struct socket *newsock, int flags,
		     bool kern)
{
....
	for (;;) {
		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
		...
		if (!signal_pending(current)) {
			release_sock(sk);
			schedule();
			lock_sock(sk);
			continue;
		}
		...
	}
...
}

When calling process will be scheduled, another proccess can release
this socket and set sk->sk_wq to NULL. (In this case nr_release()
will call sock_orphan(sk)). In this case GPF will happen in
prepare_to_wait().

I came up with this patch, but im not an expect in netdev sybsystem and
im not sure about this one:


I look forward to hearing your perspective on this :)


BTW, I found similar code in:

1) net/ax25/af_ax25.c
2) net/rose/af_rose.c


I hope, this will help!

With regards,
Pavel Skripkin

Comments

Jakub Kicinski May 6, 2021, 10:09 p.m. UTC | #1
On Wed, 5 May 2021 20:02:42 +0300 Pavel Skripkin wrote:
> Hi, netdev developers!
> 
> I've spent some time debugging this bug
> https://syzkaller.appspot.com/bug?id=c670fb9da2ce08f7b5101baa9426083b39ee9f90
> and, I believe, I found the root case:
> 
> static int nr_accept(struct socket *sock, struct socket *newsock, int flags,
> 		     bool kern)
> {
> ....
> 	for (;;) {
> 		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
> 		...
> 		if (!signal_pending(current)) {
> 			release_sock(sk);
> 			schedule();
> 			lock_sock(sk);
> 			continue;
> 		}
> 		...
> 	}
> ...
> }
> 
> When calling process will be scheduled, another proccess can release
> this socket and set sk->sk_wq to NULL. (In this case nr_release()
> will call sock_orphan(sk)). In this case GPF will happen in
> prepare_to_wait().

How does it get released midway through an accept call?
Is there a reference imbalance somewhere else in the code?

> I came up with this patch, but im not an expect in netdev sybsystem and
> im not sure about this one:
> 
> diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
> index 6d16e1ab1a8a..89ceddea48e8 100644
> --- a/net/netrom/af_netrom.c
> +++ b/net/netrom/af_netrom.c
> @@ -803,6 +803,10 @@ static int nr_accept(struct socket *sock, struct socket *newsock, int flags,
>  			release_sock(sk);
>  			schedule();
>  			lock_sock(sk);
> +			if (sock_flag(sk, SOCK_DEAD)) {
> +				err = -ECONNABORTED;
> +				goto out_release;
> +			}
>  			continue;
>  		}
>  		err = -ERESTARTSYS;
> 
> I look forward to hearing your perspective on this :)
> 
> 
> BTW, I found similar code in:
> 
> 1) net/ax25/af_ax25.c
> 2) net/rose/af_rose.c
> 
> 
> I hope, this will help!
> 
> With regards,
> Pavel Skripkin
Cong Wang May 7, 2021, 12:40 a.m. UTC | #2
On Wed, May 5, 2021 at 10:36 AM Pavel Skripkin <paskripkin@gmail.com> wrote:
>
> Hi, netdev developers!
>
> I've spent some time debugging this bug
> https://syzkaller.appspot.com/bug?id=c670fb9da2ce08f7b5101baa9426083b39ee9f90
> and, I believe, I found the root case:
>
> static int nr_accept(struct socket *sock, struct socket *newsock, int flags,
>                      bool kern)
> {
> ....
>         for (;;) {
>                 prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
>                 ...
>                 if (!signal_pending(current)) {
>                         release_sock(sk);
>                         schedule();
>                         lock_sock(sk);
>                         continue;
>                 }
>                 ...
>         }
> ...
> }
>
> When calling process will be scheduled, another proccess can release
> this socket and set sk->sk_wq to NULL. (In this case nr_release()
> will call sock_orphan(sk)). In this case GPF will happen in
> prepare_to_wait().

Are you sure?

How could another process release this socket when its fd is still
refcnt'ed? That is, accept() still does not return yet at the point of
schedule().

Also, the above pattern is pretty common in networking subsystem,
see sk_wait_event(), so how come it is only problematic for netrom?

Thanks.
diff mbox series

Patch

diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 6d16e1ab1a8a..89ceddea48e8 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -803,6 +803,10 @@  static int nr_accept(struct socket *sock, struct socket *newsock, int flags,
 			release_sock(sk);
 			schedule();
 			lock_sock(sk);
+			if (sock_flag(sk, SOCK_DEAD)) {
+				err = -ECONNABORTED;
+				goto out_release;
+			}
 			continue;
 		}
 		err = -ERESTARTSYS;