diff mbox series

[net-next,v2] l2tp: fix possible UAF when cleaning up tunnels

Message ID 20240704152508.1923908-1-jchapman@katalix.com (mailing list archive)
State Accepted
Commit f8ad00f3fb2af98f29aacd7ceb4ecdd5ad3c9a7f
Delegated to: Netdev Maintainers
Headers show
Series [net-next,v2] l2tp: fix possible UAF when cleaning up tunnels | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 839 this patch: 839
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 846 this patch: 846
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 846 this patch: 846
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 24 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-07-05--06-00 (tests: 695)

Commit Message

James Chapman July 4, 2024, 3:25 p.m. UTC
syzbot reported a UAF caused by a race when the L2TP work queue closes a
tunnel at the same time as a userspace thread closes a session in that
tunnel.

Tunnel cleanup is handled by a work queue which iterates through the
sessions contained within a tunnel, and closes them in turn.

Meanwhile, a userspace thread may arbitrarily close a session via
either netlink command or by closing the pppox socket in the case of
l2tp_ppp.

The race condition may occur when l2tp_tunnel_closeall walks the list
of sessions in the tunnel and deletes each one.  Currently this is
implemented using list_for_each_safe, but because the list spinlock is
dropped in the loop body it's possible for other threads to manipulate
the list during list_for_each_safe's list walk.  This can lead to the
list iterator being corrupted, leading to list_for_each_safe spinning.
One sequence of events which may lead to this is as follows:

 * A tunnel is created, containing two sessions A and B.
 * A thread closes the tunnel, triggering tunnel cleanup via the work
   queue.
 * l2tp_tunnel_closeall runs in the context of the work queue.  It
   removes session A from the tunnel session list, then drops the list
   lock.  At this point the list_for_each_safe temporary variable is
   pointing to the other session on the list, which is session B, and
   the list can be manipulated by other threads since the list lock has
   been released.
 * Userspace closes session B, which removes the session from its parent
   tunnel via l2tp_session_delete.  Since l2tp_tunnel_closeall has
   released the tunnel list lock, l2tp_session_delete is able to call
   list_del_init on the session B list node.
 * Back on the work queue, l2tp_tunnel_closeall resumes execution and
   will now spin forever on the same list entry until the underlying
   session structure is freed, at which point UAF occurs.

The solution is to iterate over the tunnel's session list using
list_first_entry_not_null to avoid the possibility of the list
iterator pointing at a list item which may be removed during the walk.

Also, have l2tp_tunnel_closeall ref each session while it processes it
to prevent another thread from freeing it.

	cpu1				cpu2
	---				---
					pppol2tp_release()

	spin_lock_bh(&tunnel->list_lock);
	for (;;) {
		session = list_first_entry_or_null(&tunnel->session_list,
						   struct l2tp_session, list);
		if (!session)
			break;
		list_del_init(&session->list);
		spin_unlock_bh(&tunnel->list_lock);

 					l2tp_session_delete(session);

		l2tp_session_delete(session);
		spin_lock_bh(&tunnel->list_lock);
	}
	spin_unlock_bh(&tunnel->list_lock);

Calling l2tp_session_delete on the same session twice isn't a problem
per-se, but if cpu2 manages to destruct the socket and unref the
session to zero before cpu1 progresses then it would lead to UAF.

Reported-by: syzbot+b471b7c936301a59745b@syzkaller.appspotmail.com
Reported-by: syzbot+c041b4ce3a6dfd1e63e2@syzkaller.appspotmail.com
Fixes: d18d3f0a24fc ("l2tp: replace hlist with simple list for per-tunnel session list")

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: Tom Parkin <tparkin@katalix.com>

---
v2:
  - hold session ref when processing tunnel close (Hillf Danton)
v1: https://lore.kernel.org/netdev/20240703185108.1752795-1-jchapman@katalix.com/
---
 net/l2tp/l2tp_core.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

Comments

Hillf Danton July 5, 2024, 10:32 a.m. UTC | #1
On Thu,  4 Jul 2024 16:25:08 +0100 James Chapman <jchapman@katalix.com>
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1290,17 +1290,20 @@ static void l2tp_session_unhash(struct l2tp_session *session)
>  static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
>  {
>  	struct l2tp_session *session;
> -	struct list_head *pos;
> -	struct list_head *tmp;
>  
>  	spin_lock_bh(&tunnel->list_lock);
>  	tunnel->acpt_newsess = false;
> -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
> -		session = list_entry(pos, struct l2tp_session, list);
> +	for (;;) {
> +		session = list_first_entry_or_null(&tunnel->session_list,
> +						   struct l2tp_session, list);
> +		if (!session)
> +			break;
> +		l2tp_session_inc_refcount(session);
>  		list_del_init(&session->list);
>  		spin_unlock_bh(&tunnel->list_lock);
>  		l2tp_session_delete(session);
>  		spin_lock_bh(&tunnel->list_lock);
> +		l2tp_session_dec_refcount(session);

Bumping refcount up makes it safe for the current cpu to go thru race
after releasing lock, and if it wins the race, dropping refcount makes
the peer head on uaf.
James Chapman July 8, 2024, 10:06 a.m. UTC | #2
On 05/07/2024 11:32, Hillf Danton wrote:
> On Thu,  4 Jul 2024 16:25:08 +0100 James Chapman <jchapman@katalix.com>
>> --- a/net/l2tp/l2tp_core.c
>> +++ b/net/l2tp/l2tp_core.c
>> @@ -1290,17 +1290,20 @@ static void l2tp_session_unhash(struct l2tp_session *session)
>>   static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
>>   {
>>   	struct l2tp_session *session;
>> -	struct list_head *pos;
>> -	struct list_head *tmp;
>>   
>>   	spin_lock_bh(&tunnel->list_lock);
>>   	tunnel->acpt_newsess = false;
>> -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
>> -		session = list_entry(pos, struct l2tp_session, list);
>> +	for (;;) {
>> +		session = list_first_entry_or_null(&tunnel->session_list,
>> +						   struct l2tp_session, list);
>> +		if (!session)
>> +			break;
>> +		l2tp_session_inc_refcount(session);
>>   		list_del_init(&session->list);
>>   		spin_unlock_bh(&tunnel->list_lock);
>>   		l2tp_session_delete(session);
>>   		spin_lock_bh(&tunnel->list_lock);
>> +		l2tp_session_dec_refcount(session);
> 
> Bumping refcount up makes it safe for the current cpu to go thru race
> after releasing lock, and if it wins the race, dropping refcount makes
> the peer head on uaf.

Thanks for reviewing this. Can you elaborate on what you mean by "makes 
the peer head on uaf", please?
Hillf Danton July 8, 2024, 11:59 a.m. UTC | #3
On Mon, 8 Jul 2024 11:06:25 +0100 James Chapman <jchapman@katalix.com>
> On 05/07/2024 11:32, Hillf Danton wrote:
> > On Thu,  4 Jul 2024 16:25:08 +0100 James Chapman <jchapman@katalix.com>
> >> --- a/net/l2tp/l2tp_core.c
> >> +++ b/net/l2tp/l2tp_core.c
> >> @@ -1290,17 +1290,20 @@ static void l2tp_session_unhash(struct l2tp_session *session)
> >>   static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
> >>   {
> >>   	struct l2tp_session *session;
> >> -	struct list_head *pos;
> >> -	struct list_head *tmp;
> >>   
> >>   	spin_lock_bh(&tunnel->list_lock);
> >>   	tunnel->acpt_newsess = false;
> >> -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
> >> -		session = list_entry(pos, struct l2tp_session, list);
> >> +	for (;;) {
> >> +		session = list_first_entry_or_null(&tunnel->session_list,
> >> +						   struct l2tp_session, list);
> >> +		if (!session)
> >> +			break;
> >> +		l2tp_session_inc_refcount(session);
> >>   		list_del_init(&session->list);
> >>   		spin_unlock_bh(&tunnel->list_lock);
> >>   		l2tp_session_delete(session);
> >>   		spin_lock_bh(&tunnel->list_lock);
> >> +		l2tp_session_dec_refcount(session);
> > 
> > Bumping refcount up makes it safe for the current cpu to go thru race
> > after releasing lock, and if it wins the race, dropping refcount makes
> > the peer head on uaf.
> 
> Thanks for reviewing this. Can you elaborate on what you mean by "makes 
> the peer head on uaf", please?
>
Given race, there are winner and loser. If the current cpu wins the race,
the loser hits uaf once winner drops refcount.
James Chapman July 8, 2024, 1:57 p.m. UTC | #4
On 08/07/2024 12:59, Hillf Danton wrote:
> On Mon, 8 Jul 2024 11:06:25 +0100 James Chapman <jchapman@katalix.com>
>> On 05/07/2024 11:32, Hillf Danton wrote:
>>> On Thu,  4 Jul 2024 16:25:08 +0100 James Chapman <jchapman@katalix.com>
>>>> --- a/net/l2tp/l2tp_core.c
>>>> +++ b/net/l2tp/l2tp_core.c
>>>> @@ -1290,17 +1290,20 @@ static void l2tp_session_unhash(struct l2tp_session *session)
>>>>    static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
>>>>    {
>>>>    	struct l2tp_session *session;
>>>> -	struct list_head *pos;
>>>> -	struct list_head *tmp;
>>>>    
>>>>    	spin_lock_bh(&tunnel->list_lock);
>>>>    	tunnel->acpt_newsess = false;
>>>> -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
>>>> -		session = list_entry(pos, struct l2tp_session, list);
>>>> +	for (;;) {
>>>> +		session = list_first_entry_or_null(&tunnel->session_list,
>>>> +						   struct l2tp_session, list);
>>>> +		if (!session)
>>>> +			break;
>>>> +		l2tp_session_inc_refcount(session);
>>>>    		list_del_init(&session->list);
>>>>    		spin_unlock_bh(&tunnel->list_lock);
>>>>    		l2tp_session_delete(session);
>>>>    		spin_lock_bh(&tunnel->list_lock);
>>>> +		l2tp_session_dec_refcount(session);
>>>
>>> Bumping refcount up makes it safe for the current cpu to go thru race
>>> after releasing lock, and if it wins the race, dropping refcount makes
>>> the peer head on uaf.
>>
>> Thanks for reviewing this. Can you elaborate on what you mean by "makes
>> the peer head on uaf", please?
>>
> Given race, there are winner and loser. If the current cpu wins the race,
> the loser hits uaf once winner drops refcount.

I think the session's dead flag would protect against threads racing in 
l2tp_session_delete to delete the same session.
Any thread with a pointer to a session should hold a reference on it to 
prevent the session going away while it is accessed. Am I missing a 
codepath where that's not the case?
Paolo Abeni July 9, 2024, 9:03 a.m. UTC | #5
On Mon, 2024-07-08 at 14:57 +0100, James Chapman wrote:
> On 08/07/2024 12:59, Hillf Danton wrote:
> > On Mon, 8 Jul 2024 11:06:25 +0100 James Chapman <jchapman@katalix.com>
> > > On 05/07/2024 11:32, Hillf Danton wrote:
> > > > On Thu,  4 Jul 2024 16:25:08 +0100 James Chapman <jchapman@katalix.com>
> > > > > --- a/net/l2tp/l2tp_core.c
> > > > > +++ b/net/l2tp/l2tp_core.c
> > > > > @@ -1290,17 +1290,20 @@ static void l2tp_session_unhash(struct l2tp_session *session)
> > > > >    static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
> > > > >    {
> > > > >    	struct l2tp_session *session;
> > > > > -	struct list_head *pos;
> > > > > -	struct list_head *tmp;
> > > > >    
> > > > >    	spin_lock_bh(&tunnel->list_lock);
> > > > >    	tunnel->acpt_newsess = false;
> > > > > -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
> > > > > -		session = list_entry(pos, struct l2tp_session, list);
> > > > > +	for (;;) {
> > > > > +		session = list_first_entry_or_null(&tunnel->session_list,
> > > > > +						   struct l2tp_session, list);
> > > > > +		if (!session)
> > > > > +			break;
> > > > > +		l2tp_session_inc_refcount(session);
> > > > >    		list_del_init(&session->list);
> > > > >    		spin_unlock_bh(&tunnel->list_lock);
> > > > >    		l2tp_session_delete(session);
> > > > >    		spin_lock_bh(&tunnel->list_lock);
> > > > > +		l2tp_session_dec_refcount(session);
> > > > 
> > > > Bumping refcount up makes it safe for the current cpu to go thru race
> > > > after releasing lock, and if it wins the race, dropping refcount makes
> > > > the peer head on uaf.
> > > 
> > > Thanks for reviewing this. Can you elaborate on what you mean by "makes
> > > the peer head on uaf", please?
> > > 
> > Given race, there are winner and loser. If the current cpu wins the race,
> > the loser hits uaf once winner drops refcount.
> 
> I think the session's dead flag would protect against threads racing in 
> l2tp_session_delete to delete the same session.
> Any thread with a pointer to a session should hold a reference on it to 
> prevent the session going away while it is accessed. Am I missing a 
> codepath where that's not the case?

AFAICS this patch is safe, as the session refcount can't be 0 at
l2tp_session_inc_refcount() time and will drop to 0 after
l2tp_session_dec_refcount() only if no other entity/thread is owning
any reference to the session.

@James: the patch has a formal issue, you should avoid any empty line
in the tag area, specifically between the 'Fixes' and SoB tags.

I'll exceptionally fix this while applying the patch, but please run
checkpatch before your next submission.

Also somewhat related, I think there is still a race condition in
l2tp_tunnel_get_session():

	rcu_read_lock_bh();
        hlist_for_each_entry_rcu(session, session_list, hlist)
                if (session->session_id == session_id) {
                        l2tp_session_inc_refcount(session);

I think that at l2tp_session_inc_refcount(), the session refcount could
be 0 due to a concurrent tunnel cleanup. l2tp_session_inc_refcount()
should likely be refcount_inc_not_zero() and the caller should check
the return value.

In any case the latter is a separate issue.

Thanks,

Paolo
James Chapman July 9, 2024, 9:29 a.m. UTC | #6
On 09/07/2024 10:03, Paolo Abeni wrote:
[snip]
> AFAICS this patch is safe, as the session refcount can't be 0 at
> l2tp_session_inc_refcount() time and will drop to 0 after
> l2tp_session_dec_refcount() only if no other entity/thread is owning
> any reference to the session.
> 
> @James: the patch has a formal issue, you should avoid any empty line
> in the tag area, specifically between the 'Fixes' and SoB tags.
> 
> I'll exceptionally fix this while applying the patch, but please run
> checkpatch before your next submission.

Thanks Paolo. Will do. I'll be more careful next time.

> Also somewhat related, I think there is still a race condition in
> l2tp_tunnel_get_session():
> 
> 	rcu_read_lock_bh();
>          hlist_for_each_entry_rcu(session, session_list, hlist)
>                  if (session->session_id == session_id) {
>                          l2tp_session_inc_refcount(session);
> 
> I think that at l2tp_session_inc_refcount(), the session refcount could
> be 0 due to a concurrent tunnel cleanup. l2tp_session_inc_refcount()
> should likely be refcount_inc_not_zero() and the caller should check
> the return value.
> 
> In any case the latter is a separate issue.

I'm currently working on another series which will address this along 
with more l2tp cleanup improvements.
patchwork-bot+netdevbpf@kernel.org July 9, 2024, 9:40 a.m. UTC | #7
Hello:

This patch was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Thu,  4 Jul 2024 16:25:08 +0100 you wrote:
> syzbot reported a UAF caused by a race when the L2TP work queue closes a
> tunnel at the same time as a userspace thread closes a session in that
> tunnel.
> 
> Tunnel cleanup is handled by a work queue which iterates through the
> sessions contained within a tunnel, and closes them in turn.
> 
> [...]

Here is the summary with links:
  - [net-next,v2] l2tp: fix possible UAF when cleaning up tunnels
    https://git.kernel.org/netdev/net-next/c/f8ad00f3fb2a

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 64f446f0930b..2790a51e59e3 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1290,17 +1290,20 @@  static void l2tp_session_unhash(struct l2tp_session *session)
 static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
 {
 	struct l2tp_session *session;
-	struct list_head *pos;
-	struct list_head *tmp;
 
 	spin_lock_bh(&tunnel->list_lock);
 	tunnel->acpt_newsess = false;
-	list_for_each_safe(pos, tmp, &tunnel->session_list) {
-		session = list_entry(pos, struct l2tp_session, list);
+	for (;;) {
+		session = list_first_entry_or_null(&tunnel->session_list,
+						   struct l2tp_session, list);
+		if (!session)
+			break;
+		l2tp_session_inc_refcount(session);
 		list_del_init(&session->list);
 		spin_unlock_bh(&tunnel->list_lock);
 		l2tp_session_delete(session);
 		spin_lock_bh(&tunnel->list_lock);
+		l2tp_session_dec_refcount(session);
 	}
 	spin_unlock_bh(&tunnel->list_lock);
 }