[net,1/2] net/tls: Replace TLS_RX_SYNC_RUNNING with RCU

Message ID	20210524121220.1577321-2-maximmi@nvidia.com (mailing list archive)
State	Superseded
Delegated to:	Netdev Maintainers
Headers	show Return-Path: <netdev-owner@kernel.org> Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.32 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.32; helo=mail.nvidia.com; From: Maxim Mikityanskiy <maximmi@nvidia.com> To: Boris Pismenny <borisp@nvidia.com>, John Fastabend <john.fastabend@gmail.com>, Daniel Borkmann <daniel@iogearbox.net>, "Jakub Kicinski" <kuba@kernel.org>, "David S. Miller" <davem@davemloft.net>, "Aviad Yehezkel" <aviadye@nvidia.com> CC: Tariq Toukan <tariqt@nvidia.com>, <netdev@vger.kernel.org>, "Maxim Mikityanskiy" <maximmi@nvidia.com> Subject: [PATCH net 1/2] net/tls: Replace TLS_RX_SYNC_RUNNING with RCU Date: Mon, 24 May 2021 15:12:19 +0300 Message-ID: <20210524121220.1577321-2-maximmi@nvidia.com> In-Reply-To: <20210524121220.1577321-1-maximmi@nvidia.com> References: <20210524121220.1577321-1-maximmi@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: bulk
Series	Fix use-after-free after the TLS device goes down and up \| expand [net,0/2] Fix use-after-free after the TLS device goes down and up [net,1/2] net/tls: Replace TLS_RX_SYNC_RUNNING with RCU [net,2/2] net/tls: Fix use-after-free after the TLS device goes down and up

Message ID

20210524121220.1577321-2-maximmi@nvidia.com (mailing list archive)

State

Superseded

Delegated to:

Netdev Maintainers

Headers

Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates
 216.228.112.32 as permitted sender) receiver=protection.outlook.com;
 client-ip=216.228.112.32; helo=mail.nvidia.com;
From: Maxim Mikityanskiy <maximmi@nvidia.com>
To: Boris Pismenny <borisp@nvidia.com>,
        John Fastabend <john.fastabend@gmail.com>,
        Daniel Borkmann <daniel@iogearbox.net>,
        "Jakub Kicinski" <kuba@kernel.org>,
        "David S. Miller" <davem@davemloft.net>,
        "Aviad Yehezkel" <aviadye@nvidia.com>
CC: Tariq Toukan <tariqt@nvidia.com>, <netdev@vger.kernel.org>,
        "Maxim Mikityanskiy" <maximmi@nvidia.com>
Subject: [PATCH net 1/2] net/tls: Replace TLS_RX_SYNC_RUNNING with RCU
Date: Mon, 24 May 2021 15:12:19 +0300
Message-ID: <20210524121220.1577321-2-maximmi@nvidia.com>
In-Reply-To: <20210524121220.1577321-1-maximmi@nvidia.com>
References: <20210524121220.1577321-1-maximmi@nvidia.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 24 May 2021 12:12:31.5706
 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 
 0bce038b-a7d7-4c84-607c-08d91ead34fc
X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: 
 TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.112.32];Helo=[mail.nvidia.com]
X-MS-Exchange-CrossTenant-AuthSource: 
 CO1NAM11FT054.eop-nam11.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW3PR12MB4409
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org
X-Patchwork-Delegate: kuba@kernel.org

Series

Fix use-after-free after the TLS device goes down and up | expand

Checks

Context	Check	Description
netdev/cover_letter	success	Link
netdev/fixes_present	success	Link
netdev/patch_count	success	Link
netdev/tree_selection	success	Clearly marked for net
netdev/subject_prefix	success	Link
netdev/cc_maintainers	success	CCed 6 of 6 maintainers
netdev/source_inline	success	Was 0 now: 0
netdev/verify_signedoff	success	Link
netdev/module_param	success	Was 0 now: 0
netdev/build_32bit	success	Errors and warnings before: 54 this patch: 54
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/verify_fixes	success	Link
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 34 lines checked
netdev/build_allmodconfig_warn	success	Errors and warnings before: 54 this patch: 54
netdev/header_inline	success	Link

Context

Check

Description

netdev/cover_letter

success

Link

netdev/fixes_present

success

Link

netdev/patch_count

success

Link

netdev/tree_selection

success

Clearly marked for net

netdev/subject_prefix

success

Link

netdev/cc_maintainers

success

CCed 6 of 6 maintainers

netdev/source_inline

success

Was 0 now: 0

netdev/verify_signedoff

success

Link

netdev/module_param

success

Was 0 now: 0

netdev/build_32bit

success

Errors and warnings before: 54 this patch: 54

netdev/kdoc

success

Errors and warnings before: 0 this patch: 0

netdev/verify_fixes

success

Link

netdev/checkpatch

success

total: 0 errors, 0 warnings, 0 checks, 34 lines checked

netdev/build_allmodconfig_warn

success

Errors and warnings before: 54 this patch: 54

netdev/header_inline

success

Link

Commit Message

Maxim Mikityanskiy May 24, 2021, 12:12 p.m. UTC

RCU synchronization is guaranteed to finish in finite time, unlike a
busy loop that polls a flag. This patch is a preparation for the bugfix
in the next patch, where the same synchronize_net() call will also be
used to sync with the TX datapath.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
---
 include/net/tls.h    |  1 -
 net/tls/tls_device.c | 10 +++-------
 2 files changed, 3 insertions(+), 8 deletions(-)

Comments

Jakub Kicinski May 24, 2021, 4:05 p.m. UTC | #1

On Mon, 24 May 2021 15:12:19 +0300 Maxim Mikityanskiy wrote:
> RCU synchronization is guaranteed to finish in finite time, unlike a
> busy loop that polls a flag. This patch is a preparation for the bugfix
> in the next patch, where the same synchronize_net() call will also be
> used to sync with the TX datapath.
> 
> Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> ---
>  include/net/tls.h    |  1 -
>  net/tls/tls_device.c | 10 +++-------
>  2 files changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/include/net/tls.h b/include/net/tls.h
> index 3eccb525e8f7..6531ace2a68b 100644
> --- a/include/net/tls.h
> +++ b/include/net/tls.h
> @@ -193,7 +193,6 @@ struct tls_offload_context_tx {
>  	(sizeof(struct tls_offload_context_tx) + TLS_DRIVER_STATE_SIZE_TX)
>  
>  enum tls_context_flags {
> -	TLS_RX_SYNC_RUNNING = 0,
>  	/* Unlike RX where resync is driven entirely by the core in TX only
>  	 * the driver knows when things went out of sync, so we need the flag
>  	 * to be atomic.
> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> index 76a6f8c2eec4..171752cd6910 100644
> --- a/net/tls/tls_device.c
> +++ b/net/tls/tls_device.c
> @@ -680,15 +680,13 @@ static void tls_device_resync_rx(struct tls_context *tls_ctx,
>  	struct tls_offload_context_rx *rx_ctx = tls_offload_ctx_rx(tls_ctx);
>  	struct net_device *netdev;
>  
> -	if (WARN_ON(test_and_set_bit(TLS_RX_SYNC_RUNNING, &tls_ctx->flags)))
> -		return;
> -
>  	trace_tls_device_rx_resync_send(sk, seq, rcd_sn, rx_ctx->resync_type);
> +	rcu_read_lock();
>  	netdev = READ_ONCE(tls_ctx->netdev);
>  	if (netdev)
>  		netdev->tlsdev_ops->tls_dev_resync(netdev, sk, seq, rcd_sn,
>  						   TLS_OFFLOAD_CTX_DIR_RX);

Now this can't sleep right? No bueno.

> -	clear_bit_unlock(TLS_RX_SYNC_RUNNING, &tls_ctx->flags);
> +	rcu_read_unlock();
>  	TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXDEVICERESYNC);
>  }
>

Maxim Mikityanskiy May 25, 2021, 8:52 a.m. UTC | #2

On 2021-05-24 19:05, Jakub Kicinski wrote:
> On Mon, 24 May 2021 15:12:19 +0300 Maxim Mikityanskiy wrote:
>> RCU synchronization is guaranteed to finish in finite time, unlike a
>> busy loop that polls a flag. This patch is a preparation for the bugfix
>> in the next patch, where the same synchronize_net() call will also be
>> used to sync with the TX datapath.
>>
>> Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
>> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
>> ---
>>   include/net/tls.h    |  1 -
>>   net/tls/tls_device.c | 10 +++-------
>>   2 files changed, 3 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/net/tls.h b/include/net/tls.h
>> index 3eccb525e8f7..6531ace2a68b 100644
>> --- a/include/net/tls.h
>> +++ b/include/net/tls.h
>> @@ -193,7 +193,6 @@ struct tls_offload_context_tx {
>>   	(sizeof(struct tls_offload_context_tx) + TLS_DRIVER_STATE_SIZE_TX)
>>   
>>   enum tls_context_flags {
>> -	TLS_RX_SYNC_RUNNING = 0,
>>   	/* Unlike RX where resync is driven entirely by the core in TX only
>>   	 * the driver knows when things went out of sync, so we need the flag
>>   	 * to be atomic.
>> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
>> index 76a6f8c2eec4..171752cd6910 100644
>> --- a/net/tls/tls_device.c
>> +++ b/net/tls/tls_device.c
>> @@ -680,15 +680,13 @@ static void tls_device_resync_rx(struct tls_context *tls_ctx,
>>   	struct tls_offload_context_rx *rx_ctx = tls_offload_ctx_rx(tls_ctx);
>>   	struct net_device *netdev;
>>   
>> -	if (WARN_ON(test_and_set_bit(TLS_RX_SYNC_RUNNING, &tls_ctx->flags)))
>> -		return;
>> -
>>   	trace_tls_device_rx_resync_send(sk, seq, rcd_sn, rx_ctx->resync_type);
>> +	rcu_read_lock();
>>   	netdev = READ_ONCE(tls_ctx->netdev);
>>   	if (netdev)
>>   		netdev->tlsdev_ops->tls_dev_resync(netdev, sk, seq, rcd_sn,
>>   						   TLS_OFFLOAD_CTX_DIR_RX);
> 
> Now this can't sleep right? No bueno.

No, it can't sleep under RCU. However, are you sure it was allowed to 
sleep before my change? I don't think so. Your commit e52972c11d6b 
("net/tls: replace the sleeping lock around RX resync with a bit lock") 
mentions that "RX resync may get called from soft IRQ", which 
essentially means that it can't sleep.

Furthermore, no implementations try to sleep in RX resync, as far as I 
see from reviewing the code. For example, nfp_net_tls_resync uses 
GFP_ATOMIC for RX resync and GFP_KERNEL for TX resync. 
mlx5_fpga_tls_resync_rx also uses GFP_ATOMIC.

So, I don't think I'm breaking anything with my change.

> 
>> -	clear_bit_unlock(TLS_RX_SYNC_RUNNING, &tls_ctx->flags);
>> +	rcu_read_unlock();
>>   	TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXDEVICERESYNC);
>>   }
>>   
>

Jakub Kicinski May 25, 2021, 5:14 p.m. UTC | #3

On Tue, 25 May 2021 11:52:20 +0300 Maxim Mikityanskiy wrote:
> On 2021-05-24 19:05, Jakub Kicinski wrote:
> > On Mon, 24 May 2021 15:12:19 +0300 Maxim Mikityanskiy wrote:  
> >> RCU synchronization is guaranteed to finish in finite time, unlike a
> >> busy loop that polls a flag. This patch is a preparation for the bugfix
> >> in the next patch, where the same synchronize_net() call will also be
> >> used to sync with the TX datapath.
> >>
> >> Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
> >> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> >> ---
> >>   include/net/tls.h    |  1 -
> >>   net/tls/tls_device.c | 10 +++-------
> >>   2 files changed, 3 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/include/net/tls.h b/include/net/tls.h
> >> index 3eccb525e8f7..6531ace2a68b 100644
> >> --- a/include/net/tls.h
> >> +++ b/include/net/tls.h
> >> @@ -193,7 +193,6 @@ struct tls_offload_context_tx {
> >>   	(sizeof(struct tls_offload_context_tx) + TLS_DRIVER_STATE_SIZE_TX)
> >>   
> >>   enum tls_context_flags {
> >> -	TLS_RX_SYNC_RUNNING = 0,
> >>   	/* Unlike RX where resync is driven entirely by the core in TX only
> >>   	 * the driver knows when things went out of sync, so we need the flag
> >>   	 * to be atomic.
> >> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> >> index 76a6f8c2eec4..171752cd6910 100644
> >> --- a/net/tls/tls_device.c
> >> +++ b/net/tls/tls_device.c
> >> @@ -680,15 +680,13 @@ static void tls_device_resync_rx(struct tls_context *tls_ctx,
> >>   	struct tls_offload_context_rx *rx_ctx = tls_offload_ctx_rx(tls_ctx);
> >>   	struct net_device *netdev;
> >>   
> >> -	if (WARN_ON(test_and_set_bit(TLS_RX_SYNC_RUNNING, &tls_ctx->flags)))
> >> -		return;
> >> -
> >>   	trace_tls_device_rx_resync_send(sk, seq, rcd_sn, rx_ctx->resync_type);
> >> +	rcu_read_lock();
> >>   	netdev = READ_ONCE(tls_ctx->netdev);
> >>   	if (netdev)
> >>   		netdev->tlsdev_ops->tls_dev_resync(netdev, sk, seq, rcd_sn,
> >>   						   TLS_OFFLOAD_CTX_DIR_RX);  
> > 
> > Now this can't sleep right? No bueno.  
> 
> No, it can't sleep under RCU. However, are you sure it was allowed to 
> sleep before my change? I don't think so. Your commit e52972c11d6b 
> ("net/tls: replace the sleeping lock around RX resync with a bit lock") 
> mentions that "RX resync may get called from soft IRQ", which 
> essentially means that it can't sleep.
> 
> Furthermore, no implementations try to sleep in RX resync, as far as I 
> see from reviewing the code. For example, nfp_net_tls_resync uses 
> GFP_ATOMIC for RX resync and GFP_KERNEL for TX resync. 
> mlx5_fpga_tls_resync_rx also uses GFP_ATOMIC.
> 
> So, I don't think I'm breaking anything with my change.

You're right.

diff --git a/include/net/tls.h b/include/net/tls.h
index 3eccb525e8f7..6531ace2a68b 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -193,7 +193,6 @@  struct tls_offload_context_tx {
 	(sizeof(struct tls_offload_context_tx) + TLS_DRIVER_STATE_SIZE_TX)
 
 enum tls_context_flags {
-	TLS_RX_SYNC_RUNNING = 0,
 	/* Unlike RX where resync is driven entirely by the core in TX only
 	 * the driver knows when things went out of sync, so we need the flag
 	 * to be atomic.
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 76a6f8c2eec4..171752cd6910 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -680,15 +680,13 @@  static void tls_device_resync_rx(struct tls_context *tls_ctx,
 	struct tls_offload_context_rx *rx_ctx = tls_offload_ctx_rx(tls_ctx);
 	struct net_device *netdev;
 
-	if (WARN_ON(test_and_set_bit(TLS_RX_SYNC_RUNNING, &tls_ctx->flags)))
-		return;
-
 	trace_tls_device_rx_resync_send(sk, seq, rcd_sn, rx_ctx->resync_type);
+	rcu_read_lock();
 	netdev = READ_ONCE(tls_ctx->netdev);
 	if (netdev)
 		netdev->tlsdev_ops->tls_dev_resync(netdev, sk, seq, rcd_sn,
 						   TLS_OFFLOAD_CTX_DIR_RX);
-	clear_bit_unlock(TLS_RX_SYNC_RUNNING, &tls_ctx->flags);
+	rcu_read_unlock();
 	TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXDEVICERESYNC);
 }
 
@@ -1300,9 +1298,7 @@  static int tls_device_down(struct net_device *netdev)
 			netdev->tlsdev_ops->tls_dev_del(netdev, ctx,
 							TLS_OFFLOAD_CTX_DIR_RX);
 		WRITE_ONCE(ctx->netdev, NULL);
-		smp_mb__before_atomic(); /* pairs with test_and_set_bit() */
-		while (test_bit(TLS_RX_SYNC_RUNNING, &ctx->flags))
-			usleep_range(10, 200);
+		synchronize_net();
 		dev_put(netdev);
 		list_del_init(&ctx->list);