diff mbox series

[net] tls: do not return error when the tls_bigint overflows in tls_advance_record_sn()

Message ID 20230906065237.2180187-1-liujian56@huawei.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net] tls: do not return error when the tls_bigint overflows in tls_advance_record_sn() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1330 this patch: 1330
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 1353 this patch: 1353
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1353 this patch: 1353
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

liujian (CE) Sept. 6, 2023, 6:52 a.m. UTC
I got the below warning when do fuzzing test:
BUG: KASAN: null-ptr-deref in scatterwalk_copychunks+0x320/0x470
Read of size 4 at addr 0000000000000008 by task kworker/u8:1/9

CPU: 0 PID: 9 Comm: kworker/u8:1 Tainted: G           OE
Hardware name: linux,dummy-virt (DT)
Workqueue: pencrypt_parallel padata_parallel_worker
Call trace:
 dump_backtrace+0x0/0x420
 show_stack+0x34/0x44
 dump_stack+0x1d0/0x248
 __kasan_report+0x138/0x140
 kasan_report+0x44/0x6c
 __asan_load4+0x94/0xd0
 scatterwalk_copychunks+0x320/0x470
 skcipher_next_slow+0x14c/0x290
 skcipher_walk_next+0x2fc/0x480
 skcipher_walk_first+0x9c/0x110
 skcipher_walk_aead_common+0x380/0x440
 skcipher_walk_aead_encrypt+0x54/0x70
 ccm_encrypt+0x13c/0x4d0
 crypto_aead_encrypt+0x7c/0xfc
 pcrypt_aead_enc+0x28/0x84
 padata_parallel_worker+0xd0/0x2dc
 process_one_work+0x49c/0xbdc
 worker_thread+0x124/0x880
 kthread+0x210/0x260
 ret_from_fork+0x10/0x18

This is because the value of rec_seq of tls_crypto_info configured by the
user program is too large, for example, 0xffffffffffffff. In addition, TLS
is asynchronously accelerated. When tls_do_encryption() returns
-EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow,
skmsg is released before the asynchronous encryption process ends. As a
result, the UAF problem occurs during the asynchronous processing of the
encryption module.

I didn't see the rec_seq overflow causing other problems, so let's get rid
of the overflow error here.

Fixes: 635d93981786 ("net/tls: free record only on encryption error")
Signed-off-by: Liu Jian <liujian56@huawei.com>
---
 net/tls/tls.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Sabrina Dubroca Sept. 6, 2023, 11:02 a.m. UTC | #1
2023-09-06, 14:52:37 +0800, Liu Jian wrote:
> This is because the value of rec_seq of tls_crypto_info configured by the
> user program is too large, for example, 0xffffffffffffff. In addition, TLS
> is asynchronously accelerated. When tls_do_encryption() returns
> -EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow,
> skmsg is released before the asynchronous encryption process ends. As a
> result, the UAF problem occurs during the asynchronous processing of the
> encryption module.
> 
> I didn't see the rec_seq overflow causing other problems, so let's get rid
> of the overflow error here.
> 
> Fixes: 635d93981786 ("net/tls: free record only on encryption error")
> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
>  net/tls/tls.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/net/tls/tls.h b/net/tls/tls.h
> index 28a8c0e80e3c..3f0e10df8053 100644
> --- a/net/tls/tls.h
> +++ b/net/tls/tls.h
> @@ -304,8 +304,7 @@ static inline void
>  tls_advance_record_sn(struct sock *sk, struct tls_prot_info *prot,
>  		      struct cipher_context *ctx)
>  {
> -	if (tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size))
> -		tls_err_abort(sk, -EBADMSG);
> +	tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size);

That seems wrong. We can't allow the record number to wrap, if breaks
the crypto. See for example:
https://datatracker.ietf.org/doc/html/rfc5288#section-6.1

The real fix would be to stop the caller from freeing the skmsg and
record if we go async. Once we go through async crypto, the record etc
don't belong to the caller anymore, they've been transfered to the
async callback. I'd say we need both tests in bpf_exec_tx_verdict:
-EINPROGRESS (from before 635d93981786) and EBADMSG (from
635d93981786).

Actually we need to check for both -EINPROGRESS and -EBUSY as I've
recently found out.

I've been running the selftests with async crypto and have collected a
few fixes that I was going to post this week (but not this one, since
we don't have a selftest for wrapping rec_seq). One of the patches
adds -EBUSY checks for all existing -EINPROGRESS, since the crypto API
can return -EBUSY as well if we're going through the backlog queue.
Jakub Kicinski Sept. 6, 2023, 3:02 p.m. UTC | #2
On Wed, 6 Sep 2023 13:02:37 +0200 Sabrina Dubroca wrote:
> I've been running the selftests with async crypto and have collected a
> few fixes that I was going to post this week (but not this one, since
> we don't have a selftest for wrapping rec_seq). One of the patches
> adds -EBUSY checks for all existing -EINPROGRESS, since the crypto API
> can return -EBUSY as well if we're going through the backlog queue.

BTW is it possible to fake async crypto for a test or does one need 
to have an actual accelerator?
Sabrina Dubroca Sept. 6, 2023, 3:14 p.m. UTC | #3
2023-09-06, 08:02:31 -0700, Jakub Kicinski wrote:
> On Wed, 6 Sep 2023 13:02:37 +0200 Sabrina Dubroca wrote:
> > I've been running the selftests with async crypto and have collected a
> > few fixes that I was going to post this week (but not this one, since
> > we don't have a selftest for wrapping rec_seq). One of the patches
> > adds -EBUSY checks for all existing -EINPROGRESS, since the crypto API
> > can return -EBUSY as well if we're going through the backlog queue.
> 
> BTW is it possible to fake async crypto for a test or does one need 
> to have an actual accelerator?

That's what I did for my tests, forcing AESNI to go async. I'm going
to send my changes as RFC to linux-crypto@. I think syzbot would find
a few more bugs if they let it loose with forced async crypto.

Short version (without the debugfs toggles):

diff --git a/crypto/simd.c b/crypto/simd.c
index edaa479a1ec5..e3f3bf31fcca 100644
--- a/crypto/simd.c
+++ b/crypto/simd.c
@@ -317,7 +317,7 @@ static int simd_aead_encrypt(struct aead_request *req)
 	subreq = aead_request_ctx(req);
 	*subreq = *req;
 
-	if (!crypto_simd_usable() ||
+	if (true /* force async */ || !crypto_simd_usable() ||
 	    (in_atomic() && cryptd_aead_queued(ctx->cryptd_tfm)))
 		child = &ctx->cryptd_tfm->base;
 	else
@@ -338,7 +338,7 @@ static int simd_aead_decrypt(struct aead_request *req)
 	subreq = aead_request_ctx(req);
 	*subreq = *req;
 
-	if (!crypto_simd_usable() ||
+	if (true /* force async */ || !crypto_simd_usable() ||
 	    (in_atomic() && cryptd_aead_queued(ctx->cryptd_tfm)))
 		child = &ctx->cryptd_tfm->base;
 	else
liujian (CE) Sept. 7, 2023, 12:59 p.m. UTC | #4
On 2023/9/6 19:02, Sabrina Dubroca wrote:
> 2023-09-06, 14:52:37 +0800, Liu Jian wrote:
>> This is because the value of rec_seq of tls_crypto_info configured by the
>> user program is too large, for example, 0xffffffffffffff. In addition, TLS
>> is asynchronously accelerated. When tls_do_encryption() returns
>> -EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow,
>> skmsg is released before the asynchronous encryption process ends. As a
>> result, the UAF problem occurs during the asynchronous processing of the
>> encryption module.
>>
>> I didn't see the rec_seq overflow causing other problems, so let's get rid
>> of the overflow error here.
>>
>> Fixes: 635d93981786 ("net/tls: free record only on encryption error")
>> Signed-off-by: Liu Jian <liujian56@huawei.com>
>> ---
>>   net/tls/tls.h | 3 +--
>>   1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/net/tls/tls.h b/net/tls/tls.h
>> index 28a8c0e80e3c..3f0e10df8053 100644
>> --- a/net/tls/tls.h
>> +++ b/net/tls/tls.h
>> @@ -304,8 +304,7 @@ static inline void
>>   tls_advance_record_sn(struct sock *sk, struct tls_prot_info *prot,
>>   		      struct cipher_context *ctx)
>>   {
>> -	if (tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size))
>> -		tls_err_abort(sk, -EBADMSG);
>> +	tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size);
> 
> That seems wrong. We can't allow the record number to wrap, if breaks
> the crypto. See for example:
> https://datatracker.ietf.org/doc/html/rfc5288#section-6.1
> 
> The real fix would be to stop the caller from freeing the skmsg and
> record if we go async. Once we go through async crypto, the record etc
> don't belong to the caller anymore, they've been transfered to the
> async callback. I'd say we need both tests in bpf_exec_tx_verdict:
> -EINPROGRESS (from before 635d93981786) and EBADMSG (from
> 635d93981786).

Thanks for your review~

By the way, does the return of EBADMSG mean that the tls link needs to 
renegotiate the encryption information or re-establish the link?

And is this okay?
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 1ed4a611631f..d1fc295b83b5 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -817,7 +817,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, 
struct sock *sk,
         psock = sk_psock_get(sk);
         if (!psock || !policy) {
                 err = tls_push_record(sk, flags, record_type);
-               if (err && sk->sk_err == EBADMSG) {
+               if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) {
                         *copied -= sk_msg_free(sk, msg);
                         tls_free_open_rec(sk);
                         err = -sk->sk_err;
@@ -846,7 +846,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, 
struct sock *sk,
         switch (psock->eval) {
         case __SK_PASS:
                 err = tls_push_record(sk, flags, record_type);
-               if (err && sk->sk_err == EBADMSG) {
+               if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) {
                         *copied -= sk_msg_free(sk, msg);
                         tls_free_open_rec(sk);
                         err = -sk->sk_err;

> 
> Actually we need to check for both -EINPROGRESS and -EBUSY as I've
> recently found out.
> 
> I've been running the selftests with async crypto and have collected a
> few fixes that I was going to post this week (but not this one, since
> we don't have a selftest for wrapping rec_seq). One of the patches
> adds -EBUSY checks for all existing -EINPROGRESS, since the crypto API
> can return -EBUSY as well if we're going through the backlog queue.
>
Sabrina Dubroca Sept. 8, 2023, 4:41 p.m. UTC | #5
2023-09-07, 20:59:51 +0800, liujian (CE) wrote:
> By the way, does the return of EBADMSG mean that the tls link needs to
> renegotiate the encryption information or re-establish the link?

We currently don't support key updates so closing this socket is the
only option for now. AFAIU when we set EBADMSG, we can't fix that socket.

> And is this okay?

Yes, this is what I had in mind.

> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 1ed4a611631f..d1fc295b83b5 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -817,7 +817,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg,
> struct sock *sk,
>         psock = sk_psock_get(sk);
>         if (!psock || !policy) {
>                 err = tls_push_record(sk, flags, record_type);
> -               if (err && sk->sk_err == EBADMSG) {
> +               if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) {
>                         *copied -= sk_msg_free(sk, msg);
>                         tls_free_open_rec(sk);
>                         err = -sk->sk_err;
> @@ -846,7 +846,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg,
> struct sock *sk,
>         switch (psock->eval) {
>         case __SK_PASS:
>                 err = tls_push_record(sk, flags, record_type);
> -               if (err && sk->sk_err == EBADMSG) {
> +               if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) {
>                         *copied -= sk_msg_free(sk, msg);
>                         tls_free_open_rec(sk);
>                         err = -sk->sk_err;
liujian (CE) Sept. 9, 2023, 7:58 a.m. UTC | #6
On 2023/9/9 0:41, Sabrina Dubroca wrote:
> 2023-09-07, 20:59:51 +0800, liujian (CE) wrote:
>> By the way, does the return of EBADMSG mean that the tls link needs to
>> renegotiate the encryption information or re-establish the link?
> 
> We currently don't support key updates so closing this socket is the
> only option for now. AFAIU when we set EBADMSG, we can't fix that socket.
> 
Got it, thank you for your explanation.

>> And is this okay?
> 
> Yes, this is what I had in mind.
> 
Will send v2, thanks.

>> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
>> index 1ed4a611631f..d1fc295b83b5 100644
>> --- a/net/tls/tls_sw.c
>> +++ b/net/tls/tls_sw.c
>> @@ -817,7 +817,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg,
>> struct sock *sk,
>>          psock = sk_psock_get(sk);
>>          if (!psock || !policy) {
>>                  err = tls_push_record(sk, flags, record_type);
>> -               if (err && sk->sk_err == EBADMSG) {
>> +               if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) {
>>                          *copied -= sk_msg_free(sk, msg);
>>                          tls_free_open_rec(sk);
>>                          err = -sk->sk_err;
>> @@ -846,7 +846,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg,
>> struct sock *sk,
>>          switch (psock->eval) {
>>          case __SK_PASS:
>>                  err = tls_push_record(sk, flags, record_type);
>> -               if (err && sk->sk_err == EBADMSG) {
>> +               if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) {
>>                          *copied -= sk_msg_free(sk, msg);
>>                          tls_free_open_rec(sk);
>>                          err = -sk->sk_err;
>
diff mbox series

Patch

diff --git a/net/tls/tls.h b/net/tls/tls.h
index 28a8c0e80e3c..3f0e10df8053 100644
--- a/net/tls/tls.h
+++ b/net/tls/tls.h
@@ -304,8 +304,7 @@  static inline void
 tls_advance_record_sn(struct sock *sk, struct tls_prot_info *prot,
 		      struct cipher_context *ctx)
 {
-	if (tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size))
-		tls_err_abort(sk, -EBADMSG);
+	tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size);
 
 	if (prot->version != TLS_1_3_VERSION &&
 	    prot->cipher_type != TLS_CIPHER_CHACHA20_POLY1305)