Message ID | 20230906065237.2180187-1-liujian56@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tls: do not return error when the tls_bigint overflows in tls_advance_record_sn() | expand |
2023-09-06, 14:52:37 +0800, Liu Jian wrote: > This is because the value of rec_seq of tls_crypto_info configured by the > user program is too large, for example, 0xffffffffffffff. In addition, TLS > is asynchronously accelerated. When tls_do_encryption() returns > -EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow, > skmsg is released before the asynchronous encryption process ends. As a > result, the UAF problem occurs during the asynchronous processing of the > encryption module. > > I didn't see the rec_seq overflow causing other problems, so let's get rid > of the overflow error here. > > Fixes: 635d93981786 ("net/tls: free record only on encryption error") > Signed-off-by: Liu Jian <liujian56@huawei.com> > --- > net/tls/tls.h | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/net/tls/tls.h b/net/tls/tls.h > index 28a8c0e80e3c..3f0e10df8053 100644 > --- a/net/tls/tls.h > +++ b/net/tls/tls.h > @@ -304,8 +304,7 @@ static inline void > tls_advance_record_sn(struct sock *sk, struct tls_prot_info *prot, > struct cipher_context *ctx) > { > - if (tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size)) > - tls_err_abort(sk, -EBADMSG); > + tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size); That seems wrong. We can't allow the record number to wrap, if breaks the crypto. See for example: https://datatracker.ietf.org/doc/html/rfc5288#section-6.1 The real fix would be to stop the caller from freeing the skmsg and record if we go async. Once we go through async crypto, the record etc don't belong to the caller anymore, they've been transfered to the async callback. I'd say we need both tests in bpf_exec_tx_verdict: -EINPROGRESS (from before 635d93981786) and EBADMSG (from 635d93981786). Actually we need to check for both -EINPROGRESS and -EBUSY as I've recently found out. I've been running the selftests with async crypto and have collected a few fixes that I was going to post this week (but not this one, since we don't have a selftest for wrapping rec_seq). One of the patches adds -EBUSY checks for all existing -EINPROGRESS, since the crypto API can return -EBUSY as well if we're going through the backlog queue.
On Wed, 6 Sep 2023 13:02:37 +0200 Sabrina Dubroca wrote: > I've been running the selftests with async crypto and have collected a > few fixes that I was going to post this week (but not this one, since > we don't have a selftest for wrapping rec_seq). One of the patches > adds -EBUSY checks for all existing -EINPROGRESS, since the crypto API > can return -EBUSY as well if we're going through the backlog queue. BTW is it possible to fake async crypto for a test or does one need to have an actual accelerator?
2023-09-06, 08:02:31 -0700, Jakub Kicinski wrote: > On Wed, 6 Sep 2023 13:02:37 +0200 Sabrina Dubroca wrote: > > I've been running the selftests with async crypto and have collected a > > few fixes that I was going to post this week (but not this one, since > > we don't have a selftest for wrapping rec_seq). One of the patches > > adds -EBUSY checks for all existing -EINPROGRESS, since the crypto API > > can return -EBUSY as well if we're going through the backlog queue. > > BTW is it possible to fake async crypto for a test or does one need > to have an actual accelerator? That's what I did for my tests, forcing AESNI to go async. I'm going to send my changes as RFC to linux-crypto@. I think syzbot would find a few more bugs if they let it loose with forced async crypto. Short version (without the debugfs toggles): diff --git a/crypto/simd.c b/crypto/simd.c index edaa479a1ec5..e3f3bf31fcca 100644 --- a/crypto/simd.c +++ b/crypto/simd.c @@ -317,7 +317,7 @@ static int simd_aead_encrypt(struct aead_request *req) subreq = aead_request_ctx(req); *subreq = *req; - if (!crypto_simd_usable() || + if (true /* force async */ || !crypto_simd_usable() || (in_atomic() && cryptd_aead_queued(ctx->cryptd_tfm))) child = &ctx->cryptd_tfm->base; else @@ -338,7 +338,7 @@ static int simd_aead_decrypt(struct aead_request *req) subreq = aead_request_ctx(req); *subreq = *req; - if (!crypto_simd_usable() || + if (true /* force async */ || !crypto_simd_usable() || (in_atomic() && cryptd_aead_queued(ctx->cryptd_tfm))) child = &ctx->cryptd_tfm->base; else
On 2023/9/6 19:02, Sabrina Dubroca wrote: > 2023-09-06, 14:52:37 +0800, Liu Jian wrote: >> This is because the value of rec_seq of tls_crypto_info configured by the >> user program is too large, for example, 0xffffffffffffff. In addition, TLS >> is asynchronously accelerated. When tls_do_encryption() returns >> -EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow, >> skmsg is released before the asynchronous encryption process ends. As a >> result, the UAF problem occurs during the asynchronous processing of the >> encryption module. >> >> I didn't see the rec_seq overflow causing other problems, so let's get rid >> of the overflow error here. >> >> Fixes: 635d93981786 ("net/tls: free record only on encryption error") >> Signed-off-by: Liu Jian <liujian56@huawei.com> >> --- >> net/tls/tls.h | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/net/tls/tls.h b/net/tls/tls.h >> index 28a8c0e80e3c..3f0e10df8053 100644 >> --- a/net/tls/tls.h >> +++ b/net/tls/tls.h >> @@ -304,8 +304,7 @@ static inline void >> tls_advance_record_sn(struct sock *sk, struct tls_prot_info *prot, >> struct cipher_context *ctx) >> { >> - if (tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size)) >> - tls_err_abort(sk, -EBADMSG); >> + tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size); > > That seems wrong. We can't allow the record number to wrap, if breaks > the crypto. See for example: > https://datatracker.ietf.org/doc/html/rfc5288#section-6.1 > > The real fix would be to stop the caller from freeing the skmsg and > record if we go async. Once we go through async crypto, the record etc > don't belong to the caller anymore, they've been transfered to the > async callback. I'd say we need both tests in bpf_exec_tx_verdict: > -EINPROGRESS (from before 635d93981786) and EBADMSG (from > 635d93981786). Thanks for your review~ By the way, does the return of EBADMSG mean that the tls link needs to renegotiate the encryption information or re-establish the link? And is this okay? diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 1ed4a611631f..d1fc295b83b5 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -817,7 +817,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk, psock = sk_psock_get(sk); if (!psock || !policy) { err = tls_push_record(sk, flags, record_type); - if (err && sk->sk_err == EBADMSG) { + if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) { *copied -= sk_msg_free(sk, msg); tls_free_open_rec(sk); err = -sk->sk_err; @@ -846,7 +846,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk, switch (psock->eval) { case __SK_PASS: err = tls_push_record(sk, flags, record_type); - if (err && sk->sk_err == EBADMSG) { + if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) { *copied -= sk_msg_free(sk, msg); tls_free_open_rec(sk); err = -sk->sk_err; > > Actually we need to check for both -EINPROGRESS and -EBUSY as I've > recently found out. > > I've been running the selftests with async crypto and have collected a > few fixes that I was going to post this week (but not this one, since > we don't have a selftest for wrapping rec_seq). One of the patches > adds -EBUSY checks for all existing -EINPROGRESS, since the crypto API > can return -EBUSY as well if we're going through the backlog queue. >
2023-09-07, 20:59:51 +0800, liujian (CE) wrote: > By the way, does the return of EBADMSG mean that the tls link needs to > renegotiate the encryption information or re-establish the link? We currently don't support key updates so closing this socket is the only option for now. AFAIU when we set EBADMSG, we can't fix that socket. > And is this okay? Yes, this is what I had in mind. > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c > index 1ed4a611631f..d1fc295b83b5 100644 > --- a/net/tls/tls_sw.c > +++ b/net/tls/tls_sw.c > @@ -817,7 +817,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, > struct sock *sk, > psock = sk_psock_get(sk); > if (!psock || !policy) { > err = tls_push_record(sk, flags, record_type); > - if (err && sk->sk_err == EBADMSG) { > + if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) { > *copied -= sk_msg_free(sk, msg); > tls_free_open_rec(sk); > err = -sk->sk_err; > @@ -846,7 +846,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, > struct sock *sk, > switch (psock->eval) { > case __SK_PASS: > err = tls_push_record(sk, flags, record_type); > - if (err && sk->sk_err == EBADMSG) { > + if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) { > *copied -= sk_msg_free(sk, msg); > tls_free_open_rec(sk); > err = -sk->sk_err;
On 2023/9/9 0:41, Sabrina Dubroca wrote: > 2023-09-07, 20:59:51 +0800, liujian (CE) wrote: >> By the way, does the return of EBADMSG mean that the tls link needs to >> renegotiate the encryption information or re-establish the link? > > We currently don't support key updates so closing this socket is the > only option for now. AFAIU when we set EBADMSG, we can't fix that socket. > Got it, thank you for your explanation. >> And is this okay? > > Yes, this is what I had in mind. > Will send v2, thanks. >> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c >> index 1ed4a611631f..d1fc295b83b5 100644 >> --- a/net/tls/tls_sw.c >> +++ b/net/tls/tls_sw.c >> @@ -817,7 +817,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, >> struct sock *sk, >> psock = sk_psock_get(sk); >> if (!psock || !policy) { >> err = tls_push_record(sk, flags, record_type); >> - if (err && sk->sk_err == EBADMSG) { >> + if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) { >> *copied -= sk_msg_free(sk, msg); >> tls_free_open_rec(sk); >> err = -sk->sk_err; >> @@ -846,7 +846,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, >> struct sock *sk, >> switch (psock->eval) { >> case __SK_PASS: >> err = tls_push_record(sk, flags, record_type); >> - if (err && sk->sk_err == EBADMSG) { >> + if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) { >> *copied -= sk_msg_free(sk, msg); >> tls_free_open_rec(sk); >> err = -sk->sk_err; >
diff --git a/net/tls/tls.h b/net/tls/tls.h index 28a8c0e80e3c..3f0e10df8053 100644 --- a/net/tls/tls.h +++ b/net/tls/tls.h @@ -304,8 +304,7 @@ static inline void tls_advance_record_sn(struct sock *sk, struct tls_prot_info *prot, struct cipher_context *ctx) { - if (tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size)) - tls_err_abort(sk, -EBADMSG); + tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size); if (prot->version != TLS_1_3_VERSION && prot->cipher_type != TLS_CIPHER_CHACHA20_POLY1305)
I got the below warning when do fuzzing test: BUG: KASAN: null-ptr-deref in scatterwalk_copychunks+0x320/0x470 Read of size 4 at addr 0000000000000008 by task kworker/u8:1/9 CPU: 0 PID: 9 Comm: kworker/u8:1 Tainted: G OE Hardware name: linux,dummy-virt (DT) Workqueue: pencrypt_parallel padata_parallel_worker Call trace: dump_backtrace+0x0/0x420 show_stack+0x34/0x44 dump_stack+0x1d0/0x248 __kasan_report+0x138/0x140 kasan_report+0x44/0x6c __asan_load4+0x94/0xd0 scatterwalk_copychunks+0x320/0x470 skcipher_next_slow+0x14c/0x290 skcipher_walk_next+0x2fc/0x480 skcipher_walk_first+0x9c/0x110 skcipher_walk_aead_common+0x380/0x440 skcipher_walk_aead_encrypt+0x54/0x70 ccm_encrypt+0x13c/0x4d0 crypto_aead_encrypt+0x7c/0xfc pcrypt_aead_enc+0x28/0x84 padata_parallel_worker+0xd0/0x2dc process_one_work+0x49c/0xbdc worker_thread+0x124/0x880 kthread+0x210/0x260 ret_from_fork+0x10/0x18 This is because the value of rec_seq of tls_crypto_info configured by the user program is too large, for example, 0xffffffffffffff. In addition, TLS is asynchronously accelerated. When tls_do_encryption() returns -EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow, skmsg is released before the asynchronous encryption process ends. As a result, the UAF problem occurs during the asynchronous processing of the encryption module. I didn't see the rec_seq overflow causing other problems, so let's get rid of the overflow error here. Fixes: 635d93981786 ("net/tls: free record only on encryption error") Signed-off-by: Liu Jian <liujian56@huawei.com> --- net/tls/tls.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)