Message ID | 20190609115509.26260-8-ard.biesheuvel@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | None | expand |
Probably harmless to change this code path (build_ntlmssp_auth_blob is called at session negotiation time so shouldn't have much of a performance impact). On the other hand if we can find optimizations in the encryption and signing paths, that would be really helpful. There was a lot of focus on encryption performance at SambaXP last week. Andreas from Redhat gave a talk on the improvements in Samba with TLS implementation of AES-GCM. I added the cifs client implementation of AES-GCM and notice it is now faster to encrypt packets than sign them (performance is about 2 to 3 times faster now with GCM). On Sun, Jun 9, 2019 at 6:57 AM Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > > The CIFS code uses the sync skcipher API to invoke the ecb(arc4) skcipher, > of which only a single generic C code implementation exists. This means > that going through all the trouble of using scatterlists etc buys us > very little, and we're better off just invoking the arc4 library directly. > > Cc: linux-cifs@vger.kernel.org > Cc: Steve French <sfrench@samba.org> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > fs/cifs/Kconfig | 2 +- > fs/cifs/cifsencrypt.c | 50 +++++--------------- > 2 files changed, 13 insertions(+), 39 deletions(-) > > diff --git a/fs/cifs/Kconfig b/fs/cifs/Kconfig > index aae2b8b2adf5..523e9ea78a28 100644 > --- a/fs/cifs/Kconfig > +++ b/fs/cifs/Kconfig > @@ -10,7 +10,7 @@ config CIFS > select CRYPTO_SHA512 > select CRYPTO_CMAC > select CRYPTO_HMAC > - select CRYPTO_ARC4 > + select CRYPTO_LIB_ARC4 > select CRYPTO_AEAD2 > select CRYPTO_CCM > select CRYPTO_ECB > diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c > index d2a05e46d6f5..d0ab5a38e5d2 100644 > --- a/fs/cifs/cifsencrypt.c > +++ b/fs/cifs/cifsencrypt.c > @@ -33,7 +33,7 @@ > #include <linux/ctype.h> > #include <linux/random.h> > #include <linux/highmem.h> > -#include <crypto/skcipher.h> > +#include <crypto/arc4.h> > #include <crypto/aead.h> > > int __cifs_calc_signature(struct smb_rqst *rqst, > @@ -772,11 +772,9 @@ setup_ntlmv2_rsp(struct cifs_ses *ses, const struct nls_table *nls_cp) > int > calc_seckey(struct cifs_ses *ses) > { > - int rc; > - struct crypto_skcipher *tfm_arc4; > - struct scatterlist sgin, sgout; > - struct skcipher_request *req; > + struct crypto_arc4_ctx *ctx_arc4; > unsigned char *sec_key; > + int rc = 0; > > sec_key = kmalloc(CIFS_SESS_KEY_SIZE, GFP_KERNEL); > if (sec_key == NULL) > @@ -784,49 +782,25 @@ calc_seckey(struct cifs_ses *ses) > > get_random_bytes(sec_key, CIFS_SESS_KEY_SIZE); > > - tfm_arc4 = crypto_alloc_skcipher("ecb(arc4)", 0, CRYPTO_ALG_ASYNC); > - if (IS_ERR(tfm_arc4)) { > - rc = PTR_ERR(tfm_arc4); > - cifs_dbg(VFS, "could not allocate crypto API arc4\n"); > - goto out; > - } > - > - rc = crypto_skcipher_setkey(tfm_arc4, ses->auth_key.response, > - CIFS_SESS_KEY_SIZE); > - if (rc) { > - cifs_dbg(VFS, "%s: Could not set response as a key\n", > - __func__); > - goto out_free_cipher; > - } > - > - req = skcipher_request_alloc(tfm_arc4, GFP_KERNEL); > - if (!req) { > + ctx_arc4 = kmalloc(sizeof(*ctx_arc4), GFP_KERNEL); > + if (!ctx_arc4) { > rc = -ENOMEM; > - cifs_dbg(VFS, "could not allocate crypto API arc4 request\n"); > - goto out_free_cipher; > + cifs_dbg(VFS, "could not allocate arc4 context\n"); > + goto out; > } > > - sg_init_one(&sgin, sec_key, CIFS_SESS_KEY_SIZE); > - sg_init_one(&sgout, ses->ntlmssp->ciphertext, CIFS_CPHTXT_SIZE); > - > - skcipher_request_set_callback(req, 0, NULL, NULL); > - skcipher_request_set_crypt(req, &sgin, &sgout, CIFS_CPHTXT_SIZE, NULL); > - > - rc = crypto_skcipher_encrypt(req); > - skcipher_request_free(req); > - if (rc) { > - cifs_dbg(VFS, "could not encrypt session key rc: %d\n", rc); > - goto out_free_cipher; > - } > + crypto_arc4_set_key(ctx_arc4, ses->auth_key.response, > + CIFS_SESS_KEY_SIZE); > + crypto_arc4_crypt(ctx_arc4, ses->ntlmssp->ciphertext, sec_key, > + CIFS_CPHTXT_SIZE); > > /* make secondary_key/nonce as session key */ > memcpy(ses->auth_key.response, sec_key, CIFS_SESS_KEY_SIZE); > /* and make len as that of session key only */ > ses->auth_key.len = CIFS_SESS_KEY_SIZE; > > -out_free_cipher: > - crypto_free_skcipher(tfm_arc4); > out: > + kfree(ctx_arc4); > kfree(sec_key); > return rc; > } > -- > 2.20.1 >
Hi Steve, On Sun, Jun 09, 2019 at 05:03:25PM -0500, Steve French wrote: > Probably harmless to change this code path (build_ntlmssp_auth_blob is > called at session negotiation time so shouldn't have much of a > performance impact). > > On the other hand if we can find optimizations in the encryption and > signing paths, that would be really helpful. There was a lot of > focus on encryption performance at SambaXP last week. > > Andreas from Redhat gave a talk on the improvements in Samba with TLS > implementation of AES-GCM. I added the cifs client implementation of > AES-GCM and notice it is now faster to encrypt packets than sign them > (performance is about 2 to 3 times faster now with GCM). > > On Sun, Jun 9, 2019 at 6:57 AM Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > > > > The CIFS code uses the sync skcipher API to invoke the ecb(arc4) skcipher, > > of which only a single generic C code implementation exists. This means > > that going through all the trouble of using scatterlists etc buys us > > very little, and we're better off just invoking the arc4 library directly. This patch only changes RC4 encryption, and to be clear it actually *improves* the performance of the RC4 encryption, since it removes unnecessary abstractions. I'd really hope that people wouldn't care either way, though, since RC4 is insecure and should not be used. Also it seems that fs/cifs/ supports AES-CCM but not AES-GCM? /* pneg_ctxt->Ciphers[0] = SMB2_ENCRYPTION_AES128_GCM;*/ /* not supported yet */ pneg_ctxt->Ciphers[0] = SMB2_ENCRYPTION_AES128_CCM; AES-GCM is usually faster than AES-CCM, so if you want to improve performance, switching from CCM to GCM would do that. - Eric
On Mon, Jun 10, 2019 at 11:17 AM Eric Biggers <ebiggers@kernel.org> wrote: > > Hi Steve, > > On Sun, Jun 09, 2019 at 05:03:25PM -0500, Steve French wrote: > > Probably harmless to change this code path (build_ntlmssp_auth_blob is > > called at session negotiation time so shouldn't have much of a > > performance impact). > > > > On the other hand if we can find optimizations in the encryption and > > signing paths, that would be really helpful. There was a lot of > > focus on encryption performance at SambaXP last week. > > > > Andreas from Redhat gave a talk on the improvements in Samba with TLS > > implementation of AES-GCM. I added the cifs client implementation of > > AES-GCM and notice it is now faster to encrypt packets than sign them > > (performance is about 2 to 3 times faster now with GCM). > > > > On Sun, Jun 9, 2019 at 6:57 AM Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > > > > > > The CIFS code uses the sync skcipher API to invoke the ecb(arc4) skcipher, > > > of which only a single generic C code implementation exists. This means > > > that going through all the trouble of using scatterlists etc buys us > > > very little, and we're better off just invoking the arc4 library directly. > > This patch only changes RC4 encryption, and to be clear it actually *improves* > the performance of the RC4 encryption, since it removes unnecessary > abstractions. I'd really hope that people wouldn't care either way, though, > since RC4 is insecure and should not be used. > > Also it seems that fs/cifs/ supports AES-CCM but not AES-GCM? > > /* pneg_ctxt->Ciphers[0] = SMB2_ENCRYPTION_AES128_GCM;*/ /* not supported yet */ > pneg_ctxt->Ciphers[0] = SMB2_ENCRYPTION_AES128_CCM; > > AES-GCM is usually faster than AES-CCM, so if you want to improve performance, > switching from CCM to GCM would do that. > > - Eric Yes - when I tested the GCM code in cifs.ko last week (the two patches are currently in the cifs-2.6.git for-next branch and thus in linux-next and are attached), I was astounded at the improvement - encryption with GCM is now faster than signing, and copy is more than twice as fast as encryption was before with CCM (assuming a fast enough network so that the network is not the bottleneck). We see more benefit in the write path (copy to server) than the read path (copy from server) but even the read path gets 80% benefit in my tests, and with the addition of multichannel support in the next few months, we should see copy from server on SMB3.1.1 mounts improving even more. Any other ideas how to improve the encryption or signing in the read or write path in cifs.ko would be appreciated. We still are slower than Windows, probably in part due to holding mutexes longer in sending the frame across the network (including signing or encryption) which limits parallelism somewhat.
On Mon, 10 Jun 2019 at 19:54, Steve French <smfrench@gmail.com> wrote: > > On Mon, Jun 10, 2019 at 11:17 AM Eric Biggers <ebiggers@kernel.org> wrote: > > > > Hi Steve, > > > > On Sun, Jun 09, 2019 at 05:03:25PM -0500, Steve French wrote: > > > Probably harmless to change this code path (build_ntlmssp_auth_blob is > > > called at session negotiation time so shouldn't have much of a > > > performance impact). > > > > > > On the other hand if we can find optimizations in the encryption and > > > signing paths, that would be really helpful. There was a lot of > > > focus on encryption performance at SambaXP last week. > > > > > > Andreas from Redhat gave a talk on the improvements in Samba with TLS > > > implementation of AES-GCM. I added the cifs client implementation of > > > AES-GCM and notice it is now faster to encrypt packets than sign them > > > (performance is about 2 to 3 times faster now with GCM). > > > > > > On Sun, Jun 9, 2019 at 6:57 AM Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > > > > > > > > The CIFS code uses the sync skcipher API to invoke the ecb(arc4) skcipher, > > > > of which only a single generic C code implementation exists. This means > > > > that going through all the trouble of using scatterlists etc buys us > > > > very little, and we're better off just invoking the arc4 library directly. > > > > This patch only changes RC4 encryption, and to be clear it actually *improves* > > the performance of the RC4 encryption, since it removes unnecessary > > abstractions. I'd really hope that people wouldn't care either way, though, > > since RC4 is insecure and should not be used. > > > > Also it seems that fs/cifs/ supports AES-CCM but not AES-GCM? > > > > /* pneg_ctxt->Ciphers[0] = SMB2_ENCRYPTION_AES128_GCM;*/ /* not supported yet */ > > pneg_ctxt->Ciphers[0] = SMB2_ENCRYPTION_AES128_CCM; > > > > AES-GCM is usually faster than AES-CCM, so if you want to improve performance, > > switching from CCM to GCM would do that. > > > > - Eric > > Yes - when I tested the GCM code in cifs.ko last week (the two patches > are currently > in the cifs-2.6.git for-next branch and thus in linux-next and are > attached), I was astounded > at the improvement - encryption with GCM is now faster than signing, > and copy is more > than twice as fast as encryption was before with CCM (assuming a fast > enough network so > that the network is not the bottleneck). We see more benefit in the write path > (copy to server) than the read path (copy from server) but even the > read path gets > 80% benefit in my tests, and with the addition of multichannel support > in the next > few months, we should see copy from server on SMB3.1.1 mounts > improving even more. > I assume this was tested on high end x86 silicon? The CBCMAC path in CCM is strictly sequential, so on systems with deep pipelines, you lose a lot of speed there. For arm64, I implemented a CCM driver that does the encryption and authentication in parallel, which mitigates most of the performance hit, but even then, you will still be running with a underutilized pipeline (unless you are using low end silicon which only has a 2 cycle latency for AES instructions) > Any other ideas how to improve the encryption or signing in the read > or write path > in cifs.ko would be appreciated. We still are slower than Windows, probably in > part due to holding mutexes longer in sending the frame across the network > (including signing or encryption) which limits parallelism somewhat. > It all depends on whether crypto is still the bottleneck in this case.
On Mon, Jun 10, 2019 at 1:02 PM Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On Mon, 10 Jun 2019 at 19:54, Steve French <smfrench@gmail.com> wrote: > > Yes - when I tested the GCM code in cifs.ko last week (the two patches > > are currently > > in the cifs-2.6.git for-next branch and thus in linux-next and are > > attached), I was astounded > > at the improvement - encryption with GCM is now faster than signing, <snip> > I assume this was tested on high end x86 silicon? The CBCMAC path in > CCM is strictly sequential, so on systems with deep pipelines, you > lose a lot of speed there. For arm64, I implemented a CCM driver that > does the encryption and authentication in parallel, which mitigates > most of the performance hit, but even then, you will still be running > with a underutilized pipeline (unless you are using low end silicon > which only has a 2 cycle latency for AES instructions) Was doing most of my testing on an i7-7820HQ 2.9GHz (passmark score 9231) > > Any other ideas how to improve the encryption or signing in the read > > or write path > > in cifs.ko would be appreciated. We still are slower than Windows, probably in > > part due to holding mutexes longer in sending the frame across the network > > (including signing or encryption) which limits parallelism somewhat. > > > > It all depends on whether crypto is still the bottleneck in this case. Trying to wade through traces to see. Unfortunately lots of different configurations (especially varying network latency and network copy offload) to experiment with.
diff --git a/fs/cifs/Kconfig b/fs/cifs/Kconfig index aae2b8b2adf5..523e9ea78a28 100644 --- a/fs/cifs/Kconfig +++ b/fs/cifs/Kconfig @@ -10,7 +10,7 @@ config CIFS select CRYPTO_SHA512 select CRYPTO_CMAC select CRYPTO_HMAC - select CRYPTO_ARC4 + select CRYPTO_LIB_ARC4 select CRYPTO_AEAD2 select CRYPTO_CCM select CRYPTO_ECB diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c index d2a05e46d6f5..d0ab5a38e5d2 100644 --- a/fs/cifs/cifsencrypt.c +++ b/fs/cifs/cifsencrypt.c @@ -33,7 +33,7 @@ #include <linux/ctype.h> #include <linux/random.h> #include <linux/highmem.h> -#include <crypto/skcipher.h> +#include <crypto/arc4.h> #include <crypto/aead.h> int __cifs_calc_signature(struct smb_rqst *rqst, @@ -772,11 +772,9 @@ setup_ntlmv2_rsp(struct cifs_ses *ses, const struct nls_table *nls_cp) int calc_seckey(struct cifs_ses *ses) { - int rc; - struct crypto_skcipher *tfm_arc4; - struct scatterlist sgin, sgout; - struct skcipher_request *req; + struct crypto_arc4_ctx *ctx_arc4; unsigned char *sec_key; + int rc = 0; sec_key = kmalloc(CIFS_SESS_KEY_SIZE, GFP_KERNEL); if (sec_key == NULL) @@ -784,49 +782,25 @@ calc_seckey(struct cifs_ses *ses) get_random_bytes(sec_key, CIFS_SESS_KEY_SIZE); - tfm_arc4 = crypto_alloc_skcipher("ecb(arc4)", 0, CRYPTO_ALG_ASYNC); - if (IS_ERR(tfm_arc4)) { - rc = PTR_ERR(tfm_arc4); - cifs_dbg(VFS, "could not allocate crypto API arc4\n"); - goto out; - } - - rc = crypto_skcipher_setkey(tfm_arc4, ses->auth_key.response, - CIFS_SESS_KEY_SIZE); - if (rc) { - cifs_dbg(VFS, "%s: Could not set response as a key\n", - __func__); - goto out_free_cipher; - } - - req = skcipher_request_alloc(tfm_arc4, GFP_KERNEL); - if (!req) { + ctx_arc4 = kmalloc(sizeof(*ctx_arc4), GFP_KERNEL); + if (!ctx_arc4) { rc = -ENOMEM; - cifs_dbg(VFS, "could not allocate crypto API arc4 request\n"); - goto out_free_cipher; + cifs_dbg(VFS, "could not allocate arc4 context\n"); + goto out; } - sg_init_one(&sgin, sec_key, CIFS_SESS_KEY_SIZE); - sg_init_one(&sgout, ses->ntlmssp->ciphertext, CIFS_CPHTXT_SIZE); - - skcipher_request_set_callback(req, 0, NULL, NULL); - skcipher_request_set_crypt(req, &sgin, &sgout, CIFS_CPHTXT_SIZE, NULL); - - rc = crypto_skcipher_encrypt(req); - skcipher_request_free(req); - if (rc) { - cifs_dbg(VFS, "could not encrypt session key rc: %d\n", rc); - goto out_free_cipher; - } + crypto_arc4_set_key(ctx_arc4, ses->auth_key.response, + CIFS_SESS_KEY_SIZE); + crypto_arc4_crypt(ctx_arc4, ses->ntlmssp->ciphertext, sec_key, + CIFS_CPHTXT_SIZE); /* make secondary_key/nonce as session key */ memcpy(ses->auth_key.response, sec_key, CIFS_SESS_KEY_SIZE); /* and make len as that of session key only */ ses->auth_key.len = CIFS_SESS_KEY_SIZE; -out_free_cipher: - crypto_free_skcipher(tfm_arc4); out: + kfree(ctx_arc4); kfree(sec_key); return rc; }
The CIFS code uses the sync skcipher API to invoke the ecb(arc4) skcipher, of which only a single generic C code implementation exists. This means that going through all the trouble of using scatterlists etc buys us very little, and we're better off just invoking the arc4 library directly. Cc: linux-cifs@vger.kernel.org Cc: Steve French <sfrench@samba.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- fs/cifs/Kconfig | 2 +- fs/cifs/cifsencrypt.c | 50 +++++--------------- 2 files changed, 13 insertions(+), 39 deletions(-)