Message ID | Y/3N6zFOZeehJQ/p@gondor.apana.org.au (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4] crypto: stm32 - Save and restore between each request | expand |
On Tue, Feb 28, 2023 at 10:48 AM Herbert Xu <herbert@gondor.apana.org.au> wrote: > v4 fixes hmac to not reload the key over and over again causing > the hash state to be corrupted. OK I tested this, sadly the same results. Notice though: the HMAC versions fail on test vector 0 and the non-MAC:ed fail on vector 1, so I guess that means test vector 0 works with those? Here is the complete log: [ 2.997312] alg: extra crypto tests enabled. This is intended for developer use only. [ 15.203609] Key type encrypted registered [ 22.553791] stm32-hash a03c2000.hash: allocated hmac(sha256) fallback [ 22.561976] alg: ahash: stm32-hmac-sha256 test failed (wrong result) on test vector 0, cfg="init+update+final aligned buffer" [ 22.573387] Expected: [ 22.575674] 00000000: a2 1b 1f 5d 4c f4 f7 3a 4d d9 39 75 0f 7a 06 6a [ 22.582160] 00000010: 7f 98 cc 13 1c b1 6a 66 92 75 90 21 cf ab 81 81 [ 22.588613] Obtained: [ 22.590917] 00000000: 46 24 76 a8 97 dd fd bd 40 d1 42 0e 08 a5 bc fe [ 22.597368] 00000010: eb 25 c3 e2 ad e6 a0 a9 08 3b 32 7b 9e f9 fc a1 [ 22.603865] alg: self-tests for hmac(sha256) using stm32-hmac-sha256 failed (rc=-22) [ 22.603887] ------------[ cut here ]------------ [ 22.616297] WARNING: CPU: 1 PID: 75 at crypto/testmgr.c:5864 alg_test.part.0+0x4d0/0x4dc [ 22.624437] alg: self-tests for hmac(sha256) using stm32-hmac-sha256 failed (rc=-22) [ 22.624448] Modules linked in: [ 22.635258] CPU: 1 PID: 75 Comm: cryptomgr_test Not tainted 6.2.0-12020-g1c3e1a0051be #67 [ 22.643437] Hardware name: ST-Ericsson Ux5x0 platform (Device Tree Support) [ 22.650405] unwind_backtrace from show_stack+0x10/0x14 [ 22.655650] show_stack from dump_stack_lvl+0x40/0x4c [ 22.660724] dump_stack_lvl from __warn+0x94/0xc0 [ 22.665447] __warn from warn_slowpath_fmt+0x118/0x164 [ 22.670601] warn_slowpath_fmt from alg_test.part.0+0x4d0/0x4dc [ 22.676537] alg_test.part.0 from cryptomgr_test+0x18/0x38 [ 22.682037] cryptomgr_test from kthread+0xc0/0xc4 [ 22.686843] kthread from ret_from_fork+0x14/0x2c [ 22.691553] Exception stack(0xf0f45fb0 to 0xf0f45ff8) [ 22.696604] 5fa0: 00000000 00000000 00000000 00000000 [ 22.704779] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 22.712953] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 22.719596] ---[ end trace 0000000000000000 ]--- [ 22.724494] stm32-hash a03c2000.hash: allocated sha256 fallback [ 22.769732] alg: ahash: stm32-sha256 test failed (wrong result) on test vector 1, cfg="init+update+final aligned buffer" [ 22.780648] Expected: [ 22.782952] 00000000: ba 78 16 bf 8f 01 cf ea 41 41 40 de 5d ae 22 23 [ 22.789392] 00000010: b0 03 61 a3 96 17 7a 9c b4 10 ff 61 f2 00 15 ad [ 22.795874] Obtained: [ 22.798147] 00000000: e3 b0 c4 42 98 fc 1c 14 9a fb f4 c8 99 6f b9 24 [ 22.804607] 00000010: 27 ae 41 e4 64 9b 93 4c a4 95 99 1b 78 52 b8 55 [ 22.811074] alg: self-tests for sha256 using stm32-sha256 failed (rc=-22) [ 22.811083] ------------[ cut here ]------------ [ 22.822480] WARNING: CPU: 1 PID: 85 at crypto/testmgr.c:5864 alg_test.part.0+0x4d0/0x4dc [ 22.830607] alg: self-tests for sha256 using stm32-sha256 failed (rc=-22) [ 22.830615] Modules linked in: [ 22.840457] CPU: 1 PID: 85 Comm: cryptomgr_test Tainted: G W 6.2.0-12020-g1c3e1a0051be #67 [ 22.850109] Hardware name: ST-Ericsson Ux5x0 platform (Device Tree Support) [ 22.857069] unwind_backtrace from show_stack+0x10/0x14 [ 22.862307] show_stack from dump_stack_lvl+0x40/0x4c [ 22.867373] dump_stack_lvl from __warn+0x94/0xc0 [ 22.872090] __warn from warn_slowpath_fmt+0x118/0x164 [ 22.877237] warn_slowpath_fmt from alg_test.part.0+0x4d0/0x4dc [ 22.883167] alg_test.part.0 from cryptomgr_test+0x18/0x38 [ 22.888662] cryptomgr_test from kthread+0xc0/0xc4 [ 22.893462] kthread from ret_from_fork+0x14/0x2c [ 22.898169] Exception stack(0xf0f6dfb0 to 0xf0f6dff8) [ 22.903216] dfa0: 00000000 00000000 00000000 00000000 [ 22.911388] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 22.919559] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 22.926182] ---[ end trace 0000000000000000 ]--- [ 36.677933] stm32-hash a03c2000.hash: allocated hmac(sha1) fallback [ 36.686991] alg: ahash: stm32-hmac-sha1 test failed (wrong result) on test vector 0, cfg="init+update+final aligned buffer" [ 36.698242] Expected: [ 36.700547] 00000000: b6 17 31 86 55 05 72 64 e2 8b c0 b6 fb 37 8c 8e [ 36.707002] 00000010: f1 46 be 00 [ 36.710345] Obtained: [ 36.712624] 00000000: 12 3f d7 8b da 01 00 78 6a e8 6b 76 f5 0f 01 bd [ 36.719072] 00000010: 18 e4 77 f3 [ 36.722450] alg: self-tests for hmac(sha1) using stm32-hmac-sha1 failed (rc=-22) [ 36.722472] ------------[ cut here ]------------ [ 36.734495] WARNING: CPU: 1 PID: 88 at crypto/testmgr.c:5864 alg_test.part.0+0x4d0/0x4dc [ 36.742628] alg: self-tests for hmac(sha1) using stm32-hmac-sha1 failed (rc=-22) [ 36.742637] Modules linked in: [ 36.753097] CPU: 1 PID: 88 Comm: cryptomgr_test Tainted: G W 6.2.0-12020-g1c3e1a0051be #67 [ 36.762754] Hardware name: ST-Ericsson Ux5x0 platform (Device Tree Support) [ 36.769719] unwind_backtrace from show_stack+0x10/0x14 [ 36.774963] show_stack from dump_stack_lvl+0x40/0x4c [ 36.780036] dump_stack_lvl from __warn+0x94/0xc0 [ 36.784759] __warn from warn_slowpath_fmt+0x118/0x164 [ 36.789912] warn_slowpath_fmt from alg_test.part.0+0x4d0/0x4dc [ 36.795847] alg_test.part.0 from cryptomgr_test+0x18/0x38 [ 36.801347] cryptomgr_test from kthread+0xc0/0xc4 [ 36.806153] kthread from ret_from_fork+0x14/0x2c [ 36.810862] Exception stack(0xf0f79fb0 to 0xf0f79ff8) [ 36.815912] 9fa0: 00000000 00000000 00000000 00000000 [ 36.824087] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 36.832261] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 36.838902] ---[ end trace 0000000000000000 ]--- [ 36.843762] stm32-hash a03c2000.hash: allocated sha1 fallback [ 36.889782] alg: ahash: stm32-sha1 test failed (wrong result) on test vector 1, cfg="init+update+final aligned buffer" [ 36.900507] Expected: [ 36.902786] 00000000: a9 99 3e 36 47 06 81 6a ba 3e 25 71 78 50 c2 6c [ 36.909225] 00000010: 9c d0 d8 9d [ 36.912564] Obtained: [ 36.914834] 00000000: da 39 a3 ee 5e 6b 4b 0d 32 55 bf ef 95 60 18 90 [ 36.921296] 00000010: af d8 07 09 [ 36.924627] alg: self-tests for sha1 using stm32-sha1 failed (rc=-22) [ 36.924635] ------------[ cut here ]------------ [ 36.935687] WARNING: CPU: 1 PID: 100 at crypto/testmgr.c:5864 alg_test.part.0+0x4d0/0x4dc [ 36.943902] alg: self-tests for sha1 using stm32-sha1 failed (rc=-22) [ 36.943909] Modules linked in: [ 36.953406] CPU: 1 PID: 100 Comm: cryptomgr_test Tainted: G W 6.2.0-12020-g1c3e1a0051be #67 [ 36.963144] Hardware name: ST-Ericsson Ux5x0 platform (Device Tree Support) [ 36.970103] unwind_backtrace from show_stack+0x10/0x14 [ 36.975340] show_stack from dump_stack_lvl+0x40/0x4c [ 36.980404] dump_stack_lvl from __warn+0x94/0xc0 [ 36.985120] __warn from warn_slowpath_fmt+0x118/0x164 [ 36.990266] warn_slowpath_fmt from alg_test.part.0+0x4d0/0x4dc [ 36.996195] alg_test.part.0 from cryptomgr_test+0x18/0x38 [ 37.001689] cryptomgr_test from kthread+0xc0/0xc4 [ 37.006488] kthread from ret_from_fork+0x14/0x2c [ 37.011193] Exception stack(0xf0f8dfb0 to 0xf0f8dff8) [ 37.016240] dfa0: 00000000 00000000 00000000 00000000 [ 37.024411] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 37.032581] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 37.039222] ---[ end trace 0000000000000000 ]--- Here I have applied a patch like this to see the failing vectors: commit 1c3e1a0051be234ef109e97075783c28e3b07452 (HEAD -> ux500-fixup-stm32-cryp-herbert-v4) Author: Linus Walleij <linus.walleij@linaro.org> Date: Mon Dec 26 09:53:10 2022 +0100 test hacks diff --git a/crypto/testmgr.c b/crypto/testmgr.c index c91e93ece20b..db511293933b 100644 --- a/crypto/testmgr.c +++ b/crypto/testmgr.c @@ -1203,6 +1203,10 @@ static int check_hash_result(const char *type, if (memcmp(result, vec->digest, digestsize) != 0) { pr_err("alg: %s: %s test failed (wrong result) on test vector %s, cfg=\"%s\"\n", type, driver, vec_name, cfg->name); + pr_err("Expected:\n"); + hexdump(vec->digest, digestsize); + pr_err("Obtained:\n"); + hexdump(result, digestsize); return -EINVAL; I'm a bit lost on what to try next :/ Yours, Linus Walleij
On Tue, Feb 28, 2023 at 09:50:55PM +0100, Linus Walleij wrote: > > OK I tested this, sadly the same results. > > Notice though: the HMAC versions fail on test vector 0 and > the non-MAC:ed fail on vector 1, so I guess that means test > vector 0 works with those? Hah, test vector 0 for sha256 is an empty message. While test vector 1 is the same as test vector 0 for hmac(sha256). So I guess at least the fallback is still working :) Cheers,
On Tue, Feb 28, 2023 at 09:50:55PM +0100, Linus Walleij wrote: > > Notice though: the HMAC versions fail on test vector 0 and > the non-MAC:ed fail on vector 1, so I guess that means test > vector 0 works with those? The failing vector is the first one where we save the state from the hardware and then try to restore it. Is your device ux500 or stm32? Perhaps state saving/restoring is simply broken on ux500 (as the original ux500 driver didn't support export/import and always used a fallback)? Thanks,
On Wed, Mar 01, 2023 at 09:36:08AM +0800, Herbert Xu wrote: > > Is your device ux500 or stm32? Perhaps state saving/restoring is > simply broken on ux500 (as the original ux500 driver didn't support > export/import and always used a fallback)? Interesting, I dug up the old ux500 driver and even though it doesn't have export/import hooked up, it does actually appear to save/restore hardware state. In fact it seems to do it multiple times per request, even when it's unnecessary. I'll try to see if the saving/restoring is subtly different between ux500 and stm32. Cheers,
On Wed, Mar 1, 2023 at 2:36 AM Herbert Xu <herbert@gondor.apana.org.au> wrote: > The failing vector is the first one where we save the state from > the hardware and then try to restore it. Yeah that's typical :/ > Is your device ux500 or stm32? Perhaps state saving/restoring is > simply broken on ux500 (as the original ux500 driver didn't support > export/import and always used a fallback)? It's Ux500 but I had no problem with import/export before, and yeah it has state save/restore in HW. Yours, Linus Walleij
On Wed, Mar 01, 2023 at 01:22:13PM +0100, Linus Walleij wrote: > > It's Ux500 but I had no problem with import/export before, > and yeah it has state save/restore in HW. So with the stm32 driver your ux500 is able to pass the extra fuzz tests, right? That should indeed test export and import. Thanks,
On Wed, Mar 01, 2023 at 01:22:13PM +0100, Linus Walleij wrote: > > It's Ux500 but I had no problem with import/export before, > and yeah it has state save/restore in HW. I think I see the problem. My patch wasn't waiting for the hash computation to complete before saving the state so obviously it will get the wrong hash state every single time. I'll fix this up and some other inconsistencies (my reading of the documentation is that there are 54 registers (0-53), not 53) and resend the patch. Cheers,
On Thu, Mar 02, 2023 at 02:04:38PM +0800, Herbert Xu wrote: > > I think I see the problem. My patch wasn't waiting for the hash > computation to complete before saving the state so obviously it > will get the wrong hash state every single time. > > I'll fix this up and some other inconsistencies (my reading of the > documentation is that there are 54 registers (0-53), not 53) and > resend the patch. I've split the patch up into smaller chunks for easier testing. Cheers,
On Sat, Mar 04, 2023 at 05:34:04PM +0800, Herbert Xu wrote: > > I've split the patch up into smaller chunks for easier testing. v6 fixes a bug in the finup patch that caused the new data to be discarded instead of hashed. This patch series fixes the import/export functions in the stm32 driver. As usual, a failure in import/export indicates a general bug in the hash driver that may break as soon as two concurrent users show up and hash at the same time using any method other than digest or init+finup. Cheers,
Hi All, Sorry for the very (very very) late response. Thanks for highlighting the issue. I'm worried about the issue seen that we've fixed at our downstream level. We (ST) are currently working on upstreaming the new peripheral update for STM32MP13 that fixed the old issue seen (such as CSR register numbers), and so on.... The issue about the context management relies on a question I've get time to ask you. There is no internal test purpose (using test manager) that really show the need of a hash update that needs to be "self-content". We've seen the issue using openssl use cases that is not using import/export. I'm wondering to understand the real need of import/export in the framework if the request must be safe itself? From hardware point of view, it is a penalty to wait for completion to save the context after each request. I understand the need of multiple hash request in // but I was wondering that it can be managed by the import/export, but it seems I was wrong. The penalty of the context saving will impact all hash requests where, in a runtime context is probably not the most important use case. I'm looking deeper to check with the DMA use case and there is some new HW restriction on the coming hash version that doesn't allow the read of CSR register at some times. BR, Lionel ST Restricted -----Original Message----- From: Herbert Xu <herbert@gondor.apana.org.au> Sent: Monday, March 6, 2023 5:42 AM To: Linus Walleij <linus.walleij@linaro.org> Cc: Lionel Debieve <lionel.debieve@foss.st.com>; Li kunyu <kunyu@nfschina.com>; davem@davemloft.net; linux-arm-kernel@lists.infradead.org; linux-crypto@vger.kernel.org; linux-kernel@vger.kernel.org; linux-stm32@st-md-mailman.stormreply.com; mcoquelin.stm32@gmail.com Subject: [v6 PATCH 0/7] crypto: stm32 - Save and restore between each request On Sat, Mar 04, 2023 at 05:34:04PM +0800, Herbert Xu wrote: > > I've split the patch up into smaller chunks for easier testing. v6 fixes a bug in the finup patch that caused the new data to be discarded instead of hashed. This patch series fixes the import/export functions in the stm32 driver. As usual, a failure in import/export indicates a general bug in the hash driver that may break as soon as two concurrent users show up and hash at the same time using any method other than digest or init+finup. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Tue, Mar 07, 2023 at 02:55:29PM +0100, lionel.debieve@foss.st.com wrote: > > The issue about the context management relies on a question I've get time to > ask you. There is no internal test purpose (using test manager) that really > show the need of a hash update that needs to be "self-content". We've seen Indeed this functionality is sorely missed. It shouldn't be hard to implement, because simply hashing two different requests interleaved with each other should show the problem: init(A) => update(A) => init(B) => update(B) => final(A) => final(B) > the issue using openssl use cases that is not using import/export. > I'm wondering to understand the real need of import/export in the framework > if the request must be safe itself? The hash state is normally stored in the request context. The import/export functions let you save a copy of the state for subsequent processing. The request could then be freed after the export and re-allocated prior to import, or in other contexts the request could be reused for a completely different hash in the time being (init/update/final). > >From hardware point of view, it is a penalty to wait for completion to save > the context after each request. I understand the need of multiple hash > request in // but I was wondering that it can be managed by the > import/export, but it seems I was wrong. The penalty of the context saving > will impact all hash requests where, in a runtime context is probably not > the most important use case. Oh of course we try to avoid unnecessary savings/restoring as much as we can. That's why we encourage users to use finup/digest as much as possible, in which case there may be be no need to save and restore at all. However, if the user has to do a partial update through the update function, then we have to save the state. Cheers,
v7 fixes empty message hashing and moves that into its own patch. As a result of this arrangement, I have removed the Reviewd-by's and Tested-by's for those two final patches. This patch series fixes the import/export functions in the stm32 driver. As usual, a failure in import/export indicates a general bug in the hash driver that may break as soon as two concurrent users show up and hash at the same time using any method other than digest or init+finup. Cheers,
diff --git a/drivers/crypto/stm32/stm32-hash.c b/drivers/crypto/stm32/stm32-hash.c index 7bf805563ac2..a4c4cb1735d4 100644 --- a/drivers/crypto/stm32/stm32-hash.c +++ b/drivers/crypto/stm32/stm32-hash.c @@ -7,7 +7,6 @@ */ #include <linux/clk.h> -#include <linux/crypto.h> #include <linux/delay.h> #include <linux/dma-mapping.h> #include <linux/dmaengine.h> @@ -127,6 +126,16 @@ struct stm32_hash_ctx { int keylen; }; +struct stm32_hash_state { + u16 bufcnt; + u16 buflen; + + u8 buffer[HASH_BUFLEN] __aligned(4); + + /* hash state */ + u32 hw_context[3 + HASH_CSR_REGISTER_NUMBER]; +}; + struct stm32_hash_request_ctx { struct stm32_hash_dev *hdev; unsigned long flags; @@ -134,8 +143,6 @@ struct stm32_hash_request_ctx { u8 digest[SHA256_DIGEST_SIZE] __aligned(sizeof(u32)); size_t digcnt; - size_t bufcnt; - size_t buflen; /* DMA */ struct scatterlist *sg; @@ -149,10 +156,7 @@ struct stm32_hash_request_ctx { u8 data_type; - u8 buffer[HASH_BUFLEN] __aligned(sizeof(u32)); - - /* Export Context */ - u32 *hw_context; + struct stm32_hash_state state; }; struct stm32_hash_algs_info { @@ -183,7 +187,6 @@ struct stm32_hash_dev { struct ahash_request *req; struct crypto_engine *engine; - int err; unsigned long flags; struct dma_chan *dma_lch; @@ -326,11 +329,12 @@ static void stm32_hash_write_ctrl(struct stm32_hash_dev *hdev, int bufcnt) static void stm32_hash_append_sg(struct stm32_hash_request_ctx *rctx) { + struct stm32_hash_state *state = &rctx->state; size_t count; - while ((rctx->bufcnt < rctx->buflen) && rctx->total) { + while ((state->bufcnt < state->buflen) && rctx->total) { count = min(rctx->sg->length - rctx->offset, rctx->total); - count = min(count, rctx->buflen - rctx->bufcnt); + count = min_t(size_t, count, state->buflen - state->bufcnt); if (count <= 0) { if ((rctx->sg->length == 0) && !sg_is_last(rctx->sg)) { @@ -341,10 +345,10 @@ static void stm32_hash_append_sg(struct stm32_hash_request_ctx *rctx) } } - scatterwalk_map_and_copy(rctx->buffer + rctx->bufcnt, rctx->sg, - rctx->offset, count, 0); + scatterwalk_map_and_copy(state->buffer + state->bufcnt, + rctx->sg, rctx->offset, count, 0); - rctx->bufcnt += count; + state->bufcnt += count; rctx->offset += count; rctx->total -= count; @@ -413,26 +417,27 @@ static int stm32_hash_xmit_cpu(struct stm32_hash_dev *hdev, static int stm32_hash_update_cpu(struct stm32_hash_dev *hdev) { struct stm32_hash_request_ctx *rctx = ahash_request_ctx(hdev->req); + struct stm32_hash_state *state = &rctx->state; int bufcnt, err = 0, final; dev_dbg(hdev->dev, "%s flags %lx\n", __func__, rctx->flags); final = (rctx->flags & HASH_FLAGS_FINUP); - while ((rctx->total >= rctx->buflen) || - (rctx->bufcnt + rctx->total >= rctx->buflen)) { + while ((rctx->total >= state->buflen) || + (state->bufcnt + rctx->total >= state->buflen)) { stm32_hash_append_sg(rctx); - bufcnt = rctx->bufcnt; - rctx->bufcnt = 0; - err = stm32_hash_xmit_cpu(hdev, rctx->buffer, bufcnt, 0); + bufcnt = state->bufcnt; + state->bufcnt = 0; + err = stm32_hash_xmit_cpu(hdev, state->buffer, bufcnt, 0); } stm32_hash_append_sg(rctx); if (final) { - bufcnt = rctx->bufcnt; - rctx->bufcnt = 0; - err = stm32_hash_xmit_cpu(hdev, rctx->buffer, bufcnt, 1); + bufcnt = state->bufcnt; + state->bufcnt = 0; + err = stm32_hash_xmit_cpu(hdev, state->buffer, bufcnt, 1); /* If we have an IRQ, wait for that, else poll for completion */ if (hdev->polled) { @@ -441,8 +446,20 @@ static int stm32_hash_update_cpu(struct stm32_hash_dev *hdev) hdev->flags |= HASH_FLAGS_OUTPUT_READY; err = 0; } + } else { + u32 *preg = state->hw_context; + int i; + + if (!hdev->pdata->ux500) + *preg++ = stm32_hash_read(hdev, HASH_IMR); + *preg++ = stm32_hash_read(hdev, HASH_STR); + *preg++ = stm32_hash_read(hdev, HASH_CR); + for (i = 0; i < HASH_CSR_REGISTER_NUMBER; i++) + *preg++ = stm32_hash_read(hdev, HASH_CSR(i)); } + rctx->flags |= HASH_FLAGS_INIT; + return err; } @@ -584,10 +601,10 @@ static int stm32_hash_dma_init(struct stm32_hash_dev *hdev) static int stm32_hash_dma_send(struct stm32_hash_dev *hdev) { struct stm32_hash_request_ctx *rctx = ahash_request_ctx(hdev->req); + u32 *buffer = (void *)rctx->state.buffer; struct scatterlist sg[1], *tsg; int err = 0, len = 0, reg, ncp = 0; unsigned int i; - u32 *buffer = (void *)rctx->buffer; rctx->sg = hdev->req->src; rctx->total = hdev->req->nbytes; @@ -615,7 +632,7 @@ static int stm32_hash_dma_send(struct stm32_hash_dev *hdev) ncp = sg_pcopy_to_buffer( rctx->sg, rctx->nents, - rctx->buffer, sg->length - len, + rctx->state.buffer, sg->length - len, rctx->total - sg->length + len); sg->length = len; @@ -671,6 +688,8 @@ static int stm32_hash_dma_send(struct stm32_hash_dev *hdev) err = stm32_hash_hmac_dma_send(hdev); } + rctx->flags |= HASH_FLAGS_INIT; + return err; } @@ -749,14 +768,12 @@ static int stm32_hash_init(struct ahash_request *req) return -EINVAL; } - rctx->bufcnt = 0; - rctx->buflen = HASH_BUFLEN; + rctx->state.bufcnt = 0; + rctx->state.buflen = HASH_BUFLEN; rctx->total = 0; rctx->offset = 0; rctx->data_type = HASH_DATA_8_BITS; - memset(rctx->buffer, 0, HASH_BUFLEN); - if (ctx->flags & HASH_FLAGS_HMAC) rctx->flags |= HASH_FLAGS_HMAC; @@ -774,15 +791,16 @@ static int stm32_hash_final_req(struct stm32_hash_dev *hdev) { struct ahash_request *req = hdev->req; struct stm32_hash_request_ctx *rctx = ahash_request_ctx(req); + struct stm32_hash_state *state = &rctx->state; + int buflen = state->bufcnt; int err; - int buflen = rctx->bufcnt; - rctx->bufcnt = 0; + state->bufcnt = 0; if (!(rctx->flags & HASH_FLAGS_CPU)) err = stm32_hash_dma_send(hdev); else - err = stm32_hash_xmit_cpu(hdev, rctx->buffer, buflen, 1); + err = stm32_hash_xmit_cpu(hdev, state->buffer, buflen, 1); /* If we have an IRQ, wait for that, else poll for completion */ if (hdev->polled) { @@ -832,7 +850,7 @@ static void stm32_hash_copy_hash(struct ahash_request *req) __be32 *hash = (void *)rctx->digest; unsigned int i, hashsize; - if (hdev->pdata->broken_emptymsg && !req->nbytes) + if (hdev->pdata->broken_emptymsg && !(rctx->flags & HASH_FLAGS_INIT)) return stm32_hash_emptymsg_fallback(req); switch (rctx->flags & HASH_FLAGS_ALGO_MASK) { @@ -882,11 +900,6 @@ static void stm32_hash_finish_req(struct ahash_request *req, int err) if (!err && (HASH_FLAGS_FINAL & hdev->flags)) { stm32_hash_copy_hash(req); err = stm32_hash_finish(req); - hdev->flags &= ~(HASH_FLAGS_FINAL | HASH_FLAGS_CPU | - HASH_FLAGS_INIT | HASH_FLAGS_DMA_READY | - HASH_FLAGS_OUTPUT_READY | HASH_FLAGS_HMAC | - HASH_FLAGS_HMAC_INIT | HASH_FLAGS_HMAC_FINAL | - HASH_FLAGS_HMAC_KEY); } else { rctx->flags |= HASH_FLAGS_ERRORS; } @@ -897,67 +910,61 @@ static void stm32_hash_finish_req(struct ahash_request *req, int err) crypto_finalize_hash_request(hdev->engine, req, err); } -static int stm32_hash_hw_init(struct stm32_hash_dev *hdev, +static void stm32_hash_hw_init(struct stm32_hash_dev *hdev, struct stm32_hash_request_ctx *rctx) { pm_runtime_get_sync(hdev->dev); - - if (!(HASH_FLAGS_INIT & hdev->flags)) { - stm32_hash_write(hdev, HASH_CR, HASH_CR_INIT); - stm32_hash_write(hdev, HASH_STR, 0); - stm32_hash_write(hdev, HASH_DIN, 0); - stm32_hash_write(hdev, HASH_IMR, 0); - hdev->err = 0; - } - - return 0; } -static int stm32_hash_one_request(struct crypto_engine *engine, void *areq); -static int stm32_hash_prepare_req(struct crypto_engine *engine, void *areq); - static int stm32_hash_handle_queue(struct stm32_hash_dev *hdev, struct ahash_request *req) { return crypto_transfer_hash_request_to_engine(hdev->engine, req); } -static int stm32_hash_prepare_req(struct crypto_engine *engine, void *areq) +static int stm32_hash_one_request(struct crypto_engine *engine, void *areq) { struct ahash_request *req = container_of(areq, struct ahash_request, base); struct stm32_hash_ctx *ctx = crypto_ahash_ctx(crypto_ahash_reqtfm(req)); struct stm32_hash_dev *hdev = stm32_hash_find_dev(ctx); struct stm32_hash_request_ctx *rctx; + int err = 0; if (!hdev) return -ENODEV; + dev_dbg(hdev->dev, "processing new req, op: %lu, nbytes %d\n", + rctx->op, req->nbytes); + + stm32_hash_hw_init(hdev, rctx); + hdev->req = req; + hdev->flags = 0; rctx = ahash_request_ctx(req); - dev_dbg(hdev->dev, "processing new req, op: %lu, nbytes %d\n", - rctx->op, req->nbytes); + if (rctx->flags & HASH_FLAGS_INIT) { + u32 *preg = rctx->state.hw_context; + u32 reg; + int i; - return stm32_hash_hw_init(hdev, rctx); -} - -static int stm32_hash_one_request(struct crypto_engine *engine, void *areq) -{ - struct ahash_request *req = container_of(areq, struct ahash_request, - base); - struct stm32_hash_ctx *ctx = crypto_ahash_ctx(crypto_ahash_reqtfm(req)); - struct stm32_hash_dev *hdev = stm32_hash_find_dev(ctx); - struct stm32_hash_request_ctx *rctx; - int err = 0; + if (!hdev->pdata->ux500) + stm32_hash_write(hdev, HASH_IMR, *preg++); + stm32_hash_write(hdev, HASH_STR, *preg++); + stm32_hash_write(hdev, HASH_CR, *preg); + reg = *preg++ | HASH_CR_INIT; + stm32_hash_write(hdev, HASH_CR, reg); - if (!hdev) - return -ENODEV; + for (i = 0; i < HASH_CSR_REGISTER_NUMBER; i++) + stm32_hash_write(hdev, HASH_CSR(i), *preg++); - hdev->req = req; + hdev->flags |= HASH_FLAGS_INIT; - rctx = ahash_request_ctx(req); + if (rctx->flags & HASH_FLAGS_HMAC) + hdev->flags |= HASH_FLAGS_HMAC | + HASH_FLAGS_HMAC_KEY; + } if (rctx->op == HASH_OP_UPDATE) err = stm32_hash_update_req(hdev); @@ -985,6 +992,7 @@ static int stm32_hash_enqueue(struct ahash_request *req, unsigned int op) static int stm32_hash_update(struct ahash_request *req) { struct stm32_hash_request_ctx *rctx = ahash_request_ctx(req); + struct stm32_hash_state *state = &rctx->state; if (!req->nbytes || !(rctx->flags & HASH_FLAGS_CPU)) return 0; @@ -993,7 +1001,7 @@ static int stm32_hash_update(struct ahash_request *req) rctx->sg = req->src; rctx->offset = 0; - if ((rctx->bufcnt + rctx->total < rctx->buflen)) { + if ((state->bufcnt + rctx->total < state->buflen)) { stm32_hash_append_sg(rctx); return 0; } @@ -1044,35 +1052,13 @@ static int stm32_hash_digest(struct ahash_request *req) static int stm32_hash_export(struct ahash_request *req, void *out) { struct stm32_hash_request_ctx *rctx = ahash_request_ctx(req); - struct stm32_hash_ctx *ctx = crypto_ahash_ctx(crypto_ahash_reqtfm(req)); - struct stm32_hash_dev *hdev = stm32_hash_find_dev(ctx); - u32 *preg; - unsigned int i; - int ret; + bool empty = !(rctx->flags & HASH_FLAGS_INIT); + u8 *p = out; - pm_runtime_get_sync(hdev->dev); - - ret = stm32_hash_wait_busy(hdev); - if (ret) - return ret; - - rctx->hw_context = kmalloc_array(3 + HASH_CSR_REGISTER_NUMBER, - sizeof(u32), - GFP_KERNEL); + *(u8 *)p = empty; - preg = rctx->hw_context; - - if (!hdev->pdata->ux500) - *preg++ = stm32_hash_read(hdev, HASH_IMR); - *preg++ = stm32_hash_read(hdev, HASH_STR); - *preg++ = stm32_hash_read(hdev, HASH_CR); - for (i = 0; i < HASH_CSR_REGISTER_NUMBER; i++) - *preg++ = stm32_hash_read(hdev, HASH_CSR(i)); - - pm_runtime_mark_last_busy(hdev->dev); - pm_runtime_put_autosuspend(hdev->dev); - - memcpy(out, rctx, sizeof(*rctx)); + if (!empty) + memcpy(p + 1, &rctx->state, sizeof(rctx->state)); return 0; } @@ -1080,32 +1066,14 @@ static int stm32_hash_export(struct ahash_request *req, void *out) static int stm32_hash_import(struct ahash_request *req, const void *in) { struct stm32_hash_request_ctx *rctx = ahash_request_ctx(req); - struct stm32_hash_ctx *ctx = crypto_ahash_ctx(crypto_ahash_reqtfm(req)); - struct stm32_hash_dev *hdev = stm32_hash_find_dev(ctx); - const u32 *preg = in; - u32 reg; - unsigned int i; - - memcpy(rctx, in, sizeof(*rctx)); + const u8 *p = in; - preg = rctx->hw_context; - - pm_runtime_get_sync(hdev->dev); + stm32_hash_init(req); - if (!hdev->pdata->ux500) - stm32_hash_write(hdev, HASH_IMR, *preg++); - stm32_hash_write(hdev, HASH_STR, *preg++); - stm32_hash_write(hdev, HASH_CR, *preg); - reg = *preg++ | HASH_CR_INIT; - stm32_hash_write(hdev, HASH_CR, reg); - - for (i = 0; i < HASH_CSR_REGISTER_NUMBER; i++) - stm32_hash_write(hdev, HASH_CSR(i), *preg++); - - pm_runtime_mark_last_busy(hdev->dev); - pm_runtime_put_autosuspend(hdev->dev); - - kfree(rctx->hw_context); + if (!*(u8 *)p) { + rctx->flags |= HASH_FLAGS_INIT; + memcpy(&rctx->state, p + 1, sizeof(rctx->state)); + } return 0; } @@ -1162,8 +1130,6 @@ static int stm32_hash_cra_init_algs(struct crypto_tfm *tfm, ctx->flags |= HASH_FLAGS_HMAC; ctx->enginectx.op.do_one_request = stm32_hash_one_request; - ctx->enginectx.op.prepare_request = stm32_hash_prepare_req; - ctx->enginectx.op.unprepare_request = NULL; return stm32_hash_init_fallback(tfm); } @@ -1255,7 +1221,7 @@ static struct ahash_alg algs_md5[] = { .import = stm32_hash_import, .halg = { .digestsize = MD5_DIGEST_SIZE, - .statesize = sizeof(struct stm32_hash_request_ctx), + .statesize = sizeof(struct stm32_hash_state) + 1, .base = { .cra_name = "md5", .cra_driver_name = "stm32-md5", @@ -1282,7 +1248,7 @@ static struct ahash_alg algs_md5[] = { .setkey = stm32_hash_setkey, .halg = { .digestsize = MD5_DIGEST_SIZE, - .statesize = sizeof(struct stm32_hash_request_ctx), + .statesize = sizeof(struct stm32_hash_state) + 1, .base = { .cra_name = "hmac(md5)", .cra_driver_name = "stm32-hmac-md5", @@ -1311,7 +1277,7 @@ static struct ahash_alg algs_sha1[] = { .import = stm32_hash_import, .halg = { .digestsize = SHA1_DIGEST_SIZE, - .statesize = sizeof(struct stm32_hash_request_ctx), + .statesize = sizeof(struct stm32_hash_state) + 1, .base = { .cra_name = "sha1", .cra_driver_name = "stm32-sha1", @@ -1338,7 +1304,7 @@ static struct ahash_alg algs_sha1[] = { .setkey = stm32_hash_setkey, .halg = { .digestsize = SHA1_DIGEST_SIZE, - .statesize = sizeof(struct stm32_hash_request_ctx), + .statesize = sizeof(struct stm32_hash_state) + 1, .base = { .cra_name = "hmac(sha1)", .cra_driver_name = "stm32-hmac-sha1", @@ -1367,7 +1333,7 @@ static struct ahash_alg algs_sha224[] = { .import = stm32_hash_import, .halg = { .digestsize = SHA224_DIGEST_SIZE, - .statesize = sizeof(struct stm32_hash_request_ctx), + .statesize = sizeof(struct stm32_hash_state) + 1, .base = { .cra_name = "sha224", .cra_driver_name = "stm32-sha224", @@ -1394,7 +1360,7 @@ static struct ahash_alg algs_sha224[] = { .import = stm32_hash_import, .halg = { .digestsize = SHA224_DIGEST_SIZE, - .statesize = sizeof(struct stm32_hash_request_ctx), + .statesize = sizeof(struct stm32_hash_state) + 1, .base = { .cra_name = "hmac(sha224)", .cra_driver_name = "stm32-hmac-sha224", @@ -1423,7 +1389,7 @@ static struct ahash_alg algs_sha256[] = { .import = stm32_hash_import, .halg = { .digestsize = SHA256_DIGEST_SIZE, - .statesize = sizeof(struct stm32_hash_request_ctx), + .statesize = sizeof(struct stm32_hash_state) + 1, .base = { .cra_name = "sha256", .cra_driver_name = "stm32-sha256", @@ -1450,7 +1416,7 @@ static struct ahash_alg algs_sha256[] = { .setkey = stm32_hash_setkey, .halg = { .digestsize = SHA256_DIGEST_SIZE, - .statesize = sizeof(struct stm32_hash_request_ctx), + .statesize = sizeof(struct stm32_hash_state) + 1, .base = { .cra_name = "hmac(sha256)", .cra_driver_name = "stm32-hmac-sha256",
v4 fixes hmac to not reload the key over and over again causing the hash state to be corrupted. ---8<--- The Crypto API hashing paradigm requires the hardware state to be exported between *each* request because multiple unrelated hashes may be processed concurrently. The stm32 hardware is capable of producing the hardware hashing state but it was only doing it in the export function. This is not only broken for export as you can't export a kernel pointer and reimport it, but it also means that concurrent hashing was fundamentally broken. Fix this by moving the saving and restoring of hardware hash state between each and every hashing request. Also change the emptymsg check in stm32_hash_copy_hash to rely on whether we have any existing hash state, rather than whether this particular update request is empty. Fixes: 8a1012d3f2ab ("crypto: stm32 - Support for STM32 HASH module") Reported-by: Li kunyu <kunyu@nfschina.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>