Message ID | 20211003181413.12465-9-nstange@suse.de (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Herbert Xu |
Headers | show |
Series | crypto: api - priorize tested algorithms in lookup | expand |
On Sun, Oct 03, 2021 at 08:14:13PM +0200, Nicolai Stange wrote: > crypto_alg_mod_lookup() invokes the crypto_larval_lookup() helper > to run the actual search for matching crypto_alg implementation and > larval entries. The latter is currently considering only the individual > entries' relative ->cra_priority for determining which one out of multiple > matches to return. This means that it would potentially dismiss a matching > crypto_alg implementation in working state in favor of some pending > testing larval of higher ->cra_priority. Now, if the testmgr instance > invoked asynchronously on that testing larval came to the conclusion that > it should mark the tests as failed, any pending crypto_alg_mod_lookup() > waiting for it would be made to fail as well with -EAGAIN. > > In summary, crypto_alg_mod_lookup() can fail spuriously with -EAGAIN even > though an implementation in working state would have been available, namely > if the testmgr asynchronously marked another, competing implementation of > higher ->cra_priority as failed. > > This is normally not a problem at all with upstream, because the situation > where one algorithm passed its tests, but another competing one failed to > do so, would indicate a bug anyway. > > However, for downstream distributions seeking FIPS certification, simply > amending the list in crypto/testmgr.c with ->fips_allowed = 0 entries > matching on ->cra_driver_name would provide a convenient way of > selectively blacklisting implementations from drivers/crypto in fips > mode. Note that in this scenario failure of competing crypto_alg > implementations would become more common, in particular during device > enumeration. If the algorithm in question happened to be needed for e.g. > module signature verification, module loading could spuriously fail during > bootup, which is certainly not desired. > > For transparency: this has not actually been observed, I merely came to > the conclusion that it would be possible by reading the code. > > Make crypto_alg_lookup() run an additional search for non-larval matches > upfront in the common case that the request has been made for > CRYPTO_ALG_TESTED instances. > > Signed-off-by: Nicolai Stange <nstange@suse.de> > --- > crypto/api.c | 21 ++++++++++++++++++++- > 1 file changed, 20 insertions(+), 1 deletion(-) It's not clear that this new behaviour is desirable. For example, when we construct certain complex algorithms, they may depend on a generic version of that same algorithm as a fallback. We do not want users to get the generic version while the better version is being tested. Can you please explain what your failure scenario and perhaps we can come up with another way of resolving your problem? Thanks,
Herbert Xu <herbert@gondor.apana.org.au> writes: > On Sun, Oct 03, 2021 at 08:14:13PM +0200, Nicolai Stange wrote: >> crypto_alg_mod_lookup() invokes the crypto_larval_lookup() helper >> to run the actual search for matching crypto_alg implementation and >> larval entries. The latter is currently considering only the individual >> entries' relative ->cra_priority for determining which one out of multiple >> matches to return. This means that it would potentially dismiss a matching >> crypto_alg implementation in working state in favor of some pending >> testing larval of higher ->cra_priority. Now, if the testmgr instance >> invoked asynchronously on that testing larval came to the conclusion that >> it should mark the tests as failed, any pending crypto_alg_mod_lookup() >> waiting for it would be made to fail as well with -EAGAIN. >> >> In summary, crypto_alg_mod_lookup() can fail spuriously with -EAGAIN even >> though an implementation in working state would have been available, namely >> if the testmgr asynchronously marked another, competing implementation of >> higher ->cra_priority as failed. >> >> This is normally not a problem at all with upstream, because the situation >> where one algorithm passed its tests, but another competing one failed to >> do so, would indicate a bug anyway. >> >> However, for downstream distributions seeking FIPS certification, simply >> amending the list in crypto/testmgr.c with ->fips_allowed = 0 entries >> matching on ->cra_driver_name would provide a convenient way of >> selectively blacklisting implementations from drivers/crypto in fips >> mode. Note that in this scenario failure of competing crypto_alg >> implementations would become more common, in particular during device >> enumeration. If the algorithm in question happened to be needed for e.g. >> module signature verification, module loading could spuriously fail during >> bootup, which is certainly not desired. >> >> For transparency: this has not actually been observed, I merely came to >> the conclusion that it would be possible by reading the code. >> >> Make crypto_alg_lookup() run an additional search for non-larval matches >> upfront in the common case that the request has been made for >> CRYPTO_ALG_TESTED instances. >> >> Signed-off-by: Nicolai Stange <nstange@suse.de> >> --- >> crypto/api.c | 21 ++++++++++++++++++++- >> 1 file changed, 20 insertions(+), 1 deletion(-) Hi Herbert, > It's not clear that this new behaviour is desirable. For example, > when we construct certain complex algorithms, they may depend on a > generic version of that same algorithm as a fallback. We do not > want users to get the generic version while the better version is > being tested. Ah I see, you mean something like having 3+ providers of "algXY", - algXY_driver0, ->cra_priority = 100 - algXY_driver1, ->cra_priority = 200 - algXY_driver1, ->cra_priority = 300 where the latter needs a different "algXY" as a fallback? Hmm yes, I haven't thought of this and my patch here would indeed break it. > Can you please explain what your failure scenario and perhaps we > can come up with another way of resolving your problem? In order to keep a FIPS certification manageable in terms of scope, we're looking for a way to disable everything under drivers/crypto iff fips_enabled == 1. The most convenient way to achieve this downstream would be to add dummy entries to testmgr.c like so: static int alg_test_nop(const struct alg_test_desc *desc, const char *driver, u32 type, u32 mask) { /* Succeed in non-FIPS mode. */ return 0; } static const struct alg_test_desc alg_test_descs[] = { ..., { .alg = "sha256-padlock-nano", .test = alg_test_nop, .fips_allowed = 0, }, ... }; The concern is about e.g the following sequence of events during boot: - "sha256-padlock-nano" gets registered, the test gets scheduled. - An unrelated modprobe is getting invoked. The signature verification code, e.g pkcs7_digest(), requests "sha256", finds the pending "sha256-padlock-nano" testing larval and puts itself in a wait for it. - The scheduled "sha256-padlock-nano" test gets to run and, as per ->fips_allowed == 0, is forced to fail with -EINVAL. - pkcs7_digest() wakes up and fails with -EAGAIN due to the "failed" testing larval. - The unrelated modprobe fails even though sha256-generic would have been available all the time. FWIW, I picked sha256-padlock-nano and modprobe at random for the sake of providing an example here -- I haven't checked in detail, but I guess that e.g. the combination of dm-crypt + a number of different FIPS approved algorithms could be similarly susceptible, too. So to recap, my initial approach for working around the above was to make crypto_alg_lookup() to skip over testing larvals in an additional, first search. As you said, this would break the "fallback" scenario though. As an alternative, how about not doing the additional search for non-larvals upfront, but only as a resort in case crypto_larval_wait() returned failure instead? But granted, the scenario above is not an issue at all for upstream with a pristine testmgr.c and it would be quite relatable if you wouldn't like to get bothered with any of this. I'm only bringing it up because others might perhaps come across this as well... Thanks! Nicolai
On Mon, Oct 11, 2021 at 10:34:11AM +0200, Nicolai Stange wrote: > > In order to keep a FIPS certification manageable in terms of scope, > we're looking for a way to disable everything under drivers/crypto iff > fips_enabled == 1. The most convenient way to achieve this downstream > would be to add dummy entries to testmgr.c like so: How about testing based on the flag CRYPTO_ALG_KERN_DRIVER_ONLY? That flag is meant to correspond to pretty much exactly drivers/crypto. Cheers,
Herbert Xu <herbert@gondor.apana.org.au> writes: > On Mon, Oct 11, 2021 at 10:34:11AM +0200, Nicolai Stange wrote: >> >> In order to keep a FIPS certification manageable in terms of scope, >> we're looking for a way to disable everything under drivers/crypto iff >> fips_enabled == 1. The most convenient way to achieve this downstream >> would be to add dummy entries to testmgr.c like so: > > How about testing based on the flag CRYPTO_ALG_KERN_DRIVER_ONLY? > That flag is meant to correspond to pretty much exactly > drivers/crypto. Hmm, I checked a couple of crypto/drivers and it seems like not all are setting this flag: random examples would include cavium/nitrox/, chelsio/, padlock*.c, vmx/, ... Many thanks! Nicolai
On Wed, Oct 27, 2021 at 11:59:26AM +0200, Nicolai Stange wrote: > > Hmm, I checked a couple of crypto/drivers and it seems like not all are > setting this flag: random examples would include cavium/nitrox/, > chelsio/, padlock*.c, vmx/, ... For padlock and vmx this is correct as they're just like aesni. The others are omissions and should be fixed. Thanks,
diff --git a/crypto/api.c b/crypto/api.c index 594c494a27d9..4251eedef668 100644 --- a/crypto/api.c +++ b/crypto/api.c @@ -239,8 +239,25 @@ static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type, struct crypto_alg *alg; down_read(&crypto_alg_sem); + /* + * If the request is for an algorithm having passed its + * selftests, try to serve it with a matching instance already + * in operational state. That is, skip pending larvals in a + * first search: for these it is not guaranteed that they will + * eventually turn out successful and it would be a pity to + * potentially fail the request even though a working + * implementation would have been available. If OTOH the + * request is *not* for an algorithm having passed its + * selftest (yet), no larval can match it anyway, so the + * CRYPTO_ALG_LARVAL in the mask below won't make a + * difference. + */ + alg = __crypto_alg_lookup(name, type, mask | CRYPTO_ALG_LARVAL); + if (alg || !(type & CRYPTO_ALG_TESTED)) + goto out; + alg = __crypto_alg_lookup(name, type, mask); - if (!alg && (type & CRYPTO_ALG_TESTED)) { + if (!alg) { /* * Check whether there's an instance which failed the * selftests in order to avoid pointless retries. @@ -254,6 +271,8 @@ static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type, alg = ERR_PTR(-ELIBBAD); } } + +out: up_read(&crypto_alg_sem); return alg;
crypto_alg_mod_lookup() invokes the crypto_larval_lookup() helper to run the actual search for matching crypto_alg implementation and larval entries. The latter is currently considering only the individual entries' relative ->cra_priority for determining which one out of multiple matches to return. This means that it would potentially dismiss a matching crypto_alg implementation in working state in favor of some pending testing larval of higher ->cra_priority. Now, if the testmgr instance invoked asynchronously on that testing larval came to the conclusion that it should mark the tests as failed, any pending crypto_alg_mod_lookup() waiting for it would be made to fail as well with -EAGAIN. In summary, crypto_alg_mod_lookup() can fail spuriously with -EAGAIN even though an implementation in working state would have been available, namely if the testmgr asynchronously marked another, competing implementation of higher ->cra_priority as failed. This is normally not a problem at all with upstream, because the situation where one algorithm passed its tests, but another competing one failed to do so, would indicate a bug anyway. However, for downstream distributions seeking FIPS certification, simply amending the list in crypto/testmgr.c with ->fips_allowed = 0 entries matching on ->cra_driver_name would provide a convenient way of selectively blacklisting implementations from drivers/crypto in fips mode. Note that in this scenario failure of competing crypto_alg implementations would become more common, in particular during device enumeration. If the algorithm in question happened to be needed for e.g. module signature verification, module loading could spuriously fail during bootup, which is certainly not desired. For transparency: this has not actually been observed, I merely came to the conclusion that it would be possible by reading the code. Make crypto_alg_lookup() run an additional search for non-larval matches upfront in the common case that the request has been made for CRYPTO_ALG_TESTED instances. Signed-off-by: Nicolai Stange <nstange@suse.de> --- crypto/api.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-)