Message ID | 20240415213719.120673-2-ebiggers@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Optimize dm-verity and fsverity using multibuffer hashing | expand |
Eric Biggers <ebiggers@kernel.org> wrote: > > The new API is part of the "shash" algorithm type, as it does not make > sense in "ahash". It does a "finup" operation rather than a "digest" > operation in order to support the salt that is used by dm-verity and > fs-verity. There is no fallback implementation that does two regular > finups if the underlying algorithm doesn't support finup2x, since users > probably will want to avoid the overhead of queueing up multiple hashes > when multibuffer hashing won't actually be used anyway. For your intended users, will the SIMD fallback ever be invoked? Cheers,
On Fri, Apr 19, 2024 at 06:35:01PM +0800, Herbert Xu wrote: > Eric Biggers <ebiggers@kernel.org> wrote: > > > > The new API is part of the "shash" algorithm type, as it does not make > > sense in "ahash". It does a "finup" operation rather than a "digest" > > operation in order to support the salt that is used by dm-verity and > > fs-verity. There is no fallback implementation that does two regular > > finups if the underlying algorithm doesn't support finup2x, since users > > probably will want to avoid the overhead of queueing up multiple hashes > > when multibuffer hashing won't actually be used anyway. > > For your intended users, will the SIMD fallback ever be invoked? > If you mean the fallback to scalar instructions when !crypto_simd_usable(), by default dm-verity and fs-verity do all hashing in process context, in which case the scalar fallback will never be used. dm-verity does support the 'try_verify_in_tasklet' option which makes hashing sometimes happen in softirq context, and x86 Linux has an edge case where if a softirq comes in while the kernel is in the middle of using SIMD instructions, SIMD instructions can't be used during that softirq. So in theory the !crypto_simd_usable() case could be reached then. Either way, I have the fallback implemented in the x86 and arm64 SHA-256 glue code for consistency with the rest of the crypto_shash API anyway. If you mean falling back to two crypto_shash_finup() when the algorithm doesn't support crypto_shash_finup2x(), my patches to dm-verity and fs-verity do that. Modern x86_64 and arm64 systems will use crypto_shash_finup2x(), but dm-verity and fs-verity need to work on all architectures and on older CPUs too. The alternative would be to put the fallback to two crypto_shash_finup() directly in crypto_shash_finup2x() and have the users call crypto_shash_finup2x() unconditionally (similar to how crypto_shash_digest() can be called even if the underlying shash_alg doesn't implement ->digest()). That would make for slightly simpler code, though it feels a bit awkward to queue up multiple blocks for multibuffer hashing when multibuffer hashing won't actually be used. Let me know if you have a preference about this. - Eric
On Fri, Apr 19, 2024 at 09:30:07AM -0700, Eric Biggers wrote: > > If you mean the fallback to scalar instructions when !crypto_simd_usable(), by > default dm-verity and fs-verity do all hashing in process context, in which case > the scalar fallback will never be used. dm-verity does support the > 'try_verify_in_tasklet' option which makes hashing sometimes happen in softirq > context, and x86 Linux has an edge case where if a softirq comes in while the > kernel is in the middle of using SIMD instructions, SIMD instructions can't be > used during that softirq. So in theory the !crypto_simd_usable() case could be > reached then. Either way, I have the fallback implemented in the x86 and arm64 > SHA-256 glue code for consistency with the rest of the crypto_shash API anyway. OK that's good to hear. So if they enable try_verify_in_tasklet then they will only have themselves to blame :) > If you mean falling back to two crypto_shash_finup() when the algorithm doesn't > support crypto_shash_finup2x(), my patches to dm-verity and fs-verity do that. > Modern x86_64 and arm64 systems will use crypto_shash_finup2x(), but dm-verity > and fs-verity need to work on all architectures and on older CPUs too. The > alternative would be to put the fallback to two crypto_shash_finup() directly in > crypto_shash_finup2x() and have the users call crypto_shash_finup2x() > unconditionally (similar to how crypto_shash_digest() can be called even if the > underlying shash_alg doesn't implement ->digest()). That would make for > slightly simpler code, though it feels a bit awkward to queue up multiple blocks > for multibuffer hashing when multibuffer hashing won't actually be used. Let me > know if you have a preference about this. No I don't think it's necessary for the time being. Thanks,
diff --git a/include/crypto/hash.h b/include/crypto/hash.h index 5d61f576cfc86..3bb1b0b7b1242 100644 --- a/include/crypto/hash.h +++ b/include/crypto/hash.h @@ -198,10 +198,13 @@ struct shash_desc { * @finup: see struct ahash_alg * @digest: see struct ahash_alg * @export: see struct ahash_alg * @import: see struct ahash_alg * @setkey: see struct ahash_alg + * @finup2x: **[optional]** Finish calculating the digests of two equal-length + * messages, interleaving the instructions to potentially achieve + * better performance than hashing each message individually. * @init_tfm: Initialize the cryptographic transformation object. * This function is called only once at the instantiation * time, right after the transformation context was * allocated. In case the cryptographic hardware has * some special requirements which need to be handled @@ -229,10 +232,12 @@ struct shash_alg { unsigned int len, u8 *out); int (*export)(struct shash_desc *desc, void *out); int (*import)(struct shash_desc *desc, const void *in); int (*setkey)(struct crypto_shash *tfm, const u8 *key, unsigned int keylen); + int (*finup2x)(struct shash_desc *desc, const u8 *data1, + const u8 *data2, unsigned int len, u8 *out1, u8 *out2); int (*init_tfm)(struct crypto_shash *tfm); void (*exit_tfm)(struct crypto_shash *tfm); int (*clone_tfm)(struct crypto_shash *dst, struct crypto_shash *src); unsigned int descsize; @@ -771,10 +776,15 @@ static inline unsigned int crypto_shash_digestsize(struct crypto_shash *tfm) static inline unsigned int crypto_shash_statesize(struct crypto_shash *tfm) { return crypto_shash_alg(tfm)->statesize; } +static inline bool crypto_shash_supports_finup2x(struct crypto_shash *tfm) +{ + return crypto_shash_alg(tfm)->finup2x != NULL; +} + static inline u32 crypto_shash_get_flags(struct crypto_shash *tfm) { return crypto_tfm_get_flags(crypto_shash_tfm(tfm)); } @@ -864,10 +874,34 @@ int crypto_shash_digest(struct shash_desc *desc, const u8 *data, * Return: 0 on success; < 0 if an error occurred. */ int crypto_shash_tfm_digest(struct crypto_shash *tfm, const u8 *data, unsigned int len, u8 *out); +/** + * crypto_shash_finup2x() - finish hashing two equal-length messages + * @desc: the hash state that will be forked for the two messages. This + * contains the state after hashing a (possibly-empty) common prefix of + * the two messages. + * @data1: the first message (not including any common prefix from @desc) + * @data2: the second message (not including any common prefix from @desc) + * @len: length of @data1 and @data2 in bytes + * @out1: output buffer for first message digest + * @out2: output buffer for second message digest + * + * Users must check crypto_shash_supports_finup2x(tfm) before calling this. + * + * Context: Any context. + * Return: 0 on success; a negative errno value on failure. + */ +static inline int crypto_shash_finup2x(struct shash_desc *desc, + const u8 *data1, const u8 *data2, + unsigned int len, u8 *out1, u8 *out2) +{ + return crypto_shash_alg(desc->tfm)->finup2x(desc, data1, data2, len, + out1, out2); +} + /** * crypto_shash_export() - extract operational state for message digest * @desc: reference to the operational state handle whose state is exported * @out: output buffer of sufficient size that can hold the hash state *