Message ID | 20190510111547.15310-1-jthumshirn@suse.de (mailing list archive) |
---|---|
Headers | show |
Series | Add support for SHA-256 checksums | expand |
On Fri, May 10, 2019 at 01:15:30PM +0200, Johannes Thumshirn wrote: > This patchset add support for adding new checksum types in BTRFS. > > Currently BTRFS only supports CRC32C as data and metadata checksum, which is > good if you only want to detect errors due to data corruption in hardware. > > But CRC32C isn't able cover other use-cases like de-duplication or > cryptographically save data integrity guarantees. > > The following properties made SHA-256 interesting for these use-cases: > - Still considered cryptographically sound > - Reasonably well understood by the security industry > - Result fits into the 32Byte/256Bit we have for the checksum in the on-disk > format > - Small enough collision space to make it feasible for data de-duplication > - Fast enough to calculate and offloadable to crypto hardware via the kernel's > crypto_shash framework. > > The patchset also provides mechanisms for plumbing in different hash > algorithms relatively easy. Once the code is ready for more checksum algos, we'll pick candidates and my idea is to select 1 fast (not necessarily strong, but better than crc32c) and 1 strong (but slow, and sha256 is the candidate at the moment). The discussion from 2014 on that topic brought a lot of useful information, though some algos have could have evolved since. https://lore.kernel.org/linux-btrfs/1416806586-18050-1-git-send-email-bo.li.liu@oracle.com/ In about 5 years timeframe we can revisit the algos and potentially add more, so I hope we'll be able to agree to add just 2 in this round. The minimum selection criteria for a digest algorithm: - is provided by linux kernel crypto subsystem - has a license that will allow to use it in bootloader code (grub at lest) - the implementation is available for btrfs-progs either as some small library or can be used directly as a .c file
> -----Original Message----- > From: linux-btrfs-owner@vger.kernel.org <linux-btrfs- > owner@vger.kernel.org> On Behalf Of David Sterba > Sent: Thursday, 16 May 2019 3:27 AM > To: Johannes Thumshirn <jthumshirn@suse.de> > Cc: David Sterba <dsterba@suse.com>; Linux BTRFS Mailinglist <linux- > btrfs@vger.kernel.org> > Subject: Re: [PATCH 00/17] Add support for SHA-256 checksums > > > Once the code is ready for more checksum algos, we'll pick candidates and > my idea is to select 1 fast (not necessarily strong, but better than crc32c) and > 1 strong (but slow, and sha256 is the candidate at the moment). > > The discussion from 2014 on that topic brought a lot of useful information, > though some algos have could have evolved since. > > https://lore.kernel.org/linux-btrfs/1416806586-18050-1-git-send-email- > bo.li.liu@oracle.com/ > > In about 5 years timeframe we can revisit the algos and potentially add more, > so I hope we'll be able to agree to add just 2 in this round. > > The minimum selection criteria for a digest algorithm: > > - is provided by linux kernel crypto subsystem > - has a license that will allow to use it in bootloader code (grub at > lest) > - the implementation is available for btrfs-progs either as some small > library or can be used directly as a .c file Xxhash would be a good candidate. It's extremely fast and almost crypto secure. Has been in the kernel for ~2 yeas iirc. Paul.
On 16.05.19 г. 9:30 ч., Paul Jones wrote: > >> -----Original Message----- >> From: linux-btrfs-owner@vger.kernel.org <linux-btrfs- >> owner@vger.kernel.org> On Behalf Of David Sterba >> Sent: Thursday, 16 May 2019 3:27 AM >> To: Johannes Thumshirn <jthumshirn@suse.de> >> Cc: David Sterba <dsterba@suse.com>; Linux BTRFS Mailinglist <linux- >> btrfs@vger.kernel.org> >> Subject: Re: [PATCH 00/17] Add support for SHA-256 checksums >> >> >> Once the code is ready for more checksum algos, we'll pick candidates and >> my idea is to select 1 fast (not necessarily strong, but better than crc32c) and >> 1 strong (but slow, and sha256 is the candidate at the moment). >> >> The discussion from 2014 on that topic brought a lot of useful information, >> though some algos have could have evolved since. >> >> https://lore.kernel.org/linux-btrfs/1416806586-18050-1-git-send-email- >> bo.li.liu@oracle.com/ >> >> In about 5 years timeframe we can revisit the algos and potentially add more, >> so I hope we'll be able to agree to add just 2 in this round. >> >> The minimum selection criteria for a digest algorithm: >> >> - is provided by linux kernel crypto subsystem >> - has a license that will allow to use it in bootloader code (grub at >> lest) >> - the implementation is available for btrfs-progs either as some small >> library or can be used directly as a .c file > > > Xxhash would be a good candidate. It's extremely fast and almost crypto secure. Has been in the kernel for ~2 yeas iirc. Disclaimer: not a cryptographer. But according to the official site: xxHash is non-cryptography hash. From the (draft) spec: It is labelled non-cryptographic, and is not meant to avoid intentional collisions (same digest for 2 different messages), or to prevent producing a message with predefined digest. This doesn't disqualify it, however we need to be aware its limitations. Perhahps it could be used as a replacement for crc32c but definitely not as secure crypto hash. > > > Paul. >
On Thu, May 16, 2019 at 11:16:38AM +0300, Nikolay Borisov wrote: > It is labelled non-cryptographic, and is not meant to avoid intentional > collisions (same digest for 2 different messages), or to prevent > producing a message with predefined digest. > > This doesn't disqualify it, however we need to be aware its limitations. > Perhahps it could be used as a replacement for crc32c but definitely not > as secure crypto hash. Agreed, but David's plan was to have 3 hashes and xx seems like a good fit for the 3rd fast, stronger than crc32c but not cryptographically secure option. I'll be looking into it for v3.
El miércoles, 15 de mayo de 2019 19:27:21 (CEST) David Sterba escribió: > Once the code is ready for more checksum algos, we'll pick candidates > and my idea is to select 1 fast (not necessarily strong, but better > than crc32c) and 1 strong (but slow, and sha256 is the candidate at the > moment) Modern CPUs have SHA256 instructions, it is actually that slow? (not sure how fast these instructions are) If btrfs needs an algorithm with good performance/security ratio, I would suggest considering BLAKE2 [1]. It is based in the BLAKE algorithm that made to the final round in the SHA3 competition, it is considered pretty secure (above SHA2 at least), and it was designed to take advantage of modern CPU features and be as fast as possible - it even beats SHA1 in that regard. It is not currently in the kernel but Wireguard uses it and will add an implementation when it's merged (but Wireguard doesn't use the crypto layer for some reason...)
On Fri, May 17, 2019 at 08:36:23PM +0200, Diego Calleja wrote: > Modern CPUs have SHA256 instructions, it is actually that slow? (not sure how > fast these instructions are) This still is subject to evaluation. > If btrfs needs an algorithm with good performance/security ratio, I would > suggest considering BLAKE2 [1]. It is based in the BLAKE algorithm that made > to the final round in the SHA3 competition, it is considered pretty secure > (above SHA2 at least), and it was designed to take advantage of modern CPU > features and be as fast as possible - it even beats SHA1 in that regard. It is > not currently in the kernel but Wireguard uses it and will add an > implementation when it's merged (but Wireguard doesn't use the crypto layer > for some reason...) SHA3 is on my list of other candidates to look at for a performance evaluation. As for BLAKE2 I haven't done too much research on it and I'm not a cryptographer so I have to trust FIPS et al. One other (non chrypto) hash that is often mentioned would be XXHash which is in the kernel but not yet wired up to the kernel's crypto framework, but this shouldn't be too hard to do. Byte, Johannes
On Fri, May 17, 2019 at 09:07:03PM +0200, Johannes Thumshirn wrote: > On Fri, May 17, 2019 at 08:36:23PM +0200, Diego Calleja wrote: > > If btrfs needs an algorithm with good performance/security ratio, I would > > suggest considering BLAKE2 [1]. It is based in the BLAKE algorithm that made > > to the final round in the SHA3 competition, it is considered pretty secure > > (above SHA2 at least), and it was designed to take advantage of modern CPU > > features and be as fast as possible - it even beats SHA1 in that regard. It is > > not currently in the kernel but Wireguard uses it and will add an > > implementation when it's merged (but Wireguard doesn't use the crypto layer > > for some reason...) > > SHA3 is on my list of other candidates to look at for a performance > evaluation. As for BLAKE2 I haven't done too much research on it and I'm not a > cryptographer so I have to trust FIPS et al. "Trust FIPS" is the main problem here. Until recently, FIPS certification required implementing this nice random generator: https://en.wikipedia.org/wiki/Dual_EC_DRBG Thus, a good part of people are reluctant to use hash functions chosen by NIST (and published as FIPS). BLAKE2 is also a good deal faster on most hardware: https://bench.cr.yp.to/results-sha3.html Even with sha_ni, SHA256 wins only on Zen AMDs: sha_ni equipped Intels have superior SIMD thus BLAKE2 is still faster. And without sha_ni, the difference is drastic. Meow!
On Sat, May 18, 2019 at 02:38:08AM +0200, Adam Borowski wrote: > On Fri, May 17, 2019 at 09:07:03PM +0200, Johannes Thumshirn wrote: > > On Fri, May 17, 2019 at 08:36:23PM +0200, Diego Calleja wrote: > > > If btrfs needs an algorithm with good performance/security ratio, I would > > > suggest considering BLAKE2 [1]. It is based in the BLAKE algorithm that made > > > to the final round in the SHA3 competition, it is considered pretty secure > > > (above SHA2 at least), and it was designed to take advantage of modern CPU > > > features and be as fast as possible - it even beats SHA1 in that regard. It is > > > not currently in the kernel but Wireguard uses it and will add an > > > implementation when it's merged (but Wireguard doesn't use the crypto layer > > > for some reason...) > > > > SHA3 is on my list of other candidates to look at for a performance > > evaluation. As for BLAKE2 I haven't done too much research on it and I'm not a > > cryptographer so I have to trust FIPS et al. > > "Trust FIPS" is the main problem here. Until recently, FIPS certification > required implementing this nice random generator: > https://en.wikipedia.org/wiki/Dual_EC_DRBG > > Thus, a good part of people are reluctant to use hash functions chosen by > NIST (and published as FIPS). I know, but please also understand that there are applications which do require FIPS certified algorithms. Byte, Johannes
On 2019-05-20 03:47, Johannes Thumshirn wrote: > On Sat, May 18, 2019 at 02:38:08AM +0200, Adam Borowski wrote: >> On Fri, May 17, 2019 at 09:07:03PM +0200, Johannes Thumshirn wrote: >>> On Fri, May 17, 2019 at 08:36:23PM +0200, Diego Calleja wrote: >>>> If btrfs needs an algorithm with good performance/security ratio, I would >>>> suggest considering BLAKE2 [1]. It is based in the BLAKE algorithm that made >>>> to the final round in the SHA3 competition, it is considered pretty secure >>>> (above SHA2 at least), and it was designed to take advantage of modern CPU >>>> features and be as fast as possible - it even beats SHA1 in that regard. It is >>>> not currently in the kernel but Wireguard uses it and will add an >>>> implementation when it's merged (but Wireguard doesn't use the crypto layer >>>> for some reason...) >>> >>> SHA3 is on my list of other candidates to look at for a performance >>> evaluation. As for BLAKE2 I haven't done too much research on it and I'm not a >>> cryptographer so I have to trust FIPS et al. >> >> "Trust FIPS" is the main problem here. Until recently, FIPS certification >> required implementing this nice random generator: >> https://en.wikipedia.org/wiki/Dual_EC_DRBG >> >> Thus, a good part of people are reluctant to use hash functions chosen by >> NIST (and published as FIPS). > > I know, but please also understand that there are applications which do > require FIPS certified algorithms. Those would also be cryptographic applications, which BTRFS is not. If you're in one of those situations and need to have cryptographic verification of files on the system, you need to be using either IMA, dm-verity, or dm-integrity.
On 2019-05-17 14:36, Diego Calleja wrote: > El miércoles, 15 de mayo de 2019 19:27:21 (CEST) David Sterba escribió: >> Once the code is ready for more checksum algos, we'll pick candidates >> and my idea is to select 1 fast (not necessarily strong, but better >> than crc32c) and 1 strong (but slow, and sha256 is the candidate at the >> moment) > > Modern CPUs have SHA256 instructions, it is actually that slow? (not sure how > fast these instructions are) > > If btrfs needs an algorithm with good performance/security ratio, I would > suggest considering BLAKE2 [1]. It is based in the BLAKE algorithm that made > to the final round in the SHA3 competition, it is considered pretty secure > (above SHA2 at least), and it was designed to take advantage of modern CPU > features and be as fast as possible - it even beats SHA1 in that regard. It is > not currently in the kernel but Wireguard uses it and will add an > implementation when it's merged (but Wireguard doesn't use the crypto layer > for some reason...) If anything, I'd argue for BLAKE2 instead of SHA256 as the 'slow' hash, as it's got equivalent or better strength but runs significantly faster. For the fast hash, we should probably be looking more at stuff like xxhash or murmur3, both of which make CRC32c look slow by comparison (at least, when you don't have hardware acceleration for the CRC calculations).
On Mon, May 20, 2019 at 07:34:34AM -0400, Austin S. Hemmelgarn wrote: > Those would also be cryptographic applications, which BTRFS is not. If > you're in one of those situations and need to have cryptographic > verification of files on the system, you need to be using either IMA, > dm-verity, or dm-integrity. This is a system we're aiming at in the followups to this series, but haven't ultimately validated the design yet.
On Fri, May 17, 2019 at 08:36:23PM +0200, Diego Calleja wrote: > El miércoles, 15 de mayo de 2019 19:27:21 (CEST) David Sterba escribió: > > Once the code is ready for more checksum algos, we'll pick candidates > > and my idea is to select 1 fast (not necessarily strong, but better > > than crc32c) and 1 strong (but slow, and sha256 is the candidate at the > > moment) > > Modern CPUs have SHA256 instructions, it is actually that slow? (not sure how > fast these instructions are) > > If btrfs needs an algorithm with good performance/security ratio, I would > suggest considering BLAKE2 [1]. It is based in the BLAKE algorithm that made > to the final round in the SHA3 competition, it is considered pretty secure > (above SHA2 at least), and it was designed to take advantage of modern CPU > features and be as fast as possible - it even beats SHA1 in that regard. It is > not currently in the kernel but Wireguard uses it and will add an > implementation when it's merged (but Wireguard doesn't use the crypto layer > for some reason...) BLAKE2 looks as a good candidate. I have a glue code to export it as the crypto module so we'll be able to test it at least. I'm not sure about SHA3 due to the performance reasons, it comes out slower than SHA256 and that one is already considered slow.