Message ID | 20190822114029.11225-1-jthumshirn@suse.de (mailing list archive) |
---|---|
Headers | show |
Series | Support xxhash64 checksums | expand |
On 8/22/19 1:40 PM, Johannes Thumshirn wrote: > Now that Nikolay's XXHASH64 support for the Crypto API has landed and BTRFS is > prepared for an easy addition of new checksums, this patchset implements > XXHASH64 as a second, fast but not cryptographically secure checksum hash. Question from the cheap seats.. :) I know that crc32c-intel uses native SSE 4.2 instructions, but so far I have been unable to find benchmarks or explanations why adding xxhash64 benefits btrfs. All benchmarks seem to be against crc32c in *software*, not the SSE4.2-enabled version (or I can't read). I mean, it's great that xxhash64 is really fast for a software implementation, but how does btrfs benefit from this compared to using crc32-intel? Verifying that plugging in other hash impls works (e.g. as preparation for stronger impls) has value, but it's probably not something most users care about. Maybe there are obscure downsides to crc32c-intel like instruction latency (def. a problem for AVX512), cache pollution..? Just curious. thanks, Holger
On Thu, Aug 22, 2019 at 02:28:53PM +0200, Holger Hoffstätte wrote: > On 8/22/19 1:40 PM, Johannes Thumshirn wrote: > > Now that Nikolay's XXHASH64 support for the Crypto API has landed and BTRFS is > > prepared for an easy addition of new checksums, this patchset implements > > XXHASH64 as a second, fast but not cryptographically secure checksum hash. > > Question from the cheap seats.. :) > > I know that crc32c-intel uses native SSE 4.2 instructions, but so far I have > been unable to find benchmarks or explanations why adding xxhash64 benefits > btrfs. All benchmarks seem to be against crc32c in *software*, not the > SSE4.2-enabled version (or I can't read). I mean, it's great that xxhash64 is > really fast for a software implementation, but how does btrfs benefit from this > compared to using crc32-intel? > > Verifying that plugging in other hash impls works (e.g. as preparation for > stronger impls) has value, but it's probably not something most > users care about. > > Maybe there are obscure downsides to crc32c-intel like instruction latency > (def. a problem for AVX512), cache pollution..? > > Just curious. It's not so much about the performance aspect of xxhash64 vs crc32c. xxhash64 has a lower collission proability compared to crc32c, which for instance makes it a good candidate to use for de-duplication. HTH, Johannes
Am Do., 22. Aug. 2019 um 16:41 Uhr schrieb Holger Hoffstätte
<holger@applied-asynchrony.com>:
> but how does btrfs benefit from this compared to using crc32-intel?
As i know, crc32c is as far as ~3x faster than xxhash. But xxHash was
created with a differend design goal.
If you using a cpu without hardware crc32 support, xxHash provides you
a maximum portability and speed. Look at arm, mips, power, etc. or old
intel cpus like Core 2 Duo.
> -----Original Message----- > From: linux-btrfs-owner@vger.kernel.org <linux-btrfs- > owner@vger.kernel.org> On Behalf Of Peter Becker > Sent: Friday, 23 August 2019 1:40 AM > To: Holger Hoffstätte <holger@applied-asynchrony.com> > Cc: Linux BTRFS Mailinglist <linux-btrfs@vger.kernel.org> > Subject: Re: [PATCH v2 0/4] Support xxhash64 checksums > > Am Do., 22. Aug. 2019 um 16:41 Uhr schrieb Holger Hoffstätte > <holger@applied-asynchrony.com>: > > but how does btrfs benefit from this compared to using crc32-intel? > > As i know, crc32c is as far as ~3x faster than xxhash. But xxHash was created > with a differend design goal. > If you using a cpu without hardware crc32 support, xxHash provides you a > maximum portability and speed. Look at arm, mips, power, etc. or old intel > cpus like Core 2 Duo. I've got a modified version of smhasher (https://github.com/PeeJay/smhasher) that tests speed and cryptographics of various hashing functions. Crc32 Software - 379.91 MiB/sec Crc32 Hardware - 7338.60 MiB/sec XXhash64 Software - 12094.40 MiB/sec Testing done on a 1st Gen Ryzen. Impressive numbers from XXhash64. Paul.
> -----Original Message----- > From: linux-btrfs-owner@vger.kernel.org <linux-btrfs- > owner@vger.kernel.org> On Behalf Of Paul Jones > Sent: Friday, 23 August 2019 7:39 PM > To: Peter Becker <floyd.net@gmail.com>; Holger Hoffstätte > <holger@applied-asynchrony.com> > Cc: Linux BTRFS Mailinglist <linux-btrfs@vger.kernel.org> > Subject: RE: [PATCH v2 0/4] Support xxhash64 checksums > > > -----Original Message----- > > From: linux-btrfs-owner@vger.kernel.org <linux-btrfs- > > owner@vger.kernel.org> On Behalf Of Peter Becker > > Sent: Friday, 23 August 2019 1:40 AM > > To: Holger Hoffstätte <holger@applied-asynchrony.com> > > Cc: Linux BTRFS Mailinglist <linux-btrfs@vger.kernel.org> > > Subject: Re: [PATCH v2 0/4] Support xxhash64 checksums > > > > Am Do., 22. Aug. 2019 um 16:41 Uhr schrieb Holger Hoffstätte > > <holger@applied-asynchrony.com>: > > > but how does btrfs benefit from this compared to using crc32-intel? > > > > As i know, crc32c is as far as ~3x faster than xxhash. But xxHash was > > created with a differend design goal. > > If you using a cpu without hardware crc32 support, xxHash provides you > > a maximum portability and speed. Look at arm, mips, power, etc. or old > > intel cpus like Core 2 Duo. > > I've got a modified version of smhasher > (https://github.com/PeeJay/smhasher) that tests speed and cryptographics > of various hashing functions. I forgot to add xxhash32 Crc32 Software - 379.91 MiB/sec Crc32 Hardware - 7338.60 MiB/sec XXhash64 Software - 12094.40 MiB/sec XXhash32 Software - 6060.11 MiB/sec Testing done on a 1st Gen Ryzen. Impressive numbers from XXhash64. Paul.
On Fri, Aug 23, 2019 at 09:43:22AM +0000, Paul Jones wrote: > > > Am Do., 22. Aug. 2019 um 16:41 Uhr schrieb Holger Hoffstätte > > > <holger@applied-asynchrony.com>: > > > > but how does btrfs benefit from this compared to using crc32-intel? > > > > > > As i know, crc32c is as far as ~3x faster than xxhash. But xxHash was > > > created with a differend design goal. > > > If you using a cpu without hardware crc32 support, xxHash provides you > > > a maximum portability and speed. Look at arm, mips, power, etc. or old > > > intel cpus like Core 2 Duo. > > > > I've got a modified version of smhasher > > (https://github.com/PeeJay/smhasher) that tests speed and cryptographics > > of various hashing functions. > > I forgot to add xxhash32 > > Crc32 Software - 379.91 MiB/sec > Crc32 Hardware - 7338.60 MiB/sec > XXhash64 Software - 12094.40 MiB/sec > XXhash32 Software - 6060.11 MiB/sec > > Testing done on a 1st Gen Ryzen. Impressive numbers from XXhash64. Newest biggest Threadripper (2990WX, no 3* version released yet): crc32 - 492.75 MiB/sec crc32hw - 9447.37 MiB/sec crc64 - 1959.51 MiB/sec xxhash32 - 7479.29 MiB/sec xxhash64 - 14911.58 MiB/sec An old Skylake (i7-6700): crc32 - 359.32 MiB/sec crc32hw - 21119.68 MiB/sec crc64 - 1656.34 MiB/sec xxhash32 - 5989.87 MiB/sec xxhash64 - 11949.41 MiB/sec Cascade Lake (0000%@): crc32hw 1.92× as fast as xxhash64. So you want crc32hw on Intel, xxhash64 on AMD. crc32 also allows going back to old kernels; the improved collision resistance of xxhash64 is not a reason as if you intend to dedupe you want a crypto hash so you don't need to verify. Meow!
On 2019-08-23 13:08, Adam Borowski wrote: > the improved collision > resistance of xxhash64 is not a reason as if you intend to dedupe you want > a crypto hash so you don't need to verify. The improved collision resistance is a roughly 10 orders of magnitude reduction in the chance of a collision. That may not matter for most, but it's a significant improvement for anybody operating at large enough scale that media errors are commonplace. Also, you would still need to verify even if you're using whatever the fanciest new collision resistant cryptographic hash is, because the number of possible input values is still more than _nine thousand_ orders of magnitude larger than the total number of output values even if we use a 512-bit cryptographic hash.
On Mon, Aug 26, 2019 at 08:27:15AM -0400, Austin S. Hemmelgarn wrote: > On 2019-08-23 13:08, Adam Borowski wrote: > > the improved collision > > resistance of xxhash64 is not a reason as if you intend to dedupe you want > > a crypto hash so you don't need to verify. > > The improved collision resistance is a roughly 10 orders of magnitude > reduction in the chance of a collision. That may not matter for most, but > it's a significant improvement for anybody operating at large enough scale > that media errors are commonplace. Hash size doesn't matter vs media errors. You don't have billions of mismatches: the first one is a cause of alarm, so 1-in-4294967296 chance of failing to notice it hardly ever matters (even though it _can_ happen in real life as opposed to collisions below). I can think of a bigger hash useful in three cases: * recovering from a split-brain RAID * recovering from one disk of a RAID having had a large piece scribbled upon * finding candidates for deduplication (but see below why not 64-bit) > Also, you would still need to verify even if you're using whatever the > fanciest new collision resistant cryptographic hash is, because the number > of possible input values is still more than _nine thousand_ orders of > magnitude larger than the total number of output values even if we use a > 512-bit cryptographic hash. You're underestimating how rare crypto-strength hash collisions are. There are two scenarios: unintentional, and malicious. Let's go with unintentional first: the age of the Universe is 2^58.5 seconds. The fastest disk (non-pmem) is NVMe-connected Optane, at 240000 IOPS. That's 2^17.8. With a 256-bit hash, the mass of machines needed for a single expected collision within the age of Universe exceeds the mass of observable Universe itself. So, malicious. We demand a non-broken hash, which in crypto speak means there's no known attack better than brute force. An iterative approach is right out; the best space-time tradeoff is birthday attack, which requires storage size akin to the root of # of combinations (ie, half the hash length). It's drastically better: at current best storage densities, you'd need only the mass of the Earth. Please let me know when you'll build that Earth-sized computer, so I can migrate from weak SHA256 to eg. BLAKE2b. On the other hand, computers and memories get hit by cosmic rays, thermal noise, and so on at a non-negligible rate. Any theoretical chance of a hash collision is dwarfed by flaws of technology we have. Or, eg, by the chance that you'll get hit by multiple lightings the next time you leave your house. Thus: no, you don't need to recheck after SHA256. Meow!