diff mbox series

[5/9] btrfs: zstd: Switch to the zstd-1.4.6 API

Message ID 20200916034307.2092020-7-nickrterrell@gmail.com (mailing list archive)
State New, archived
Headers show
Series Update to zstd-1.4.6 | expand

Commit Message

Nick Terrell Sept. 16, 2020, 3:42 a.m. UTC
From: Nick Terrell <terrelln@fb.com>

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell <terrelln@fb.com>
---
 fs/btrfs/zstd.c | 48 ++++++++++++++++++++++++++++--------------------
 1 file changed, 28 insertions(+), 20 deletions(-)

Comments

Christoph Hellwig Sept. 16, 2020, 8:49 a.m. UTC | #1
On Tue, Sep 15, 2020 at 08:42:59PM -0700, Nick Terrell wrote:
> From: Nick Terrell <terrelln@fb.com>
> 
> Move away from the compatibility wrapper to the zstd-1.4.6 API. This
> code is functionally equivalent.

Again, please use sensible names  And no one gives a fuck if this bad
API is "zstd-1.4.6" as the Linux kernel uses its own APIs, not some
random mess from a badly written userspace package.
Chris Mason Sept. 16, 2020, 2:20 p.m. UTC | #2
On 16 Sep 2020, at 4:49, Christoph Hellwig wrote:

> On Tue, Sep 15, 2020 at 08:42:59PM -0700, Nick Terrell wrote:
>> From: Nick Terrell <terrelln@fb.com>
>>
>> Move away from the compatibility wrapper to the zstd-1.4.6 API. This
>> code is functionally equivalent.
>
> Again, please use sensible names  And no one gives a fuck if this bad
> API is "zstd-1.4.6" as the Linux kernel uses its own APIs, not some
> random mess from a badly written userspace package.

Hi Christoph,

It’s not completely clear what you’re asking for here.  If the API 
matches what’s in zstd-1.4.6, that seems like a reasonable way to 
label it.  That’s what the upstream is for this code.

I’m also not sure why we’re taking extra time to shit on the zstd 
userspace package.  Can we please be constructive or at least 
actionable?

-chris
Christoph Hellwig Sept. 16, 2020, 2:30 p.m. UTC | #3
On Wed, Sep 16, 2020 at 10:20:52AM -0400, Chris Mason wrote:
> It???s not completely clear what you???re asking for here.  If the API
> matches what???s in zstd-1.4.6, that seems like a reasonable way to label
> it.  That???s what the upstream is for this code.
> 
> I???m also not sure why we???re taking extra time to shit on the zstd
> userspace package.  Can we please be constructive or at least actionable?

Because it really doesn't matter that these crappy APIs he is
introducing match anything, especially not something done as horribly
as the zstd API.  We'll need to do this properly, and claiming
compliance to some version of this lousy API is completely irrelevant
for the kernel.
Chris Mason Sept. 16, 2020, 2:43 p.m. UTC | #4
On 16 Sep 2020, at 10:30, Christoph Hellwig wrote:

> On Wed, Sep 16, 2020 at 10:20:52AM -0400, Chris Mason wrote:
>> It???s not completely clear what you???re asking for here.  If the 
>> API
>> matches what???s in zstd-1.4.6, that seems like a reasonable way to 
>> label
>> it.  That???s what the upstream is for this code.
>>
>> I???m also not sure why we???re taking extra time to shit on the zstd
>> userspace package.  Can we please be constructive or at least 
>> actionable?
>
> Because it really doesn't matter that these crappy APIs he is
> introducing match anything, especially not something done as horribly
> as the zstd API.  We'll need to do this properly, and claiming
> compliance to some version of this lousy API is completely irrelevant
> for the kernel.

If the underlying goal is to closely follow the upstream of another 
project, we’re much better off using those APIs as provided.

Otherwise we just end up with drift and kernel-specific bugs that are 
harder to debug.  To the extent those APIs make us contort the kernel 
code, I’m sure Nick is interested in improving things in both places.

There are probably 1000 constructive ways to have that conversation.  
Please choose one of those instead of being an asshole.

-chris
Christoph Hellwig Sept. 16, 2020, 2:46 p.m. UTC | #5
On Wed, Sep 16, 2020 at 10:43:04AM -0400, Chris Mason wrote:
> Otherwise we just end up with drift and kernel-specific bugs that are harder
> to debug.  To the extent those APIs make us contort the kernel code, I???m
> sure Nick is interested in improving things in both places.

Seriously, we do not care elsewhere.  Why would zlib be any different?

> There are probably 1000 constructive ways to have that conversation.  Please
> choose one of those instead of being an asshole.

I think you are the asshole here by ignoring the practices we are using
elsewhere and think your employers pet project is somehow special.  It
is not, and claiming so is everything but constructive.
Chris Mason Sept. 16, 2020, 3:01 p.m. UTC | #6
On 16 Sep 2020, at 10:46, Christoph Hellwig wrote:

> On Wed, Sep 16, 2020 at 10:43:04AM -0400, Chris Mason wrote:
>> Otherwise we just end up with drift and kernel-specific bugs that are 
>> harder
>> to debug.  To the extent those APIs make us contort the kernel code, 
>> I???m
>> sure Nick is interested in improving things in both places.
>
> Seriously, we do not care elsewhere.  Why would zlib be any different?

Is the zlib upstream active?  Or trying to sync active development with 
the kernel?  I’d suggest the same path for them if they were.

>
>> There are probably 1000 constructive ways to have that conversation.  
>> Please
>> choose one of those instead of being an asshole.
>
> I think you are the asshole here by ignoring the practices we are 
> using
> elsewhere and think your employers pet project is somehow special.  It
> is not, and claiming so is everything but constructive.

I’m happy to advocate for more constructive discussion for anyone’s 
project.  I tend to pick threads where I have context and I know the 
people involved.

The kernel best practices are pragmatic.  As one of many users of any 
established-non-kernel project, there’s a compromise between the APIs 
they are using for a broad base of users and us.  I’m sure they are 
interested in improving life for all of their users, while also 
improving maintainability for us.

-chris
Eric Biggers Sept. 16, 2020, 6:27 p.m. UTC | #7
On Wed, Sep 16, 2020 at 03:46:18PM +0100, Christoph Hellwig wrote:
> On Wed, Sep 16, 2020 at 10:43:04AM -0400, Chris Mason wrote:
> > Otherwise we just end up with drift and kernel-specific bugs that are harder
> > to debug.  To the extent those APIs make us contort the kernel code, I???m
> > sure Nick is interested in improving things in both places.
> 
> Seriously, we do not care elsewhere.  Why would zlib be any different?
> 
> > There are probably 1000 constructive ways to have that conversation.  Please
> > choose one of those instead of being an asshole.
> 
> I think you are the asshole here by ignoring the practices we are using
> elsewhere and think your employers pet project is somehow special.  It
> is not, and claiming so is everything but constructive.
> 

The userspace Zstandard library is widely used and has been heavily reviewed,
tested, and fuzzed.

The options are either (a) write and maintain a separate kernel implementation
of Zstandard, or (b) periodically sync from upstream and make minimal, easily
reviewable changes to integrate with the kernel.

I don't see option (a) working for Zstandard.  For short and simple code, it's
the usual Linux kernel practice and it works fine.  But it's far too hard to
write and maintain a good implementation of Zstandard -- meaning correct, fast,
fully fuzzed, and supporting all needed compression levels.  Optimizing
compressors and decompressors is really hard.  A "naive" implementation wouldn't
be too hard, but it would be slow and wouldn't support high compression ratios.

Similarly, some of the crypto assembly code in the kernel is taken from the
OpenSSL project, since the kernel community doesn't have the capacity to
properly optimize algorithms like Poly1305 for x86, arm, arm64, mips, ...

If your main concern is about the camel case function naming, that doesn't seem
very important, relatively speaking.

- Eric
Nick Terrell Sept. 16, 2020, 7:18 p.m. UTC | #8
> On Sep 16, 2020, at 7:46 AM, Christoph Hellwig <hch@infradead.org> wrote:
> 
> On Wed, Sep 16, 2020 at 10:43:04AM -0400, Chris Mason wrote:
>> Otherwise we just end up with drift and kernel-specific bugs that are harder
>> to debug.  To the extent those APIs make us contort the kernel code, I???m
>> sure Nick is interested in improving things in both places.
> 
> Seriously, we do not care elsewhere.  Why would zlib be any different?
> 
>> There are probably 1000 constructive ways to have that conversation.  Please
>> choose one of those instead of being an asshole.
> 
> I think you are the asshole here by ignoring the practices we are using
> elsewhere and think your employers pet project is somehow special.  It
> is not, and claiming so is everything but constructive.

My goal in updating the zstd kernel to use the upstream API directly is to
make frequent syncs into the kernel easy. This is important so the kernel
doesn't miss out on bug fixes and performance improvements.

The upstream zstd is continuously fuzzed and is battle tested in production
and across many different projects external to Facebook. That means that
zstd-1.4.6 has an additional 3 years of continuous fuzzing, as well as
improvements to our fuzz and test suite.

The zstd version in the kernel works fine. But, you can see that the version
that got imported stagnated where upstream had 14 released versions. I
don't think it makes sense to have kernel developers maintain their own copy
of zstd. Their time would be better spent working on the rest of the kernel.
Using upstream directly lets the kernel profit from the work that we, the zstd
developers, are doing. And it still allows kernel developers to fix bugs if any
show up, and we can back-port them to upstream.

For example, I’ve measured that BtrFS decompression + read performance
is improved 15% with this patch. And ZRAM performance improves 30%.
And SquashFS decompression + read performance improves 15%.

Admittedly, the API provided for static workspace allocation is verbose. Most
zstd users don’t need it, so our efforts to improve the ergonomics of the API
haven’t been focused here. At this point, we couldn’t rename these APIs easily,
since we have users relying on our API. It could be done, because we don’t
guarantee ABI stability for this portion of the API, but we would have to have
a good reason for it.

One possibility is to have a kernel wrapper on top of the zstd API to make it
more ergonomic. I personally don’t really see the value in it, since it adds
another layer of indirection between zstd and the caller, but it could be done.

Of all the compressors in the kernel, only lz4 and zstd are under active
development. And lz4 has switched to using the upstream API directly.
Xz does see a little bit of development, but nothing has been synced to the
kernel.

Best,
Nick
Rik van Riel Sept. 17, 2020, 1:35 a.m. UTC | #9
On Wed, 2020-09-16 at 15:18 -0400, Nick Terrell wrote:

> The zstd version in the kernel works fine. But, you can see that the
> version
> that got imported stagnated where upstream had 14 released versions.
> I
> don't think it makes sense to have kernel developers maintain their
> own copy
> of zstd. Their time would be better spent working on the rest of the
> kernel.
> Using upstream directly lets the kernel profit from the work that we,
> the zstd
> developers, are doing. And it still allows kernel developers to fix
> bugs if any
> show up, and we can back-port them to upstream.

I can't argue with that.

> One possibility is to have a kernel wrapper on top of the zstd API to
> make it
> more ergonomic. I personally don’t really see the value in it, since
> it adds
> another layer of indirection between zstd and the caller, but it
> could be done.

Zstd would not be the first part of the kernel to
come from somewhere else, and have wrappers when
it gets integrated into the kernel. There certainly
is precedence there.

It would be interesting to know what Christoph's
preference is.
Christoph Hellwig Sept. 17, 2020, 10:04 a.m. UTC | #10
On Wed, Sep 16, 2020 at 09:35:51PM -0400, Rik van Riel wrote:
> > One possibility is to have a kernel wrapper on top of the zstd API to
> > make it
> > more ergonomic. I personally don???t really see the value in it, since
> > it adds
> > another layer of indirection between zstd and the caller, but it
> > could be done.
> 
> Zstd would not be the first part of the kernel to
> come from somewhere else, and have wrappers when
> it gets integrated into the kernel. There certainly
> is precedence there.
> 
> It would be interesting to know what Christoph's
> preference is.

Yes, I think kernel wrappers would be a pretty sensible step forward.
That also avoid the need to do strange upgrades to a new version,
and instead we can just change APIs on a as-needed basis.
Chris Mason Sept. 17, 2020, 2:28 p.m. UTC | #11
On 17 Sep 2020, at 6:04, Christoph Hellwig wrote:

> On Wed, Sep 16, 2020 at 09:35:51PM -0400, Rik van Riel wrote:
>>> One possibility is to have a kernel wrapper on top of the zstd API 
>>> to
>>> make it
>>> more ergonomic. I personally don???t really see the value in it, 
>>> since
>>> it adds
>>> another layer of indirection between zstd and the caller, but it
>>> could be done.
>>
>> Zstd would not be the first part of the kernel to
>> come from somewhere else, and have wrappers when
>> it gets integrated into the kernel. There certainly
>> is precedence there.
>>
>> It would be interesting to know what Christoph's
>> preference is.
>
> Yes, I think kernel wrappers would be a pretty sensible step forward.
> That also avoid the need to do strange upgrades to a new version,
> and instead we can just change APIs on a as-needed basis.

When we add wrappers, we end up creating a kernel specific API that 
doesn’t match the upstream zstd docs, and it doesn’t leverage as 
much of the zstd fuzzing and testing.

So we’re actually making kernel zstd slightly less usable in hopes 
that our kernel specific part of the API is familiar enough to us that 
it makes zstd more usable.  There’s no way to compare the two until 
the wrappers are done, but given the code today I’d prefer that we 
focus on making it really easy to track upstream.  I really understand 
Christoph’s side here, but I’d rather ride a camel with the group 
than go it alone.

I’d also much rather spend time on any problems where the structure of 
the zstd APIs don’t fit the kernel’s needs.  The btrfs streaming 
compression/decompression looks pretty clean to me, but I think Johannes 
mentioned some possibilities to improve things for zswap (optimizations 
for page-at-atime).  If there are places where the zstd memory 
management or error handling don’t fit naturally into the kernel, that 
would also be higher on my list.

Fixing those are probably going to be much easier if we’re close to 
the zstd upstream, again so that we can leverage testing and long term 
code maintenance done there.

-chris
Nick Terrell Sept. 17, 2020, 5:57 p.m. UTC | #12
> On Sep 17, 2020, at 7:28 AM, Chris Mason <clm@fb.com> wrote:
> 
> On 17 Sep 2020, at 6:04, Christoph Hellwig wrote:
> 
>> On Wed, Sep 16, 2020 at 09:35:51PM -0400, Rik van Riel wrote:
>>>> One possibility is to have a kernel wrapper on top of the zstd API to
>>>> make it
>>>> more ergonomic. I personally don???t really see the value in it, since
>>>> it adds
>>>> another layer of indirection between zstd and the caller, but it
>>>> could be done.
>>> 
>>> Zstd would not be the first part of the kernel to
>>> come from somewhere else, and have wrappers when
>>> it gets integrated into the kernel. There certainly
>>> is precedence there.
>>> 
>>> It would be interesting to know what Christoph's
>>> preference is.
>> 
>> Yes, I think kernel wrappers would be a pretty sensible step forward.
>> That also avoid the need to do strange upgrades to a new version,
>> and instead we can just change APIs on a as-needed basis.
> 
> When we add wrappers, we end up creating a kernel specific API that doesn’t match the upstream zstd docs, and it doesn’t leverage as much of the zstd fuzzing and testing.
> 
> So we’re actually making kernel zstd slightly less usable in hopes that our kernel specific part of the API is familiar enough to us that it makes zstd more usable.  There’s no way to compare the two until the wrappers are done, but given the code today I’d prefer that we focus on making it really easy to track upstream.  I really understand Christoph’s side here, but I’d rather ride a camel with the group than go it alone.
> 
> I’d also much rather spend time on any problems where the structure of the zstd APIs don’t fit the kernel’s needs.  The btrfs streaming compression/decompression looks pretty clean to me, but I think Johannes mentioned some possibilities to improve things for zswap (optimizations for page-at-atime).  If there are places where the zstd memory management or error handling don’t fit naturally into the kernel, that would also be higher on my list.

This update includes the recent optimizations for ZSwap that I've made, which
gives a 30% speed boost for page-at-a-time decompression.

We're very open to improving and changing zstd to better fit the needs of the
kernel. If there are use cases that can't use the existing API, or the existing
API isn't optimal, or any other problems, we’re happy to help figure out the
best solution. Opening an issue on our upstream GitHub repo is the best way to
get our attention

-Nick

> Fixing those are probably going to be much easier if we’re close to the zstd upstream, again so that we can leverage testing and long term code maintenance done there.
> 
> -chris
diff mbox series

Patch

diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index a7367ff573d4..6b466e090cd7 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@ 
 #include <linux/refcount.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
-#include <linux/zstd_compat.h>
+#include <linux/zstd.h>
 #include "misc.h"
 #include "compression.h"
 #include "ctree.h"
@@ -159,8 +159,8 @@  static void zstd_calc_ws_mem_sizes(void)
 			zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT);
 		size_t level_size =
 			max_t(size_t,
-			      ZSTD_CStreamWorkspaceBound(params.cParams),
-			      ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
+			      ZSTD_estimateCStreamSize_usingCParams(params.cParams),
+			      ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT));
 
 		max_size = max_t(size_t, max_size, level_size);
 		zstd_ws_mem_sizes[level - 1] = max_size;
@@ -389,13 +389,23 @@  int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
 	*total_in = 0;
 
 	/* Initialize the stream */
-	stream = ZSTD_initCStream(params, len, workspace->mem,
-			workspace->size);
+	stream = ZSTD_initStaticCStream(workspace->mem, workspace->size);
 	if (!stream) {
-		pr_warn("BTRFS: ZSTD_initCStream failed\n");
+		pr_warn("BTRFS: ZSTD_initStaticCStream failed\n");
 		ret = -EIO;
 		goto out;
 	}
+	{
+		size_t ret2;
+
+		ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len);
+		if (ZSTD_isError(ret2)) {
+			pr_warn("BTRFS: ZSTD_initCStream_advanced returned %s\n",
+					ZSTD_getErrorName(ret2));
+			ret = -EIO;
+			goto out;
+		}
+	}
 
 	/* map in the first page of input data */
 	in_page = find_get_page(mapping, start >> PAGE_SHIFT);
@@ -421,8 +431,8 @@  int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
 		ret2 = ZSTD_compressStream(stream, &workspace->out_buf,
 				&workspace->in_buf);
 		if (ZSTD_isError(ret2)) {
-			pr_debug("BTRFS: ZSTD_compressStream returned %d\n",
-					ZSTD_getErrorCode(ret2));
+			pr_debug("BTRFS: ZSTD_compressStream returned %s\n",
+					ZSTD_getErrorName(ret2));
 			ret = -EIO;
 			goto out;
 		}
@@ -489,8 +499,8 @@  int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
 
 		ret2 = ZSTD_endStream(stream, &workspace->out_buf);
 		if (ZSTD_isError(ret2)) {
-			pr_debug("BTRFS: ZSTD_endStream returned %d\n",
-					ZSTD_getErrorCode(ret2));
+			pr_debug("BTRFS: ZSTD_endStream returned %s\n",
+					ZSTD_getErrorName(ret2));
 			ret = -EIO;
 			goto out;
 		}
@@ -557,10 +567,9 @@  int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 	unsigned long buf_start;
 	unsigned long total_out = 0;
 
-	stream = ZSTD_initDStream(
-			ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+	stream = ZSTD_initStaticDStream(workspace->mem, workspace->size);
 	if (!stream) {
-		pr_debug("BTRFS: ZSTD_initDStream failed\n");
+		pr_debug("BTRFS: ZSTD_initStaticDStream failed\n");
 		ret = -EIO;
 		goto done;
 	}
@@ -579,8 +588,8 @@  int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 		ret2 = ZSTD_decompressStream(stream, &workspace->out_buf,
 				&workspace->in_buf);
 		if (ZSTD_isError(ret2)) {
-			pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
-					ZSTD_getErrorCode(ret2));
+			pr_debug("BTRFS: ZSTD_decompressStream returned %s\n",
+					ZSTD_getErrorName(ret2));
 			ret = -EIO;
 			goto done;
 		}
@@ -633,10 +642,9 @@  int zstd_decompress(struct list_head *ws, unsigned char *data_in,
 	unsigned long pg_offset = 0;
 	char *kaddr;
 
-	stream = ZSTD_initDStream(
-			ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+	stream = ZSTD_initStaticDStream(workspace->mem, workspace->size);
 	if (!stream) {
-		pr_warn("BTRFS: ZSTD_initDStream failed\n");
+		pr_warn("BTRFS: ZSTD_initStaticDStream failed\n");
 		ret = -EIO;
 		goto finish;
 	}
@@ -667,8 +675,8 @@  int zstd_decompress(struct list_head *ws, unsigned char *data_in,
 		ret2 = ZSTD_decompressStream(stream, &workspace->out_buf,
 				&workspace->in_buf);
 		if (ZSTD_isError(ret2)) {
-			pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
-					ZSTD_getErrorCode(ret2));
+			pr_debug("BTRFS: ZSTD_decompressStream returned %s\n",
+					ZSTD_getErrorName(ret2));
 			ret = -EIO;
 			goto finish;
 		}