Message ID | 20220929215515.276486-1-gpiccoli@igalia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [REGRESSION] Revert "pstore: migrate to crypto acomp interface" | expand |
On Thu, Sep 29, 2022 at 06:55:15PM -0300, Guilherme G. Piccoli wrote: > This reverts commit e4f0a7ec586b7644107839f5394fb685cf1aadcc. > > When using this new interface, both efi_pstore and ramoops > backends are unable to properly decompress dmesg if using > zstd, lz4 and lzo algorithms (and maybe more). It does succeed > with deflate though. Hi! Thanks for looking at this. I wasn't able to reproduce the problem, initially. Booting with pstore.backend=ramoops pstore.compress=zstd and writing to /dev/pmsg0, after a reboot I'm able to read it back. > The message observed in the kernel log is: > > [2.328828] pstore: crypto_acomp_decompress failed, ret = -22! Hmm, that's EINVAL. > The pstore infrastructure is able to collect the dmesg with > both backends tested, but since decompression fails it's > unreadable. With this revert everything is back to normal. > > Fixes: e4f0a7ec586b ("pstore: migrate to crypto acomp interface") > Cc: Ard Biesheuvel <ardb@kernel.org> > Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> A reminder to myself, I keep getting surprised that systemd is stealing the pstore filesystem contents and moving them into /var/lib/systemd/pstore/ > Hi Ard, Thorsten and pstore maintainers. I've found this yday > during pstore development - it was "hidden" since I was using > deflate. Tried some fixes (I plan to submit a cast fix for a > long-term issue later), but nothing I tried fixed this. What's your setup for this? I'm using emulated NVDIMM through qemu for a ramoops backend. But trying this with the EFI backend (booting undef EFI with pstore.backend=efi), I _do_ see the problem. That's weird... I suspect there's some back interaction with buffer size differences between ramoops and EFI & deflate and zstd. And I can confirm EFI+zstd with the acomp change reverted fixes it. Ard, anything jump to mind for you? > So, I thought in sending this revert - feel free to ignore it if > anybody comes with a proper fix for the async compress interface > proposed by Ard. The idea of the revert is because the 6.0-rc > cycle is nearly over, and would be nice to avoid introducing > this regression. > > Also, I searched some mailing list discussions / submission of > the patch ("pstore: migrate to crypto acomp interface"), but > couldn't find it - can any of you point it to me in case it's > in some archive? Hm, it's possible this was just sent directly to me? If that's true, I apologize for not re-posting it to lkml. I suspect I didn't notice at the time that it wasn't CCed to a list. > Thanks in advance, and sorry for reporting this so late in the > cycle, I wish I'd found it before. No worries! Whatever the case, there's always -stable updates. :) -Kees
[Note: this mail is primarily send for documentation purposes and/or for regzbot, my Linux kernel regression tracking bot. That's why I removed most or all folks from the list of recipients, but left any that looked like a mailing lists. These mails usually contain '#forregzbot' in the subject, to make them easy to spot and filter out.] [TLDR: I'm adding this regression report to the list of tracked regressions; all text from me you find below is based on a few templates paragraphs you might have encountered already already in similar form.] Hi, this is your Linux kernel regression tracker. Thx for the report. On 29.09.22 23:55, Guilherme G. Piccoli wrote: > This reverts commit e4f0a7ec586b7644107839f5394fb685cf1aadcc. > > When using this new interface, both efi_pstore and ramoops > backends are unable to properly decompress dmesg if using > zstd, lz4 and lzo algorithms (and maybe more). It does succeed > with deflate though. > > The message observed in the kernel log is: > > [2.328828] pstore: crypto_acomp_decompress failed, ret = -22! > > The pstore infrastructure is able to collect the dmesg with > both backends tested, but since decompression fails it's > unreadable. With this revert everything is back to normal. > > Fixes: e4f0a7ec586b ("pstore: migrate to crypto acomp interface") > Cc: Ard Biesheuvel <ardb@kernel.org> > Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> > > --- > > Hi Ard, Thorsten and pstore maintainers. I've found this yday > during pstore development - it was "hidden" since I was using > deflate. Tried some fixes (I plan to submit a cast fix for a > long-term issue later), but nothing I tried fixed this. > > So, I thought in sending this revert - feel free to ignore it if > anybody comes with a proper fix for the async compress interface > proposed by Ard. The idea of the revert is because the 6.0-rc > cycle is nearly over, and would be nice to avoid introducing > this regression. > > Also, I searched some mailing list discussions / submission of > the patch ("pstore: migrate to crypto acomp interface"), but > couldn't find it - can any of you point it to me in case it's > in some archive? > > Thanks in advance, and sorry for reporting this so late in the > cycle, I wish I'd found it before. Thanks for the report. To be sure below issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot: #regzbot ^introduced e4f0a7ec586b #regzbot title pstore: efi_pstore and ramoops backends are sometimes unable to properly decompress dmesg #regzbot ignore-activity This isn't a regression? This issue or a fix for it are already discussed somewhere else? It was fixed already? You want to clarify when the regression started to happen? Or point out I got the title or something else totally wrong? Then just reply -- ideally with also telling regzbot about it, as explained here: https://linux-regtracking.leemhuis.info/tracked-regression/ Reminder for developers: When fixing the issue, add 'Link:' tags pointing to the report (the mail this one replies to), as explained for in the Linux kernel's documentation; above webpage explains why this is important for tracked regressions. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight.
On 30/09/2022 00:29, Kees Cook wrote: > [...] > > Hi! > > Thanks for looking at this. I wasn't able to reproduce the problem, > initially. Booting with pstore.backend=ramoops pstore.compress=zstd and > writing to /dev/pmsg0, after a reboot I'm able to read it back. > Hi Kees, thanks a lot for your attention! IIUC, compression applies to dmesg only, correct? > [...] > What's your setup for this? I'm using emulated NVDIMM through qemu for > a ramoops backend. But trying this with the EFI backend (booting > undef EFI with pstore.backend=efi), I _do_ see the problem. That's > weird... I suspect there's some back interaction with buffer size > differences between ramoops and EFI & deflate and zstd. > > And I can confirm EFI+zstd with the acomp change reverted fixes it. > I'm using qemu but was able to use real HW (Steam Deck). In both cases, kernel is not using the entire RAM ("mem=" parameter, for example) so we can use a bit for ramoops. Also, both setups are UEFI, hence I can also use efi_pstore. > [...] > Hm, it's possible this was just sent directly to me? If that's true, I > apologize for not re-posting it to lkml. I suspect I didn't notice at > the time that it wasn't CCed to a list. No need for apologies, thanks for the clarification! How about if we add a mailing list in the pstore entry on MAINTAINERS file, since it's just composed for you and 3 other people now? I mean, "officially" speaking, it should be enough to send a patch for the 4 maintainers with no list in CC, and that's bad for achieving purposes. What list should be the best, fsdevel? Lkml? > > No worries! Whatever the case, there's always -stable updates. :) Heheh you're right! But for something like this (pstore/dmesg compression broke for the most backends), I'd be glad if we could fix it before the release. Cheers, Guilherme
On Fri, 30 Sept 2022 at 14:39, Guilherme G. Piccoli <gpiccoli@igalia.com> wrote: > > On 30/09/2022 00:29, Kees Cook wrote: > > [...] > > > > Hi! > > > > Thanks for looking at this. I wasn't able to reproduce the problem, > > initially. Booting with pstore.backend=ramoops pstore.compress=zstd and > > writing to /dev/pmsg0, after a reboot I'm able to read it back. > > > > Hi Kees, thanks a lot for your attention! > IIUC, compression applies to dmesg only, correct? > > > > [...] > > What's your setup for this? I'm using emulated NVDIMM through qemu for > > a ramoops backend. But trying this with the EFI backend (booting > > undef EFI with pstore.backend=efi), I _do_ see the problem. That's > > weird... I suspect there's some back interaction with buffer size > > differences between ramoops and EFI & deflate and zstd. > > > > And I can confirm EFI+zstd with the acomp change reverted fixes it. > > > > I'm using qemu but was able to use real HW (Steam Deck). In both cases, > kernel is not using the entire RAM ("mem=" parameter, for example) so we > can use a bit for ramoops. Also, both setups are UEFI, hence I can also > use efi_pstore. > Does this help? diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index b2fd3c20e7c2..c0b609d7d04e 100644 --- a/fs/pstore/platform.c +++ b/fs/pstore/platform.c @@ -292,7 +292,7 @@ static int pstore_compress(const void *in, void *out, return ret; } - return outlen; + return creq->dlen; } static void allocate_buf_for_compression(void) > > > [...] > > Hm, it's possible this was just sent directly to me? If that's true, I > > apologize for not re-posting it to lkml. I suspect I didn't notice at > > the time that it wasn't CCed to a list. > > No need for apologies, thanks for the clarification! How about if we add > a mailing list in the pstore entry on MAINTAINERS file, since it's just > composed for you and 3 other people now? I mean, "officially" speaking, > it should be enough to send a patch for the 4 maintainers with no list > in CC, and that's bad for achieving purposes. What list should be the > best, fsdevel? Lkml? > Makes sense > > > > > No worries! Whatever the case, there's always -stable updates. :) > > Heheh you're right! But for something like this (pstore/dmesg > compression broke for the most backends), I'd be glad if we could fix it > before the release. Yeah better to revert - this was not a critical change anyway. But I think the tweak above should fix things (it works for me here)
On 30/09/2022 12:51, Ard Biesheuvel wrote: > [...] > > Does this help? > > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c > index b2fd3c20e7c2..c0b609d7d04e 100644 > --- a/fs/pstore/platform.c > +++ b/fs/pstore/platform.c > @@ -292,7 +292,7 @@ static int pstore_compress(const void *in, void *out, > return ret; > } > > - return outlen; > + return creq->dlen; > } > > static void allocate_buf_for_compression(void) > Thanks a lot Ard, this seems to be the fix! Tested with lz4/zstd/deflate in both ramoops/efi backends, and all worked fine. It makes sense, outlen was modified in the previous API and not in the acomp thing, so it was a good catch =) >> Heheh you're right! But for something like this (pstore/dmesg >> compression broke for the most backends), I'd be glad if we could fix it >> before the release. > > Yeah better to revert - this was not a critical change anyway. But I > think the tweak above should fix things (it works for me here) Agreed - in fact seems it was reverted already. More than that, I found yet another small issue in the acomp refactor, a memory leak - attached is a patch with the fix, feel free to integrate in your acomp refactor when re-submitting (I mean, feel free to just integrate the code, don't need to send it as a separate patch/fix). I'm also working some fixes in implicit conversions in pstore that aren't great (unsigned -> int in many places), I'll send some stuff next week. Cheers, Guilherme
On Fri, Sep 30, 2022 at 03:31:17PM -0300, Guilherme G. Piccoli wrote: > On 30/09/2022 12:51, Ard Biesheuvel wrote: > > [...] > > > > Does this help? > > > > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c > > index b2fd3c20e7c2..c0b609d7d04e 100644 > > --- a/fs/pstore/platform.c > > +++ b/fs/pstore/platform.c > > @@ -292,7 +292,7 @@ static int pstore_compress(const void *in, void *out, > > return ret; > > } > > > > - return outlen; > > + return creq->dlen; > > } > > > > static void allocate_buf_for_compression(void) > > > > Thanks a lot Ard, this seems to be the fix! Tested with lz4/zstd/deflate > in both ramoops/efi backends, and all worked fine. It makes sense, > outlen was modified in the previous API and not in the acomp thing, so > it was a good catch =) > > > >> Heheh you're right! But for something like this (pstore/dmesg > >> compression broke for the most backends), I'd be glad if we could fix it > >> before the release. > > > > Yeah better to revert - this was not a critical change anyway. But I > > think the tweak above should fix things (it works for me here) > > Agreed - in fact seems it was reverted already. More than that, I found > yet another small issue in the acomp refactor, a memory leak - attached > is a patch with the fix, feel free to integrate in your acomp refactor > when re-submitting (I mean, feel free to just integrate the code, don't > need to send it as a separate patch/fix). > > I'm also working some fixes in implicit conversions in pstore that > aren't great (unsigned -> int in many places), I'll send some stuff next > week. Awesome! Thank you both! I'll probably be busy for the next week with the merge window, but after that I'll pull new stuff into -next for pstore. I appreciate more people poking around in the code. :)
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index b2fd3c20e7c2..0c034ea39954 100644 --- a/fs/pstore/platform.c +++ b/fs/pstore/platform.c @@ -28,14 +28,11 @@ #include <linux/crypto.h> #include <linux/string.h> #include <linux/timer.h> -#include <linux/scatterlist.h> #include <linux/slab.h> #include <linux/uaccess.h> #include <linux/jiffies.h> #include <linux/workqueue.h> -#include <crypto/acompress.h> - #include "internal.h" /* @@ -93,8 +90,7 @@ module_param(compress, charp, 0444); MODULE_PARM_DESC(compress, "compression to use"); /* Compression parameters */ -static struct crypto_acomp *tfm; -static struct acomp_req *creq; +static struct crypto_comp *tfm; struct pstore_zbackend { int (*zbufsize)(size_t size); @@ -272,21 +268,12 @@ static const struct pstore_zbackend zbackends[] = { static int pstore_compress(const void *in, void *out, unsigned int inlen, unsigned int outlen) { - struct scatterlist src, dst; int ret; if (!IS_ENABLED(CONFIG_PSTORE_COMPRESS)) return -EINVAL; - sg_init_table(&src, 1); - sg_set_buf(&src, in, inlen); - - sg_init_table(&dst, 1); - sg_set_buf(&dst, out, outlen); - - acomp_request_set_params(creq, &src, &dst, inlen, outlen); - - ret = crypto_acomp_compress(creq); + ret = crypto_comp_compress(tfm, in, inlen, out, &outlen); if (ret) { pr_err("crypto_comp_compress failed, ret = %d!\n", ret); return ret; @@ -297,7 +284,7 @@ static int pstore_compress(const void *in, void *out, static void allocate_buf_for_compression(void) { - struct crypto_acomp *acomp; + struct crypto_comp *ctx; int size; char *buf; @@ -309,7 +296,7 @@ static void allocate_buf_for_compression(void) if (!psinfo || tfm) return; - if (!crypto_has_acomp(zbackend->name, 0, CRYPTO_ALG_ASYNC)) { + if (!crypto_has_comp(zbackend->name, 0, 0)) { pr_err("Unknown compression: %s\n", zbackend->name); return; } @@ -328,24 +315,16 @@ static void allocate_buf_for_compression(void) return; } - acomp = crypto_alloc_acomp(zbackend->name, 0, CRYPTO_ALG_ASYNC); - if (IS_ERR_OR_NULL(acomp)) { + ctx = crypto_alloc_comp(zbackend->name, 0, 0); + if (IS_ERR_OR_NULL(ctx)) { kfree(buf); pr_err("crypto_alloc_comp('%s') failed: %ld\n", zbackend->name, - PTR_ERR(acomp)); - return; - } - - creq = acomp_request_alloc(acomp); - if (!creq) { - crypto_free_acomp(acomp); - kfree(buf); - pr_err("acomp_request_alloc('%s') failed\n", zbackend->name); + PTR_ERR(ctx)); return; } /* A non-NULL big_oops_buf indicates compression is available. */ - tfm = acomp; + tfm = ctx; big_oops_buf_sz = size; big_oops_buf = buf; @@ -355,8 +334,7 @@ static void allocate_buf_for_compression(void) static void free_buf_for_compression(void) { if (IS_ENABLED(CONFIG_PSTORE_COMPRESS) && tfm) { - acomp_request_free(creq); - crypto_free_acomp(tfm); + crypto_free_comp(tfm); tfm = NULL; } kfree(big_oops_buf); @@ -693,8 +671,6 @@ static void decompress_record(struct pstore_record *record) int ret; int unzipped_len; char *unzipped, *workspace; - struct acomp_req *dreq; - struct scatterlist src, dst; if (!IS_ENABLED(CONFIG_PSTORE_COMPRESS) || !record->compressed) return; @@ -718,30 +694,16 @@ static void decompress_record(struct pstore_record *record) if (!workspace) return; - dreq = acomp_request_alloc(tfm); - if (!dreq) { - kfree(workspace); - return; - } - - sg_init_table(&src, 1); - sg_set_buf(&src, record->buf, record->size); - - sg_init_table(&dst, 1); - sg_set_buf(&dst, workspace, unzipped_len); - - acomp_request_set_params(dreq, &src, &dst, record->size, unzipped_len); - /* After decompression "unzipped_len" is almost certainly smaller. */ - ret = crypto_acomp_decompress(dreq); + ret = crypto_comp_decompress(tfm, record->buf, record->size, + workspace, &unzipped_len); if (ret) { - pr_err("crypto_acomp_decompress failed, ret = %d!\n", ret); + pr_err("crypto_comp_decompress failed, ret = %d!\n", ret); kfree(workspace); return; } /* Append ECC notice to decompressed buffer. */ - unzipped_len = dreq->dlen; memcpy(workspace + unzipped_len, record->buf + record->size, record->ecc_notice_size); @@ -749,7 +711,6 @@ static void decompress_record(struct pstore_record *record) unzipped = kmemdup(workspace, unzipped_len + record->ecc_notice_size, GFP_KERNEL); kfree(workspace); - acomp_request_free(dreq); if (!unzipped) return;
This reverts commit e4f0a7ec586b7644107839f5394fb685cf1aadcc. When using this new interface, both efi_pstore and ramoops backends are unable to properly decompress dmesg if using zstd, lz4 and lzo algorithms (and maybe more). It does succeed with deflate though. The message observed in the kernel log is: [2.328828] pstore: crypto_acomp_decompress failed, ret = -22! The pstore infrastructure is able to collect the dmesg with both backends tested, but since decompression fails it's unreadable. With this revert everything is back to normal. Fixes: e4f0a7ec586b ("pstore: migrate to crypto acomp interface") Cc: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> --- Hi Ard, Thorsten and pstore maintainers. I've found this yday during pstore development - it was "hidden" since I was using deflate. Tried some fixes (I plan to submit a cast fix for a long-term issue later), but nothing I tried fixed this. So, I thought in sending this revert - feel free to ignore it if anybody comes with a proper fix for the async compress interface proposed by Ard. The idea of the revert is because the 6.0-rc cycle is nearly over, and would be nice to avoid introducing this regression. Also, I searched some mailing list discussions / submission of the patch ("pstore: migrate to crypto acomp interface"), but couldn't find it - can any of you point it to me in case it's in some archive? Thanks in advance, and sorry for reporting this so late in the cycle, I wish I'd found it before. Cheers, Guilherme fs/pstore/platform.c | 63 +++++++++----------------------------------- 1 file changed, 12 insertions(+), 51 deletions(-)