Message ID | 20221006224212.569555-3-gpiccoli@igalia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Some pstore improvements | expand |
On Thu, Oct 06, 2022 at 07:42:06PM -0300, Guilherme G. Piccoli wrote: > Currently this tuning is only exposed as a filesystem option, > but most Linux distros automatically mount pstore, hence changing > this setting requires remounting it. Also, if that mount option > wasn't explicitly set it doesn't show up in mount information, > so users cannot check what is the current value of kmsg_bytes. > > Let's then expose it as a module parameter, allowing both user > visibility at all times (even if not manually set) and also the > possibility of setting that as a boot/module parameter. I've been meaning to do this too. :) > > Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> > --- > fs/pstore/platform.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c > index c32957e4b256..be05090076ce 100644 > --- a/fs/pstore/platform.c > +++ b/fs/pstore/platform.c > @@ -89,6 +89,11 @@ static char *compress = > module_param(compress, charp, 0444); > MODULE_PARM_DESC(compress, "compression to use"); > > +/* How much of the kernel log to snapshot */ > +unsigned long kmsg_bytes = CONFIG_PSTORE_DEFAULT_KMSG_BYTES; > +module_param(kmsg_bytes, ulong, 0444); > +MODULE_PARM_DESC(kmsg_bytes, "amount of kernel log to snapshot (in bytes)"); > + > /* Compression parameters */ > static struct crypto_comp *tfm; > > @@ -100,9 +105,6 @@ struct pstore_zbackend { > static char *big_oops_buf; > static size_t big_oops_buf_sz; > > -/* How much of the console log to snapshot */ > -unsigned long kmsg_bytes = CONFIG_PSTORE_DEFAULT_KMSG_BYTES; > - > void pstore_set_kmsg_bytes(int bytes) > { > kmsg_bytes = bytes; > -- > 2.38.0 Doing a mount will override the result, so I wonder if there should be two variables, etc... not a concern for the normal use case. Also, I've kind of wanted to get rid of a "default" for this and instead use a value based on the compression vs record sizes, etc. But I didn't explore it.
On 06/10/2022 20:32, Kees Cook wrote: > [...] > Doing a mount will override the result, so I wonder if there should be > two variables, etc... not a concern for the normal use case. > > Also, I've kind of wanted to get rid of a "default" for this and instead > use a value based on the compression vs record sizes, etc. But I didn't > explore it. > For some reason I forgot to respond that, sorry! I didn't understand exactly how the mount would override things; I've done some tests: (1) booted with the new kmsg_bytes module parameter set to 64k, and it was preserved across multiple mount/umount cycles. (2) When I manually had "-o kmsg_bytes=16k" set during the mount operation, it worked as expected, setting the thing to 16k (and reflecting in the module parameter, as observed in /sys/modules). Maybe I'm missing something? Now, regarding the idea of setting that as a function of compression/record_sizes, I feel it makes sense and could be worked, like a heuristic right? In the end, if you think properly, what is the purpose of kmsg_bytes? Wouldn't make sense to just fill the record_size with the maximum amount of data it can handle? Of course there is the partitioning thing, but in the end kmsg_bytes seems a mechanism to _restrict_ the data collection, so maybe the default would be a value that means "save whatever you can handle" (maybe 0), and if the parameter/mount option is set, then pstore would restrict the saved size. Thoughts? Thanks, Guilherme
On Wed, Oct 12, 2022 at 12:33:36PM -0300, Guilherme G. Piccoli wrote: > On 06/10/2022 20:32, Kees Cook wrote: > > [...] > > Doing a mount will override the result, so I wonder if there should be > > two variables, etc... not a concern for the normal use case. > > > > Also, I've kind of wanted to get rid of a "default" for this and instead > > use a value based on the compression vs record sizes, etc. But I didn't > > explore it. > > > > For some reason I forgot to respond that, sorry! > > I didn't understand exactly how the mount would override things; I've > done some tests: > > (1) booted with the new kmsg_bytes module parameter set to 64k, and it > was preserved across multiple mount/umount cycles. > > (2) When I manually had "-o kmsg_bytes=16k" set during the mount > operation, it worked as expected, setting the thing to 16k (and > reflecting in the module parameter, as observed in /sys/modules). What I was imagining was the next step: (3) umount, unload the backend, load a new backend, and mount it without kmsg_bytes specified -- kmsg_bytes will be 16k, not 64k. It's a pretty extreme corner-case, I realize. :) However, see below... > In the end, if you think properly, what is the purpose of kmsg_bytes? > Wouldn't make sense to just fill the record_size with the maximum amount > of data it can handle? Of course there is the partitioning thing, but in > the end kmsg_bytes seems a mechanism to _restrict_ the data collection, > so maybe the default would be a value that means "save whatever you can > handle" (maybe 0), and if the parameter/mount option is set, then pstore > would restrict the saved size. Right, kmsg_bytes is the maximum size to save from the console on a crash. The design of the ram backend was to handle really small amounts of persistent RAM -- if a single crash would eat all of it and possibly wrap around, it could write over useful parts at the end (since it's written from the end to the front). However, I think somewhere along the way, stricter logic was added to the ram backend: /* * Explicitly only take the first part of any new crash. * If our buffer is larger than kmsg_bytes, this can never happen, * and if our buffer is smaller than kmsg_bytes, we don't want the * report split across multiple records. */ if (record->part != 1) return -ENOSPC; This limits it to just a single record. However, this does _not_ exist for other backends, so they will see up to kmsg_bytes-size dumps split across psinfo->bufsize many records. For the backends, this record size is not always fixed: - efi uses 1024, even though it allocates 4096 (as was pointed out earlier) - zone uses kmsg_bytes - acpi-erst uses some ACPI value from ACPI_ERST_GET_ERROR_LENGTH - ppc-nvram uses the configured size of nvram partition Honestly, it seems like the 64k default is huge, but I don't think it should be "unlimited" given the behaviors of ppc-nvram, and acpi-erst. For ram and efi, it's effectively unlimited because of the small bufsizes (and the "only 1 record" logic in ram). Existing documentation I can find online seem to imply making it smaller (8000 bytes[1], 16000 bytes), but without justification. Even the "main" documentation[2] doesn't mention it. -Kees [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/pstore [2] https://docs.kernel.org/admin-guide/ramoops.html
Hi Kees, my apologies for the (big) delay in answering that! I kept it marked to respond after tests, ended-up forgetting, but now did all the tests and finally I'm able to respond. On 12/10/2022 14:58, Kees Cook wrote: > [...] >> I didn't understand exactly how the mount would override things; I've >> done some tests: >> >> (1) booted with the new kmsg_bytes module parameter set to 64k, and it >> was preserved across multiple mount/umount cycles. >> >> (2) When I manually had "-o kmsg_bytes=16k" set during the mount >> operation, it worked as expected, setting the thing to 16k (and >> reflecting in the module parameter, as observed in /sys/modules). > > What I was imagining was the next step: > > (3) umount, unload the backend, load a new backend, and mount it > without kmsg_bytes specified -- kmsg_bytes will be 16k, not 64k. > > It's a pretty extreme corner-case, I realize. :) However, see below... Oh okay, thanks for pointing that! Indeed, in your test-case I've faced the issue of the retained kmsg_bytes...although, I agree it's an extreme corner-case heheh > [...] > Right, kmsg_bytes is the maximum size to save from the console on a > crash. The design of the ram backend was to handle really small amounts > of persistent RAM -- if a single crash would eat all of it and possibly > wrap around, it could write over useful parts at the end (since it's > written from the end to the front). However, I think somewhere along > the way, stricter logic was added to the ram backend: > > /* > * Explicitly only take the first part of any new crash. > * If our buffer is larger than kmsg_bytes, this can never happen, > * and if our buffer is smaller than kmsg_bytes, we don't want the > * report split across multiple records. > */ > if (record->part != 1) > return -ENOSPC; > > This limits it to just a single record. Indeed, and I already considered that in the past...why was that restricted to a single record, right? I had plans to change it, lemme know your thoughts. (Reference: https://lore.kernel.org/linux-fsdevel/a21201cf-1e5f-fed1-356d-42c83a66fa57@igalia.com/) > However, this does _not_ exist for other backends, so they will see up > to kmsg_bytes-size dumps split across psinfo->bufsize many records. For > the backends, this record size is not always fixed: > > - efi uses 1024, even though it allocates 4096 (as was pointed out earlier) > - zone uses kmsg_bytes > - acpi-erst uses some ACPI value from ACPI_ERST_GET_ERROR_LENGTH > - ppc-nvram uses the configured size of nvram partition > > Honestly, it seems like the 64k default is huge, but I don't think it > should be "unlimited" given the behaviors of ppc-nvram, and acpi-erst. > For ram and efi, it's effectively unlimited because of the small bufsizes > (and the "only 1 record" logic in ram). > > Existing documentation I can find online seem to imply making it smaller > (8000 bytes[1], 16000 bytes), but without justification. Even the "main" > documentation[2] doesn't mention it. Right! Also, on top of that, there is a kind of "tricky" logic in which this value is not always respected. For example, in the Steam Deck case we have a region of ~10MB, and set record size of the ramoops backend to 2MB. This is the amount collected, it doesn't respect kmsg_bytes, since it checks the amount dumped vs kmsg_bytes effectively _after_ the first part is written (which in the ramoops case, it's a single write), hence this check is never "respected" there. I don't consider that as a bug, more a flexibility for the ramoops case heh In any way, lemme know if you want to have a "revamp" in the meaning of kmsg_bytes, I'd be glad in discuss/work on that =) Thanks, Guilherme
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index c32957e4b256..be05090076ce 100644 --- a/fs/pstore/platform.c +++ b/fs/pstore/platform.c @@ -89,6 +89,11 @@ static char *compress = module_param(compress, charp, 0444); MODULE_PARM_DESC(compress, "compression to use"); +/* How much of the kernel log to snapshot */ +unsigned long kmsg_bytes = CONFIG_PSTORE_DEFAULT_KMSG_BYTES; +module_param(kmsg_bytes, ulong, 0444); +MODULE_PARM_DESC(kmsg_bytes, "amount of kernel log to snapshot (in bytes)"); + /* Compression parameters */ static struct crypto_comp *tfm; @@ -100,9 +105,6 @@ struct pstore_zbackend { static char *big_oops_buf; static size_t big_oops_buf_sz; -/* How much of the console log to snapshot */ -unsigned long kmsg_bytes = CONFIG_PSTORE_DEFAULT_KMSG_BYTES; - void pstore_set_kmsg_bytes(int bytes) { kmsg_bytes = bytes;
Currently this tuning is only exposed as a filesystem option, but most Linux distros automatically mount pstore, hence changing this setting requires remounting it. Also, if that mount option wasn't explicitly set it doesn't show up in mount information, so users cannot check what is the current value of kmsg_bytes. Let's then expose it as a module parameter, allowing both user visibility at all times (even if not manually set) and also the possibility of setting that as a boot/module parameter. Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> --- fs/pstore/platform.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)