Message ID | 20230501165450.15352-2-surenb@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Memory allocation profiling | expand |
On Mon, 01 May 2023, Suren Baghdasaryan wrote: >From: Kent Overstreet <kent.overstreet@linux.dev> > >Previously, string_get_size() outputted a space between the number and >the units, i.e. > 9.88 MiB > >This changes it to > 9.88MiB > >which allows it to be parsed correctly by the 'sort -h' command. Wouldn't this break users that already parse it the current way? Thanks, Davidlohr
On Mon, May 01, 2023 at 11:13:15AM -0700, Davidlohr Bueso wrote: > On Mon, 01 May 2023, Suren Baghdasaryan wrote: > > > From: Kent Overstreet <kent.overstreet@linux.dev> > > > > Previously, string_get_size() outputted a space between the number and > > the units, i.e. > > 9.88 MiB > > > > This changes it to > > 9.88MiB > > > > which allows it to be parsed correctly by the 'sort -h' command. > > Wouldn't this break users that already parse it the current way? It's not impossible - but it's not used in very many places and we wouldn't be printing in human-readable units if it was meant to be parsed - it's mainly used for debug output currently. If someone raises a specific objection we'll do something different, otherwise I think standardizing on what userspace tooling already parses is a good idea.
On Mon, May 1, 2023 at 10:36 PM Kent Overstreet <kent.overstreet@linux.dev> wrote: > > On Mon, May 01, 2023 at 11:13:15AM -0700, Davidlohr Bueso wrote: > > On Mon, 01 May 2023, Suren Baghdasaryan wrote: > > > > > From: Kent Overstreet <kent.overstreet@linux.dev> > > > > > > Previously, string_get_size() outputted a space between the number and > > > the units, i.e. > > > 9.88 MiB > > > > > > This changes it to > > > 9.88MiB > > > > > > which allows it to be parsed correctly by the 'sort -h' command. But why do we need that? What's the use case? > > Wouldn't this break users that already parse it the current way? > > It's not impossible - but it's not used in very many places and we > wouldn't be printing in human-readable units if it was meant to be > parsed - it's mainly used for debug output currently. > > If someone raises a specific objection we'll do something different, > otherwise I think standardizing on what userspace tooling already parses > is a good idea. Yes, I NAK this on the basis of https://english.stackexchange.com/a/2911/153144
On Mon, May 01, 2023 at 10:57:07PM +0300, Andy Shevchenko wrote: > On Mon, May 1, 2023 at 10:36 PM Kent Overstreet > <kent.overstreet@linux.dev> wrote: > > > > On Mon, May 01, 2023 at 11:13:15AM -0700, Davidlohr Bueso wrote: > > > On Mon, 01 May 2023, Suren Baghdasaryan wrote: > > > > > > > From: Kent Overstreet <kent.overstreet@linux.dev> > > > > > > > > Previously, string_get_size() outputted a space between the number and > > > > the units, i.e. > > > > 9.88 MiB > > > > > > > > This changes it to > > > > 9.88MiB > > > > > > > > which allows it to be parsed correctly by the 'sort -h' command. > > But why do we need that? What's the use case? As was in the commit message: to produce output that sort -h knows how to parse. > > > Wouldn't this break users that already parse it the current way? > > > > It's not impossible - but it's not used in very many places and we > > wouldn't be printing in human-readable units if it was meant to be > > parsed - it's mainly used for debug output currently. > > > > If someone raises a specific objection we'll do something different, > > otherwise I think standardizing on what userspace tooling already parses > > is a good idea. > > Yes, I NAK this on the basis of > https://english.stackexchange.com/a/2911/153144 Not sure I find a style guide on stackexchange more compelling than interop with a tool everyone already has installed :)
* Andy Shevchenko <andy.shevchenko@gmail.com> [230501 15:57]: > On Mon, May 1, 2023 at 10:36 PM Kent Overstreet > <kent.overstreet@linux.dev> wrote: > > > > On Mon, May 01, 2023 at 11:13:15AM -0700, Davidlohr Bueso wrote: > > > On Mon, 01 May 2023, Suren Baghdasaryan wrote: > > > > > > > From: Kent Overstreet <kent.overstreet@linux.dev> > > > > > > > > Previously, string_get_size() outputted a space between the number and > > > > the units, i.e. > > > > 9.88 MiB > > > > > > > > This changes it to > > > > 9.88MiB > > > > > > > > which allows it to be parsed correctly by the 'sort -h' command. > > But why do we need that? What's the use case? > > > > Wouldn't this break users that already parse it the current way? > > > > It's not impossible - but it's not used in very many places and we > > wouldn't be printing in human-readable units if it was meant to be > > parsed - it's mainly used for debug output currently. > > > > If someone raises a specific objection we'll do something different, > > otherwise I think standardizing on what userspace tooling already parses > > is a good idea. > > Yes, I NAK this on the basis of > https://english.stackexchange.com/a/2911/153144 This fixes the output to be better aligned with: the output of ls -sh the input expected by find -size Are there counter-examples of commands that follow the SI Brochure? Thanks, Liam
On Mon, May 01, 2023 at 05:33:49PM -0400, Liam R. Howlett wrote: > * Andy Shevchenko <andy.shevchenko@gmail.com> [230501 15:57]: > This fixes the output to be better aligned with: > the output of ls -sh > the input expected by find -size > > Are there counter-examples of commands that follow the SI Brochure? Even perf, which is included in the kernel tree, doesn't include the space - example perf top output: 0 bcachefs:move_extent_fail 0 bcachefs:move_extent_alloc_mem_fail 3 bcachefs:move_data 0 bcachefs:evacuate_bucket 0 bcachefs:copygc 2 bcachefs:copygc_wait 195K bcachefs:transaction_commit 0 bcachefs:trans_restart_injected (I'm also going to need to submit a patch that deletes or makes optional the B suffix, just because we're using human readable units doesn't mean it's bytes).
On Mon, May 01, 2023 at 10:57:07PM +0300, Andy Shevchenko wrote:
> But why do we need that? What's the use case?
It looks like we missed you on the initial CC, here's the use case:
https://lore.kernel.org/linux-fsdevel/ZFAsm0XTqC%2F%2Ff4FP@P9FQF9L96D/T/#mdda814a8c569e2214baa31320912b0ef83432fa9
On Mon, 2023-05-01 at 15:35 -0400, Kent Overstreet wrote: > On Mon, May 01, 2023 at 11:13:15AM -0700, Davidlohr Bueso wrote: > > On Mon, 01 May 2023, Suren Baghdasaryan wrote: > > > > > From: Kent Overstreet <kent.overstreet@linux.dev> > > > > > > Previously, string_get_size() outputted a space between the > > > number and the units, i.e. > > > 9.88 MiB > > > > > > This changes it to > > > 9.88MiB > > > > > > which allows it to be parsed correctly by the 'sort -h' command. > > > > Wouldn't this break users that already parse it the current way? > > It's not impossible - but it's not used in very many places and we > wouldn't be printing in human-readable units if it was meant to be > parsed - it's mainly used for debug output currently. It is not used just for debug. It's used all over the kernel for printing out device sizes. The output mostly goes to the kernel print buffer, so it's anyone's guess as to what, if any, tools are parsing it, but the concern about breaking log parsers seems to be a valid one. > If someone raises a specific objection we'll do something different, > otherwise I think standardizing on what userspace tooling already > parses is a good idea. If you want to omit the space, why not simply add your own variant? A string_get_size_nospace() which would use most of the body of this one as a helper function but give its own snprintf format string at the end. It's only a couple of lines longer as a patch and has the bonus that it definitely wouldn't break anything by altering an existing output. James
On Mon, May 01, 2023 at 10:22:18PM -0400, James Bottomley wrote: > It is not used just for debug. It's used all over the kernel for > printing out device sizes. The output mostly goes to the kernel print > buffer, so it's anyone's guess as to what, if any, tools are parsing > it, but the concern about breaking log parsers seems to be a valid one. Ok, there is sd_print_capacity() - but who in their right mind would be trying to scrape device sizes, in human readable units, from log messages when it's available in sysfs/procfs (actually, is it in sysfs? if not, that's an oversight) in more reasonable units? Correct me if I'm wrong, but I've yet to hear about kernel log messages being consider a stable interface, and this seems a bit out there. But, you did write the code :) > > If someone raises a specific objection we'll do something different, > > otherwise I think standardizing on what userspace tooling already > > parses is a good idea. > > If you want to omit the space, why not simply add your own variant? A > string_get_size_nospace() which would use most of the body of this one > as a helper function but give its own snprintf format string at the > end. It's only a couple of lines longer as a patch and has the bonus > that it definitely wouldn't break anything by altering an existing > output. I'm happy to do that - I just wanted to post this version first to see if we can avoid the fragmentation and do a bit of standardizing with how everything else seems to do that.
On Tue, May 2, 2023 at 6:18 AM Kent Overstreet <kent.overstreet@linux.dev> wrote: > On Mon, May 01, 2023 at 10:22:18PM -0400, James Bottomley wrote: ... > > > If someone raises a specific objection we'll do something different, > > > otherwise I think standardizing on what userspace tooling already > > > parses is a good idea. > > > > If you want to omit the space, why not simply add your own variant? A > > string_get_size_nospace() which would use most of the body of this one > > as a helper function but give its own snprintf format string at the > > end. It's only a couple of lines longer as a patch and has the bonus > > that it definitely wouldn't break anything by altering an existing > > output. > > I'm happy to do that - I just wanted to post this version first to see > if we can avoid the fragmentation and do a bit of standardizing with > how everything else seems to do that. Actually instead of producing zillions of variants, do a %p extension to the printf() and that's it. We have, for example, %pt with T and with space to follow users that want one or the other variant. Same can be done with string_get_size().
On Tue, May 02, 2023 at 08:33:57AM +0300, Andy Shevchenko wrote: > Actually instead of producing zillions of variants, do a %p extension > to the printf() and that's it. We have, for example, %pt with T and > with space to follow users that want one or the other variant. Same > can be done with string_get_size(). God no.
On Mon, 01 May 2023, Suren Baghdasaryan <surenb@google.com> wrote: > From: Kent Overstreet <kent.overstreet@linux.dev> > > Previously, string_get_size() outputted a space between the number and > the units, i.e. > 9.88 MiB > > This changes it to > 9.88MiB > > which allows it to be parsed correctly by the 'sort -h' command. The former is easier for humans to parse, and that should be preferred. 'sort -h' is supposed to compare "human readable numbers", so arguably sort does not do its job here. BR, Jani. > > Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > Cc: Andy Shevchenko <andy@kernel.org> > Cc: Michael Ellerman <mpe@ellerman.id.au> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Cc: Paul Mackerras <paulus@samba.org> > Cc: "Michael S. Tsirkin" <mst@redhat.com> > Cc: Jason Wang <jasowang@redhat.com> > Cc: "Noralf Trønnes" <noralf@tronnes.org> > Cc: Jens Axboe <axboe@kernel.dk> > --- > lib/string_helpers.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/lib/string_helpers.c b/lib/string_helpers.c > index 230020a2e076..593b29fece32 100644 > --- a/lib/string_helpers.c > +++ b/lib/string_helpers.c > @@ -126,8 +126,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units, > else > unit = units_str[units][i]; > > - snprintf(buf, len, "%u%s %s", (u32)size, > - tmp, unit); > + snprintf(buf, len, "%u%s%s", (u32)size, tmp, unit); > } > EXPORT_SYMBOL(string_get_size);
On Mon, 2023-05-01 at 23:17 -0400, Kent Overstreet wrote: > On Mon, May 01, 2023 at 10:22:18PM -0400, James Bottomley wrote: > > It is not used just for debug. It's used all over the kernel for > > printing out device sizes. The output mostly goes to the kernel > > print buffer, so it's anyone's guess as to what, if any, tools are > > parsing it, but the concern about breaking log parsers seems to be > > a valid one. > > Ok, there is sd_print_capacity() - but who in their right mind would > be trying to scrape device sizes, in human readable units, If you bother to google "kernel log parser", you'll discover it's quite an active area which supports a load of company business models. > from log messages when it's available in sysfs/procfs (actually, is > it in sysfs? if not, that's an oversight) in more reasonable units? It's not in sysfs, no. As aren't a lot of things, which is why log parsing for system monitoring is big business. > Correct me if I'm wrong, but I've yet to hear about kernel log > messages being consider a stable interface, and this seems a bit out > there. It might not be listed as stable, but when it's known there's a large ecosystem out there consuming it we shouldn't break it just because you feel like it. You should have a good reason and the break should be unavoidable. I wanted my output in a particular form so I thought I'd change everyone else's output as well isn't a good reason and it only costs a couple of lines to avoid. > But, you did write the code :) > > > > If someone raises a specific objection we'll do something > > > different, otherwise I think standardizing on what userspace > > > tooling already parses is a good idea. > > > > If you want to omit the space, why not simply add your own > > variant? A string_get_size_nospace() which would use most of the > > body of this one as a helper function but give its own snprintf > > format string at the end. It's only a couple of lines longer as a > > patch and has the bonus that it definitely wouldn't break anything > > by altering an existing output. > > I'm happy to do that - I just wanted to post this version first to > see if we can avoid the fragmentation and do a bit of standardizing > with how everything else seems to do that. What fragmentation? To do this properly you move the whole of the current function to a helper which takes a format sting, say with a double underscore prefix, then the existing function and what you want become one line additions calling the helper with their specific format string. There's no fragmentation of the base function at all. James
On Tue, May 2, 2023 at 9:22 AM Kent Overstreet <kent.overstreet@linux.dev> wrote: > On Tue, May 02, 2023 at 08:33:57AM +0300, Andy Shevchenko wrote: > > Actually instead of producing zillions of variants, do a %p extension > > to the printf() and that's it. We have, for example, %pt with T and > > with space to follow users that want one or the other variant. Same > > can be done with string_get_size(). > > God no. Any elaboration what's wrong with that? God no for zillion APIs for almost the same. Today you want space, tomorrow some other (special) delimiter.
On Tue, May 02, 2023 at 07:42:59AM -0400, James Bottomley wrote: > On Mon, 2023-05-01 at 23:17 -0400, Kent Overstreet wrote: > > On Mon, May 01, 2023 at 10:22:18PM -0400, James Bottomley wrote: > > > It is not used just for debug. It's used all over the kernel for > > > printing out device sizes. The output mostly goes to the kernel > > > print buffer, so it's anyone's guess as to what, if any, tools are > > > parsing it, but the concern about breaking log parsers seems to be > > > a valid one. > > > > Ok, there is sd_print_capacity() - but who in their right mind would > > be trying to scrape device sizes, in human readable units, > > If you bother to google "kernel log parser", you'll discover it's quite > an active area which supports a load of company business models. That doesn't mean log messages are unchangable ABI. Indeed, we had the whole "printk_index_emit()" addition recently to create an external index of printk message formats for such applications to use. [*] > > from log messages when it's available in sysfs/procfs (actually, is > > it in sysfs? if not, that's an oversight) in more reasonable units? > > It's not in sysfs, no. As aren't a lot of things, which is why log > parsing for system monitoring is big business. And that big business is why printk_index_emit() exists to allow them to easily determine how log messages change format and come and go across different kernel versions. > > Correct me if I'm wrong, but I've yet to hear about kernel log > > messages being consider a stable interface, and this seems a bit out > > there. > > It might not be listed as stable, but when it's known there's a large > ecosystem out there consuming it we shouldn't break it just because you > feel like it. But we've solved this problem already, yes? If the userspace applications are not using the kernel printk format index to detect such changes between kernel version, then they should be. This makes trivial issues like whether we have a space or not between units is completely irrelevant because the entry in the printk format index for the log output we emit will match whatever is output by the kernel.... Cheers, Dave. [*] commit 337015573718b161891a3473d25f59273f2e626b Author: Chris Down <chris@chrisdown.name> Date: Tue Jun 15 17:52:53 2021 +0100 printk: Userspace format indexing support We have a number of systems industry-wide that have a subset of their functionality that works as follows: 1. Receive a message from local kmsg, serial console, or netconsole; 2. Apply a set of rules to classify the message; 3. Do something based on this classification (like scheduling a remediation for the machine), rinse, and repeat. As a couple of examples of places we have this implemented just inside Facebook, although this isn't a Facebook-specific problem, we have this inside our netconsole processing (for alarm classification), and as part of our machine health checking. We use these messages to determine fairly important metrics around production health, and it's important that we get them right. While for some kinds of issues we have counters, tracepoints, or metrics with a stable interface which can reliably indicate the issue, in order to react to production issues quickly we need to work with the interface which most kernel developers naturally use when developing: printk. Most production issues come from unexpected phenomena, and as such usually the code in question doesn't have easily usable tracepoints or other counters available for the specific problem being mitigated. We have a number of lines of monitoring defence against problems in production (host metrics, process metrics, service metrics, etc), and where it's not feasible to reliably monitor at another level, this kind of pragmatic netconsole monitoring is essential. As one would expect, monitoring using printk is rather brittle for a number of reasons -- most notably that the message might disappear entirely in a new version of the kernel, or that the message may change in some way that the regex or other classification methods start to silently fail. One factor that makes this even harder is that, under normal operation, many of these messages are never expected to be hit. For example, there may be a rare hardware bug which one wants to detect if it was to ever happen again, but its recurrence is not likely or anticipated. This precludes using something like checking whether the printk in question was printed somewhere fleetwide recently to determine whether the message in question is still present or not, since we don't anticipate that it should be printed anywhere, but still need to monitor for its future presence in the long-term. This class of issue has happened on a number of occasions, causing unhealthy machines with hardware issues to remain in production for longer than ideal. As a recent example, some monitoring around blk_update_request fell out of date and caused semi-broken machines to remain in production for longer than would be desirable. Searching through the codebase to find the message is also extremely fragile, because many of the messages are further constructed beyond their callsite (eg. btrfs_printk and other module-specific wrappers, each with their own functionality). Even if they aren't, guessing the format and formulation of the underlying message based on the aesthetics of the message emitted is not a recipe for success at scale, and our previous issues with fleetwide machine health checking demonstrate as much. This provides a solution to the issue of silently changed or deleted printks: we record pointers to all printk format strings known at compile time into a new .printk_index section, both in vmlinux and modules. At runtime, this can then be iterated by looking at <debugfs>/printk/index/<module>, which emits the following format, both readable by humans and able to be parsed by machines: $ head -1 vmlinux; shuf -n 5 vmlinux # <level[,flags]> filename:line function "format" <5> block/blk-settings.c:661 disk_stack_limits "%s: Warning: Device %s is misaligned\n" <4> kernel/trace/trace.c:8296 trace_create_file "Could not create tracefs '%s' entry\n" <6> arch/x86/kernel/hpet.c:144 _hpet_print_config "hpet: %s(%d):\n" <6> init/do_mounts.c:605 prepare_namespace "Waiting for root device %s...\n" <6> drivers/acpi/osl.c:1410 acpi_no_auto_serialize_setup "ACPI: auto-serialization disabled\n" This mitigates the majority of cases where we have a highly-specific printk which we want to match on, as we can now enumerate and check whether the format changed or the printk callsite disappeared entirely in userspace. This allows us to catch changes to printks we monitor earlier and decide what to do about it before it becomes problematic. There is no additional runtime cost for printk callers or printk itself, and the assembly generated is exactly the same. Signed-off-by: Chris Down <chris@chrisdown.name> Cc: Petr Mladek <pmladek@suse.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Jessica Yu <jeyu@kernel.org> # for module.{c,h} Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/e42070983637ac5e384f17fbdbe86d19c7b212a5.1623775748.git.chris@chrisdown.name
On Tue, May 02, 2023 at 06:19:27PM +0300, Andy Shevchenko wrote: > On Tue, May 2, 2023 at 9:22 AM Kent Overstreet > <kent.overstreet@linux.dev> wrote: > > On Tue, May 02, 2023 at 08:33:57AM +0300, Andy Shevchenko wrote: > > > Actually instead of producing zillions of variants, do a %p extension > > > to the printf() and that's it. We have, for example, %pt with T and > > > with space to follow users that want one or the other variant. Same > > > can be done with string_get_size(). > > > > God no. > > Any elaboration what's wrong with that? I'm really not a fan of %p extensions in general (they are what people reach for because we can't standardize on a common string output API), but when we'd be passing it bare integers the lack of type safety would be a particularly big footgun. > God no for zillion APIs for almost the same. Today you want space, > tomorrow some other (special) delimiter. No, I just want to delete the space and output numbers the same way everyone else does. And if we are stuck with two string_get_size() functions, %p extensions in no way improve the situation.
On Wed, May 3, 2023 at 5:07 AM Kent Overstreet <kent.overstreet@linux.dev> wrote: > On Tue, May 02, 2023 at 06:19:27PM +0300, Andy Shevchenko wrote: > > On Tue, May 2, 2023 at 9:22 AM Kent Overstreet > > <kent.overstreet@linux.dev> wrote: > > > On Tue, May 02, 2023 at 08:33:57AM +0300, Andy Shevchenko wrote: > > > > Actually instead of producing zillions of variants, do a %p extension > > > > to the printf() and that's it. We have, for example, %pt with T and > > > > with space to follow users that want one or the other variant. Same > > > > can be done with string_get_size(). > > > > > > God no. > > > > Any elaboration what's wrong with that? > > I'm really not a fan of %p extensions in general (they are what people > reach for because we can't standardize on a common string output API), The whole story behind, for example, %pt is to _standardize_ the output of the same stanza in the kernel. > but when we'd be passing it bare integers the lack of type safety would > be a particularly big footgun. There is no difference to any other place in the kernel where we can shoot into our foot. > > God no for zillion APIs for almost the same. Today you want space, > > tomorrow some other (special) delimiter. > > No, I just want to delete the space and output numbers the same way > everyone else does. And if we are stuck with two string_get_size() > functions, %p extensions in no way improve the situation. I think it's exactly for the opposite, i.e. standardize that output once and for all.
On Wed, May 03, 2023 at 09:30:11AM +0300, Andy Shevchenko wrote: > On Wed, May 3, 2023 at 5:07 AM Kent Overstreet > <kent.overstreet@linux.dev> wrote: > > On Tue, May 02, 2023 at 06:19:27PM +0300, Andy Shevchenko wrote: > > > On Tue, May 2, 2023 at 9:22 AM Kent Overstreet > > > <kent.overstreet@linux.dev> wrote: > > > > On Tue, May 02, 2023 at 08:33:57AM +0300, Andy Shevchenko wrote: > > > > > Actually instead of producing zillions of variants, do a %p extension > > > > > to the printf() and that's it. We have, for example, %pt with T and > > > > > with space to follow users that want one or the other variant. Same > > > > > can be done with string_get_size(). > > > > > > > > God no. > > > > > > Any elaboration what's wrong with that? > > > > I'm really not a fan of %p extensions in general (they are what people > > reach for because we can't standardize on a common string output API), > > The whole story behind, for example, %pt is to _standardize_ the > output of the same stanza in the kernel. Wtf does this have to do with the rest of the discussion? The %p thing seems like a total non sequitar and a distraction. I'm not getting involved with that. All I'm interested in is fixing the memory allocation profiling output to make it more usable. > > but when we'd be passing it bare integers the lack of type safety would > > be a particularly big footgun. > > There is no difference to any other place in the kernel where we can > shoot into our foot. Yeah, no, absolutely not. Passing different size integers to string_get_size() is fine; passing pointers to different size integers to a %p extension will explode and the compiler won't be able to warn. > > > > God no for zillion APIs for almost the same. Today you want space, > > > tomorrow some other (special) delimiter. > > > > No, I just want to delete the space and output numbers the same way > > everyone else does. And if we are stuck with two string_get_size() > > functions, %p extensions in no way improve the situation. > > I think it's exactly for the opposite, i.e. standardize that output > once and for all. So, are you dropping your NACK then, so we can standardize the kernel on the way everything else does it?
On Wed, May 3, 2023 at 10:13 AM Kent Overstreet <kent.overstreet@linux.dev> wrote: > On Wed, May 03, 2023 at 09:30:11AM +0300, Andy Shevchenko wrote: > > On Wed, May 3, 2023 at 5:07 AM Kent Overstreet > > <kent.overstreet@linux.dev> wrote: > > > On Tue, May 02, 2023 at 06:19:27PM +0300, Andy Shevchenko wrote: > > > > On Tue, May 2, 2023 at 9:22 AM Kent Overstreet > > > > <kent.overstreet@linux.dev> wrote: > > > > > On Tue, May 02, 2023 at 08:33:57AM +0300, Andy Shevchenko wrote: > > > > > > Actually instead of producing zillions of variants, do a %p extension > > > > > > to the printf() and that's it. We have, for example, %pt with T and > > > > > > with space to follow users that want one or the other variant. Same > > > > > > can be done with string_get_size(). > > > > > > > > > > God no. > > > > > > > > Any elaboration what's wrong with that? > > > > > > I'm really not a fan of %p extensions in general (they are what people > > > reach for because we can't standardize on a common string output API), > > > > The whole story behind, for example, %pt is to _standardize_ the > > output of the same stanza in the kernel. > > Wtf does this have to do with the rest of the discussion? The %p thing > seems like a total non sequitar and a distraction. > > I'm not getting involved with that. All I'm interested in is fixing the > memory allocation profiling output to make it more usable. > > > > but when we'd be passing it bare integers the lack of type safety would > > > be a particularly big footgun. > > > > There is no difference to any other place in the kernel where we can > > shoot into our foot. > > Yeah, no, absolutely not. Passing different size integers to > string_get_size() is fine; passing pointers to different size integers > to a %p extension will explode and the compiler won't be able to warn. This is another topic. Yes, there is a discussion to have a compiler plugin to check this. > > > > God no for zillion APIs for almost the same. Today you want space, > > > > tomorrow some other (special) delimiter. > > > > > > No, I just want to delete the space and output numbers the same way > > > everyone else does. And if we are stuck with two string_get_size() > > > functions, %p extensions in no way improve the situation. > > > > I think it's exactly for the opposite, i.e. standardize that output > > once and for all. > > So, are you dropping your NACK then, so we can standardize the kernel on > the way everything else does it? No, you are breaking existing users. The NAK stays. The whole discussion after that is to make the way on how users can utilize your format and existing format without multiplying APIs.
On Wed, May 03, 2023 at 12:12:12PM +0300, Andy Shevchenko wrote: > > So, are you dropping your NACK then, so we can standardize the kernel on > > the way everything else does it? > > No, you are breaking existing users. The NAK stays. > The whole discussion after that is to make the way on how users can > utilize your format and existing format without multiplying APIs. Dave seems to think we shouldn't be, and I'm in agreement.
On 5/3/23 00:50, Dave Chinner wrote: > On Tue, May 02, 2023 at 07:42:59AM -0400, James Bottomley wrote: >> On Mon, 2023-05-01 at 23:17 -0400, Kent Overstreet wrote: >> > On Mon, May 01, 2023 at 10:22:18PM -0400, James Bottomley wrote: >> > > It is not used just for debug. It's used all over the kernel for >> > > printing out device sizes. The output mostly goes to the kernel >> > > print buffer, so it's anyone's guess as to what, if any, tools are >> > > parsing it, but the concern about breaking log parsers seems to be >> > > a valid one. >> > >> > Ok, there is sd_print_capacity() - but who in their right mind would >> > be trying to scrape device sizes, in human readable units, >> >> If you bother to google "kernel log parser", you'll discover it's quite >> an active area which supports a load of company business models. > > That doesn't mean log messages are unchangable ABI. Indeed, we had > the whole "printk_index_emit()" addition recently to create > an external index of printk message formats for such applications to > use. [*] > >> > from log messages when it's available in sysfs/procfs (actually, is >> > it in sysfs? if not, that's an oversight) in more reasonable units? >> >> It's not in sysfs, no. As aren't a lot of things, which is why log >> parsing for system monitoring is big business. > > And that big business is why printk_index_emit() exists to allow > them to easily determine how log messages change format and come and > go across different kernel versions. > >> > Correct me if I'm wrong, but I've yet to hear about kernel log >> > messages being consider a stable interface, and this seems a bit out >> > there. >> >> It might not be listed as stable, but when it's known there's a large >> ecosystem out there consuming it we shouldn't break it just because you >> feel like it. > > But we've solved this problem already, yes? > > If the userspace applications are not using the kernel printk format > index to detect such changes between kernel version, then they > should be. This makes trivial issues like whether we have a space or > not between units is completely irrelevant because the entry in the > printk format index for the log output we emit will match whatever > is output by the kernel.... If I understand that correctly from the commit changelog, this would have indeed helped, but if the change was reflected in format string. But with string_get_size() it's always an %s and the change of the helper's or a switch to another variant of the helper that would omit the space, wouldn't be reflected in the format string at all? I guess that would be an argument for Andy's suggestion for adding a new %pt / %pT which would then be reflected in the format string. And also more concise to use than using the helper, fwiw. > Cheers, > > Dave. > > [*] > commit 337015573718b161891a3473d25f59273f2e626b > Author: Chris Down <chris@chrisdown.name> > Date: Tue Jun 15 17:52:53 2021 +0100 > > printk: Userspace format indexing support > > We have a number of systems industry-wide that have a subset of their > functionality that works as follows: > > 1. Receive a message from local kmsg, serial console, or netconsole; > 2. Apply a set of rules to classify the message; > 3. Do something based on this classification (like scheduling a > remediation for the machine), rinse, and repeat. > > As a couple of examples of places we have this implemented just inside > Facebook, although this isn't a Facebook-specific problem, we have this > inside our netconsole processing (for alarm classification), and as part > of our machine health checking. We use these messages to determine > fairly important metrics around production health, and it's important > that we get them right. > > While for some kinds of issues we have counters, tracepoints, or metrics > with a stable interface which can reliably indicate the issue, in order > to react to production issues quickly we need to work with the interface > which most kernel developers naturally use when developing: printk. > > Most production issues come from unexpected phenomena, and as such > usually the code in question doesn't have easily usable tracepoints or > other counters available for the specific problem being mitigated. We > have a number of lines of monitoring defence against problems in > production (host metrics, process metrics, service metrics, etc), and > where it's not feasible to reliably monitor at another level, this kind > of pragmatic netconsole monitoring is essential. > > As one would expect, monitoring using printk is rather brittle for a > number of reasons -- most notably that the message might disappear > entirely in a new version of the kernel, or that the message may change > in some way that the regex or other classification methods start to > silently fail. > > One factor that makes this even harder is that, under normal operation, > many of these messages are never expected to be hit. For example, there > may be a rare hardware bug which one wants to detect if it was to ever > happen again, but its recurrence is not likely or anticipated. This > precludes using something like checking whether the printk in question > was printed somewhere fleetwide recently to determine whether the > message in question is still present or not, since we don't anticipate > that it should be printed anywhere, but still need to monitor for its > future presence in the long-term. > > This class of issue has happened on a number of occasions, causing > unhealthy machines with hardware issues to remain in production for > longer than ideal. As a recent example, some monitoring around > blk_update_request fell out of date and caused semi-broken machines to > remain in production for longer than would be desirable. > > Searching through the codebase to find the message is also extremely > fragile, because many of the messages are further constructed beyond > their callsite (eg. btrfs_printk and other module-specific wrappers, > each with their own functionality). Even if they aren't, guessing the > format and formulation of the underlying message based on the aesthetics > of the message emitted is not a recipe for success at scale, and our > previous issues with fleetwide machine health checking demonstrate as > much. > > This provides a solution to the issue of silently changed or deleted > printks: we record pointers to all printk format strings known at > compile time into a new .printk_index section, both in vmlinux and > modules. At runtime, this can then be iterated by looking at > <debugfs>/printk/index/<module>, which emits the following format, both > readable by humans and able to be parsed by machines: > > $ head -1 vmlinux; shuf -n 5 vmlinux > # <level[,flags]> filename:line function "format" > <5> block/blk-settings.c:661 disk_stack_limits "%s: Warning: Device %s is misaligned\n" > <4> kernel/trace/trace.c:8296 trace_create_file "Could not create tracefs '%s' entry\n" > <6> arch/x86/kernel/hpet.c:144 _hpet_print_config "hpet: %s(%d):\n" > <6> init/do_mounts.c:605 prepare_namespace "Waiting for root device %s...\n" > <6> drivers/acpi/osl.c:1410 acpi_no_auto_serialize_setup "ACPI: auto-serialization disabled\n" > > This mitigates the majority of cases where we have a highly-specific > printk which we want to match on, as we can now enumerate and check > whether the format changed or the printk callsite disappeared entirely > in userspace. This allows us to catch changes to printks we monitor > earlier and decide what to do about it before it becomes problematic. > > There is no additional runtime cost for printk callers or printk itself, > and the assembly generated is exactly the same. > > Signed-off-by: Chris Down <chris@chrisdown.name> > Cc: Petr Mladek <pmladek@suse.com> > Cc: Jessica Yu <jeyu@kernel.org> > Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> > Cc: John Ogness <john.ogness@linutronix.de> > Cc: Steven Rostedt <rostedt@goodmis.org> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Kees Cook <keescook@chromium.org> > Reviewed-by: Petr Mladek <pmladek@suse.com> > Tested-by: Petr Mladek <pmladek@suse.com> > Reported-by: kernel test robot <lkp@intel.com> > Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> > Acked-by: Jessica Yu <jeyu@kernel.org> # for module.{c,h} > Signed-off-by: Petr Mladek <pmladek@suse.com> > Link: https://lore.kernel.org/r/e42070983637ac5e384f17fbdbe86d19c7b212a5.1623775748.git.chris@chrisdown.name >
On Wed, May 3, 2023 at 12:28 PM Vlastimil Babka <vbabka@suse.cz> wrote: > > On 5/3/23 00:50, Dave Chinner wrote: > > On Tue, May 02, 2023 at 07:42:59AM -0400, James Bottomley wrote: > >> On Mon, 2023-05-01 at 23:17 -0400, Kent Overstreet wrote: > >> > On Mon, May 01, 2023 at 10:22:18PM -0400, James Bottomley wrote: > >> > > It is not used just for debug. It's used all over the kernel for > >> > > printing out device sizes. The output mostly goes to the kernel > >> > > print buffer, so it's anyone's guess as to what, if any, tools are > >> > > parsing it, but the concern about breaking log parsers seems to be > >> > > a valid one. > >> > > >> > Ok, there is sd_print_capacity() - but who in their right mind would > >> > be trying to scrape device sizes, in human readable units, > >> > >> If you bother to google "kernel log parser", you'll discover it's quite > >> an active area which supports a load of company business models. > > > > That doesn't mean log messages are unchangable ABI. Indeed, we had > > the whole "printk_index_emit()" addition recently to create > > an external index of printk message formats for such applications to > > use. [*] > > > >> > from log messages when it's available in sysfs/procfs (actually, is > >> > it in sysfs? if not, that's an oversight) in more reasonable units? > >> > >> It's not in sysfs, no. As aren't a lot of things, which is why log > >> parsing for system monitoring is big business. > > > > And that big business is why printk_index_emit() exists to allow > > them to easily determine how log messages change format and come and > > go across different kernel versions. > > > >> > Correct me if I'm wrong, but I've yet to hear about kernel log > >> > messages being consider a stable interface, and this seems a bit out > >> > there. > >> > >> It might not be listed as stable, but when it's known there's a large > >> ecosystem out there consuming it we shouldn't break it just because you > >> feel like it. > > > > But we've solved this problem already, yes? > > > > If the userspace applications are not using the kernel printk format > > index to detect such changes between kernel version, then they > > should be. This makes trivial issues like whether we have a space or > > not between units is completely irrelevant because the entry in the > > printk format index for the log output we emit will match whatever > > is output by the kernel.... > > If I understand that correctly from the commit changelog, this would have > indeed helped, but if the change was reflected in format string. But with > string_get_size() it's always an %s and the change of the helper's or a > switch to another variant of the helper that would omit the space, wouldn't > be reflected in the format string at all? I guess that would be an argument > for Andy's suggestion for adding a new %pt / %pT which would then be (Note, there is no respective %p extension for string_get_size() yet. %pt is for time and was used as an example when its evolution included a change like this) > reflected in the format string. And also more concise to use than using the > helper, fwiw.
On Wed, 2023-05-03 at 08:50 +1000, Dave Chinner wrote: > On Tue, May 02, 2023 at 07:42:59AM -0400, James Bottomley wrote: > > On Mon, 2023-05-01 at 23:17 -0400, Kent Overstreet wrote: > > > On Mon, May 01, 2023 at 10:22:18PM -0400, James Bottomley wrote: > > > > It is not used just for debug. It's used all over the kernel > > > > for printing out device sizes. The output mostly goes to the > > > > kernel print buffer, so it's anyone's guess as to what, if any, > > > > tools are parsing it, but the concern about breaking log > > > > parsers seems to be a valid one. > > > > > > Ok, there is sd_print_capacity() - but who in their right mind > > > would be trying to scrape device sizes, in human readable units, > > > > If you bother to google "kernel log parser", you'll discover it's > > quite an active area which supports a load of company business > > models. > > That doesn't mean log messages are unchangable ABI. Indeed, we had > the whole "printk_index_emit()" addition recently to create > an external index of printk message formats for such applications to > use. [*] I didn't say they were. > > > > from log messages when it's available in sysfs/procfs (actually, > > > is it in sysfs? if not, that's an oversight) in more reasonable > > > units? > > > > It's not in sysfs, no. As aren't a lot of things, which is why log > > parsing for system monitoring is big business. > > And that big business is why printk_index_emit() exists to allow > them to easily determine how log messages change format and come and > go across different kernel versions. > > > > Correct me if I'm wrong, but I've yet to hear about kernel log > > > messages being consider a stable interface, and this seems a bit > > > out there. > > > > It might not be listed as stable, but when it's known there's a > > large ecosystem out there consuming it we shouldn't break it just > > because you feel like it. > > But we've solved this problem already, yes? Well, yes; since it's a simple bit of extra thought and a couple of lines addition not to afflict everyone with the change, that's the simplest course. It also gets us out of arguing about whether the space reads better and is SI consistent. > If the userspace applications are not using the kernel printk format > index to detect such changes between kernel version, then they > should be. This makes trivial issues like whether we have a space or > not between units is completely irrelevant because the entry in the > printk format index for the log output we emit will match whatever > is output by the kernel.... Just because we have better tools to fix a problem when it happens doesn't mean we should actively cause the problem when its easily avoidable. In the same way we shouldn't drive less carefully just because cars are built safer today. James
diff --git a/lib/string_helpers.c b/lib/string_helpers.c index 230020a2e076..593b29fece32 100644 --- a/lib/string_helpers.c +++ b/lib/string_helpers.c @@ -126,8 +126,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units, else unit = units_str[units][i]; - snprintf(buf, len, "%u%s %s", (u32)size, - tmp, unit); + snprintf(buf, len, "%u%s%s", (u32)size, tmp, unit); } EXPORT_SYMBOL(string_get_size);