Message ID | 20220318153432.3984b871@gandalf.local.home (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | tracing: Have type enum modifications copy the strings | expand |
On Fri, 18 Mar 2022 19:34:32 +0000, Steven Rostedt <rostedt@goodmis.org> wrote: > > From: "Steven Rostedt (Google)" <rostedt@goodmis.org> > > When an enum is used in the visible parts of a trace event that is > exported to user space, the user space applications like perf and > trace-cmd do not have a way to know what the value of the enum is. To > solve this, at boot up (or module load) the printk formats are modified to > replace the enum with their numeric value in the string output. > > Array fields of the event are defined by [<nr-elements>] in the type > portion of the format file so that the user space parsers can correctly > parse the array into the appropriate size chunks. But in some trace > events, an enum is used in defining the size of the array, which once > again breaks the parsing of user space tooling. > > This was solved the same way as the print formats were, but it modified > the type strings of the trace event. This caused crashes in some > architectures because, as supposed to the print string, is a const string > value. This was not detected on x86, as it appears that const strings are > still writable (at least in boot up), but other architectures this is not > the case, and writing to a const string will cause a kernel fault. > > To fix this, use kstrdup() to copy the type before modifying it. If the > trace event is for the core kernel there's no need to free it because the > string will be in use for the life of the machine being on line. For > modules, create a link list to store all the strings being allocated for > modules and when the module is removed, free them. > > Link: https://lore.kernel.org/all/yt9dr1706b4i.fsf@linux.ibm.com/ > > Fixes: b3bc8547d3be ("tracing: Have TRACE_DEFINE_ENUM affect trace event types as well") > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> This fixes booting on arm64 with ext4 as a module, so FWIW: Tested-by: Marc Zyngier <maz@kernel.org> M.
Hi Steve, Steven Rostedt <rostedt@goodmis.org> writes: > From: "Steven Rostedt (Google)" <rostedt@goodmis.org> > > When an enum is used in the visible parts of a trace event that is > exported to user space, the user space applications like perf and > trace-cmd do not have a way to know what the value of the enum is. To > solve this, at boot up (or module load) the printk formats are modified to > replace the enum with their numeric value in the string output. > > Array fields of the event are defined by [<nr-elements>] in the type > portion of the format file so that the user space parsers can correctly > parse the array into the appropriate size chunks. But in some trace > events, an enum is used in defining the size of the array, which once > again breaks the parsing of user space tooling. > > This was solved the same way as the print formats were, but it modified > the type strings of the trace event. This caused crashes in some > architectures because, as supposed to the print string, is a const string > value. This was not detected on x86, as it appears that const strings are > still writable (at least in boot up), but other architectures this is not > the case, and writing to a const string will cause a kernel fault. > > To fix this, use kstrdup() to copy the type before modifying it. If the > trace event is for the core kernel there's no need to free it because the > string will be in use for the life of the machine being on line. For > modules, create a link list to store all the strings being allocated for > modules and when the module is removed, free them. > > Link: https://lore.kernel.org/all/yt9dr1706b4i.fsf@linux.ibm.com/ > > Fixes: b3bc8547d3be ("tracing: Have TRACE_DEFINE_ENUM affect trace event types as well") > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> This fixes the crash seen on s390. Thanks! Tested-by: Sven Schnelle <svens@linux.ibm.com>
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index ae9a3b8481f5..0d91152172c9 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -40,6 +40,14 @@ static LIST_HEAD(ftrace_generic_fields); static LIST_HEAD(ftrace_common_fields); static bool eventdir_initialized; +static LIST_HEAD(module_strings); + +struct module_string { + struct list_head next; + struct module *module; + char *str; +}; + #define GFP_TRACE (GFP_KERNEL | __GFP_ZERO) static struct kmem_cache *field_cachep; @@ -2633,14 +2641,40 @@ static void update_event_printk(struct trace_event_call *call, } } +static void add_str_to_module(struct module *module, char *str) +{ + struct module_string *modstr; + + modstr = kmalloc(sizeof(*modstr), GFP_KERNEL); + + /* + * If we failed to allocate memory here, then we'll just + * let the str memory leak when the module is removed. + * If this fails to allocate, there's worse problems than + * a leaked string on module removal. + */ + if (WARN_ON_ONCE(!modstr)) + return; + + modstr->module = module; + modstr->str = str; + + list_add(&modstr->next, &module_strings); +} + static void update_event_fields(struct trace_event_call *call, struct trace_eval_map *map) { struct ftrace_event_field *field; struct list_head *head; char *ptr; + char *str; int len = strlen(map->eval_string); + /* Dynamic events should never have field maps */ + if (WARN_ON_ONCE(call->flags & TRACE_EVENT_FL_DYNAMIC)) + return; + head = trace_get_fields(call); list_for_each_entry(field, head, link) { ptr = strchr(field->type, '['); @@ -2654,9 +2688,26 @@ static void update_event_fields(struct trace_event_call *call, if (strncmp(map->eval_string, ptr, len) != 0) continue; + str = kstrdup(field->type, GFP_KERNEL); + if (WARN_ON_ONCE(!str)) + return; + ptr = str + (ptr - field->type); ptr = eval_replace(ptr, map, len); /* enum/sizeof string smaller than value */ - WARN_ON_ONCE(!ptr); + if (WARN_ON_ONCE(!ptr)) { + kfree(str); + continue; + } + + /* + * If the event is part of a module, then we need to free the string + * when the module is removed. Otherwise, it will stay allocated + * until a reboot. + */ + if (call->module) + add_str_to_module(call->module, str); + + field->type = str; } } @@ -2883,6 +2934,7 @@ static void trace_module_add_events(struct module *mod) static void trace_module_remove_events(struct module *mod) { struct trace_event_call *call, *p; + struct module_string *modstr, *m; down_write(&trace_event_sem); list_for_each_entry_safe(call, p, &ftrace_events, list) { @@ -2891,6 +2943,14 @@ static void trace_module_remove_events(struct module *mod) if (call->module == mod) __trace_remove_event_call(call); } + /* Check for any strings allocade for this module */ + list_for_each_entry_safe(modstr, m, &module_strings, next) { + if (modstr->module != mod) + continue; + list_del(&modstr->next); + kfree(modstr->str); + kfree(modstr); + } up_write(&trace_event_sem); /*