Message ID | 20240416224102.734-2-beaub@linux.microsoft.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | tracing/user_events: Fix non-spaced field matching | expand |
On Tue, 16 Apr 2024 22:41:01 +0000 Beau Belgrave <beaub@linux.microsoft.com> wrote: > When the ABI was updated to prevent same name w/different args, it > missed an important corner case when fields don't end with a space. > Typically, space is used for fields to help separate them, like > "u8 field1; u8 field2". If no spaces are used, like > "u8 field1;u8 field2", then the parsing works for the first time. > However, the match check fails on a subsequent register, leading to > confusion. > > This is because the match check uses argv_split() and assumes that all > fields will be split upon the space. When spaces are used, we get back > { "u8", "field1;" }, without spaces we get back { "u8", "field1;u8" }. > This causes a mismatch, and the user program gets back -EADDRINUSE. > > Add a method to detect this case before calling argv_split(). If found > force a space after the field separator character ';'. This ensures all > cases work properly for matching. > > With this fix, the following are all treated as matching: > u8 field1;u8 field2 > u8 field1; u8 field2 > u8 field1;\tu8 field2 > u8 field1;\nu8 field2 Sounds good to me. I just have some nits. > > Fixes: ba470eebc2f6 ("tracing/user_events: Prevent same name but different args event") > Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> > --- > kernel/trace/trace_events_user.c | 88 +++++++++++++++++++++++++++++++- > 1 file changed, 87 insertions(+), 1 deletion(-) > > diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c > index 70d428c394b6..9184d3962b2a 100644 > --- a/kernel/trace/trace_events_user.c > +++ b/kernel/trace/trace_events_user.c > @@ -1989,6 +1989,92 @@ static int user_event_set_tp_name(struct user_event *user) > return 0; > } > > +/* > + * Counts how many ';' without a trailing space are in the args. > + */ > +static int count_semis_no_space(char *args) > +{ > + int count = 0; > + > + while ((args = strchr(args, ';'))) { > + args++; > + > + if (!isspace(*args)) > + count++; > + } > + > + return count; > +} > + > +/* > + * Copies the arguments while ensuring all ';' have a trailing space. > + */ > +static char *fix_semis_no_space(char *args, int count) nit: This name does not represent what it does. 'insert_space_after_semis()' is more self-described. > +{ > + char *fixed, *pos; > + char c, last; > + int len; > + > + len = strlen(args) + count; > + fixed = kmalloc(len + 1, GFP_KERNEL); > + > + if (!fixed) > + return NULL; > + > + pos = fixed; > + last = '\0'; > + > + while (len > 0) { > + c = *args++; > + > + if (last == ';' && !isspace(c)) { > + *pos++ = ' '; > + len--; > + } > + > + if (len > 0) { > + *pos++ = c; > + len--; > + } > + > + last = c; > + } nit: This loop can be simpler, because we are sure fixed has enough length; /* insert a space after ';' if there is no space. */ while(*args) { *pos = *args++; if (*pos++ == ';' && !isspace(*args)) *pos++ = ' '; } > + > + /* > + * len is the length of the copy excluding the null. > + * This ensures we always have room for a null. > + */ > + *pos = '\0'; > + > + return fixed; > +} > + > +static char **user_event_argv_split(char *args, int *argc) > +{ > + /* Count how many ';' without a trailing space */ > + int count = count_semis_no_space(args); > + > + if (count) { nit: it is better to exit fast, so if (!count) return argv_split(GFP_KERNEL, args, argc); ... Thank you, OT: BTW, can this also simplify synthetic events? > + /* We must fixup 'field;field' to 'field; field' */ > + char *fixed = fix_semis_no_space(args, count); > + char **split; > + > + if (!fixed) > + return NULL; > + > + /* We do a normal split afterwards */ > + split = argv_split(GFP_KERNEL, fixed, argc); > + > + /* We can free since argv_split makes a copy */ > + kfree(fixed); > + > + return split; > + } > + > + /* No fixup is required */ > + return argv_split(GFP_KERNEL, args, argc); > +} > + > /* > * Parses the event name, arguments and flags then registers if successful. > * The name buffer lifetime is owned by this method for success cases only. > @@ -2012,7 +2098,7 @@ static int user_event_parse(struct user_event_group *group, char *name, > return -EPERM; > > if (args) { > - argv = argv_split(GFP_KERNEL, args, &argc); > + argv = user_event_argv_split(args, &argc); > > if (!argv) > return -ENOMEM; > -- > 2.34.1 >
On Fri, Apr 19, 2024 at 11:33:05AM +0900, Masami Hiramatsu wrote: > On Tue, 16 Apr 2024 22:41:01 +0000 > Beau Belgrave <beaub@linux.microsoft.com> wrote: > > > When the ABI was updated to prevent same name w/different args, it > > missed an important corner case when fields don't end with a space. > > Typically, space is used for fields to help separate them, like > > "u8 field1; u8 field2". If no spaces are used, like > > "u8 field1;u8 field2", then the parsing works for the first time. > > However, the match check fails on a subsequent register, leading to > > confusion. > > > > This is because the match check uses argv_split() and assumes that all > > fields will be split upon the space. When spaces are used, we get back > > { "u8", "field1;" }, without spaces we get back { "u8", "field1;u8" }. > > This causes a mismatch, and the user program gets back -EADDRINUSE. > > > > Add a method to detect this case before calling argv_split(). If found > > force a space after the field separator character ';'. This ensures all > > cases work properly for matching. > > > > With this fix, the following are all treated as matching: > > u8 field1;u8 field2 > > u8 field1; u8 field2 > > u8 field1;\tu8 field2 > > u8 field1;\nu8 field2 > > Sounds good to me. I just have some nits. > > > > > Fixes: ba470eebc2f6 ("tracing/user_events: Prevent same name but different args event") > > Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> > > --- > > kernel/trace/trace_events_user.c | 88 +++++++++++++++++++++++++++++++- > > 1 file changed, 87 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c > > index 70d428c394b6..9184d3962b2a 100644 > > --- a/kernel/trace/trace_events_user.c > > +++ b/kernel/trace/trace_events_user.c > > @@ -1989,6 +1989,92 @@ static int user_event_set_tp_name(struct user_event *user) > > return 0; > > } > > > > +/* > > + * Counts how many ';' without a trailing space are in the args. > > + */ > > +static int count_semis_no_space(char *args) > > +{ > > + int count = 0; > > + > > + while ((args = strchr(args, ';'))) { > > + args++; > > + > > + if (!isspace(*args)) > > + count++; > > + } > > + > > + return count; > > +} > > + > > +/* > > + * Copies the arguments while ensuring all ';' have a trailing space. > > + */ > > +static char *fix_semis_no_space(char *args, int count) > > nit: This name does not represent what it does. 'insert_space_after_semis()' > is more self-described. > Sure, will fix in a v2. > > +{ > > + char *fixed, *pos; > > + char c, last; > > + int len; > > + > > + len = strlen(args) + count; > > + fixed = kmalloc(len + 1, GFP_KERNEL); > > + > > + if (!fixed) > > + return NULL; > > + > > + pos = fixed; > > + last = '\0'; > > + > > + while (len > 0) { > > + c = *args++; > > + > > + if (last == ';' && !isspace(c)) { > > + *pos++ = ' '; > > + len--; > > + } > > + > > + if (len > 0) { > > + *pos++ = c; > > + len--; > > + } > > + > > + last = c; > > + } > > nit: This loop can be simpler, because we are sure fixed has enough length; > > /* insert a space after ';' if there is no space. */ > while(*args) { > *pos = *args++; > if (*pos++ == ';' && !isspace(*args)) > *pos++ = ' '; > } > I was worried that if count_semis_no_space() ever had different logic (maybe after this commit) that it could cause an overflow if the count was wrong, etc. I don't have an issue making it shorter, but I was trying to be more on the safe side, since this isn't a fast path (event register). > > + > > + /* > > + * len is the length of the copy excluding the null. > > + * This ensures we always have room for a null. > > + */ > > + *pos = '\0'; > > + > > + return fixed; > > +} > > + > > +static char **user_event_argv_split(char *args, int *argc) > > +{ > > + /* Count how many ';' without a trailing space */ > > + int count = count_semis_no_space(args); > > + > > + if (count) { > > nit: it is better to exit fast, so > > if (!count) > return argv_split(GFP_KERNEL, args, argc); > > ... Sure, will fix in a v2. > > Thank you, > > OT: BTW, can this also simplify synthetic events? > I'm not sure, I'll check when I have some time. I want to get this fix in sooner rather than later. Thanks, -Beau > > + /* We must fixup 'field;field' to 'field; field' */ > > + char *fixed = fix_semis_no_space(args, count); > > + char **split; > > + > > + if (!fixed) > > + return NULL; > > + > > + /* We do a normal split afterwards */ > > + split = argv_split(GFP_KERNEL, fixed, argc); > > + > > + /* We can free since argv_split makes a copy */ > > + kfree(fixed); > > + > > + return split; > > + } > > + > > + /* No fixup is required */ > > + return argv_split(GFP_KERNEL, args, argc); > > +} > > + > > /* > > * Parses the event name, arguments and flags then registers if successful. > > * The name buffer lifetime is owned by this method for success cases only. > > @@ -2012,7 +2098,7 @@ static int user_event_parse(struct user_event_group *group, char *name, > > return -EPERM; > > > > if (args) { > > - argv = argv_split(GFP_KERNEL, args, &argc); > > + argv = user_event_argv_split(args, &argc); > > > > if (!argv) > > return -ENOMEM; > > -- > > 2.34.1 > > > > > -- > Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Fri, 19 Apr 2024 14:13:34 -0700 Beau Belgrave <beaub@linux.microsoft.com> wrote: > On Fri, Apr 19, 2024 at 11:33:05AM +0900, Masami Hiramatsu wrote: > > On Tue, 16 Apr 2024 22:41:01 +0000 > > Beau Belgrave <beaub@linux.microsoft.com> wrote: > > > > > When the ABI was updated to prevent same name w/different args, it > > > missed an important corner case when fields don't end with a space. > > > Typically, space is used for fields to help separate them, like > > > "u8 field1; u8 field2". If no spaces are used, like > > > "u8 field1;u8 field2", then the parsing works for the first time. > > > However, the match check fails on a subsequent register, leading to > > > confusion. > > > > > > This is because the match check uses argv_split() and assumes that all > > > fields will be split upon the space. When spaces are used, we get back > > > { "u8", "field1;" }, without spaces we get back { "u8", "field1;u8" }. > > > This causes a mismatch, and the user program gets back -EADDRINUSE. > > > > > > Add a method to detect this case before calling argv_split(). If found > > > force a space after the field separator character ';'. This ensures all > > > cases work properly for matching. > > > > > > With this fix, the following are all treated as matching: > > > u8 field1;u8 field2 > > > u8 field1; u8 field2 > > > u8 field1;\tu8 field2 > > > u8 field1;\nu8 field2 > > > > Sounds good to me. I just have some nits. > > > > > > > > Fixes: ba470eebc2f6 ("tracing/user_events: Prevent same name but different args event") > > > Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> > > > --- > > > kernel/trace/trace_events_user.c | 88 +++++++++++++++++++++++++++++++- > > > 1 file changed, 87 insertions(+), 1 deletion(-) > > > > > > diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c > > > index 70d428c394b6..9184d3962b2a 100644 > > > --- a/kernel/trace/trace_events_user.c > > > +++ b/kernel/trace/trace_events_user.c > > > @@ -1989,6 +1989,92 @@ static int user_event_set_tp_name(struct user_event *user) > > > return 0; > > > } > > > > > > +/* > > > + * Counts how many ';' without a trailing space are in the args. > > > + */ > > > +static int count_semis_no_space(char *args) > > > +{ > > > + int count = 0; > > > + > > > + while ((args = strchr(args, ';'))) { > > > + args++; > > > + > > > + if (!isspace(*args)) > > > + count++; > > > + } > > > + > > > + return count; > > > +} > > > + > > > +/* > > > + * Copies the arguments while ensuring all ';' have a trailing space. > > > + */ > > > +static char *fix_semis_no_space(char *args, int count) > > > > nit: This name does not represent what it does. 'insert_space_after_semis()' > > is more self-described. > > > > Sure, will fix in a v2. > > > > +{ > > > + char *fixed, *pos; > > > + char c, last; > > > + int len; > > > + > > > + len = strlen(args) + count; > > > + fixed = kmalloc(len + 1, GFP_KERNEL); > > > + > > > + if (!fixed) > > > + return NULL; > > > + > > > + pos = fixed; > > > + last = '\0'; > > > + > > > + while (len > 0) { > > > + c = *args++; > > > + > > > + if (last == ';' && !isspace(c)) { > > > + *pos++ = ' '; > > > + len--; > > > + } > > > + > > > + if (len > 0) { > > > + *pos++ = c; > > > + len--; > > > + } > > > + > > > + last = c; > > > + } > > > > nit: This loop can be simpler, because we are sure fixed has enough length; > > > > /* insert a space after ';' if there is no space. */ > > while(*args) { > > *pos = *args++; > > if (*pos++ == ';' && !isspace(*args)) > > *pos++ = ' '; > > } > > > > I was worried that if count_semis_no_space() ever had different logic > (maybe after this commit) that it could cause an overflow if the count > was wrong, etc. > > I don't have an issue making it shorter, but I was trying to be more on > the safe side, since this isn't a fast path (event register). OK, anyway current code looks correct. But note that I don't think "pos++; len--;" is safer, since it is not atomic. This pattern easily loose "len--;" in my experience. So please carefully use it ;) > > > > + > > > + /* > > > + * len is the length of the copy excluding the null. > > > + * This ensures we always have room for a null. > > > + */ > > > + *pos = '\0'; > > > + > > > + return fixed; > > > +} > > > + > > > +static char **user_event_argv_split(char *args, int *argc) > > > +{ > > > + /* Count how many ';' without a trailing space */ > > > + int count = count_semis_no_space(args); > > > + > > > + if (count) { > > > > nit: it is better to exit fast, so > > > > if (!count) > > return argv_split(GFP_KERNEL, args, argc); > > > > ... > > Sure, will fix in a v2. > > > > > Thank you, > > > > OT: BTW, can this also simplify synthetic events? > > > > I'm not sure, I'll check when I have some time. I want to get this fix > in sooner rather than later. Ah, nevermind. Synthetic event parses the field by strsep(';') first and argv_split(). So it does not have this issue. Thank you, > > Thanks, > -Beau > > > > + /* We must fixup 'field;field' to 'field; field' */ > > > + char *fixed = fix_semis_no_space(args, count); > > > + char **split; > > > + > > > + if (!fixed) > > > + return NULL; > > > + > > > + /* We do a normal split afterwards */ > > > + split = argv_split(GFP_KERNEL, fixed, argc); > > > + > > > + /* We can free since argv_split makes a copy */ > > > + kfree(fixed); > > > + > > > + return split; > > > + } > > > + > > > + /* No fixup is required */ > > > + return argv_split(GFP_KERNEL, args, argc); > > > +} > > > + > > > /* > > > * Parses the event name, arguments and flags then registers if successful. > > > * The name buffer lifetime is owned by this method for success cases only. > > > @@ -2012,7 +2098,7 @@ static int user_event_parse(struct user_event_group *group, char *name, > > > return -EPERM; > > > > > > if (args) { > > > - argv = argv_split(GFP_KERNEL, args, &argc); > > > + argv = user_event_argv_split(args, &argc); > > > > > > if (!argv) > > > return -ENOMEM; > > > -- > > > 2.34.1 > > > > > > > > > -- > > Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Sat, Apr 20, 2024 at 09:50:52PM +0900, Masami Hiramatsu wrote: > On Fri, 19 Apr 2024 14:13:34 -0700 > Beau Belgrave <beaub@linux.microsoft.com> wrote: > > > On Fri, Apr 19, 2024 at 11:33:05AM +0900, Masami Hiramatsu wrote: > > > On Tue, 16 Apr 2024 22:41:01 +0000 > > > Beau Belgrave <beaub@linux.microsoft.com> wrote: *SNIP* > > > nit: This loop can be simpler, because we are sure fixed has enough length; > > > > > > /* insert a space after ';' if there is no space. */ > > > while(*args) { > > > *pos = *args++; > > > if (*pos++ == ';' && !isspace(*args)) > > > *pos++ = ' '; > > > } > > > > > > > I was worried that if count_semis_no_space() ever had different logic > > (maybe after this commit) that it could cause an overflow if the count > > was wrong, etc. > > > > I don't have an issue making it shorter, but I was trying to be more on > > the safe side, since this isn't a fast path (event register). > > OK, anyway current code looks correct. But note that I don't think > "pos++; len--;" is safer, since it is not atomic. This pattern > easily loose "len--;" in my experience. So please carefully use it ;) > I'll stick with your loop. Perhaps others will chime in on the v2 and state a stronger opinion. You scared me with the atomic comment, I went back and looked at all the paths for this. In the user_events IOCTL the buffer is copied from user to kernel, so it cannot change (and no other threads access it). I also checked trace_parse_run_command() which is the same. So at least in this context the non-atomic part is OK. > > > > > > + > > > > + /* > > > > + * len is the length of the copy excluding the null. > > > > + * This ensures we always have room for a null. > > > > + */ > > > > + *pos = '\0'; > > > > + > > > > + return fixed; > > > > +} > > > > + > > > > +static char **user_event_argv_split(char *args, int *argc) > > > > +{ > > > > + /* Count how many ';' without a trailing space */ > > > > + int count = count_semis_no_space(args); > > > > + > > > > + if (count) { > > > > > > nit: it is better to exit fast, so > > > > > > if (!count) > > > return argv_split(GFP_KERNEL, args, argc); > > > > > > ... > > > > Sure, will fix in a v2. > > > > > > > > Thank you, > > > > > > OT: BTW, can this also simplify synthetic events? > > > > > > > I'm not sure, I'll check when I have some time. I want to get this fix > > in sooner rather than later. > > Ah, nevermind. Synthetic event parses the field by strsep(';') first > and argv_split(). So it does not have this issue. > Ok, seems unrelated. Thanks for checking. Thanks, -Beau > Thank you, > > > > > Thanks, > > -Beau > > *SNIP* > > > Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > -- > Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Mon, 22 Apr 2024 14:55:25 -0700 Beau Belgrave <beaub@linux.microsoft.com> wrote: > On Sat, Apr 20, 2024 at 09:50:52PM +0900, Masami Hiramatsu wrote: > > On Fri, 19 Apr 2024 14:13:34 -0700 > > Beau Belgrave <beaub@linux.microsoft.com> wrote: > > > > > On Fri, Apr 19, 2024 at 11:33:05AM +0900, Masami Hiramatsu wrote: > > > > On Tue, 16 Apr 2024 22:41:01 +0000 > > > > Beau Belgrave <beaub@linux.microsoft.com> wrote: > > *SNIP* > > > > > nit: This loop can be simpler, because we are sure fixed has enough length; > > > > > > > > /* insert a space after ';' if there is no space. */ > > > > while(*args) { > > > > *pos = *args++; > > > > if (*pos++ == ';' && !isspace(*args)) > > > > *pos++ = ' '; > > > > } > > > > > > > > > > I was worried that if count_semis_no_space() ever had different logic > > > (maybe after this commit) that it could cause an overflow if the count > > > was wrong, etc. > > > > > > I don't have an issue making it shorter, but I was trying to be more on > > > the safe side, since this isn't a fast path (event register). > > > > OK, anyway current code looks correct. But note that I don't think > > "pos++; len--;" is safer, since it is not atomic. This pattern > > easily loose "len--;" in my experience. So please carefully use it ;) > > > > I'll stick with your loop. Perhaps others will chime in on the v2 and > state a stronger opinion. > > You scared me with the atomic comment, I went back and looked at all the > paths for this. In the user_events IOCTL the buffer is copied from user > to kernel, so it cannot change (and no other threads access it). I also > checked trace_parse_run_command() which is the same. So at least in this > context the non-atomic part is OK. Oh, sorry if I scared you. I've seen bugs get introduced into loops like this many times (while updating the code), so I try to keep it simple. I'm sure that your code has no bugs. Thank you,
diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c index 70d428c394b6..9184d3962b2a 100644 --- a/kernel/trace/trace_events_user.c +++ b/kernel/trace/trace_events_user.c @@ -1989,6 +1989,92 @@ static int user_event_set_tp_name(struct user_event *user) return 0; } +/* + * Counts how many ';' without a trailing space are in the args. + */ +static int count_semis_no_space(char *args) +{ + int count = 0; + + while ((args = strchr(args, ';'))) { + args++; + + if (!isspace(*args)) + count++; + } + + return count; +} + +/* + * Copies the arguments while ensuring all ';' have a trailing space. + */ +static char *fix_semis_no_space(char *args, int count) +{ + char *fixed, *pos; + char c, last; + int len; + + len = strlen(args) + count; + fixed = kmalloc(len + 1, GFP_KERNEL); + + if (!fixed) + return NULL; + + pos = fixed; + last = '\0'; + + while (len > 0) { + c = *args++; + + if (last == ';' && !isspace(c)) { + *pos++ = ' '; + len--; + } + + if (len > 0) { + *pos++ = c; + len--; + } + + last = c; + } + + /* + * len is the length of the copy excluding the null. + * This ensures we always have room for a null. + */ + *pos = '\0'; + + return fixed; +} + +static char **user_event_argv_split(char *args, int *argc) +{ + /* Count how many ';' without a trailing space */ + int count = count_semis_no_space(args); + + if (count) { + /* We must fixup 'field;field' to 'field; field' */ + char *fixed = fix_semis_no_space(args, count); + char **split; + + if (!fixed) + return NULL; + + /* We do a normal split afterwards */ + split = argv_split(GFP_KERNEL, fixed, argc); + + /* We can free since argv_split makes a copy */ + kfree(fixed); + + return split; + } + + /* No fixup is required */ + return argv_split(GFP_KERNEL, args, argc); +} + /* * Parses the event name, arguments and flags then registers if successful. * The name buffer lifetime is owned by this method for success cases only. @@ -2012,7 +2098,7 @@ static int user_event_parse(struct user_event_group *group, char *name, return -EPERM; if (args) { - argv = argv_split(GFP_KERNEL, args, &argc); + argv = user_event_argv_split(args, &argc); if (!argv) return -ENOMEM;
When the ABI was updated to prevent same name w/different args, it missed an important corner case when fields don't end with a space. Typically, space is used for fields to help separate them, like "u8 field1; u8 field2". If no spaces are used, like "u8 field1;u8 field2", then the parsing works for the first time. However, the match check fails on a subsequent register, leading to confusion. This is because the match check uses argv_split() and assumes that all fields will be split upon the space. When spaces are used, we get back { "u8", "field1;" }, without spaces we get back { "u8", "field1;u8" }. This causes a mismatch, and the user program gets back -EADDRINUSE. Add a method to detect this case before calling argv_split(). If found force a space after the field separator character ';'. This ensures all cases work properly for matching. With this fix, the following are all treated as matching: u8 field1;u8 field2 u8 field1; u8 field2 u8 field1;\tu8 field2 u8 field1;\nu8 field2 Fixes: ba470eebc2f6 ("tracing/user_events: Prevent same name but different args event") Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> --- kernel/trace/trace_events_user.c | 88 +++++++++++++++++++++++++++++++- 1 file changed, 87 insertions(+), 1 deletion(-)