Message ID | 20211020013153.4106001-4-kaleshsingh@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | tracing: Extend histogram triggers expression parsing | expand |
On Tue, 19 Oct 2021 18:31:40 -0700 Kalesh Singh <kaleshsingh@google.com> wrote: > + minus_op = strrchr(str, '-'); > + if (minus_op) { > + /* Unfortunately, the modifier ".sym-offset" can confuse things. */ > + if (minus_op - str >= 4 && !strncmp(minus_op - 4, ".sym-offset", 11)) > + goto out; > I was thinking about this, and perhaps we can add this later, but we could just replace all ".sym-offset" with ".symXoffset" after receiving it from the user. Then it won't be an issue during prasing. -- Steve
On Wed, Oct 20, 2021 at 7:28 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Tue, 19 Oct 2021 18:31:40 -0700 > Kalesh Singh <kaleshsingh@google.com> wrote: > > > + minus_op = strrchr(str, '-'); > > + if (minus_op) { > > + /* Unfortunately, the modifier ".sym-offset" can confuse things. */ > > + if (minus_op - str >= 4 && !strncmp(minus_op - 4, ".sym-offset", 11)) > > + goto out; > > > > I was thinking about this, and perhaps we can add this later, but we could > just replace all ".sym-offset" with ".symXoffset" after receiving it from > the user. Then it won't be an issue during prasing. That's a good idea. It would clean things up a bit and avoid bailing out if the user has a sym-offest in an expression string. I can send a separate patch for this. > > -- Steve
On Wed, 20 Oct 2021 08:06:26 -0700 Kalesh Singh <kaleshsingh@google.com> wrote: > That's a good idea. It would clean things up a bit and avoid bailing > out if the user has a sym-offest in an expression string. I can send a > separate patch for this. +1 -- Steve
On Tue, 19 Oct 2021 18:31:40 -0700 Kalesh Singh <kaleshsingh@google.com> wrote: > @@ -2391,60 +2460,61 @@ static int check_expr_operands(struct trace_array *tr, > static struct hist_field *parse_expr(struct hist_trigger_data *hist_data, > struct trace_event_file *file, > char *str, unsigned long flags, > - char *var_name, unsigned int level) > + char *var_name, unsigned int *n_subexprs) > { > struct hist_field *operand1 = NULL, *operand2 = NULL, *expr = NULL; > unsigned long operand_flags; > int field_op, ret = -EINVAL; > char *sep, *operand1_str; > > - if (level > 3) { > + if (*n_subexprs > 3) { Why limit the sub expressions, and not just keep the limit of the level of recursion. We allow 3 levels of recursion, but we could have more than 3 sub expressions. If we have: a * b + c / d - e * f / h It would break down into: - + / * / * h a b c d e f Which I believe is 6 "sub expressions", but never goes more than three deep in recursion: "a * b + c / d - e * f / h" Step 1: op = "-" operand1 = "a * b + c / d" operand2 = "e * f / h" Process operand1: (recursion level 1) op = "+" operand1a = "a * b" operand2a = "c / d" Process operand1a: (recursion level 2) op = "*" operand1b = "a" operand2b = "b" return; Process operand1b: (recursion level 2) op = "/" operand1b = "c" operand2b = "d" return; return; Process operand2: (recursion level 1) op = "/" operand1c = "e * f" operand2c = "h" Process operand1c: (recursion level 2) op = "*" operand1c = "e" operand2c = "f" return; return; > + > + /* LHS of string is an expression e.g. a+b in a+b+c */ > + operand1 = parse_expr(hist_data, file, operand1_str, operand_flags, NULL, n_subexprs); > if (IS_ERR(operand1)) { > ret = PTR_ERR(operand1); > operand1 = NULL; I wonder if we should look for optimizations, in case of operand1 and operand2 are both constants? Just perform the function, and convert it into a constant as well. -- Steve
On Wed, Oct 20, 2021 at 8:48 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Tue, 19 Oct 2021 18:31:40 -0700 > Kalesh Singh <kaleshsingh@google.com> wrote: > > > @@ -2391,60 +2460,61 @@ static int check_expr_operands(struct trace_array *tr, > > static struct hist_field *parse_expr(struct hist_trigger_data *hist_data, > > struct trace_event_file *file, > > char *str, unsigned long flags, > > - char *var_name, unsigned int level) > > + char *var_name, unsigned int *n_subexprs) > > { > > struct hist_field *operand1 = NULL, *operand2 = NULL, *expr = NULL; > > unsigned long operand_flags; > > int field_op, ret = -EINVAL; > > char *sep, *operand1_str; > > > > - if (level > 3) { > > + if (*n_subexprs > 3) { > > Why limit the sub expressions, and not just keep the limit of the level of > recursion. We allow 3 levels of recursion, but we could have more than 3 > sub expressions. The main reason for this is that it's predictable behavior form the user's perspective. Before this patch the recursion always walked down a single branch so limiting by level worked out the same as limiting by sub expressions and is in line with the error the user would see ("Too many sub-expressions (3 max)"). Now that we take multiple paths in the recursion, using the level to reflect the number of sub-expressions would lead to only seeing the error in some of the cases (Sometimes we allow 4, 5, 6 sub-expressions depending on how balanced the tree is, and sometimes we error out on 4 - when the tree is list-like). Limiting by sub-expression keeps this consistent (always error out if we have more than 3 sub-expressions) and is in line with the previous behavior. - Kalesh > > > If we have: a * b + c / d - e * f / h > > It would break down into: > - > + / > * / * h > a b c d e f > > > Which I believe is 6 "sub expressions", but never goes more than three deep > in recursion: > > "a * b + c / d - e * f / h" > > Step 1: > > op = "-" > operand1 = "a * b + c / d" > operand2 = "e * f / h" > > Process operand1: (recursion level 1) > > op = "+" > operand1a = "a * b" > operand2a = "c / d" > > Process operand1a: (recursion level 2) > > op = "*" > operand1b = "a" > operand2b = "b" > > return; > > Process operand1b: (recursion level 2) > > op = "/" > operand1b = "c" > operand2b = "d" > > return; > > return; > > Process operand2: (recursion level 1) > > op = "/" > operand1c = "e * f" > operand2c = "h" > > Process operand1c: (recursion level 2) > > op = "*" > operand1c = "e" > operand2c = "f" > > return; > > return; > > > > > + > > + /* LHS of string is an expression e.g. a+b in a+b+c */ > > + operand1 = parse_expr(hist_data, file, operand1_str, operand_flags, NULL, n_subexprs); > > if (IS_ERR(operand1)) { > > ret = PTR_ERR(operand1); > > operand1 = NULL; > > I wonder if we should look for optimizations, in case of operand1 and > operand2 are both constants? > > Just perform the function, and convert it into a constant as well. > > -- Steve
On Wed, 20 Oct 2021 09:11:27 -0700 Kalesh Singh <kaleshsingh@google.com> wrote: > The main reason for this is that it's predictable behavior form the > user's perspective. Before this patch the recursion always walked down > a single branch so limiting by level worked out the same as limiting > by sub expressions and is in line with the error the user would see > ("Too many sub-expressions (3 max)"). Now that we take multiple paths > in the recursion, using the level to reflect the number of > sub-expressions would lead to only seeing the error in some of the > cases (Sometimes we allow 4, 5, 6 sub-expressions depending on how > balanced the tree is, and sometimes we error out on 4 - when the tree > is list-like). Limiting by sub-expression keeps this consistent > (always error out if we have more than 3 sub-expressions) and is in > line with the previous behavior. I'm fine with that. If we want to improve this in the future then fine. We can always extend, as that doesn't break user API. So we can keep it as is. -- Steve
On Wed, Oct 20, 2021 at 8:48 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Tue, 19 Oct 2021 18:31:40 -0700 > Kalesh Singh <kaleshsingh@google.com> wrote: > > > @@ -2391,60 +2460,61 @@ static int check_expr_operands(struct trace_array *tr, > > static struct hist_field *parse_expr(struct hist_trigger_data *hist_data, > > struct trace_event_file *file, > > char *str, unsigned long flags, > > - char *var_name, unsigned int level) > > + char *var_name, unsigned int *n_subexprs) > > { > > struct hist_field *operand1 = NULL, *operand2 = NULL, *expr = NULL; > > unsigned long operand_flags; > > int field_op, ret = -EINVAL; > > char *sep, *operand1_str; > > > > - if (level > 3) { > > + if (*n_subexprs > 3) { > > Why limit the sub expressions, and not just keep the limit of the level of > recursion. We allow 3 levels of recursion, but we could have more than 3 > sub expressions. > > > If we have: a * b + c / d - e * f / h > > It would break down into: > - > + / > * / * h > a b c d e f > > > Which I believe is 6 "sub expressions", but never goes more than three deep > in recursion: > > "a * b + c / d - e * f / h" > > Step 1: > > op = "-" > operand1 = "a * b + c / d" > operand2 = "e * f / h" > > Process operand1: (recursion level 1) > > op = "+" > operand1a = "a * b" > operand2a = "c / d" > > Process operand1a: (recursion level 2) > > op = "*" > operand1b = "a" > operand2b = "b" > > return; > > Process operand1b: (recursion level 2) > > op = "/" > operand1b = "c" > operand2b = "d" > > return; > > return; > > Process operand2: (recursion level 1) > > op = "/" > operand1c = "e * f" > operand2c = "h" > > Process operand1c: (recursion level 2) > > op = "*" > operand1c = "e" > operand2c = "f" > > return; > > return; > > > > > + > > + /* LHS of string is an expression e.g. a+b in a+b+c */ > > + operand1 = parse_expr(hist_data, file, operand1_str, operand_flags, NULL, n_subexprs); > > if (IS_ERR(operand1)) { > > ret = PTR_ERR(operand1); > > operand1 = NULL; > > I wonder if we should look for optimizations, in case of operand1 and > operand2 are both constants? > > Just perform the function, and convert it into a constant as well. I think we achieve something like this by propagating up the HIST_FIELD_FL_CONST flag. Thanks for the suggestions. I'll update this in the next version. Thanks, Kalesh > > -- Steve
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 9415ee65acc0..9205cfe120e8 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -68,7 +68,9 @@ C(INVALID_SORT_FIELD, "Sort field must be a key or a val"), \ C(INVALID_STR_OPERAND, "String type can not be an operand in expression"), \ C(TOO_MANY_CONSTS, "Too many constants defined"), \ - C(EXPECT_NUMBER, "Expecting numeric literal"), + C(EXPECT_NUMBER, "Expecting numeric literal"), \ + C(UNARY_MINUS_SUBEXPR, "Unary minus not supported in sub-expressions"), \ + C(SYM_OFFSET_SUBEXPR, ".sym-offset not supported in sub-expressions"), #undef C #define C(a, b) HIST_ERR_##a @@ -1647,40 +1649,96 @@ static char *expr_str(struct hist_field *field, unsigned int level) return expr; } -static int contains_operator(char *str) +/* + * If field_op != FIELD_OP_NONE, *sep points to the root operator + * of the expression tree to be evaluated. + */ +static int contains_operator(char *str, char **sep) { enum field_op_id field_op = FIELD_OP_NONE; - char *op; + char *minus_op, *plus_op, *div_op, *mult_op; + + + /* + * Report the last occurrence of the operators first, so that the + * expression is evaluated left to right. This is important since + * subtraction and division are not associative. + * + * e.g + * 64/8/4/2 is 1, i.e 64/8/4/2 = ((64/8)/4)/2 + * 14-7-5-2 is 0, i.e 14-7-5-2 = ((14-7)-5)-2 + */ - op = strpbrk(str, "+-/*"); - if (!op) - return FIELD_OP_NONE; + /* + * First, find lower precedence addition and subtraction + * since the expression will be evaluated recursively. + */ + minus_op = strrchr(str, '-'); + if (minus_op) { + /* Unfortunately, the modifier ".sym-offset" can confuse things. */ + if (minus_op - str >= 4 && !strncmp(minus_op - 4, ".sym-offset", 11)) + goto out; - switch (*op) { - case '-': /* - * Unfortunately, the modifier ".sym-offset" - * can confuse things. + * Unary minus is not supported in sub-expressions. If + * present, it is always the next root operator. */ - if (op - str >= 4 && !strncmp(op - 4, ".sym-offset", 11)) - return FIELD_OP_NONE; - - if (*str == '-') + if (minus_op == str) { field_op = FIELD_OP_UNARY_MINUS; - else - field_op = FIELD_OP_MINUS; - break; - case '+': - field_op = FIELD_OP_PLUS; - break; - case '/': + goto out; + } + + field_op = FIELD_OP_MINUS; + } + + plus_op = strrchr(str, '+'); + if (plus_op || minus_op) { + /* + * For operators of the same precedence use to rightmost as the + * root, so that the expression is evaluated left to right. + */ + if (plus_op > minus_op) + field_op = FIELD_OP_PLUS; + goto out; + } + + /* + * Multiplication and division have higher precedence than addition and + * subtraction. + */ + div_op = strrchr(str, '/'); + if (div_op) field_op = FIELD_OP_DIV; - break; - case '*': + + mult_op = strrchr(str, '*'); + /* + * For operators of the same precedence use to rightmost as the + * root, so that the expression is evaluated left to right. + */ + if (mult_op > div_op) field_op = FIELD_OP_MULT; - break; - default: - break; + +out: + if (sep) { + switch (field_op) { + case FIELD_OP_UNARY_MINUS: + case FIELD_OP_MINUS: + *sep = minus_op; + break; + case FIELD_OP_PLUS: + *sep = plus_op; + break; + case FIELD_OP_DIV: + *sep = div_op; + break; + case FIELD_OP_MULT: + *sep = mult_op; + break; + case FIELD_OP_NONE: + default: + *sep = NULL; + break; + } } return field_op; @@ -2006,7 +2064,7 @@ static char *field_name_from_var(struct hist_trigger_data *hist_data, if (strcmp(var_name, name) == 0) { field = hist_data->attrs->var_defs.expr[i]; - if (contains_operator(field) || is_var_ref(field)) + if (contains_operator(field, NULL) || is_var_ref(field)) continue; return field; } @@ -2275,21 +2333,24 @@ static struct hist_field *parse_atom(struct hist_trigger_data *hist_data, static struct hist_field *parse_expr(struct hist_trigger_data *hist_data, struct trace_event_file *file, char *str, unsigned long flags, - char *var_name, unsigned int level); + char *var_name, unsigned int *n_subexprs); static struct hist_field *parse_unary(struct hist_trigger_data *hist_data, struct trace_event_file *file, char *str, unsigned long flags, - char *var_name, unsigned int level) + char *var_name, unsigned int *n_subexprs) { struct hist_field *operand1, *expr = NULL; unsigned long operand_flags; int ret = 0; char *s; + /* Unary minus operator, increment n_subexprs */ + ++*n_subexprs; + /* we support only -(xxx) i.e. explicit parens required */ - if (level > 3) { + if (*n_subexprs > 3) { hist_err(file->tr, HIST_ERR_TOO_MANY_SUBEXPR, errpos(str)); ret = -EINVAL; goto free; @@ -2306,8 +2367,16 @@ static struct hist_field *parse_unary(struct hist_trigger_data *hist_data, } s = strrchr(str, ')'); - if (s) + if (s) { + /* unary minus not supported in sub-expressions */ + if (*(s+1) != '\0') { + hist_err(file->tr, HIST_ERR_UNARY_MINUS_SUBEXPR, + errpos(str)); + ret = -EINVAL; + goto free; + } *s = '\0'; + } else { ret = -EINVAL; /* no closing ')' */ goto free; @@ -2321,7 +2390,7 @@ static struct hist_field *parse_unary(struct hist_trigger_data *hist_data, } operand_flags = 0; - operand1 = parse_expr(hist_data, file, str, operand_flags, NULL, ++level); + operand1 = parse_expr(hist_data, file, str, operand_flags, NULL, n_subexprs); if (IS_ERR(operand1)) { ret = PTR_ERR(operand1); goto free; @@ -2391,60 +2460,61 @@ static int check_expr_operands(struct trace_array *tr, static struct hist_field *parse_expr(struct hist_trigger_data *hist_data, struct trace_event_file *file, char *str, unsigned long flags, - char *var_name, unsigned int level) + char *var_name, unsigned int *n_subexprs) { struct hist_field *operand1 = NULL, *operand2 = NULL, *expr = NULL; unsigned long operand_flags; int field_op, ret = -EINVAL; char *sep, *operand1_str; - if (level > 3) { + if (*n_subexprs > 3) { hist_err(file->tr, HIST_ERR_TOO_MANY_SUBEXPR, errpos(str)); return ERR_PTR(-EINVAL); } - field_op = contains_operator(str); + /* + * ".sym-offset" in expressions has no effect on their evaluation, + * but can confuse operator parsing. + */ + if (*n_subexprs == 0) { + sep = strstr(str, ".sym-offset"); + if (sep) { + *sep = '\0'; + if (strpbrk(str, "+-/*") || strpbrk(sep + 11, "+-/*")) { + *sep = '.'; + hist_err(file->tr, HIST_ERR_SYM_OFFSET_SUBEXPR, + errpos(sep)); + return ERR_PTR(-EINVAL); + } + *sep = '.'; + } + } + + field_op = contains_operator(str, &sep); if (field_op == FIELD_OP_NONE) return parse_atom(hist_data, file, str, &flags, var_name); if (field_op == FIELD_OP_UNARY_MINUS) - return parse_unary(hist_data, file, str, flags, var_name, ++level); + return parse_unary(hist_data, file, str, flags, var_name, n_subexprs); - switch (field_op) { - case FIELD_OP_MINUS: - sep = "-"; - break; - case FIELD_OP_PLUS: - sep = "+"; - break; - case FIELD_OP_DIV: - sep = "/"; - break; - case FIELD_OP_MULT: - sep = "*"; - break; - default: - goto free; - } + /* Binary operator found, increment n_subexprs */ + ++*n_subexprs; - /* - * Multiplication and division are only supported in single operator - * expressions, since the expression is always evaluated from right - * to left. - */ - if ((field_op == FIELD_OP_DIV || field_op == FIELD_OP_MULT) && level > 0) { - hist_err(file->tr, HIST_ERR_TOO_MANY_SUBEXPR, errpos(str)); - return ERR_PTR(-EINVAL); - } + /* Split the expression string at the root operator */ + if (!sep) + goto free; + *sep = '\0'; + operand1_str = str; + str = sep+1; - operand1_str = strsep(&str, sep); if (!operand1_str || !str) goto free; operand_flags = 0; - operand1 = parse_atom(hist_data, file, operand1_str, - &operand_flags, NULL); + + /* LHS of string is an expression e.g. a+b in a+b+c */ + operand1 = parse_expr(hist_data, file, operand1_str, operand_flags, NULL, n_subexprs); if (IS_ERR(operand1)) { ret = PTR_ERR(operand1); operand1 = NULL; @@ -2456,9 +2526,9 @@ static struct hist_field *parse_expr(struct hist_trigger_data *hist_data, goto free; } - /* rest of string could be another expression e.g. b+c in a+b+c */ + /* RHS of string is another expression e.g. c in a+b+c */ operand_flags = 0; - operand2 = parse_expr(hist_data, file, str, operand_flags, NULL, ++level); + operand2 = parse_expr(hist_data, file, str, operand_flags, NULL, n_subexprs); if (IS_ERR(operand2)) { ret = PTR_ERR(operand2); operand2 = NULL; @@ -3892,9 +3962,9 @@ static int __create_val_field(struct hist_trigger_data *hist_data, unsigned long flags) { struct hist_field *hist_field; - int ret = 0; + int ret = 0, n_subexprs = 0; - hist_field = parse_expr(hist_data, file, field_str, flags, var_name, 0); + hist_field = parse_expr(hist_data, file, field_str, flags, var_name, &n_subexprs); if (IS_ERR(hist_field)) { ret = PTR_ERR(hist_field); goto out; @@ -4035,7 +4105,7 @@ static int create_key_field(struct hist_trigger_data *hist_data, struct hist_field *hist_field = NULL; unsigned long flags = 0; unsigned int key_size; - int ret = 0; + int ret = 0, n_subexprs = 0; if (WARN_ON(key_idx >= HIST_FIELDS_MAX)) return -EINVAL; @@ -4048,7 +4118,7 @@ static int create_key_field(struct hist_trigger_data *hist_data, hist_field = create_hist_field(hist_data, NULL, flags, NULL); } else { hist_field = parse_expr(hist_data, file, field_str, flags, - NULL, 0); + NULL, &n_subexprs); if (IS_ERR(hist_field)) { ret = PTR_ERR(hist_field); goto out;