diff mbox series

[2/2] grep.c: tolerate NULL grep_expr in free_pattern_expr()

Message ID 7e094882c2a71894416089f894557a9eae07e8f8.1665423686.git.me@ttaylorr.com (mailing list archive)
State New, archived
Headers show
Series grep: tolerate NULL argument to free_grep_expr() | expand

Commit Message

Taylor Blau Oct. 10, 2022, 5:41 p.m. UTC
As demonstrated in the previous commit, `git log --invert-grep` without
a `--grep` argument causes a segfault after f41fb662f5 (revisions API:
have release_revisions() release "grep_filter", 2022-04-13).

The segfault occurs in `free_pattern_expr()`, which crashes on trying to
switch on `x->node` when given a NULL pointer. Usually we avoid calling
`free_pattern_expr()` without a pattern as indicated by the `extended`
bit being zero.

But it is possible to get into a state where the `extended` bit is
non-zero, but the `pattern_expression` is still NULL. This happens
because the `--invert-grep` option sets the `no_body_match` bit. When we
call `compile_grep_patterns()`, we set `opt->extended = 1`. But the
`pattern_expression` is left as NULL, since we return with a NULL
`header_expr`.

So when we try to call `free_pattern_expr()`, things go awry, since
`free_grep_patterns()` expects a non-NULL argument.

Instead, teach `free_grep_patterns()` to tolerate a NULL argument
(treating it as a noop), and avoid checking whether or not the
`extended` bit is set, since `free_pattern_expr()` will handle its
argument regardless.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 grep.c         | 5 +++--
 t/t4202-log.sh | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

Comments

Junio C Hamano Oct. 10, 2022, 5:54 p.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

> diff --git a/grep.c b/grep.c
> index 52a894c989..bcc6e63365 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -752,6 +752,9 @@ void compile_grep_patterns(struct grep_opt *opt)
>  
>  static void free_pattern_expr(struct grep_expr *x)
>  {
> +	if (!x)
> +		return;
> +
>  	switch (x->node) {
>  	case GREP_NODE_TRUE:
>  	case GREP_NODE_ATOM:

This hunk makes sense, but

> @@ -790,8 +793,6 @@ void free_grep_patterns(struct grep_opt *opt)
>  		free(p);
>  	}
>  
> -	if (!opt->extended)
> -		return;
>  	free_pattern_expr(opt->pattern_expression);
>  }

I do not know about this one.  We used to avoid freeing, even when
the .pattern_expression member is set, as long as the .extended bit
is not set.  Now we unconditionally try to free it even when the bit
says it does not want to.  Why?

> diff --git a/t/t4202-log.sh b/t/t4202-log.sh
> index e3ec5f5661..44f7ef0ea2 100755
> --- a/t/t4202-log.sh
> +++ b/t/t4202-log.sh
> @@ -297,7 +297,7 @@ test_expect_success 'log --invert-grep --grep -i' '
>  	fi
>  '
>  
> -test_expect_failure 'log --invert-grep (no --grep)' '
> +test_expect_success 'log --invert-grep (no --grep)' '
>  	git log --pretty="tformat:%s" >expect &&
>  	git log --invert-grep --pretty="tformat:%s" >actual &&
>  	test_cmp expect actual

Especially for something this small, doing the "failing test first
and then fix with flipping the test to success" is very much
unwelcome.  For whoever gets curious (me included when accepting
posted patch), it is easy to revert only the part of the commit
outside t/ tentatively to see how the original code breaks.  Keeping
the fix and protection of the fix together will avoid mistakes.  In
this case, the whole test fits inside the post context of the patch,
but in general, this "flip failure to success" will hide the body of
the test that changes behaviour while reviewing the patch text,
which is another downside.

Thanks.
Taylor Blau Oct. 10, 2022, 6:10 p.m. UTC | #2
On Mon, Oct 10, 2022 at 10:54:23AM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
> > @@ -790,8 +793,6 @@ void free_grep_patterns(struct grep_opt *opt)
> >  		free(p);
> >  	}
> >
> > -	if (!opt->extended)
> > -		return;
> >  	free_pattern_expr(opt->pattern_expression);
> >  }
>
> I do not know about this one.  We used to avoid freeing, even when
> the .pattern_expression member is set, as long as the .extended bit
> is not set.  Now we unconditionally try to free it even when the bit
> says it does not want to.  Why?

It's not "does not want to" be freed. As best I can tell, we conflate
`opt->extended` with "there is something in `opt->pattern_expression`".
So checking whether or not `opt->extended` is non-zero isn't "keep this
around because I'm going to use it later", but instead "there is
nothing to free, don't bother calling `free_pattern_expr()`".

A more direct way of saying the latter would have been to replace the
if-statement with `if (opt->pattern_expression)`.

I hinted at this in the commit message, but I will make it more direct
to avoid future readers' confusion.

> > diff --git a/t/t4202-log.sh b/t/t4202-log.sh
> > index e3ec5f5661..44f7ef0ea2 100755
> > --- a/t/t4202-log.sh
> > +++ b/t/t4202-log.sh
> > @@ -297,7 +297,7 @@ test_expect_success 'log --invert-grep --grep -i' '
> >  	fi
> >  '
> >
> > -test_expect_failure 'log --invert-grep (no --grep)' '
> > +test_expect_success 'log --invert-grep (no --grep)' '
> >  	git log --pretty="tformat:%s" >expect &&
> >  	git log --invert-grep --pretty="tformat:%s" >actual &&
> >  	test_cmp expect actual
>
> Especially for something this small, doing the "failing test first
> and then fix with flipping the test to success" is very much
> unwelcome.  For whoever gets curious (me included when accepting
> posted patch), it is easy to revert only the part of the commit
> outside t/ tentatively to see how the original code breaks.  Keeping
> the fix and protection of the fix together will avoid mistakes.  In
> this case, the whole test fits inside the post context of the patch,
> but in general, this "flip failure to success" will hide the body of
> the test that changes behaviour while reviewing the patch text,
> which is another downside.

Good to know. I had considered it good practice, even for a small fix,
as a way to show your work and prove that you had a legitimately broken
test case demonstrating the bug. But if it creates an extra hassle, I
don't mind squashing it down.

I can send a squashed version of these two patches, but let's see if
there are any other comments, first.

Thanks,
Taylor
Junio C Hamano Oct. 10, 2022, 6:11 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

>>  static void free_pattern_expr(struct grep_expr *x)
>>  {
>> +	if (!x)
>> +		return;
>> +
>>  	switch (x->node) {
>>  	case GREP_NODE_TRUE:
>>  	case GREP_NODE_ATOM:
>
> This hunk makes sense, but
>
>> @@ -790,8 +793,6 @@ void free_grep_patterns(struct grep_opt *opt)
>>  		free(p);
>>  	}
>>  
>> -	if (!opt->extended)
>> -		return;
>>  	free_pattern_expr(opt->pattern_expression);
>>  }
>
> I do not know about this one.  We used to avoid freeing, even when
> the .pattern_expression member is set, as long as the .extended bit
> is not set.  Now we unconditionally try to free it even when the bit
> says it does not want to.  Why?

Ah, grep.c::compile_grep_patterns() has the answer.  We only
populate the .pattern_expression member when we are doing a complex
query and leave it empty otherwise.  The .pattern_list member is
used instead as a list of OR'ed patterns in grep.c::match_line()
when .extended is not set.

The !opt->extended guard assumes that opt->pattern_expression exists
only when extended is set, which is correct, but forgets that even
when extended is set, pattern_expression is not necessarily non-NULL.

So I think the right thing to do may be to allow free_pattern_expr()
to take and ignore NULL silently?  Ah, that is already what you are
doing in the first hunk.  Is this second hunk even necessary?

I wonder how calls to grep.c::match_line() with opt->extended true
and opt->pattern_expression NULL, though.  It should die() at the
beginning of match_expr_eval(), which probably is OK, but somehow
feels unsatisfactory.
Taylor Blau Oct. 10, 2022, 6:14 p.m. UTC | #4
On Mon, Oct 10, 2022 at 11:11:05AM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> So I think the right thing to do may be to allow free_pattern_expr()
> to take and ignore NULL silently?  Ah, that is already what you are
> doing in the first hunk.  Is this second hunk even necessary?

Right. The fix in my patch is to have `free_pattern_expr()` treat its
arguments like `free()` does, where NULL is a silent noop.

We could do with or without the second hunk. I dropped it to avoid
confusion, but it seems to have had the opposite effect ;-). I'll keep
the if-statement there and instead drop the latter hunk.

> I wonder how calls to grep.c::match_line() with opt->extended true
> and opt->pattern_expression NULL, though.  It should die() at the
> beginning of match_expr_eval(), which probably is OK, but somehow
> feels unsatisfactory.

It does, and we can thank Linus for that: c922b01f54 (grep: fix segfault
when "git grep '('" is given, 2009-04-27).

Thanks,
Taylor
diff mbox series

Patch

diff --git a/grep.c b/grep.c
index 52a894c989..bcc6e63365 100644
--- a/grep.c
+++ b/grep.c
@@ -752,6 +752,9 @@  void compile_grep_patterns(struct grep_opt *opt)
 
 static void free_pattern_expr(struct grep_expr *x)
 {
+	if (!x)
+		return;
+
 	switch (x->node) {
 	case GREP_NODE_TRUE:
 	case GREP_NODE_ATOM:
@@ -790,8 +793,6 @@  void free_grep_patterns(struct grep_opt *opt)
 		free(p);
 	}
 
-	if (!opt->extended)
-		return;
 	free_pattern_expr(opt->pattern_expression);
 }
 
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index e3ec5f5661..44f7ef0ea2 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -297,7 +297,7 @@  test_expect_success 'log --invert-grep --grep -i' '
 	fi
 '
 
-test_expect_failure 'log --invert-grep (no --grep)' '
+test_expect_success 'log --invert-grep (no --grep)' '
 	git log --pretty="tformat:%s" >expect &&
 	git log --invert-grep --pretty="tformat:%s" >actual &&
 	test_cmp expect actual