diff mbox series

rev-parse: Detect missing opt-spec

Message ID 20220902050621.94381-1-oystwa@gmail.com (mailing list archive)
State Superseded
Headers show
Series rev-parse: Detect missing opt-spec | expand

Commit Message

Øystein Walle Sept. 2, 2022, 5:06 a.m. UTC
If a line in parseopts's input starts with one of the flag characters it
is erroneously parsed as a opt-spec where the short name of the option
is the flag character itself and the long name is after the end of the
string. This makes Git want to allocate SIZE_MAX bytes of memory at this
line:

    o->long_name = xmemdupz(sb.buf + 2, s - sb.buf - 2);

Since s and sb.buf are equal the second argument is -2 (except unsigned)
and xmemdupz allocates len + 1 bytes, ie. -1 meaning SIZE_MAX.

Avoid this by checking whether a flag character was found in the zeroth
position.

Signed-off-by: Øystein Walle <oystwa@gmail.com>
---
 builtin/rev-parse.c           | 3 +++
 t/t1502-rev-parse-parseopt.sh | 9 +++++++++
 2 files changed, 12 insertions(+)

Comments

Eric Sunshine Sept. 2, 2022, 5:46 a.m. UTC | #1
On Fri, Sep 2, 2022 at 1:10 AM Øystein Walle <oystwa@gmail.com> wrote:
> If a line in parseopts's input starts with one of the flag characters it
> is erroneously parsed as a opt-spec where the short name of the option
> is the flag character itself and the long name is after the end of the
> string. This makes Git want to allocate SIZE_MAX bytes of memory at this
> line:
>
>     o->long_name = xmemdupz(sb.buf + 2, s - sb.buf - 2);
>
> Since s and sb.buf are equal the second argument is -2 (except unsigned)
> and xmemdupz allocates len + 1 bytes, ie. -1 meaning SIZE_MAX.
>
> Avoid this by checking whether a flag character was found in the zeroth
> position.
>
> Signed-off-by: Øystein Walle <oystwa@gmail.com>

Perhaps add a:

    Reported-by: Ingy dot Net <ingy@ingy.net>

trailer?

> diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
> @@ -479,6 +479,9 @@ static int cmd_parseopt(int argc, const char **argv, const char *prefix)
> +               if (s == sb.buf)
> +                       die(_("Missing opt-spec before option flags"));

There is a bit of a mix in this file already, but these days, we tend
to start error messages with lowercase:

    die(_("missing opt-spec before option flags"));

> diff --git a/t/t1502-rev-parse-parseopt.sh b/t/t1502-rev-parse-parseopt.sh
> @@ -306,6 +306,13 @@ test_expect_success 'test --parseopt help output: "wrapped" options normal "or:"
> +test_expect_success 'test --parseopt invalid opt-spec' '
> +       test_write_lines x -- "=, x" >spec &&
> +       echo "fatal: Missing opt-spec before option flags" >expect &&
> +       test_must_fail git rev-parse --parseopt -- >out <spec >actual 2>&1 &&
> +       test_cmp expect actual
> +'
> +
> @@ -337,3 +344,5 @@ test_expect_success 'test --parseopt help output: multi-line blurb after empty l
>
>  test_done
> +
> +test_done

Um? Debugging leftover?
SZEDER Gábor Sept. 2, 2022, 6:47 a.m. UTC | #2
On Fri, Sep 02, 2022 at 07:06:21AM +0200, Øystein Walle wrote:
> If a line in parseopts's input starts with one of the flag characters it
> is erroneously parsed as a opt-spec where the short name of the option
> is the flag character itself and the long name is after the end of the
> string. This makes Git want to allocate SIZE_MAX bytes of memory at this
> line:
> 
>     o->long_name = xmemdupz(sb.buf + 2, s - sb.buf - 2);
> 
> Since s and sb.buf are equal the second argument is -2 (except unsigned)
> and xmemdupz allocates len + 1 bytes, ie. -1 meaning SIZE_MAX.

I suspect (but didn't actually check) that this bug was added in
2d893dff4c (rev-parse --parseopt: allow [*=?!] in argument hints,
2015-07-14).

> Avoid this by checking whether a flag character was found in the zeroth
> position.
> 
> Signed-off-by: Øystein Walle <oystwa@gmail.com>
> ---
>  builtin/rev-parse.c           | 3 +++
>  t/t1502-rev-parse-parseopt.sh | 9 +++++++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
> index b259d8990a..04958cf9a9 100644
> --- a/builtin/rev-parse.c
> +++ b/builtin/rev-parse.c
> @@ -479,6 +479,9 @@ static int cmd_parseopt(int argc, const char **argv, const char *prefix)
>  		if (!s)
>  			s = help;
>  
> +		if (s == sb.buf)
> +			die(_("Missing opt-spec before option flags"));
> +
>  		if (s - sb.buf == 1) /* short option only */
>  			o->short_name = *sb.buf;
>  		else if (sb.buf[1] != ',') /* long option only */
> diff --git a/t/t1502-rev-parse-parseopt.sh b/t/t1502-rev-parse-parseopt.sh
> index 284fe18e72..15bc240027 100755
> --- a/t/t1502-rev-parse-parseopt.sh
> +++ b/t/t1502-rev-parse-parseopt.sh
> @@ -306,6 +306,13 @@ test_expect_success 'test --parseopt help output: "wrapped" options normal "or:"
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'test --parseopt invalid opt-spec' '
> +	test_write_lines x -- "=, x" >spec &&
> +	echo "fatal: Missing opt-spec before option flags" >expect &&
> +	test_must_fail git rev-parse --parseopt -- >out <spec >actual 2>&1 &&

When checking an error message please don't look for it on standard
output; i.e. the redirection at the end should be '2>actual', or
perheps even better '2>err'.

> +	test_cmp expect actual
> +'
> +
>  test_expect_success 'test --parseopt help output: multi-line blurb after empty line' '
>  	sed -e "s/^|//" >spec <<-\EOF &&
>  	|cmd [--some-option]
> @@ -337,3 +344,5 @@ test_expect_success 'test --parseopt help output: multi-line blurb after empty l
>  '
>  
>  test_done
> +
> +test_done
> -- 
> 2.34.1
>
Junio C Hamano Sept. 2, 2022, 4:27 p.m. UTC | #3
SZEDER Gábor <szeder.dev@gmail.com> writes:

> On Fri, Sep 02, 2022 at 07:06:21AM +0200, Øystein Walle wrote:
>> If a line in parseopts's input starts with one of the flag characters it
>> is erroneously parsed as a opt-spec where the short name of the option
>> is the flag character itself and the long name is after the end of the
>> string. This makes Git want to allocate SIZE_MAX bytes of memory at this
>> line:
>> 
>>     o->long_name = xmemdupz(sb.buf + 2, s - sb.buf - 2);
>> 
>> Since s and sb.buf are equal the second argument is -2 (except unsigned)
>> and xmemdupz allocates len + 1 bytes, ie. -1 meaning SIZE_MAX.
>
> I suspect (but didn't actually check) that this bug was added in
> 2d893dff4c (rev-parse --parseopt: allow [*=?!] in argument hints,
> 2015-07-14).

Good thing to add to the proposed log message.  Thanks.

Also, Øystein "Detect" -> "detect" on the title (you can see the
convention in the output from "git shortlog --no-merges").

>>  		if (!s)
>>  			s = help;
>>  
>> +		if (s == sb.buf)
>> +			die(_("Missing opt-spec before option flags"));
>> +

OK.

>> +test_expect_success 'test --parseopt invalid opt-spec' '
>> +	test_write_lines x -- "=, x" >spec &&
>> +	echo "fatal: Missing opt-spec before option flags" >expect &&
>> +	test_must_fail git rev-parse --parseopt -- >out <spec >actual 2>&1 &&
>
> When checking an error message please don't look for it on standard
> output; i.e. the redirection at the end should be '2>actual', or
> perheps even better '2>err'.

Again, very good point.

Thanks.
Jeff King Sept. 2, 2022, 5:13 p.m. UTC | #4
On Fri, Sep 02, 2022 at 07:06:21AM +0200, Øystein Walle wrote:

> diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
> index b259d8990a..04958cf9a9 100644
> --- a/builtin/rev-parse.c
> +++ b/builtin/rev-parse.c
> @@ -479,6 +479,9 @@ static int cmd_parseopt(int argc, const char **argv, const char *prefix)
>  		if (!s)
>  			s = help;
>  
> +		if (s == sb.buf)
> +			die(_("Missing opt-spec before option flags"));
> +
>  		if (s - sb.buf == 1) /* short option only */
>  			o->short_name = *sb.buf;
>  		else if (sb.buf[1] != ',') /* long option only */

I think this is the right thing to do, at least for now. Certainly it
catches the bug. It doesn't allow short or long option names to contain
any flag characters, but that's probably OK in practice.

I think one could make an argument that cmd_parseopt() should do a
better job of parsing left-to-right. The reason it missed this case is
that it calls strpbrk(), expecting to jump past the short/long option
names, but it jumps less far than expected.

If the parsing were more left-to-right, like:

  - start with pointer at beginning of sb.buf

  - look for acceptable character for short option, or "," for none;
    advance pointer if found, otherwise bail

  - look for syntactically valid long option name; advance pointer,
    otherwise bail

  - look for valid flags

then I think it becomes much easier to reason about what is valid for
each item. And we _could_ do things like allowing a short-option that is
also a flag-character, if we wanted to.

But IMHO such a refactoring can come later, or not at all. While it
might make the code a bit more clear, I don't think it meaningfully
improves the behavior. And either way, we should start with a minimal
and easy-to-verify fix like you have here.

-Peff
diff mbox series

Patch

diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index b259d8990a..04958cf9a9 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -479,6 +479,9 @@  static int cmd_parseopt(int argc, const char **argv, const char *prefix)
 		if (!s)
 			s = help;
 
+		if (s == sb.buf)
+			die(_("Missing opt-spec before option flags"));
+
 		if (s - sb.buf == 1) /* short option only */
 			o->short_name = *sb.buf;
 		else if (sb.buf[1] != ',') /* long option only */
diff --git a/t/t1502-rev-parse-parseopt.sh b/t/t1502-rev-parse-parseopt.sh
index 284fe18e72..15bc240027 100755
--- a/t/t1502-rev-parse-parseopt.sh
+++ b/t/t1502-rev-parse-parseopt.sh
@@ -306,6 +306,13 @@  test_expect_success 'test --parseopt help output: "wrapped" options normal "or:"
 	test_cmp expect actual
 '
 
+test_expect_success 'test --parseopt invalid opt-spec' '
+	test_write_lines x -- "=, x" >spec &&
+	echo "fatal: Missing opt-spec before option flags" >expect &&
+	test_must_fail git rev-parse --parseopt -- >out <spec >actual 2>&1 &&
+	test_cmp expect actual
+'
+
 test_expect_success 'test --parseopt help output: multi-line blurb after empty line' '
 	sed -e "s/^|//" >spec <<-\EOF &&
 	|cmd [--some-option]
@@ -337,3 +344,5 @@  test_expect_success 'test --parseopt help output: multi-line blurb after empty l
 '
 
 test_done
+
+test_done