[v2] test_cmp: diagnose incorrect arguments

Message ID	20200809174209.15466-1-sunshine@sunshineco.com (mailing list archive)
State	Accepted
Commit	d572f52a64c6a69990f72ad6a09504b9b615d2e4
Headers	show Return-Path: <SRS0=dguO=BT=vger.kernel.org=git-owner@kernel.org> From: Eric Sunshine <sunshine@sunshineco.com> To: git@vger.kernel.org Cc: Elijah Newren <newren@gmail.com>, Junio C Hamano <gitster@pobox.com>, Shourya Shukla <shouryashukla.oo@gmail.com>, Eric Sunshine <sunshine@sunshineco.com> Subject: [PATCH v2] test_cmp: diagnose incorrect arguments Date: Sun, 9 Aug 2020 13:42:09 -0400 Message-Id: <20200809174209.15466-1-sunshine@sunshineco.com> In-Reply-To: <20200809060810.31370-1-sunshine@sunshineco.com> References: <20200809060810.31370-1-sunshine@sunshineco.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: git-owner@vger.kernel.org Precedence: bulk
Series	[v2] test_cmp: diagnose incorrect arguments \| expand [v2] test_cmp: diagnose incorrect arguments

Eric Sunshine Aug. 9, 2020, 5:42 p.m. UTC

Under normal circumstances, if a test author misspells a filename passed
to test_cmp(), the error is quickly discovered when the test fails
unexpectedly due to test_cmp() being unable to find the file. However,
if the test is expected to fail, as with test_expect_failure(), a
misspelled filename as argument to test_cmp() will go unnoticed since
the test will indeed fail, but for the wrong reason. Make it easier for
test authors to discover such problems early by sanity-checking the
arguments to test_cmp(). To avoid penalizing all clients of test_cmp()
in the general case, only check for missing files if the comparison
fails.

While at it, make test_cmp_bin() sanity-check its arguments, as well.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---

Notes:
    This is a re-roll of [1] which was motivated by seeing Elijah fix[2]
    several cases of bogus test_cmp() calls which perhaps could have
    been detected earlier.
    
    Changes since v1:
    
    * take into account that some callers pass "-" (meaning standard
      input) as an argument to test_cmp() (pointed out privately by
      Junio)
    
    * show the name of the missing file rather than a placeholder
      (Shourya[3])
    
    [1]: https://lore.kernel.org/git/20200809060810.31370-1-sunshine@sunshineco.com/
    [2]: https://lore.kernel.org/git/7f408b7d4069403b969d334f4940ebf87f1dc797.1596906081.git.gitgitgadget@gmail.com/
    [3]: https://lore.kernel.org/git/20200809083227.GA11219@konoha/

Interdiff against v1:
  diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
  index 8d77deebd2..a12d6a3fc9 100644
  --- a/t/test-lib-functions.sh
  +++ b/t/test-lib-functions.sh
  @@ -955,8 +955,8 @@ test_cmp() {
   	test $# -eq 2 || BUG "test_cmp requires two arguments"
   	if ! eval "$GIT_TEST_CMP" '"$@"'
   	then
  -		test -e "$1" || BUG "test_cmp 'expect' file missing"
  -		test -e "$2" || BUG "test_cmp 'actual' file missing"
  +		test "x$1" = x- || test -e "$1" || BUG "test_cmp '$1' missing"
  +		test "x$2" = x- || test -e "$2" || BUG "test_cmp '$2' missing"
   		return 1
   	fi
   }
  @@ -990,8 +990,8 @@ test_cmp_bin() {
   	test $# -eq 2 || BUG "test_cmp_bin requires two arguments"
   	if ! cmp "$@"
   	then
  -		test -e "$1" || BUG "test_cmp_bin 'expect' file missing"
  -		test -e "$2" || BUG "test_cmp_bin 'actual' file missing"
  +		test "x$1" = x- || test -e "$1" || BUG "test_cmp_bin '$1' missing"
  +		test "x$2" = x- || test -e "$2" || BUG "test_cmp_bin '$2' missing"
   		return 1
   	fi
   }

 t/test-lib-functions.sh | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

Junio C Hamano Aug. 9, 2020, 7:12 p.m. UTC | #1

Eric Sunshine <sunshine@sunshineco.com> writes:

> Under normal circumstances, if a test author misspells a filename passed
> to test_cmp(), the error is quickly discovered when the test fails
> unexpectedly due to test_cmp() being unable to find the file. However,
> if the test is expected to fail, as with test_expect_failure(), a
> misspelled filename as argument to test_cmp() will go unnoticed since
> the test will indeed fail, but for the wrong reason. Make it easier for
> test authors to discover such problems early by sanity-checking the
> arguments to test_cmp(). To avoid penalizing all clients of test_cmp()
> in the general case, only check for missing files if the comparison
> fails.
>
> While at it, make test_cmp_bin() sanity-check its arguments, as well.
>
> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
> ---

The motivation makes quite a lot of sense.

I suspect that it is a bug to use "-", i.e. "read from the standard
input", for "$2", because

 (1) if it is used for the expected output, we got the comparison in
     the wrong direction;

 (2) if it is used for the actual output, we have a command whose
     output we are interested in on the upstream side of a pipe to
     discard its exit status.

but I'll let it pass.

> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index b791933ffd..a12d6a3fc9 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -952,7 +952,13 @@ test_expect_code () {
>  # - not all diff versions understand "-u"
>  
>  test_cmp() {
> -	eval "$GIT_TEST_CMP" '"$@"'
> +	test $# -eq 2 || BUG "test_cmp requires two arguments"
> +	if ! eval "$GIT_TEST_CMP" '"$@"'
> +	then
> +		test "x$1" = x- || test -e "$1" || BUG "test_cmp '$1' missing"
> +		test "x$2" = x- || test -e "$2" || BUG "test_cmp '$2' missing"
> +		return 1
> +	fi
>  }
>  
>  # Check that the given config key has the expected value.
> @@ -981,7 +987,13 @@ test_cmp_config() {
>  # test_cmp_bin - helper to compare binary files
>  
>  test_cmp_bin() {
> -	cmp "$@"
> +	test $# -eq 2 || BUG "test_cmp_bin requires two arguments"
> +	if ! cmp "$@"
> +	then
> +		test "x$1" = x- || test -e "$1" || BUG "test_cmp_bin '$1' missing"
> +		test "x$2" = x- || test -e "$2" || BUG "test_cmp_bin '$2' missing"
> +		return 1
> +	fi
>  }
>  
>  # Use this instead of test_cmp to compare files that contain expected and

Eric Sunshine Aug. 9, 2020, 7:34 p.m. UTC | #2

On Sun, Aug 9, 2020 at 3:12 PM Junio C Hamano <gitster@pobox.com> wrote:
> The motivation makes quite a lot of sense.
>
> I suspect that it is a bug to use "-", i.e. "read from the standard
> input", for "$2", because
>
>  (1) if it is used for the expected output, we got the comparison in
>      the wrong direction;

I had the same concern that it didn't necessarily make sense to allow
"-" for $2, but eventually thought of a case such as:

    sed '...' actual | test_cmp expect - &&

which massages 'actual' before test_cmp() sees it. True, we don't
actually have such callers, but it theoretically is legitimate.

>  (2) if it is used for the actual output, we have a command whose
>      output we are interested in on the upstream side of a pipe to
>      discard its exit status.

If the command upstream is a Git command, that could indeed be a
reason for concern, but, as illustrated above, it could just be a
system command.

Junio C Hamano Aug. 10, 2020, 3:22 p.m. UTC | #3

Eric Sunshine <sunshine@sunshineco.com> writes:

> I had the same concern that it didn't necessarily make sense to allow
> "-" for $2, but eventually thought of a case such as:
>
>     sed '...' actual | test_cmp expect - &&
>
> which massages 'actual' before test_cmp() sees it. True, we don't
> actually have such callers, but it theoretically is legitimate.

Yup, that looks a bit too stretching [*] to me, but that was what I
had in mind when I said "I'll let it pass".

    Side note.  Presumably that 'actual' was written by Git, to
    avoid losing its exit code, e.g.

	git frotz >actual &&
	sed '...' actual | test_cmp expect -

    but then it becomes more natural to write the second one like
    so:

	git frotz >raw &&
	sed '...' raw >actual &&
	test_cmp expect actual

Taylor Blau Aug. 11, 2020, 6:32 p.m. UTC | #4

On Sun, Aug 09, 2020 at 01:42:09PM -0400, Eric Sunshine wrote:
> Under normal circumstances, if a test author misspells a filename passed
> to test_cmp(), the error is quickly discovered when the test fails
> unexpectedly due to test_cmp() being unable to find the file. However,
> if the test is expected to fail, as with test_expect_failure(), a
> misspelled filename as argument to test_cmp() will go unnoticed since
> the test will indeed fail, but for the wrong reason. Make it easier for
> test authors to discover such problems early by sanity-checking the
> arguments to test_cmp(). To avoid penalizing all clients of test_cmp()
> in the general case, only check for missing files if the comparison
> fails.
>
> While at it, make test_cmp_bin() sanity-check its arguments, as well.
>
> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
> ---
>
> Notes:
>     This is a re-roll of [1] which was motivated by seeing Elijah fix[2]
>     several cases of bogus test_cmp() calls which perhaps could have
>     been detected earlier.
>
>     Changes since v1:
>
>     * take into account that some callers pass "-" (meaning standard
>       input) as an argument to test_cmp() (pointed out privately by
>       Junio)
>
>     * show the name of the missing file rather than a placeholder
>       (Shourya[3])
>
>     [1]: https://lore.kernel.org/git/20200809060810.31370-1-sunshine@sunshineco.com/
>     [2]: https://lore.kernel.org/git/7f408b7d4069403b969d334f4940ebf87f1dc797.1596906081.git.gitgitgadget@gmail.com/
>     [3]: https://lore.kernel.org/git/20200809083227.GA11219@konoha/
>
> Interdiff against v1:
>   diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
>   index 8d77deebd2..a12d6a3fc9 100644
>   --- a/t/test-lib-functions.sh
>   +++ b/t/test-lib-functions.sh
>   @@ -955,8 +955,8 @@ test_cmp() {
>    	test $# -eq 2 || BUG "test_cmp requires two arguments"
>    	if ! eval "$GIT_TEST_CMP" '"$@"'
>    	then
>   -		test -e "$1" || BUG "test_cmp 'expect' file missing"
>   -		test -e "$2" || BUG "test_cmp 'actual' file missing"
>   +		test "x$1" = x- || test -e "$1" || BUG "test_cmp '$1' missing"
>   +		test "x$2" = x- || test -e "$2" || BUG "test_cmp '$2' missing"
>    		return 1
>    	fi
>    }
>   @@ -990,8 +990,8 @@ test_cmp_bin() {
>    	test $# -eq 2 || BUG "test_cmp_bin requires two arguments"
>    	if ! cmp "$@"
>    	then
>   -		test -e "$1" || BUG "test_cmp_bin 'expect' file missing"
>   -		test -e "$2" || BUG "test_cmp_bin 'actual' file missing"
>   +		test "x$1" = x- || test -e "$1" || BUG "test_cmp_bin '$1' missing"
>   +		test "x$2" = x- || test -e "$2" || BUG "test_cmp_bin '$2' missing"
>    		return 1
>    	fi
>    }
>
>  t/test-lib-functions.sh | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index b791933ffd..a12d6a3fc9 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -952,7 +952,13 @@ test_expect_code () {
>  # - not all diff versions understand "-u"
>
>  test_cmp() {
> -	eval "$GIT_TEST_CMP" '"$@"'
> +	test $# -eq 2 || BUG "test_cmp requires two arguments"
> +	if ! eval "$GIT_TEST_CMP" '"$@"'
> +	then
> +		test "x$1" = x- || test -e "$1" || BUG "test_cmp '$1' missing"
> +		test "x$2" = x- || test -e "$2" || BUG "test_cmp '$2' missing"

Not related to your patch, but I've seen this style of "x$1" in a few
places in test-lib-functions.sh. Why can't this be written as 'test "$1"
= -'?

> +		return 1
> +	fi
>  }
>
>  # Check that the given config key has the expected value.
> @@ -981,7 +987,13 @@ test_cmp_config() {
>  # test_cmp_bin - helper to compare binary files
>
>  test_cmp_bin() {
> -	cmp "$@"
> +	test $# -eq 2 || BUG "test_cmp_bin requires two arguments"
> +	if ! cmp "$@"
> +	then
> +		test "x$1" = x- || test -e "$1" || BUG "test_cmp_bin '$1' missing"
> +		test "x$2" = x- || test -e "$2" || BUG "test_cmp_bin '$2' missing"
> +		return 1
> +	fi
>  }
>
>  # Use this instead of test_cmp to compare files that contain expected and
> --
> 2.28.0.220.ged08abb693
>
Thanks,
Taylor

Eric Sunshine Aug. 11, 2020, 7:25 p.m. UTC | #5

On Tue, Aug 11, 2020 at 2:33 PM Taylor Blau <me@ttaylorr.com> wrote:
> On Sun, Aug 09, 2020 at 01:42:09PM -0400, Eric Sunshine wrote:
> > +             test "x$2" = x- || test -e "$2" || BUG "test_cmp '$2' missing"
>
> Not related to your patch, but I've seen this style of "x$1" in a few
> places in test-lib-functions.sh. Why can't this be written as 'test "$1"
> = -'?

Short answer: To prevent 'test' from thinking that the argument is a switch.

Longer answer:

'test' can accept both switches (i.e. "-e") and non-switch arguments.
Keep in mind, too, that all the quoting is stripped by the shell
_before_ 'test' ever sees its arguments. Let's say that the caller has
a filename whose name actually is "-e" and passes that in as $1. So,
what does 'test' see?

    test -e = -

Rather than comparing literal string "-e" to literal string "-", it's
instead (almost) asking if the file named "=" exists; I say "almost"
because it's actually an error since switch -e only accepts one
argument, but it's being given two arguments, "=" and "-".

You might say that having a file named "-e" (or similar) is unlikely,
however, what is not unlikely is a caller passing "-" for
standard-input as $1. In this case, 'test' sees:

    test - = -

which may or may not be an error in a particular implementation of
'test'. Some implementations may understand that "-" is not a valid
switch, thus infer that you're actually asking for an equality
comparison between arguments, but other implementations may complain
either that there is no switch named "-" or that those arguments
simply make no sense.

This is why it's a very common idiom in shell programming with 'test'
to see "x" prepended, thus ensuring that the argument can't be
confused with a switch.

Taylor Blau Aug. 11, 2020, 9:03 p.m. UTC | #6

On Tue, Aug 11, 2020 at 03:25:03PM -0400, Eric Sunshine wrote:
> On Tue, Aug 11, 2020 at 2:33 PM Taylor Blau <me@ttaylorr.com> wrote:
> > On Sun, Aug 09, 2020 at 01:42:09PM -0400, Eric Sunshine wrote:
> > > +             test "x$2" = x- || test -e "$2" || BUG "test_cmp '$2' missing"
> >
> > Not related to your patch, but I've seen this style of "x$1" in a few
> > places in test-lib-functions.sh. Why can't this be written as 'test "$1"
> > = -'?
>
> Short answer: To prevent 'test' from thinking that the argument is a switch.

Makes sense. Now I feel silly for asking :).

> Longer answer:
>
> 'test' can accept both switches (i.e. "-e") and non-switch arguments.
> Keep in mind, too, that all the quoting is stripped by the shell
> _before_ 'test' ever sees its arguments. Let's say that the caller has
> a filename whose name actually is "-e" and passes that in as $1. So,
> what does 'test' see?
>
>     test -e = -
>
> Rather than comparing literal string "-e" to literal string "-", it's
> instead (almost) asking if the file named "=" exists; I say "almost"
> because it's actually an error since switch -e only accepts one
> argument, but it's being given two arguments, "=" and "-".
>
> You might say that having a file named "-e" (or similar) is unlikely,
> however, what is not unlikely is a caller passing "-" for
> standard-input as $1. In this case, 'test' sees:
>
>     test - = -
>
> which may or may not be an error in a particular implementation of
> 'test'. Some implementations may understand that "-" is not a valid
> switch, thus infer that you're actually asking for an equality
> comparison between arguments, but other implementations may complain
> either that there is no switch named "-" or that those arguments
> simply make no sense.
>
> This is why it's a very common idiom in shell programming with 'test'
> to see "x" prepended, thus ensuring that the argument can't be
> confused with a switch.

Thanks for a careful and helpful explanation, as always :-). Makes sense
to me.

Thanks,
Taylor

Jeff King Aug. 12, 2020, 3:37 p.m. UTC | #7

On Tue, Aug 11, 2020 at 03:25:03PM -0400, Eric Sunshine wrote:

> 'test' can accept both switches (i.e. "-e") and non-switch arguments.
> Keep in mind, too, that all the quoting is stripped by the shell
> _before_ 'test' ever sees its arguments. Let's say that the caller has
> a filename whose name actually is "-e" and passes that in as $1. So,
> what does 'test' see?
> 
>     test -e = -
> 
> Rather than comparing literal string "-e" to literal string "-", it's
> instead (almost) asking if the file named "=" exists; I say "almost"
> because it's actually an error since switch -e only accepts one
> argument, but it's being given two arguments, "=" and "-".

I don't think this is an error. The program can tell which you meant by
the number of arguments. POSIX lays out some rules here (from "man
1posix test" on my system, but I'm sure you can find it online):

  3 arguments:
    *  If $2 is a binary primary, perform the binary test of $1 and $3.

    *  If $1 is '!', negate the two-argument test of $2 and $3.

    *  If $1 is '(' and $3 is ')', perform the unary test of $2.  On
       systems that do not support the XSI option, the results are
       unspecified if $1 is '(' and $3 is ')'.

    *  Otherwise, produce unspecified results.

So we'd see that "=" is a binary primary (the complete set is defined
earlier). Likewise "! -e -" would hit the second rule. We wouldn't get
fooled by trying to compare the string "!" because it knows that "=" is
a binary operator and "-e" is a unary operator.

It gets weird with "-a" joining expressions. There's some discussion in
the Rationale section of the posix page, and it even warns explicitly
against "-a" (in favor of "test expr1 && test expr1").

> which may or may not be an error in a particular implementation of
> 'test'. Some implementations may understand that "-" is not a valid
> switch, thus infer that you're actually asking for an equality
> comparison between arguments, but other implementations may complain
> either that there is no switch named "-" or that those arguments
> simply make no sense.

Yeah, historically I think there were shells that were not quite so
clever. I have no idea which ones, or whether any are still around. I
don't think we've generally been that concerned with this case in Git,
and nobody has complained. I'd be totally unsurprised to hear that SunOS
/bin/sh doesn't get this right, but we've already written it off as
unusable (it doesn't even support $() expansion).

-Peff

Eric Sunshine Aug. 12, 2020, 4:15 p.m. UTC | #8

On Wed, Aug 12, 2020 at 11:37 AM Jeff King <peff@peff.net> wrote:
> On Tue, Aug 11, 2020 at 03:25:03PM -0400, Eric Sunshine wrote:
> >     test -e = -
> > Rather than comparing literal string "-e" to literal string "-", it's
> > instead (almost) asking if the file named "=" exists; I say "almost"
> > because it's actually an error since switch -e only accepts one
> > argument, but it's being given two arguments, "=" and "-".
>
> I don't think this is an error. The program can tell which you meant by
> the number of arguments. POSIX lays out some rules here (from "man
> 1posix test" on my system, but I'm sure you can find it online):

I intentionally didn't focus on or mention POSIX in my response
because I wanted to represent the Real World reason why "x$var" is
such a common idiom. As a person who regularly uses ancient operating
systems and old computers (it's common for me to be stuck on computers
10-25 years old; my most up-to-date computer is 9.5 years old), Real
World is something I focus on more than POSIX. Okay, I may be an
outlier, but still...

> > which may or may not be an error in a particular implementation of
> > 'test'. Some implementations may understand that "-" is not a valid
> > switch, thus infer that you're actually asking for an equality
> > comparison between arguments, but other implementations may complain
> > either that there is no switch named "-" or that those arguments
> > simply make no sense.
>
> Yeah, historically I think there were shells that were not quite so
> clever. I have no idea which ones, or whether any are still around. I
> don't think we've generally been that concerned with this case in Git,
> and nobody has complained. I'd be totally unsurprised to hear that SunOS
> /bin/sh doesn't get this right, but we've already written it off as
> unusable (it doesn't even support $() expansion).

I'm probably showing my age, but having had to deal with the many
inconsistencies of implementation over the years, it's hard to _not_
worry about such things even now. Adding the "x" to "$1" or to making
other minor portability concessions is reflex at this point. These
portability concerns are also always on mind when dealing with Mac OS
since so many of its utilities are ancient and retain the old
behaviors.

Eric Sunshine Aug. 12, 2020, 4:39 p.m. UTC | #9

On Wed, Aug 12, 2020 at 12:15 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Aug 12, 2020 at 11:37 AM Jeff King <peff@peff.net> wrote:
> > I don't think this is an error. The program can tell which you meant by
> > the number of arguments. POSIX lays out some rules here (from "man
> > 1posix test" on my system, but I'm sure you can find it online):
>
> I intentionally didn't focus on or mention POSIX in my response
> because I wanted to represent the Real World reason why "x$var" is
> such a common idiom. [...]

I probably should have done a better job in my original response to
Taylor to make it clear that I was talking about Real World (even if
old) rather than POSIX.

Jeff King Aug. 12, 2020, 5:10 p.m. UTC | #10

On Wed, Aug 12, 2020 at 12:39:14PM -0400, Eric Sunshine wrote:

> On Wed, Aug 12, 2020 at 12:15 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> > On Wed, Aug 12, 2020 at 11:37 AM Jeff King <peff@peff.net> wrote:
> > > I don't think this is an error. The program can tell which you meant by
> > > the number of arguments. POSIX lays out some rules here (from "man
> > > 1posix test" on my system, but I'm sure you can find it online):
> >
> > I intentionally didn't focus on or mention POSIX in my response
> > because I wanted to represent the Real World reason why "x$var" is
> > such a common idiom. [...]
> 
> I probably should have done a better job in my original response to
> Taylor to make it clear that I was talking about Real World (even if
> old) rather than POSIX.

Yeah. I guess I'm questioning how current that Real World view is. It
hasn't bitten us yet, though we do seem to do the "x" thing in some
places. And most of our shell code is in the test suite, which sees
pretty tame filenames.

-Peff

Jeff King Oct. 16, 2020, 12:17 a.m. UTC | #11

On Sun, Aug 09, 2020 at 01:42:09PM -0400, Eric Sunshine wrote:

> Under normal circumstances, if a test author misspells a filename passed
> to test_cmp(), the error is quickly discovered when the test fails
> unexpectedly due to test_cmp() being unable to find the file. However,
> if the test is expected to fail, as with test_expect_failure(), a
> misspelled filename as argument to test_cmp() will go unnoticed since
> the test will indeed fail, but for the wrong reason. Make it easier for
> test authors to discover such problems early by sanity-checking the
> arguments to test_cmp(). To avoid penalizing all clients of test_cmp()
> in the general case, only check for missing files if the comparison
> fails.

This patch caused some interesting confusion for me today.

I was looking at the patch from [1] which causes a test failure (and I
wanted to see where it failed, etc). And I got:

  $ ./t5601-clone.sh
  ok 1 - setup
  ok 2 - clone with excess parameters (1)
  ok 3 - clone with excess parameters (2)
  ok 4 - output from clone
  ok 5 - clone does not keep pack
  ok 6 - clone checks out files
  ok 7 - clone respects GIT_WORK_TREE
  error: bug in the test script: test_cmp 'r2/.git/HEAD' missing

which was somewhat unhelpful (or at least less helpful than a regular
test failure). The test in question does this:

	test_cmp r0/.git/HEAD r2/.git/HEAD &&

and expects to fail if an earlier step didn't correctly create r2. Is it
a bug or misuse of test_cmp for it to do so? I could see an argument
that it is, but I'm also not sure if there's a convenient alternative.
The best I could come up with is:

  test_path_is_file r2/.git/HEAD &&
  test_cmp r0/.git/HEAD r2/.git/HEAD

which isn't that great.

-Peff

[1] https://lore.kernel.org/git/20200130102933.GE840531@coredump.intra.peff.net/

Eric Sunshine Oct. 16, 2020, 2:18 a.m. UTC | #12

On Thu, Oct 15, 2020 at 8:17 PM Jeff King <peff@peff.net> wrote:
> On Sun, Aug 09, 2020 at 01:42:09PM -0400, Eric Sunshine wrote:
> > [...] Make it easier for test authors to discover such problems early
> > by sanity-checking the arguments to test_cmp(). [...]
>
> This patch caused some interesting confusion for me today.
>   error: bug in the test script: test_cmp 'r2/.git/HEAD' missing
> which was somewhat unhelpful (or at least less helpful than a regular
> test failure). The test in question does this:
>         test_cmp r0/.git/HEAD r2/.git/HEAD &&
> and expects to fail if an earlier step didn't correctly create r2. Is it
> a bug or misuse of test_cmp for it to do so? I could see an argument
> that it is, but I'm also not sure if there's a convenient alternative.

I can see the argument going both ways as to whether it's a misuse of
'test_cmp'.

> The best I could come up with is:
>
>   test_path_is_file r2/.git/HEAD &&
>   test_cmp r0/.git/HEAD r2/.git/HEAD
>
> which isn't that great.

Ævar ran into the same issue recently[1] and came up with the same
workaround. Despite its good intention (trying to catch bugs in
'test_expect_failure' tests), this change[2] doesn't seem to have
caught any genuine bugs (it wouldn't even have caught the bug which
served as its inspiration[3]), but has nevertheless caused a couple
hiccups already. As such, I would not be opposed to seeing the change
reverted.

[1]: https://lore.kernel.org/git/20200921104000.2304-15-avarab@gmail.com/
[2]: https://lore.kernel.org/git/20200809060810.31370-1-sunshine@sunshineco.com/
[3]: https://lore.kernel.org/git/7f408b7d4069403b969d334f4940ebf87f1dc797.1596906081.git.gitgitgadget@gmail.com/

Junio C Hamano Oct. 16, 2020, 6:36 p.m. UTC | #13

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Thu, Oct 15, 2020 at 8:17 PM Jeff King <peff@peff.net> wrote:
>> On Sun, Aug 09, 2020 at 01:42:09PM -0400, Eric Sunshine wrote:
>> > [...] Make it easier for test authors to discover such problems early
>> > by sanity-checking the arguments to test_cmp(). [...]
>>
>> This patch caused some interesting confusion for me today.
>>   error: bug in the test script: test_cmp 'r2/.git/HEAD' missing
>> which was somewhat unhelpful (or at least less helpful than a regular
>> test failure). The test in question does this:
>>         test_cmp r0/.git/HEAD r2/.git/HEAD &&
>> and expects to fail if an earlier step didn't correctly create r2. Is it
>> a bug or misuse of test_cmp for it to do so? I could see an argument
>> that it is, but I'm also not sure if there's a convenient alternative.
>
> I can see the argument going both ways as to whether it's a misuse of
> 'test_cmp'.
>
>> The best I could come up with is:
>>
>>   test_path_is_file r2/.git/HEAD &&
>>   test_cmp r0/.git/HEAD r2/.git/HEAD
>>
>> which isn't that great.

Hmph, I agree that the "both must be file" is a bit too eager and
ignores that "they must match, but the possible reasons they may not
include one of them may be missing" use case.

> Ævar ran into the same issue recently[1] and came up with the same
> workaround. Despite its good intention (trying to catch bugs in
> 'test_expect_failure' tests), this change[2] doesn't seem to have
> caught any genuine bugs (it wouldn't even have caught the bug which
> served as its inspiration[3]), but has nevertheless caused a couple
> hiccups already. As such, I would not be opposed to seeing the change
> reverted.

Sounds good.  Anybody wants to do the honors?

Junio C Hamano Oct. 16, 2020, 8:56 p.m. UTC | #14

Junio C Hamano <gitster@pobox.com> writes:

> Hmph, I agree that the "both must be file" is a bit too eager and
> ignores that "they must match, but the possible reasons they may not
> include one of them may be missing" use case.
>
>> Ævar ran into the same issue recently[1] and came up with the same
>> workaround. Despite its good intention (trying to catch bugs in
>> 'test_expect_failure' tests), this change[2] doesn't seem to have
>> caught any genuine bugs (it wouldn't even have caught the bug which
>> served as its inspiration[3]), but has nevertheless caused a couple
>> hiccups already. As such, I would not be opposed to seeing the change
>> reverted.
>
> Sounds good.  Anybody wants to do the honors?

For now I've queued this directly on top of the offending change and
merged the result in 'seen', perhaps to be advanced to 'next' and
'master' after the 2.29 final unless a better version comes.

Thanks.

-- >8 --
commit b0b2d8b4e00fd34c0031b6dbcd67b4bcf22d864c
Author: Junio C Hamano <gitster@pobox.com>
Date:   Fri Oct 16 13:51:04 2020 -0700

    Revert "test_cmp: diagnose incorrect arguments"
    
    This reverts commit d572f52a64c6a69990f72ad6a09504b9b615d2e4; the
    idea to detect that "test_cmp expect actual" was fed a misspelt
    filename meant well, but when the version of Git tested exhibits a
    bug, the reason why these two files do not match may be because one
    of them did not get created as expected, in which case missing file
    is not a sign of misspelt filename but is a genuine test failure.
---
 t/test-lib-functions.sh | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 21225330c2..3103be8a32 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -905,13 +905,7 @@ test_expect_code () {
 # - not all diff versions understand "-u"
 
 test_cmp() {
-	test $# -eq 2 || BUG "test_cmp requires two arguments"
-	if ! eval "$GIT_TEST_CMP" '"$@"'
-	then
-		test "x$1" = x- || test -e "$1" || BUG "test_cmp '$1' missing"
-		test "x$2" = x- || test -e "$2" || BUG "test_cmp '$2' missing"
-		return 1
-	fi
+	eval "$GIT_TEST_CMP" '"$@"'
 }
 
 # Check that the given config key has the expected value.
@@ -940,13 +934,7 @@ test_cmp_config() {
 # test_cmp_bin - helper to compare binary files
 
 test_cmp_bin() {
-	test $# -eq 2 || BUG "test_cmp_bin requires two arguments"
-	if ! cmp "$@"
-	then
-		test "x$1" = x- || test -e "$1" || BUG "test_cmp_bin '$1' missing"
-		test "x$2" = x- || test -e "$2" || BUG "test_cmp_bin '$2' missing"
-		return 1
-	fi
+	cmp "$@"
 }
 
 # Use this instead of test_cmp to compare files that contain expected and

Junio C Hamano Oct. 16, 2020, 11:14 p.m. UTC | #15

Junio C Hamano <gitster@pobox.com> writes:

>  test_cmp() {
> -	test $# -eq 2 || BUG "test_cmp requires two arguments"
> -	if ! eval "$GIT_TEST_CMP" '"$@"'
> -	then
> -		test "x$1" = x- || test -e "$1" || BUG "test_cmp '$1' missing"
> -		test "x$2" = x- || test -e "$2" || BUG "test_cmp '$2' missing"
> -		return 1
> -	fi
> +	eval "$GIT_TEST_CMP" '"$@"'
>  }
>  
>  # Check that the given config key has the expected value.
> @@ -940,13 +934,7 @@ test_cmp_config() {
>  # test_cmp_bin - helper to compare binary files
>  
>  test_cmp_bin() {
> -	test $# -eq 2 || BUG "test_cmp_bin requires two arguments"
> -	if ! cmp "$@"
> -	then
> -		test "x$1" = x- || test -e "$1" || BUG "test_cmp_bin '$1' missing"
> -		test "x$2" = x- || test -e "$2" || BUG "test_cmp_bin '$2' missing"
> -		return 1
> -	fi
> +	cmp "$@"
>  }

Looking at this again, I think we could keep the "we should have two
arguments, no more than, no less than, but exactly two".  But I think
those who write new tests are working to eventually make them pass,
so hopefully they'll notice and investigate test_cmp that yields false
anyway, I guess.

Eric Sunshine Oct. 17, 2020, 6:06 a.m. UTC | #16

On Fri, Oct 16, 2020 at 7:14 PM Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> >  test_cmp() {
> > -     test $# -eq 2 || BUG "test_cmp requires two arguments"
> > -     if ! eval "$GIT_TEST_CMP" '"$@"'
>
> Looking at this again, I think we could keep the "we should have two
> arguments, no more than, no less than, but exactly two".  But I think
> those who write new tests are working to eventually make them pass,
> so hopefully they'll notice and investigate test_cmp that yields false
> anyway, I guess.

The purpose of the change in question[1] was specifically to help
catch bugs in the test_expect_failure() case, which is quite rare in
the Git test suite (and hopefully becomes rarer over time). So, I
think it's fine to revert the change fully, including the 2-argument
check since, as you say, test writers are normally trying to make
their tests pass, and the 2-argument check doesn't provide much, if
any, value for the test_expect_success() case.

[1]: d572f52a64 (test_cmp: diagnose incorrect arguments, 2020-08-09)

[v2] test_cmp: diagnose incorrect arguments

Commit Message

Comments

Patch