[5/5] test-lib-functions: restrict test_must_fail usage

Message ID	01e29450fe51a4ba13e07c611d8795ffd0282b9e.1593529394.git.liu.denton@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=HTZL=AL=vger.kernel.org=git-owner@kernel.org> From: Denton Liu <liu.denton@gmail.com> To: Git Mailing List <git@vger.kernel.org> Subject: [PATCH 5/5] test-lib-functions: restrict test_must_fail usage Date: Tue, 30 Jun 2020 11:03:19 -0400 Message-Id: <01e29450fe51a4ba13e07c611d8795ffd0282b9e.1593529394.git.liu.denton@gmail.com> In-Reply-To: <cover.1593529394.git.liu.denton@gmail.com> References: <cover.1593529394.git.liu.denton@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: git-owner@vger.kernel.org Precedence: bulk
Series	t: replace incorrect test_must_fail usage (part 6) \| expand [0/5] t: replace incorrect test_must_fail usage (part 6) [1/5] t3701: stop using `env` in force_color() [2/5] t5324: reorder `run_with_limited_open_files test_might_fail` [3/5] t7107: don't use test_must_fail() [4/5] t9834: remove use of `test_might_fail p4` [5/5] test-lib-functions: restrict test_must_fail usage

Message ID

01e29450fe51a4ba13e07c611d8795ffd0282b9e.1593529394.git.liu.denton@gmail.com (mailing list archive)

State

New, archived

Headers

From: Denton Liu <liu.denton@gmail.com>
To: Git Mailing List <git@vger.kernel.org>
Subject: [PATCH 5/5] test-lib-functions: restrict test_must_fail usage
Date: Tue, 30 Jun 2020 11:03:19 -0400
Message-Id: 
 <01e29450fe51a4ba13e07c611d8795ffd0282b9e.1593529394.git.liu.denton@gmail.com>
In-Reply-To: <cover.1593529394.git.liu.denton@gmail.com>
References: <cover.1593529394.git.liu.denton@gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: git-owner@vger.kernel.org
Precedence: bulk

Series

t: replace incorrect test_must_fail usage (part 6) | expand

Commit Message

Denton Liu June 30, 2020, 3:03 p.m. UTC

In previous commits, we removed the usage of test_must_fail() for most
commands except for a set of pre-approved commands. Since that's done,
only allow test_must_fail() to run those pre-approved commands.

Obviously, we should allow `git`.

We allow `__git*` as some completion functions return an error code that
comes from a git invocation. It's good to avoid using test_must_fail
unnecessarily but it wouldn't hurt to err on the side of caution when
we're potentially wrapping a git command (like in these case).

We also allow `test-tool` and `test-svn-fe` because these are helper
commands that are written by us and we want to catch their failure.

Finally, we allow `test_terminal` because `test_terminal` just wraps
around git commands. Also, we cannot rewrite
`test_must_fail test_terminal` as `test_terminal test_must_fail` because
test_must_fail() is a shell function and as a result, it cannot be
invoked from the test-terminal Perl script.

We opted to explicitly list the above tools instead of using a catch-all
such as `test[-_]*` because we want to be as restrictive as possible so
that in the future, someone would not accidentally introduce an
unrelated usage of test_must_fail() on an "unapproved" command.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 t/t0000-basic.sh        | 18 ++++++++++++++++++
 t/test-lib-functions.sh | 39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

Comments

Eric Sunshine June 30, 2020, 8:56 p.m. UTC | #1

On Tue, Jun 30, 2020 at 11:03 AM Denton Liu <liu.denton@gmail.com> wrote:
> In previous commits, we removed the usage of test_must_fail() for most
> commands except for a set of pre-approved commands. Since that's done,
> only allow test_must_fail() to run those pre-approved commands.
>
> Obviously, we should allow `git`.
>
> We allow `__git*` as some completion functions return an error code that
> comes from a git invocation. It's good to avoid using test_must_fail
> unnecessarily but it wouldn't hurt to err on the side of caution when
> we're potentially wrapping a git command (like in these case).

s/case/cases/

> Signed-off-by: Denton Liu <liu.denton@gmail.com>
> ---
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> +# Returns success if the arguments indicate that a command should be
> +# accepted by test_must_fail(). If the command is run with env, the env
> +# and its corresponding variable settings will be stripped before we
> +# test the command being run.
> +test_must_fail_acceptable () {
> +       while test "$1" = "env"

I was surprised to see a 'while' loop for stripping 'env'. Did you
actually run across cases in the test suite in which 'env' was
invoking 'env'? If so, were such cases legitimate (as opposed to
accidental)? Perhaps the commit message or an in-code comment could
help readers understand why it needs to strip multiple 'env's.

> +       do
> +               shift
> +               while test $# -gt 0
> +               do
> +                       case "$1" in *?=*) ;; *) break ;; esac
> +                       shift
> +               done
> +       done

Isn't '*?=*' the same as '?=', or am I misunderstanding the intention?
Also, I wonder how important it is to insist that there must be at
least one character before the '=' sign. (It doesn't necessarily hurt,
but I'm curious if it is protecting against legitimate weird cases.)

This logic would be easier to follow written this way:

    case "$1" in
        =) shift ;;
        *) break ;;
    esac

That is, place the 'shift' in the appropriate case-arm rather than
suspending it below all cases.

> +       case "$1" in
> +       git|__git*|test-tool|test-svn-fe|test_terminal)
> +               return 0
> +               ;;
> +       *)
> +               return 1
> +               ;;
> +       esac
> +}

Would it make sense to error out if "$1" has no value? That is, if the
author wrote:

    test_must_fail &&

or

    test_must_fail env foo=bar &&

then that surely is a programmer error, which could be diagnosed here
(though the original 'test_must_fail' didn't bother diagnosing that
problem so it may be overkill and outside the scope of this series to
do so here).

> @@ -817,6 +842,15 @@ list_contains () {
> +# Do not use this to run anything but "git" and other specific testable
> +# commands (see test_must_fail_acceptable()).  We are not in the
> +# business of vetting system supplied commands -- in other words, this
> +# is wrong:
> +#
> +#    test_must_fail grep pattern output
> +#
> +# Just use '!' instead.

I find this somewhat ambiguous; it's not clear at first sight what I'm
supposed to do with '!'. t/README is slightly clearer by saying "use
'! cmd' instead". It might be even clearer to spell it out explicitly
with an example:

    Instead use '!':

        ! grep pattern output

Eric Sunshine June 30, 2020, 9:59 p.m. UTC | #2

On Tue, Jun 30, 2020 at 4:56 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Tue, Jun 30, 2020 at 11:03 AM Denton Liu <liu.denton@gmail.com> wrote:
> > +                       case "$1" in *?=*) ;; *) break ;; esac
>
> Isn't '*?=*' the same as '?=', or am I misunderstanding the intention?

Nevermind. My brain blipped out while reading that. Obviously '?=' is
not the same as '*?=*' here.

The rest of the review comments still stand (I think).

Denton Liu June 30, 2020, 11:39 p.m. UTC | #3

Hi Eric,

On Tue, Jun 30, 2020 at 04:56:22PM -0400, Eric Sunshine wrote:
> > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> > +# Returns success if the arguments indicate that a command should be
> > +# accepted by test_must_fail(). If the command is run with env, the env
> > +# and its corresponding variable settings will be stripped before we
> > +# test the command being run.
> > +test_must_fail_acceptable () {
> > +       while test "$1" = "env"
> 
> I was surprised to see a 'while' loop for stripping 'env'. Did you
> actually run across cases in the test suite in which 'env' was
> invoking 'env'? If so, were such cases legitimate (as opposed to
> accidental)? Perhaps the commit message or an in-code comment could
> help readers understand why it needs to strip multiple 'env's.

There are no cases of nested envs. When I was writing this, I wanted to
be as permissive as possible in case someone wrote something like this.
Now that you bring it up, though, I don't think it makes sense to
support this use case because I can't come up with any legitimate reason
to have nested envs. (If something comes up in the future, we can
reinstate this behaviour.)

> > +       do
> > +               shift
> > +               while test $# -gt 0
> > +               do
> > +                       case "$1" in *?=*) ;; *) break ;; esac
> > +                       shift
> > +               done
> > +       done

> Would it make sense to error out if "$1" has no value? That is, if the
> author wrote:
> 
>     test_must_fail &&
> 
> or
> 
>     test_must_fail env foo=bar &&
> 
> then that surely is a programmer error, which could be diagnosed here
> (though the original 'test_must_fail' didn't bother diagnosing that
> problem so it may be overkill and outside the scope of this series to
> do so here).

This is caught by the 

	case "$1" in
	git|__git*|test-tool|test-svn-fe|test_terminal)
		return 0
		;;
	*)
		return 1
		;;
	esac

part. It only emits

	test_must_fail: only 'git' is allowed:

if this happens but it's probably fine as I don't think we should do
much hand-holding for this case.

And I'll use the remainder of your suggestions.

Thanks,

Denton

diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 2ff176cd5d..f5e4fb515d 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -1271,4 +1271,22 @@  test_expect_success 'very long name in the index handled sanely' '
 	test $len = 4098
 '
 
+test_expect_success 'test_must_fail on a failing git command' '
+	test_must_fail git notacommand
+'
+
+test_expect_success 'test_must_fail on a failing git command with env' '
+	test_must_fail env var1=a var2=b env var3=c git notacommand
+'
+
+test_expect_success 'test_must_fail rejects a non-git command' '
+	! test_must_fail grep ^$ notafile 2>err &&
+	grep -F "test_must_fail: only '"'"'git'"'"' is allowed" err
+'
+
+test_expect_success 'test_must_fail rejects a non-git command with env' '
+	! test_must_fail env var1=a var2=b env var3=c grep ^$ notafile 2>err &&
+	grep -F "test_must_fail: only '"'"'git'"'"' is allowed" err
+'
+
 test_done
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 3103be8a32..16596b28ba 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -798,6 +798,31 @@  list_contains () {
 	return 1
 }
 
+# Returns success if the arguments indicate that a command should be
+# accepted by test_must_fail(). If the command is run with env, the env
+# and its corresponding variable settings will be stripped before we
+# test the command being run.
+test_must_fail_acceptable () {
+	while test "$1" = "env"
+	do
+		shift
+		while test $# -gt 0
+		do
+			case "$1" in *?=*) ;; *) break ;; esac
+			shift
+		done
+	done
+
+	case "$1" in
+	git|__git*|test-tool|test-svn-fe|test_terminal)
+		return 0
+		;;
+	*)
+		return 1
+		;;
+	esac
+}
+
 # This is not among top-level (test_expect_success | test_expect_failure)
 # but is a prefix that can be used in the test script, like:
 #
@@ -817,6 +842,15 @@  list_contains () {
 #     Multiple signals can be specified as a comma separated list.
 #     Currently recognized signal names are: sigpipe, success.
 #     (Don't use 'success', use 'test_might_fail' instead.)
+#
+# Do not use this to run anything but "git" and other specific testable
+# commands (see test_must_fail_acceptable()).  We are not in the
+# business of vetting system supplied commands -- in other words, this
+# is wrong:
+#
+#    test_must_fail grep pattern output
+#
+# Just use '!' instead.
 
 test_must_fail () {
 	case "$1" in
@@ -828,6 +862,11 @@  test_must_fail () {
 		_test_ok=
 		;;
 	esac
+	if ! test_must_fail_acceptable "$@"
+	then
+	    echo >&7 "test_must_fail: only 'git' is allowed: $*"
+	    return 1
+	fi
 	"$@" 2>&7
 	exit_code=$?
 	if test $exit_code -eq 0 && ! list_contains "$_test_ok" success

[5/5] test-lib-functions: restrict test_must_fail usage

Commit Message

Comments

Patch