diff mbox series

[3/9] ci: make it easier to find failed tests' logs in the GitHub workflow

Message ID 98891b0d3f7927086350cc9523736ae2eb7c1ee2.1643050574.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series ci: make Git's GitHub workflow output much more helpful | expand

Commit Message

Johannes Schindelin Jan. 24, 2022, 6:56 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

You currently have to know a lot of implementation details when
investigating test failures in the CI runs. The first step is easy: the
failed job is marked quite clearly, but when opening it, the failed step
is expanded, which in our case is the one running
`ci/run-build-and-tests.sh`. This step, most notably, only offers a
high-level view of what went wrong: it prints the output of `prove`
which merely tells the reader which test script failed.

The actually interesting part is in the detailed log of said failed
test script. But that log is shown in the CI run's step that runs
`ci/print-test-failures.sh`. And that step is _not_ expanded in the web
UI by default.

Let's help the reader by showing the failed tests' detailed logs in the
step that is expanded automatically, i.e. directly after the test suite
failed.

This also helps the situation where the _build_ failed and the
`print-test-failures` step was executed under the assumption that the
_test suite_ failed, and consequently failed to find any failed tests.

An alternative way to implement this patch would be to source
`ci/print-test-failures.sh` in the `handle_test_failures` function to
show these logs. However, over the course of the next few commits, we
want to introduce some grouping which would be harder to achieve that
way (for example, we do want a leaner, and colored, preamble for each
failed test script, and it would be trickier to accommodate the lack of
nested groupings in GitHub workflows' output).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .github/workflows/main.yml | 12 ------------
 ci/lib.sh                  | 23 +++++++++++++++++++++++
 ci/run-build-and-tests.sh  |  3 ++-
 ci/run-test-slice.sh       |  3 ++-
 4 files changed, 27 insertions(+), 14 deletions(-)

Comments

Ævar Arnfjörð Bjarmason Jan. 25, 2022, 11:48 p.m. UTC | #1
On Mon, Jan 24 2022, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> You currently have to know a lot of implementation details when
> investigating test failures in the CI runs. The first step is easy: the
> failed job is marked quite clearly, but when opening it, the failed step
> is expanded, which in our case is the one running
> `ci/run-build-and-tests.sh`. This step, most notably, only offers a
> high-level view of what went wrong: it prints the output of `prove`
> which merely tells the reader which test script failed.
>
> The actually interesting part is in the detailed log of said failed
> test script. But that log is shown in the CI run's step that runs
> `ci/print-test-failures.sh`. And that step is _not_ expanded in the web
> UI by default.
>
> Let's help the reader by showing the failed tests' detailed logs in the
> step that is expanded automatically, i.e. directly after the test suite
> failed.
>
> This also helps the situation where the _build_ failed and the
> `print-test-failures` step was executed under the assumption that the
> _test suite_ failed, and consequently failed to find any failed tests.
>
> An alternative way to implement this patch would be to source
> `ci/print-test-failures.sh` in the `handle_test_failures` function to
> show these logs. However, over the course of the next few commits, we
> want to introduce some grouping which would be harder to achieve that
> way (for example, we do want a leaner, and colored, preamble for each
> failed test script, and it would be trickier to accommodate the lack of
> nested groupings in GitHub workflows' output).

Is it really better to have the first thing you see in a failing job be
this level of detail?

To take the "before" demo job from your CL, if you click on a failing
job you'll currently end up with ~1600 lines of "prove" setup and
output, culminating in (the browser auto-scrolls to the end):

    [...]
    Test Summary Report
    -------------------
    t1092-sparse-checkout-compatibility.sh           (Wstat: 256 Tests: 53 Failed: 1)
      Failed test:  49
      Non-zero exit status: 1
    t3701-add-interactive.sh                         (Wstat: 0 Tests: 71 Failed: 0)
      TODO passed:   45, 47
    Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr  1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU)
    Result: FAIL

Is it ideal? No. But I think that folding the ci/print-test-failures.sh
output into that step makes it much worse. Now I'll be sent to the very
bottom of ~16000 lines (yes, that's an extra zero at the end) of output,
ending in:

    [...]
    + test_cmp expect sparse-checkout-out
    + test 2 -ne 2
    + GIT_ALLOC_LIMIT=0 eval diff -u "$@"
    + diff -u expect sparse-checkout-out
    + test_cmp full-checkout-err sparse-checkout-err
    + test 2 -ne 2
    + GIT_ALLOC_LIMIT=0 eval diff -u "$@"
    + diff -u full-checkout-err sparse-checkout-err
    + test_cmp full-checkout-err sparse-index-err
    + test 2 -ne 2
    + GIT_ALLOC_LIMIT=0 eval diff -u "$@"
    + diff -u full-checkout-err sparse-index-err
    
    ok 53 - checkout behaves oddly with df-conflict-2
    # failed 1 among 53 test(s)
    1..53

Now you'll need to scroll up or search just to see what test failed.

Usually when these fail I might only look at the failing test name (at
that point already knowing why it failed). I think it's a feature that
we only expand the verbose output later.

I realize that:

1) This isn't the exact output you emit in the post-image here, since you're not
actually using ci/print-test-failures.sh, but from eyeballing the script
it seems to do basically the same thing, i.e. it'll emit the full *.out
file.

B.t.w. why isn't this using ci/print-test-failures.sh. Your "an
alternative way" paragraph doesn't really explain it. Sure, it'll be
further tweaked later, but in the meantime do we have to re-invent
ci/print-test-failures.sh? Anyway...

2) The end-state at the end of this series looks somewhat different, but I think
that end-state shares the UX problem noted above, and to some extent
makes it worse.

That one has 28 thousand lines of output!

Now I know it's elided so you're only supposed to see a few screenfulls
of it, but at least in my browser that end-state is *very slow*, much
slower than the version that shows me the full ~16 thousand lines at
once.

Presumably it's doing some very expensive JavaScript/CSS behind the
scenes.

I mean so slow that when I press page up and down I can see 3-8 lines of
that folded output appear at once, then the next 3-8 lines etc. The
current output meanwhile (and this more verbose one) is
near-instant. This is in Firefox 91.4, if it matters.
diff mbox series

Patch

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index c35200defb9..3fa88b78b6d 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -119,10 +119,6 @@  jobs:
     - name: test
       shell: bash
       run: . /etc/profile && ci/run-test-slice.sh ${{matrix.nr}} 10
-    - name: ci/print-test-failures.sh
-      if: failure()
-      shell: bash
-      run: ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -204,10 +200,6 @@  jobs:
       env:
         NO_SVN_TESTS: 1
       run: . /etc/profile && ci/run-test-slice.sh ${{matrix.nr}} 10
-    - name: ci/print-test-failures.sh
-      if: failure()
-      shell: bash
-      run: ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -261,8 +253,6 @@  jobs:
     - uses: actions/checkout@v2
     - run: ci/install-dependencies.sh
     - run: ci/run-build-and-tests.sh
-    - run: ci/print-test-failures.sh
-      if: failure()
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v2
@@ -292,8 +282,6 @@  jobs:
     - uses: actions/checkout@v1
     - run: ci/install-docker-dependencies.sh
     - run: ci/run-build-and-tests.sh
-    - run: ci/print-test-failures.sh
-      if: failure()
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v1
diff --git a/ci/lib.sh b/ci/lib.sh
index ebb502640fa..2b2c0932320 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -78,6 +78,10 @@  check_unignored_build_artifacts () {
 	}
 }
 
+handle_failed_tests () {
+	return 1
+}
+
 # GitHub Action doesn't set TERM, which is required by tput
 export TERM=${TERM:-dumb}
 
@@ -123,6 +127,25 @@  then
 	CI_JOB_ID="$GITHUB_RUN_ID"
 	CC="${CC:-gcc}"
 	DONT_SKIP_TAGS=t
+	handle_failed_tests () {
+		mkdir -p t/failed-test-artifacts
+		echo "FAILED_TEST_ARTIFACTS=t/failed-test-artifacts" >>$GITHUB_ENV
+
+		for test_exit in t/test-results/*.exit
+		do
+			test 0 != "$(cat "$test_exit")" || continue
+
+			test_name="${test_exit%.exit}"
+			test_name="${test_name##*/}"
+			printf "\\e[33m\\e[1m=== Failed test: ${test_name} ===\\e[m\\n"
+			cat "t/test-results/$test_name.out"
+
+			trash_dir="t/trash directory.$test_name"
+			cp "t/test-results/$test_name.out" t/failed-test-artifacts/
+			tar czf t/failed-test-artifacts/"$test_name".trash.tar.gz "$trash_dir"
+		done
+		return 1
+	}
 
 	cache_dir="$HOME/none"
 
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index b70373c172f..e49f9eaa8c0 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -51,7 +51,8 @@  esac
 make
 if test -n "$run_tests"
 then
-	make test
+	make test ||
+	handle_failed_tests
 fi
 check_unignored_build_artifacts
 
diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh
index f8c2c3106a2..63358c23e11 100755
--- a/ci/run-test-slice.sh
+++ b/ci/run-test-slice.sh
@@ -12,6 +12,7 @@  esac
 
 make --quiet -C t T="$(cd t &&
 	./helper/test-tool path-utils slice-tests "$1" "$2" t[0-9]*.sh |
-	tr '\n' ' ')"
+	tr '\n' ' ')" ||
+handle_failed_tests
 
 check_unignored_build_artifacts