Message ID | pull.1613.git.1699894837844.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | ci: avoid running the test suite _twice_ | expand |
On Mon, Nov 13, 2023 at 05:00:37PM +0000, Johannes Schindelin via GitGitGadget wrote: > This is a late amendment of 19ec39aab54 (ci: stop linking the `prove` > cache, 2022-07-10), fixing a bug that had been hidden so far. We don't seem to have that commit in Junio's tree; it is only in git-for-windows. Not that we should not fix things if they are broken, but I am trying to understand if git/git is experiencing the same bug. It sounds like not yet, though from looking at 19ec39aab54, I would expect to get these doubled runs any time we store the prove state. But maybe without that commit our state-file symlink is going somewhere invalid, and prove fails to actually store anything? > But starting with that commit, we run `prove` _twice_ in CI, and with > completely different sets of tests to run. Due to the bug, the second > invocation re-runs all of the tests that were already run as part of the > first invocation. This not only wastes build minutes, it also frequently > causes the `osx-*` jobs to fail because they already take a long time > and now are likely to run into a timeout. > > The worst part about it is that there is actually no benefit to keep > running with `--state=slow,save`, ever since we decided no longer to > try to reuse the Prove cache between CI runs. > > So let's just drop that Prove option and live happily ever after. Yes, I think this is the right thing to do regardless. If we are not saving the state to use between two related runs, there is no point storing it in the first place. I do have to wonder, though, as somebody who did not follow the unit-test topic closely: why are the unit tests totally separate from the rest of the suite? I would think we'd want them run from one or more t/t*.sh scripts. That would make bugs like this impossible, but also: 1. They'd be run via "make test", so developers don't have to remember to run them separately. 2. They can be run in parallel with all of the other tests when using "prove -j", etc. -Peff
Jeff King <peff@peff.net> writes: > I do have to wonder, though, as somebody who did not follow the > unit-test topic closely: why are the unit tests totally separate from > the rest of the suite? I would think we'd want them run from one or more > t/t*.sh scripts. That would make bugs like this impossible, but also: > > 1. They'd be run via "make test", so developers don't have to remember > to run them separately. > > 2. They can be run in parallel with all of the other tests when using > "prove -j", etc. Very good points. Josh?
On 2023.11.14 08:55, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > I do have to wonder, though, as somebody who did not follow the > > unit-test topic closely: why are the unit tests totally separate from > > the rest of the suite? I would think we'd want them run from one or more > > t/t*.sh scripts. That would make bugs like this impossible, but also: > > > > 1. They'd be run via "make test", so developers don't have to remember > > to run them separately. > > > > 2. They can be run in parallel with all of the other tests when using > > "prove -j", etc. > > Very good points. Josh? In short, the last time I tried to add something to CI, it was not well received, so I've been perhaps overly cautious in keeping the unit-tests well-separated from other targets. But I can send a follow-up patch to fold them into `make test`. Or would you prefer that I send a v11 of js/doc-unit-tests instead?
Josh Steadmon <steadmon@google.com> writes: > On 2023.11.14 08:55, Junio C Hamano wrote: >> Jeff King <peff@peff.net> writes: >> >> > I do have to wonder, though, as somebody who did not follow the >> > unit-test topic closely: why are the unit tests totally separate from >> > the rest of the suite? I would think we'd want them run from one or more >> > t/t*.sh scripts. That would make bugs like this impossible, but also: >> > >> > 1. They'd be run via "make test", so developers don't have to remember >> > to run them separately. >> > >> > 2. They can be run in parallel with all of the other tests when using >> > "prove -j", etc. >> >> Very good points. Josh? > > In short, the last time I tried to add something to CI, it was not well > received, so I've been perhaps overly cautious in keeping the unit-tests > well-separated from other targets. But I can send a follow-up patch to > fold them into `make test`. Or would you prefer that I send a v11 of > js/doc-unit-tests instead? Incremental patches to update what is in 'next' would let us try out the new arragement to drive the tests from the main "make test" eaarlier. Post release, a new iteration could replace the series wholesale as we will have an opportunity to rebuild 'next', but it would be nice for the end states to match, if you were to do both. Thanks.
On 2023.11.13 13:49, Jeff King wrote: > On Mon, Nov 13, 2023 at 05:00:37PM +0000, Johannes Schindelin via GitGitGadget wrote: > > > This is a late amendment of 19ec39aab54 (ci: stop linking the `prove` > > cache, 2022-07-10), fixing a bug that had been hidden so far. > > We don't seem to have that commit in Junio's tree; it is only in > git-for-windows. > > Not that we should not fix things if they are broken, but I am trying > to understand if git/git is experiencing the same bug. It sounds like > not yet, though from looking at 19ec39aab54, I would expect to get these > doubled runs any time we store the prove state. But maybe without that > commit our state-file symlink is going somewhere invalid, and prove > fails to actually store anything? > > > But starting with that commit, we run `prove` _twice_ in CI, and with > > completely different sets of tests to run. Due to the bug, the second > > invocation re-runs all of the tests that were already run as part of the > > first invocation. This not only wastes build minutes, it also frequently > > causes the `osx-*` jobs to fail because they already take a long time > > and now are likely to run into a timeout. > > > > The worst part about it is that there is actually no benefit to keep > > running with `--state=slow,save`, ever since we decided no longer to > > try to reuse the Prove cache between CI runs. > > > > So let's just drop that Prove option and live happily ever after. > > Yes, I think this is the right thing to do regardless. If we are not > saving the state to use between two related runs, there is no point > storing it in the first place. > > I do have to wonder, though, as somebody who did not follow the > unit-test topic closely: why are the unit tests totally separate from > the rest of the suite? I would think we'd want them run from one or more > t/t*.sh scripts. That would make bugs like this impossible, but also: > > 1. They'd be run via "make test", so developers don't have to remember > to run them separately. > > 2. They can be run in parallel with all of the other tests when using > "prove -j", etc. The first part is easy, but I don't see a good way to get both shell tests and unit tests executing under the same `prove` process. For shell tests, we pass `--exec '$(TEST_SHELL_PATH_SQ)'` to prove, meaning that we use the specified shell as an interpreter for the test files. That will not work for unit test executables. We could bundle all the unit tests into a single shell script, but then we lose parallelization and add hoops to jump through to determine what breaks. Or we could autogenerate a corresponding shell script to run each individual unit test, but that seems gross. Of course, these are hypothetical concerns for now, since we only have a single unit test at the moment. There's also the issue that the shell test arguments we pass on from prove would be shared with the unit tests. That's fine for now, as t-strbuf doesn't accept any runtime arguments, but it's possible that either the framework or individual unit tests might grow to need arguments, and it might not be convenient to stay compatible with the shell tests. Personally, I lean towards keeping things simple and just running a second `prove` process as part of `make test`. If I was forced to pick a way to get everything under one process, I'd lean towards autogenerating individual shell script wrappers for each unit test. But I'm open to discussion, especially if people have other approaches I haven't thought of.
Hi Josh, On Wed, 15 Nov 2023, Josh Steadmon wrote: > On 2023.11.13 13:49, Jeff King wrote: > > > > why are the unit tests totally separate from the rest of the suite? I > > would think we'd want them run from one or more t/t*.sh scripts. That > > would make bugs like this impossible, but also: > > > > 1. They'd be run via "make test", so developers don't have to remember > > to run them separately. > > > > 2. They can be run in parallel with all of the other tests when using > > "prove -j", etc. > > The first part is easy, but I don't see a good way to get both shell > tests and unit tests executing under the same `prove` process. For shell > tests, we pass `--exec '$(TEST_SHELL_PATH_SQ)'` to prove, meaning that > we use the specified shell as an interpreter for the test files. That > will not work for unit test executables. Probably my favorite aspect about the new unit tests is that they avoid using the error-prone, unintuitive and slow shell scripts and stay within the programming language of the code that is to be tested: C. > We could bundle all the unit tests into a single shell script, but then > we lose parallelization and add hoops to jump through to determine what > breaks. Or we could autogenerate a corresponding shell script to run > each individual unit test, but that seems gross. Of course, these are > hypothetical concerns for now, since we only have a single unit test at > the moment. I totally agree with you, Josh, that it makes little sense to try to contort the unit tests to be run in the same `prove` run as the regression tests that need to be invoked so totally differently. > There's also the issue that the shell test arguments we pass on from > prove would be shared with the unit tests. That's fine for now, as > t-strbuf doesn't accept any runtime arguments, but it's possible that > either the framework or individual unit tests might grow to need > arguments, and it might not be convenient to stay compatible with the > shell tests. > > Personally, I lean towards keeping things simple and just running a > second `prove` process as part of `make test`. Agreed. > If I was forced to pick a way to get everything under one process, I'd > lean towards autogenerating individual shell script wrappers for each > unit test. But I'm open to discussion, especially if people have other > approaches I haven't thought of. One alternative would be to avoid running the unit tests via `prove` in the first place. For example, we could use the helper from be5d88e11280 (test-tool run-command: learn to run (parts of) the testsuite, 2019-10-04) [*1*]. It would probably need a few improvements, but certainly no wizardry nor witchcraft would be required. It would also help on Windows, where running a simple test helper written in C is vastly faster than running a complex Perl script (which `prove` is). Ciao, Johannes Footnote *1*: I had always wanted to improve that test helper to the point where it could replace our use of `prove`, at least on Windows. It seems, however, that as of 4c2c38e800f3 (ci: modification of main.yml to use cmake for vs-build job, 2020-06-26) we do not use the helper at all anymore. Hopefully it can still be useful.
On 16/11/2023 08:42, Johannes Schindelin wrote: > On Wed, 15 Nov 2023, Josh Steadmon wrote: >> On 2023.11.13 13:49, Jeff King wrote: >> We could bundle all the unit tests into a single shell script, but then >> we lose parallelization and add hoops to jump through to determine what >> breaks. Or we could autogenerate a corresponding shell script to run >> each individual unit test, but that seems gross. Of course, these are >> hypothetical concerns for now, since we only have a single unit test at >> the moment. > > I totally agree with you, Josh, that it makes little sense to > try to contort the unit tests to be run in the same `prove` run as the > regression tests that need to be invoked so totally differently. FWIW that's my feeling too. It makes sense for "make test" to run the unit tests, but wrapping the unit tests in one or more shell scripts adds unnecessary complexity. Best Wishes Phillip
On Wed, Nov 15, 2023 at 01:28:49PM -0800, Josh Steadmon wrote: > The first part is easy, but I don't see a good way to get both shell > tests and unit tests executing under the same `prove` process. For shell > tests, we pass `--exec '$(TEST_SHELL_PATH_SQ)'` to prove, meaning that > we use the specified shell as an interpreter for the test files. That > will not work for unit test executables. Yes, it's unfortunate that you can't set the "exec" flag per-script (especially because without --exec it will auto-detect the right thing, but then of course it won't use TEST_SHELL_PATH). But we can intercept and do it ourselves, like: diff --git a/t/Makefile b/t/Makefile index 225aaf78ed..0b7c028eea 100644 --- a/t/Makefile +++ b/t/Makefile @@ -61,7 +61,7 @@ failed: test -z "$$failed" || $(MAKE) $$failed prove: pre-clean check-chainlint $(TEST_LINT) - @echo "*** prove ***"; $(CHAINLINTSUPPRESS) $(PROVE) --exec '$(TEST_SHELL_PATH_SQ)' $(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS) + @echo "*** prove ***"; TEST_SHELL_PATH='$(TEST_SHELL_PATH_SQ)' $(CHAINLINTSUPPRESS) $(PROVE) --exec ./run-test.sh $(GIT_PROVE_OPTS) $(T) $(UNIT_TESTS) :: $(GIT_TEST_OPTS) $(MAKE) clean-except-prove-cache $(T): diff --git a/t/run-test.sh b/t/run-test.sh new file mode 100755 index 0000000000..69944029c8 --- /dev/null +++ b/t/run-test.sh @@ -0,0 +1,10 @@ +#!/bin/sh + +case "$1" in +*.sh) + exec ${TEST_SHELL_PATH:-/bin/sh} "$@" + ;; +*) + exec "$@" + ;; +esac You can actually do this inside the prove script using their plugin interface, but the necessary bits are somewhat arcane. > We could bundle all the unit tests into a single shell script, but then > we lose parallelization and add hoops to jump through to determine what > breaks. Or we could autogenerate a corresponding shell script to run > each individual unit test, but that seems gross. Of course, these are > hypothetical concerns for now, since we only have a single unit test at > the moment. We can't just stick them all in a single script; there must be exactly one "plan" line in the TAP output from a given source. I had imagined just manually adding a thin wrapper for each ("t9970-unit-strbuf" or something). But it would also be easy to autogenerate them while compiling. (Although all of that is moot with the wrapper I showed above). > There's also the issue that the shell test arguments we pass on from > prove would be shared with the unit tests. That's fine for now, as > t-strbuf doesn't accept any runtime arguments, but it's possible that > either the framework or individual unit tests might grow to need > arguments, and it might not be convenient to stay compatible with the > shell tests. Sharing the options between the two seems like a benefit to me. I'd think that "-v" and "-i" would be useful, at least. Options which don't apply (e.g., "--root") could be quietly ignored. -Peff
On 2023.11.16 09:42, Johannes Schindelin wrote: > Hi Josh, > > On Wed, 15 Nov 2023, Josh Steadmon wrote: [snip] > > If I was forced to pick a way to get everything under one process, I'd > > lean towards autogenerating individual shell script wrappers for each > > unit test. But I'm open to discussion, especially if people have other > > approaches I haven't thought of. > > One alternative would be to avoid running the unit tests via `prove` in > the first place. > > For example, we could use the helper from be5d88e11280 (test-tool > run-command: learn to run (parts of) the testsuite, 2019-10-04) [*1*]. It > would probably need a few improvements, but certainly no wizardry nor > witchcraft would be required. It would also help on Windows, where running > a simple test helper written in C is vastly faster than running a complex > Perl script (which `prove` is). > > Ciao, > Johannes > > Footnote *1*: I had always wanted to improve that test helper to the point > where it could replace our use of `prove`, at least on Windows. It seems, > however, that as of 4c2c38e800f3 (ci: modification of main.yml to use > cmake for vs-build job, 2020-06-26) we do not use the helper at all > anymore. Hopefully it can still be useful.
diff --git a/ci/lib.sh b/ci/lib.sh index 6dfc90d7f53..307a8df0b5a 100755 --- a/ci/lib.sh +++ b/ci/lib.sh @@ -281,7 +281,7 @@ else fi MAKEFLAGS="$MAKEFLAGS --jobs=$JOBS" -GIT_PROVE_OPTS="--timer --jobs $JOBS --state=failed,slow,save" +GIT_PROVE_OPTS="--timer --jobs $JOBS" GIT_TEST_OPTS="$GIT_TEST_OPTS --verbose-log -x" case "$CI_OS_NAME" in