Message ID | 160b026e69547739a526fb6276a895904a4d33a8.1712555682.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | t: exercise Git/JGit reftable compatibility | expand |
On Mon, Apr 8, 2024 at 2:47 AM Patrick Steinhardt <ps@pks.im> wrote: > While the reftable format is a recent introduction in Git, JGit already > knows to read and write reftables since 2017. Given the complexity of > the format there is a very real risk of incompatibilities between those > two implementations, which is something that we really want to avoid. > > Add some basic tests that verify that reftables written by Git and JGit > can be read by the respective other implementation. For now this test > suite is rather small, only covering basic functionality. But it serves > as a good starting point and can be extended over time. > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh > @@ -0,0 +1,132 @@ > +test_expect_success 'JGit can read multi-level index' ' > + ... > + awk " > + BEGIN { > + print \"start\"; > + for (i = 0; i < 10000; i++) > + printf \"create refs/heads/branch-%d HEAD\n\", i; > + print \"commit\"; > + } > + " >input && I was going to suggest that you could accomplish this more easily directly in shell (without employing `awk`): { echo start && printf "create refs/heads/branch-%d HEAD\n" $(test_seq 0 9999) && echo commit } >input && but then I realized that that could potentially run afoul of command-line length limit on some platform due to the 0-9999 sequence.
Eric Sunshine <sunshine@sunshineco.com> writes: > I was going to suggest that you could accomplish this more easily > directly in shell (without employing `awk`): > > { > echo start && > printf "create refs/heads/branch-%d HEAD\n" $(test_seq 0 9999) && > echo commit > } >input && > > but then I realized that that could potentially run afoul of > command-line length limit on some platform due to the 0-9999 sequence. As xargs is supposed to know the system limit, perhaps test_seq 0 9999 | xargs printf "...%d...\n" should work?
On Mon, Apr 8, 2024 at 12:07 PM Junio C Hamano <gitster@pobox.com> wrote: > Eric Sunshine <sunshine@sunshineco.com> writes: > > I was going to suggest that you could accomplish this more easily > > directly in shell (without employing `awk`): > > > > { > > echo start && > > printf "create refs/heads/branch-%d HEAD\n" $(test_seq 0 9999) && > > echo commit > > } >input && > > > > but then I realized that that could potentially run afoul of > > command-line length limit on some platform due to the 0-9999 sequence. > > As xargs is supposed to know the system limit, perhaps > > test_seq 0 9999 | xargs printf "...%d...\n" > > should work? Hmm, yes, that should work nicely. Whether or not such a change is worthwhile is a different matter. Although it is perhaps simpler and easier to read, Windows folk might not appreciate it since it spawns at least three processes (and perhaps more depending upon how test_seq is implemented), whereas the `awk` approach spawns a single process.
On Mon, Apr 08, 2024 at 09:07:32AM -0700, Junio C Hamano wrote: > Eric Sunshine <sunshine@sunshineco.com> writes: > > > I was going to suggest that you could accomplish this more easily > > directly in shell (without employing `awk`): > > > > { > > echo start && > > printf "create refs/heads/branch-%d HEAD\n" $(test_seq 0 9999) && > > echo commit > > } >input && > > > > but then I realized that that could potentially run afoul of > > command-line length limit on some platform due to the 0-9999 sequence. > > As xargs is supposed to know the system limit, perhaps > > test_seq 0 9999 | xargs printf "...%d...\n" > > should work? Is there a reason why we want to avoid using awk(1) in the first place? We use it in several other tests and I don't think that the resulting code is all that bad. Patrick
On Mon, Apr 8, 2024 at 12:19 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > On Mon, Apr 8, 2024 at 12:07 PM Junio C Hamano <gitster@pobox.com> wrote: > > Eric Sunshine <sunshine@sunshineco.com> writes: > > > I was going to suggest that you could accomplish this more easily > > > directly in shell (without employing `awk`): > > > > > > { > > > echo start && > > > printf "create refs/heads/branch-%d HEAD\n" $(test_seq 0 9999) && > > > echo commit > > > } >input && > > > > > > but then I realized that that could potentially run afoul of > > > command-line length limit on some platform due to the 0-9999 sequence. > > > > As xargs is supposed to know the system limit, perhaps > > > > test_seq 0 9999 | xargs printf "...%d...\n" > > > > should work? > > Hmm, yes, that should work nicely. > > Whether or not such a change is worthwhile is a different matter. > Although it is perhaps simpler and easier to read, Windows folk might > not appreciate it since it spawns at least three processes (and > perhaps more depending upon how test_seq is implemented), whereas the > `awk` approach spawns a single process. And, to be clear, I have no objection to this patch's use of `awk`. It is "good enough" as is; I was not asking Patrick to make the change, but rather made the comment in case the idea hadn't occurred to him.
On Mon, Apr 08, 2024 at 06:22:10PM +0200, Patrick Steinhardt wrote: > On Mon, Apr 08, 2024 at 09:07:32AM -0700, Junio C Hamano wrote: > > Eric Sunshine <sunshine@sunshineco.com> writes: > > > > > I was going to suggest that you could accomplish this more easily > > > directly in shell (without employing `awk`): > > > > > > { > > > echo start && > > > printf "create refs/heads/branch-%d HEAD\n" $(test_seq 0 9999) && > > > echo commit > > > } >input && > > > > > > but then I realized that that could potentially run afoul of > > > command-line length limit on some platform due to the 0-9999 sequence. > > > > As xargs is supposed to know the system limit, perhaps > > > > test_seq 0 9999 | xargs printf "...%d...\n" > > > > should work? > > Is there a reason why we want to avoid using awk(1) in the first place? > We use it in several other tests and I don't think that the resulting > code is all that bad. Using awk is fine, but I think in general if we can do something just as easily without an extra process, that is preferable. Using xargs here does not to me count as "just as easily". However, using a shell loop might actually be more readable because you'd avoid the extra quoting. I.e. either: for i in $(test_seq 10000) do echo "create refs/heads/branch-$i HEAD" done or: i=0 while test $i -lt 10000 do echo "create refs/heads/branch-$i HEAD" i=$((i+1)) done I think the first one actually incurs an extra process anyway because of the $(). The second one would not. Of course, the second one probably needs &&-chaining and a "|| return 1" to work in our test snippet. IMHO the nicest thing would be if our test_seq could take a format parameter, like: diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 2eccf100c0..c8f32eb409 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -1404,6 +1404,13 @@ test_cmp_fspath () { # from 1. test_seq () { + local fmt="%d" + case "$1" in + -f) + fmt="$2" + shift 2 + ;; + esac case $# in 1) set 1 "$@" ;; 2) ;; @@ -1412,7 +1419,7 @@ test_seq () { test_seq_counter__=$1 while test "$test_seq_counter__" -le "$2" do - echo "$test_seq_counter__" + printf "$fmt\n" "$test_seq_counter__" test_seq_counter__=$(( $test_seq_counter__ + 1 )) done } Then you could just write: test_seq -f "create refs/heads/branch-%d HEAD" 0 9999 and I suspect there are several other spots which could be simplified as well. Anyway, I don't think it is worth derailing your series for this, but just general food for thought. -Peff
Patrick Steinhardt <ps@pks.im> writes: >> As xargs is supposed to know the system limit, perhaps >> >> test_seq 0 9999 | xargs printf "...%d...\n" >> >> should work? > > Is there a reason why we want to avoid using awk(1) in the first place? > We use it in several other tests and I don't think that the resulting > code is all that bad. There is no reason. It was a canned reaction against "gee we will bust shell's limit" to "use xargs then". Of course, if we can do the loop in the shell and everything we do in the loop with shell builtins (printf is often builtin in modern shells but not always, if I recall correctly), that would be best, and if the job is _that_ simple that we could do in the shell, it would make "awk" a horrible choice ;-)
Jeff King <peff@peff.net> writes: > I.e. either: > > for i in $(test_seq 10000) > do > echo "create refs/heads/branch-$i HEAD" > done > > or: > > i=0 > while test $i -lt 10000 > do > echo "create refs/heads/branch-$i HEAD" > i=$((i+1)) > done > > I think the first one actually incurs an extra process anyway because of > the $(). The second one would not. Of course, the second one probably > needs &&-chaining and a "|| return 1" to work in our test snippet. > ... > Then you could just write: > > test_seq -f "create refs/heads/branch-%d HEAD" 0 9999 > > and I suspect there are several other spots which could be simplified as > well. > > Anyway, I don't think it is worth derailing your series for this, but > just general food for thought. Yup, I agree with everything you said. Thanks.
On 24/04/08 08:47AM, Patrick Steinhardt wrote: > While the reftable format is a recent introduction in Git, JGit already > knows to read and write reftables since 2017. Given the complexity of > the format there is a very real risk of incompatibilities between those > two implementations, which is something that we really want to avoid. > > Add some basic tests that verify that reftables written by Git and JGit > can be read by the respective other implementation. For now this test > suite is rather small, only covering basic functionality. But it serves > as a good starting point and can be extended over time. > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > t/t0612-reftable-jgit-compatibility.sh | 132 +++++++++++++++++++++++++ > 1 file changed, 132 insertions(+) > create mode 100755 t/t0612-reftable-jgit-compatibility.sh > > diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh > new file mode 100755 > index 0000000000..222464e360 > --- /dev/null > +++ b/t/t0612-reftable-jgit-compatibility.sh > @@ -0,0 +1,132 @@ > +#!/bin/sh > + > +test_description='reftables are compatible with JGit' > + > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > +GIT_TEST_DEFAULT_REF_FORMAT=reftable > +export GIT_TEST_DEFAULT_REF_FORMAT > + > +# JGit does not support the 'link' DIRC extension. > +GIT_TEST_SPLIT_INDEX=0 > +export GIT_TEST_SPLIT_INDEX > + > +. ./test-lib.sh > + > +if ! test_have_prereq JGIT Do we want these tests to run in CI? As far as I can tell these tests would always be skipped. -Justin > +then > + skip_all='skipping reftable JGit tests' > + test_done > +fi ...
On Wed, Apr 10, 2024 at 03:43:58PM -0500, Justin Tobler wrote: > On 24/04/08 08:47AM, Patrick Steinhardt wrote: > > While the reftable format is a recent introduction in Git, JGit already > > knows to read and write reftables since 2017. Given the complexity of > > the format there is a very real risk of incompatibilities between those > > two implementations, which is something that we really want to avoid. > > > > Add some basic tests that verify that reftables written by Git and JGit > > can be read by the respective other implementation. For now this test > > suite is rather small, only covering basic functionality. But it serves > > as a good starting point and can be extended over time. > > > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > > --- > > t/t0612-reftable-jgit-compatibility.sh | 132 +++++++++++++++++++++++++ > > 1 file changed, 132 insertions(+) > > create mode 100755 t/t0612-reftable-jgit-compatibility.sh > > > > diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh > > new file mode 100755 > > index 0000000000..222464e360 > > --- /dev/null > > +++ b/t/t0612-reftable-jgit-compatibility.sh > > @@ -0,0 +1,132 @@ > > +#!/bin/sh > > + > > +test_description='reftables are compatible with JGit' > > + > > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > > +GIT_TEST_DEFAULT_REF_FORMAT=reftable > > +export GIT_TEST_DEFAULT_REF_FORMAT > > + > > +# JGit does not support the 'link' DIRC extension. > > +GIT_TEST_SPLIT_INDEX=0 > > +export GIT_TEST_SPLIT_INDEX > > + > > +. ./test-lib.sh > > + > > +if ! test_have_prereq JGIT > > Do we want these tests to run in CI? As far as I can tell these tests > would always be skipped. They aren't skipped on Linux-based jobs at least. The JGIT prerequisite is defined in "t/test-lib.sh" like so: ``` test_lazy_prereq JGIT ' jgit --version ' ``` And because we have adapted "install-dependencies.sh" to install the "jgit" executable into our custom PATH directory it is available, meaning that the prereq is fulfilled. But you're actually onto something: while the tests run on GitHub, they don't run on GitLab. And this is caused by the unprivileged build that we use on GitLab, where we install dependencies as a different user than what we run tests with. Consequently, as the custom path is derived from HOME, and HOME changes, we cannot find these binaries. This is not a new issue, it's been there since the inception of the GitLab pipelines. I'll include another patch on top to fix this. Patrick
diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh new file mode 100755 index 0000000000..222464e360 --- /dev/null +++ b/t/t0612-reftable-jgit-compatibility.sh @@ -0,0 +1,132 @@ +#!/bin/sh + +test_description='reftables are compatible with JGit' + +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +GIT_TEST_DEFAULT_REF_FORMAT=reftable +export GIT_TEST_DEFAULT_REF_FORMAT + +# JGit does not support the 'link' DIRC extension. +GIT_TEST_SPLIT_INDEX=0 +export GIT_TEST_SPLIT_INDEX + +. ./test-lib.sh + +if ! test_have_prereq JGIT +then + skip_all='skipping reftable JGit tests' + test_done +fi + +if ! test_have_prereq SHA1 +then + skip_all='skipping reftable JGit tests; JGit does not support SHA256 reftables' + test_done +fi + +test_commit_jgit () { + touch "$1" && + jgit add "$1" && + jgit commit -m "$1" +} + +test_same_refs () { + git show-ref --head >cgit.actual && + jgit show-ref >jgit-tabs.actual && + tr "\t" " " <jgit-tabs.actual >jgit.actual && + test_cmp cgit.actual jgit.actual +} + +test_same_ref () { + git rev-parse "$1" >cgit.actual && + jgit rev-parse "$1" >jgit.actual && + test_cmp cgit.actual jgit.actual +} + +test_same_reflog () { + git reflog "$*" >cgit.actual && + jgit reflog "$*" >jgit-newline.actual && + sed '/^$/d' <jgit-newline.actual >jgit.actual && + test_cmp cgit.actual jgit.actual +} + +test_expect_success 'CGit repository can be read by JGit' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit A && + test_same_refs && + test_same_ref HEAD && + test_same_reflog HEAD + ) +' + +test_expect_success 'JGit repository can be read by CGit' ' + test_when_finished "rm -rf repo" && + jgit init repo && + ( + cd repo && + + touch file && + jgit add file && + jgit commit -m "initial commit" && + + # Note that we must convert the ref storage after we have + # written the default branch. Otherwise JGit will end up with + # no HEAD at all. + jgit convert-ref-storage --format=reftable && + + test_same_refs && + test_same_ref HEAD && + # Interestingly, JGit cannot read its own reflog here. CGit can + # though. + printf "%s HEAD@{0}: commit (initial): initial commit" "$(git rev-parse --short HEAD)" >expect && + git reflog HEAD >actual && + test_cmp expect actual + ) +' + +test_expect_success 'mixed writes from JGit and CGit' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + + test_commit A && + test_commit_jgit B && + test_commit C && + test_commit_jgit D && + + test_same_refs && + test_same_ref HEAD && + test_same_reflog HEAD + ) +' + +test_expect_success 'JGit can read multi-level index' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + + test_commit A && + awk " + BEGIN { + print \"start\"; + for (i = 0; i < 10000; i++) + printf \"create refs/heads/branch-%d HEAD\n\", i; + print \"commit\"; + } + " >input && + git update-ref --stdin <input && + + test_same_refs && + test_same_ref refs/heads/branch-1 && + test_same_ref refs/heads/branch-5738 && + test_same_ref refs/heads/branch-9999 + ) +' + +test_done
While the reftable format is a recent introduction in Git, JGit already knows to read and write reftables since 2017. Given the complexity of the format there is a very real risk of incompatibilities between those two implementations, which is something that we really want to avoid. Add some basic tests that verify that reftables written by Git and JGit can be read by the respective other implementation. For now this test suite is rather small, only covering basic functionality. But it serves as a good starting point and can be extended over time. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- t/t0612-reftable-jgit-compatibility.sh | 132 +++++++++++++++++++++++++ 1 file changed, 132 insertions(+) create mode 100755 t/t0612-reftable-jgit-compatibility.sh