Message ID | 20190130114736.30357-2-worldhello.net@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | pack-redundant: new algorithm to find min packs | expand |
Jiang Xin <worldhello.net@gmail.com> writes: > diff --git a/t/t5323-pack-redundant.sh b/t/t5323-pack-redundant.sh > new file mode 100755 > index 0000000000..710fe9884c > --- /dev/null > +++ b/t/t5323-pack-redundant.sh > @@ -0,0 +1,322 @@ > +#!/bin/sh > +# > +# Copyright (c) 2018 Jiang Xin > +# > + > +test_description='git pack-redundant test' > + > +. ./test-lib.sh > + > +create_commits () { > + parent= > + for name in A B C D E F G H I J K L M N O P Q R > + do > + test_tick && > + T=$(git write-tree) && Move this outside loop, not for efficiency but for clarity. This helper function creates a single empty tree and bunch of commits that hold the same empty tree, arranged as a single strand of pearls. By the way, I had to draw a table like this to figure out ... T A B C D E F G H I J K L M N O P Q R 1 x x x x x x x x 2 x x x x x x x 3 x x x x x x 4 x x x x x 5 x x x x 6 x x x 7 x x 8 x ... what is going on. Perhaps something like this would help other readers near the top of the file (or in test_description)? > +format_packfiles () { > + sed \ > + -e "s#.*/pack-\(.*\)\.idx#\1#" \ > + -e "s#.*/pack-\(.*\)\.pack#\1#" | > + sort -u | > + while read p > + do > + if test -z "$(eval echo \${P$p})" > + then > + echo $p All the "expected output" below will expect P$n:${P$n} prepared by various create_pack_$n helpers we saw earlier, so an unknown packfile would be detected as a line that this emits. Is that the idea? > + else > + eval echo "\${P$p}" > + fi > + done | > + sort > +} > + > +test_expect_success 'setup master.git' ' > + git init --bare master.git && > + cd master.git && > + create_commits > +' Everything below will be done inside master.git? Avoid cd'ing around in random places in the test script, as a failure in any of the steps that does cd would start later tests in an unexpected place, if you can. > +test_expect_success 'no redundant for pack 1, 2, 3' ' > + create_pack_1 && create_pack_2 && create_pack_3 && > + git pack-redundant --all >out && > + test_must_be_empty out > +' > + > +test_expect_success 'create pack 4, 5' ' > + create_pack_4 && create_pack_5 > +' > + > +cat >expected <<EOF > +P2:$P2 > +EOF > + > +test_expect_success 'one of pack-2/pack-3 is redundant' ' > + git pack-redundant --all >out && > + format_packfiles <out >actual && > + test_cmp expected actual > +' Do the preparation of file "expect" (most of the tests compare 'expect' vs 'actual', not 'expected') _inside_ the next test that uses it. i.e. test_expect_success 'with 1 4 and 5, either 2 or 3 can be omitted' ' cat >expect <<-EOF && P2:$P2 EOF git pack-redundant --all >out && format ... >actual && test_cmp expect actual ' Again, I needed to draw this to see if the "one of ... is redundant" in the title is a valid claim. Something like it would help future readers. T A B C D E F G H I J K L M N O P Q R 1245 x x x x x x x x x x x x x x x x x 3 x x x x x x T A B C D E F G H I J K L M N O P Q R 1345 x x x x x x x x x x x x x x x x x 2 x x x x x x x I won't repeat the same for tests that appear later in this file, but they share the same issue. > +test_expect_success 'setup shared.git' ' > + cd "$TRASH_DIRECTORY" && > + git clone -q --mirror master.git shared.git && Why "-q"? > + cd shared.git && > + printf "../../master.git/objects" >objects/info/alternates > +' Why not echo? I recall designing the alternates file to be a plain text file. Is it necessary to leave the line incomplete? > +test_expect_success 'remove redundant packs by alt-odb, no packs left' ' > + git pack-redundant --all --alt-odb | xargs rm && > + git fsck --no-progress && Why "--no-progress"? > + test_must_fail git pack-redundant --all --alt-odb >actual 2>&1 && > + test_cmp expected actual > +' > + > +create_commits_others () { > + parent=$(git rev-parse HEAD) If this fails, you'd still go ahead and enter the loop, which is not what you want. > + for name in X Y Z > + do > + test_tick && > + T=$(git write-tree) && Lift this outside the loop. > + if test -z "$parent" > + then > + oid=$(echo $name | git commit-tree $T) > + else > + oid=$(echo $name | git commit-tree -p $parent $T) > + fi && > + eval $name=$oid && > + parent=$oid || > + return 1 > + done > + git update-ref refs/heads/master $Z > +}
Junio C Hamano <gitster@pobox.com> 于2019年2月1日周五 上午5:44写道: > > +create_commits () { > > + parent= > > + for name in A B C D E F G H I J K L M N O P Q R > > + do > > + test_tick && > > + T=$(git write-tree) && > > Move this outside loop, not for efficiency but for clarity. This > helper function creates a single empty tree and bunch of commits > that hold the same empty tree, arranged as a single strand of > pearls. Will rewrite as: create_commits () { parent= T=$(git write-tree) && for name in A B C D E F G H I J K L M N O P Q R > > By the way, I had to draw a table like this to figure out ... > > T A B C D E F G H I J K L M N O P Q R > 1 x x x x x x x x > 2 x x x x x x x > 3 x x x x x x > 4 x x x x x > 5 x x x x > 6 x x x > 7 x x > 8 x > > ... what is going on. Perhaps something like this would help other > readers near the top of the file (or in test_description)? Nice chart, will edit test_description as follows: test_description='git pack-redundant test In order to test git-pack-redundant, we will create a number of redundant packs in the repository `master.git`. The relationship between packs (P1-P8) and objects (T,A-R) is show in the following chart: | T A B C D E F G H I J K L M N O P Q R ---+-------------------------------------- P1 | x x x x x x x x P2 | x x x x x x x P3 | x x x x x x P4 | x x x x x P5 | x x x x P6 | x x x P7 | x x P8 | x Another repoisitory `shared.git` has unique objects (X-Z), while share others objects through alt-odb (of `master.git`). The relationship between packs and objects is as follows: | T A B C D E F G H I J K L M N O P Q R X Y Z ---+---------------------------------------------- Px1| x x x x x x Px2| x x x x x x ' > > > > +format_packfiles () { > > + sed \ > > + -e "s#.*/pack-\(.*\)\.idx#\1#" \ > > + -e "s#.*/pack-\(.*\)\.pack#\1#" | > > + sort -u | > > + while read p > > + do > > + if test -z "$(eval echo \${P$p})" > > + then > > + echo $p > > All the "expected output" below will expect P$n:${P$n} prepared by > various create_pack_$n helpers we saw earlier, so an unknown > packfile would be detected as a line that this emits. Is that the > idea? Right. During the reroll, a typo makes an empty output, so I decide to make this change. > > + else > > + eval echo "\${P$p}" > > + fi > > + done | > > + sort > > +} > > + > > +test_expect_success 'setup master.git' ' > > + git init --bare master.git && > > + cd master.git && > > + create_commits > > +' > > Everything below will be done inside master.git? Avoid cd'ing > around in random places in the test script, as a failure in any of > the steps that does cd would start later tests in an unexpected > place, if you can. The first 10 test cases will run inside master.git, and others will run inside shared.git. Only run cd inside the two `setup` test cases. > > +cat >expected <<EOF > > +P2:$P2 > > +EOF > > + > > +test_expect_success 'one of pack-2/pack-3 is redundant' ' > > + git pack-redundant --all >out && > > + format_packfiles <out >actual && > > + test_cmp expected actual > > +' > > Do the preparation of file "expect" (most of the tests compare > 'expect' vs 'actual', not 'expected') _inside_ the next test that > uses it. i.e. > > test_expect_success 'with 1 4 and 5, either 2 or 3 can be omitted' ' > cat >expect <<-EOF && > P2:$P2 > EOF > git pack-redundant --all >out && > format ... >actual && > test_cmp expect actual > ' Will do. > > +test_expect_success 'setup shared.git' ' > > + cd "$TRASH_DIRECTORY" && > > + git clone -q --mirror master.git shared.git && > > Why "-q"? To make verbose output cleaner. > > + cd shared.git && > > + printf "../../master.git/objects" >objects/info/alternates > > +' > > Why not echo? I recall designing the alternates file to be a plain > text file. Is it necessary to leave the line incomplete? Forgot "\n", will append. > > > +test_expect_success 'remove redundant packs by alt-odb, no packs left' ' > > + git pack-redundant --all --alt-odb | xargs rm && > > + git fsck --no-progress && > > Why "--no-progress"? To make verbose output cleaner. > > > + test_must_fail git pack-redundant --all --alt-odb >actual 2>&1 && > > + test_cmp expected actual > > +' > > + > > +create_commits_others () { > > + parent=$(git rev-parse HEAD) Will append "&&".
On Fri, Feb 1, 2019 at 12:44 AM Jiang Xin <worldhello.net@gmail.com> wrote: >> Junio C Hamano <gitster@pobox.com> 于2019年2月1日周五 上午5:44写道: > > Move this outside loop, not for efficiency but for clarity. This > > helper function creates a single empty tree and bunch of commits > > that hold the same empty tree, arranged as a single strand of > > pearls. > > Will rewrite as: > > create_commits () { > parent= > T=$(git write-tree) && > for name in A B C D E F G H I J K L M N O P Q R Don't forget the && at the end of the 'parent=' line to protect against someone later adding code above that line. So: create_commits () { parent= && T=$(git write-tree) && ... > Nice chart, will edit test_description as follows: > > test_description='git pack-redundant test > > In order to test git-pack-redundant, we will create a number of > redundant > packs in the repository `master.git`. The relationship between > packs (P1-P8) > and objects (T,A-R) is show in the following chart: > > | T A B C D E F G H I J K L M N O P Q R > ---+-------------------------------------- > P1 | x x x x x x x x > P2 | x x x x x x x > P3 | x x x x x x > P4 | x x x x x > P5 | x x x x > P6 | x x x > P7 | x x > P8 | x test_description should be a meaningful one-liner; it should not contain this other information, but this information should appear as comments in the test script. > Another repoisitory `shared.git` has unique objects (X-Z), while > share others s/repoisitory/repository/ > > > +test_expect_success 'setup master.git' ' > > > + git init --bare master.git && > > > + cd master.git && > > > + create_commits > > > +' > > > > Everything below will be done inside master.git? Avoid cd'ing > > around in random places in the test script, as a failure in any of > > the steps that does cd would start later tests in an unexpected > > place, if you can. > > The first 10 test cases will run inside master.git, and others will > run inside shared.git. Only run cd inside the two `setup` test cases. That's not what Junio meant. It's okay for tests to 'cd', but each test which does so _must_ ensure that the 'cd' is undone at the end of the test, even if the test fails. The correct way to do this within each test is by using 'cd' in a subhsell, like this: test_expect_success 'setup master.git' ' git init --bare master.git && ( cd master.git && create_commits ) ' Then, each test which needs to use "master.git" would 'cd' itself, like this: test_expect_success 'some test' ' ( cd master.git && ... ) ' > > > +test_expect_success 'setup shared.git' ' > > > + cd "$TRASH_DIRECTORY" && > > > + git clone -q --mirror master.git shared.git && > > > > Why "-q"? > > To make verbose output cleaner. What Junio really meant by asking that question was that you should not do this. When something goes wrong with a test, we want as much output as possible to help diagnose the problem, so suppressing output is undesirable. To summarize, don't use -q, --no-progress, or any other such option and don't redirect to /dev/null.
Eric Sunshine <sunshine@sunshineco.com> 于2019年2月1日周五 下午2:11写道: > > On Fri, Feb 1, 2019 at 12:44 AM Jiang Xin <worldhello.net@gmail.com> wrote: > >> Junio C Hamano <gitster@pobox.com> 于2019年2月1日周五 上午5:44写道: > > > Move this outside loop, not for efficiency but for clarity. This > > > helper function creates a single empty tree and bunch of commits > > > that hold the same empty tree, arranged as a single strand of > > > pearls. > > > > Will rewrite as: > > > > create_commits () { > > parent= > > T=$(git write-tree) && > > for name in A B C D E F G H I J K L M N O P Q R > > Don't forget the && at the end of the 'parent=' line to protect > against someone later adding code above that line. So: > > create_commits () { > parent= && > T=$(git write-tree) && > ... Will do. > > Nice chart, will edit test_description as follows: > > > > test_description='git pack-redundant test > > > > In order to test git-pack-redundant, we will create a number of > > redundant > > packs in the repository `master.git`. The relationship between > > packs (P1-P8) > > and objects (T,A-R) is show in the following chart: > > > > | T A B C D E F G H I J K L M N O P Q R > > ---+-------------------------------------- > > P1 | x x x x x x x x > > P2 | x x x x x x x > > P3 | x x x x x x > > P4 | x x x x x > > P5 | x x x x > > P6 | x x x > > P7 | x x > > P8 | x > > test_description should be a meaningful one-liner; it should not > contain this other information, but this information should appear as > comments in the test script. In 't/t0000-basic.sh', there is also a very long test_description. After read 't/test-lib.sh', the only usage of test_description is showing it as help, when runing: sh ./t0000-basic.sh So write a long test_description is ok, I think. > > Another repoisitory `shared.git` has unique objects (X-Z), while > > share others > > s/repoisitory/repository/ Thanks, will fix. > > > > +test_expect_success 'setup master.git' ' > > > > + git init --bare master.git && > > > > + cd master.git && > > > > + create_commits > > > > +' > > > > > > Everything below will be done inside master.git? Avoid cd'ing > > > around in random places in the test script, as a failure in any of > > > the steps that does cd would start later tests in an unexpected > > > place, if you can. > > > > The first 10 test cases will run inside master.git, and others will > > run inside shared.git. Only run cd inside the two `setup` test cases. > > That's not what Junio meant. It's okay for tests to 'cd', but each > test which does so _must_ ensure that the 'cd' is undone at the end of > the test, even if the test fails. The correct way to do this within > each test is by using 'cd' in a subhsell, like this: > > test_expect_success 'setup master.git' ' > git init --bare master.git && > ( > cd master.git && > create_commits > ) > ' > > Then, each test which needs to use "master.git" would 'cd' itself, like this: > > test_expect_success 'some test' ' > ( > cd master.git && > ... > ) > ' Nice explaination, will do. > > > > +test_expect_success 'setup shared.git' ' > > > > + cd "$TRASH_DIRECTORY" && > > > > + git clone -q --mirror master.git shared.git && > > > > > > Why "-q"? > > > > To make verbose output cleaner. > > What Junio really meant by asking that question was that you should > not do this. When something goes wrong with a test, we want as much > output as possible to help diagnose the problem, so suppressing output > is undesirable. To summarize, don't use -q, --no-progress, or any > other such option and don't redirect to /dev/null. Thanks.
Jiang Xin <worldhello.net@gmail.com> 于2019年2月1日周五 下午3:23写道: > > Eric Sunshine <sunshine@sunshineco.com> 于2019年2月1日周五 下午2:11写道: > > > Nice chart, will edit test_description as follows: > > > > > > test_description='git pack-redundant test > > > > > > In order to test git-pack-redundant, we will create a number of > > > redundant > > > packs in the repository `master.git`. The relationship between > > > packs (P1-P8) > > > and objects (T,A-R) is show in the following chart: > > > > > > | T A B C D E F G H I J K L M N O P Q R > > > ---+-------------------------------------- > > > P1 | x x x x x x x x > > > P2 | x x x x x x x > > > P3 | x x x x x x > > > P4 | x x x x x > > > P5 | x x x x > > > P6 | x x x > > > P7 | x x > > > P8 | x > > > > test_description should be a meaningful one-liner; it should not > > contain this other information, but this information should appear as > > comments in the test script. > > In 't/t0000-basic.sh', there is also a very long test_description. > After read 't/test-lib.sh', the only usage of test_description > is showing it as help, when runing: > > sh ./t0000-basic.sh sh ./t0000-basic.sh --help
Eric Sunshine <sunshine@sunshineco.com> 于2019年2月1日周五 下午2:11写道: > > Everything below will be done inside master.git? Avoid cd'ing > > > around in random places in the test script, as a failure in any of > > > the steps that does cd would start later tests in an unexpected > > > place, if you can. > > > > The first 10 test cases will run inside master.git, and others will > > run inside shared.git. Only run cd inside the two `setup` test cases. > > That's not what Junio meant. It's okay for tests to 'cd', but each > test which does so _must_ ensure that the 'cd' is undone at the end of > the test, even if the test fails. The correct way to do this within > each test is by using 'cd' in a subhsell, like this: > > test_expect_success 'setup master.git' ' > git init --bare master.git && > ( > cd master.git && > create_commits > ) > ' create_commits should not run in sub-shell, or variables set are lost. I write a commit_commits_in function : # Usage: create_commits_in <repo> A B C ... # Note: DO NOT run it in sub shell, or variables are not set create_commits_in () { repo="$1" && parent=$(git -C "$repo" rev-parse HEAD^{} 2>/dev/null) || parent= T=$(git -C "$repo" write-tree) && shift && while test $# -gt 0 do name=$1 && test_tick && if test -z "$parent" then oid=$(echo $name | git -C "$repo" commit-tree $T) else oid=$(echo $name | git -C "$repo" commit-tree -p $parent $T) fi && eval $name=$oid && parent=$oid && shift || return 1 done git -C "$repo" update-ref refs/heads/master $oid } and use it to create commits like: create_commits_in master.git A B C D E F G ...
diff --git a/t/t5323-pack-redundant.sh b/t/t5323-pack-redundant.sh new file mode 100755 index 0000000000..710fe9884c --- /dev/null +++ b/t/t5323-pack-redundant.sh @@ -0,0 +1,322 @@ +#!/bin/sh +# +# Copyright (c) 2018 Jiang Xin +# + +test_description='git pack-redundant test' + +. ./test-lib.sh + +create_commits () { + parent= + for name in A B C D E F G H I J K L M N O P Q R + do + test_tick && + T=$(git write-tree) && + if test -z "$parent" + then + oid=$(echo $name | git commit-tree $T) + else + oid=$(echo $name | git commit-tree -p $parent $T) + fi && + eval $name=$oid && + parent=$oid || + return 1 + done + git update-ref refs/heads/master $R +} + +create_pack_1 () { + P1=$(git -C objects/pack pack-objects -q pack <<-EOF + $T + $A + $B + $C + $D + $E + $F + $R + EOF + ) && + eval P$P1=P1:$P1 +} + +create_pack_2 () { + P2=$(git -C objects/pack pack-objects -q pack <<-EOF + $B + $C + $D + $E + $G + $H + $I + EOF + ) && + eval P$P2=P2:$P2 +} + +create_pack_3 () { + P3=$(git -C objects/pack pack-objects -q pack <<-EOF + $F + $I + $J + $K + $L + $M + EOF + ) && + eval P$P3=P3:$P3 +} + +create_pack_4 () { + P4=$(git -C objects/pack pack-objects -q pack <<-EOF + $J + $K + $L + $M + $P + EOF + ) && + eval P$P4=P4:$P4 +} + +create_pack_5 () { + P5=$(git -C objects/pack pack-objects -q pack <<-EOF + $G + $H + $N + $O + EOF + ) && + eval P$P5=P5:$P5 +} + +create_pack_6 () { + P6=$(git -C objects/pack pack-objects -q pack <<-EOF + $N + $O + $Q + EOF + ) && + eval P$P6=P6:$P6 +} + +create_pack_7 () { + P7=$(git -C objects/pack pack-objects -q pack <<-EOF + $P + $Q + EOF + ) && + eval P$P7=P7:$P7 +} + +create_pack_8 () { + P8=$(git -C objects/pack pack-objects -q pack <<-EOF + $A + EOF + ) && + eval P$P8=P8:$P8 +} + +format_packfiles () { + sed \ + -e "s#.*/pack-\(.*\)\.idx#\1#" \ + -e "s#.*/pack-\(.*\)\.pack#\1#" | + sort -u | + while read p + do + if test -z "$(eval echo \${P$p})" + then + echo $p + else + eval echo "\${P$p}" + fi + done | + sort +} + +test_expect_success 'setup master.git' ' + git init --bare master.git && + cd master.git && + create_commits +' + +test_expect_success 'no redundant for pack 1, 2, 3' ' + create_pack_1 && create_pack_2 && create_pack_3 && + git pack-redundant --all >out && + test_must_be_empty out +' + +test_expect_success 'create pack 4, 5' ' + create_pack_4 && create_pack_5 +' + +cat >expected <<EOF +P2:$P2 +EOF + +test_expect_success 'one of pack-2/pack-3 is redundant' ' + git pack-redundant --all >out && + format_packfiles <out >actual && + test_cmp expected actual +' + +test_expect_success 'create pack 6, 7' ' + create_pack_6 && create_pack_7 +' + +# Only after calling create_pack_6, we can use $P6 variable. +cat >expected <<EOF +P2:$P2 +P4:$P4 +P6:$P6 +EOF + +test_expect_success 'pack 2, 4, and 6 are redundant' ' + git pack-redundant --all >out && + format_packfiles <out >actual && + test_cmp expected actual +' + +test_expect_success 'create pack 8' ' + create_pack_8 +' + +cat >expected <<EOF +P2:$P2 +P4:$P4 +P6:$P6 +P8:$P8 +EOF + +test_expect_success 'pack-8 (subset of pack-1) is also redundant' ' + git pack-redundant --all >out && + format_packfiles <out >actual && + test_cmp expected actual +' + +test_expect_success 'clean loose objects' ' + git prune-packed && + find objects -type f | sed -e "/objects\/pack\//d" >out && + test_must_be_empty out +' + +test_expect_success 'remove redundant packs and pass fsck' ' + git pack-redundant --all | xargs rm && + git fsck --no-progress && + git pack-redundant --all >out && + test_must_be_empty out +' + +test_expect_success 'setup shared.git' ' + cd "$TRASH_DIRECTORY" && + git clone -q --mirror master.git shared.git && + cd shared.git && + printf "../../master.git/objects" >objects/info/alternates +' + +test_expect_success 'no redundant packs without --alt-odb' ' + git pack-redundant --all >out && + test_must_be_empty out +' + +cat >expected <<EOF +P1:$P1 +P3:$P3 +P5:$P5 +P7:$P7 +EOF + +test_expect_success 'pack-redundant --verbose: show duplicate packs in stderr' ' + git pack-redundant --all --verbose >out 2>out.err && + test_must_be_empty out && + grep "pack$" out.err | format_packfiles >actual && + test_cmp expected actual +' + +cat >expected <<EOF +fatal: Zero packs found! +EOF + +test_expect_success 'remove redundant packs by alt-odb, no packs left' ' + git pack-redundant --all --alt-odb | xargs rm && + git fsck --no-progress && + test_must_fail git pack-redundant --all --alt-odb >actual 2>&1 && + test_cmp expected actual +' + +create_commits_others () { + parent=$(git rev-parse HEAD) + for name in X Y Z + do + test_tick && + T=$(git write-tree) && + if test -z "$parent" + then + oid=$(echo $name | git commit-tree $T) + else + oid=$(echo $name | git commit-tree -p $parent $T) + fi && + eval $name=$oid && + parent=$oid || + return 1 + done + git update-ref refs/heads/master $Z +} + +create_pack_x1 () { + Px1=$(git -C objects/pack pack-objects -q pack <<-EOF + $X + $Y + $Z + $A + $B + $C + EOF + ) && + eval P${Px1}=Px1:${Px1} +} + +create_pack_x2 () { + Px2=$(git -C objects/pack pack-objects -q pack <<-EOF + $X + $Y + $Z + $D + $E + $F + EOF + ) && + eval P${Px2}=Px2:${Px2} +} + +test_expect_success 'new objects and packs in shared.git' ' + create_commits_others && + create_pack_x1 && + create_pack_x2 && + git pack-redundant --all >out && + test_must_be_empty out +' + +test_expect_success 'one pack is redundant' ' + git pack-redundant --all --alt-odb >out && + format_packfiles <out >actual && + test_line_count = 1 actual +' + +cat >expected <<EOF +Px1:$Px1 +Px2:$Px2 +EOF + +test_expect_success 'set ignore objects and all two packs are redundant' ' + git pack-redundant --all --alt-odb >out <<-EOF && + $X + $Y + $Z + EOF + format_packfiles <out >actual && + test_cmp expected actual +' + +test_done