Message ID | 7cf16485cccccf365101d30138d9ee8b00d705d0.1611617820.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | pack-revindex: introduce on-disk '.rev' format | expand |
On Mon, Jan 25, 2021 at 06:37:38PM -0500, Taylor Blau wrote: > diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh > index 297de502a9..2fc3aadbd1 100755 > --- a/t/t5319-multi-pack-index.sh > +++ b/t/t5319-multi-pack-index.sh > @@ -710,8 +710,9 @@ test_expect_success 'expire respects .keep files' ' > PACKA=$(ls .git/objects/pack/a-pack*\.pack | sed s/\.pack\$//) && > touch $PACKA.keep && > git multi-pack-index expire && > - ls -S .git/objects/pack/a-pack* | grep $PACKA >a-pack-files && > - test_line_count = 3 a-pack-files && > + test_path_is_file $PACKA.idx && > + test_path_is_file $PACKA.keep && > + test_path_is_file $PACKA.pack && I find the post-image here more readable than the original. It probably runs faster, too (zero processes instead of three). > --- a/t/t5325-reverse-index.sh > +++ b/t/t5325-reverse-index.sh > @@ -3,6 +3,10 @@ > test_description='on-disk reverse index' > . ./test-lib.sh > > +# The below tests want control over the 'pack.writeReverseIndex' setting > +# themselves to assert various combinations of it with other options. > +sane_unset GIT_TEST_WRITE_REV_INDEX I think we had a discussion a while ago about sane_unset outside of an &&-chain, where it does nothing that regular "unset" would not. But I don't know if we reached any kind of conclusion. I actually argued that it was fine in: https://lore.kernel.org/git/20200630185536.GA1888406@coredump.intra.peff.net/ So I guess I should probably stand by that. ;) > --- a/t/t5604-clone-reference.sh > +++ b/t/t5604-clone-reference.sh > @@ -329,7 +329,7 @@ test_expect_success SYMLINKS 'clone repo with symlinked or unknown files at obje > for raw in $(ls T*.raw) > do > sed -e "s!/../!/Y/!; s![0-9a-f]\{38,\}!Z!" -e "/commit-graph/d" \ > - -e "/multi-pack-index/d" <$raw >$raw.de-sha-1 && > + -e "/multi-pack-index/d" -e "/rev/d" <$raw >$raw.de-sha-1 && This one is less readable than before. :) I'm not sure how to really improve that, though. > --- a/t/t5702-protocol-v2.sh > +++ b/t/t5702-protocol-v2.sh > @@ -851,8 +851,10 @@ test_expect_success 'part of packfile response provided as URI' ' > test -f h2found && > > # Ensure that there are exactly 6 files (3 .pack and 3 .idx). > - ls http_child/.git/objects/pack/* >filelist && > - test_line_count = 6 filelist > + ls http_child/.git/objects/pack/*.pack >packlist && > + ls http_child/.git/objects/pack/*.idx >idxlist && > + test_line_count = 3 idxlist && > + test_line_count = 3 packlist > ' Hmm. Too bad we can't rely on shell brace expansion, as: ls http_child/.git/objects/pack/*.{pack,idx} would be more readable. You could still do it in a single "ls" by writing out both arguments manually, but it's probably not that important. > test_expect_success 'fetching with valid packfile URI but invalid hash fails' ' > @@ -905,8 +907,10 @@ test_expect_success 'packfile-uri with transfer.fsckobjects' ' > clone "$HTTPD_URL/smart/http_parent" http_child && > > # Ensure that there are exactly 4 files (2 .pack and 2 .idx). > - ls http_child/.git/objects/pack/* >filelist && > - test_line_count = 4 filelist > + ls http_child/.git/objects/pack/*.pack >packlist && > + ls http_child/.git/objects/pack/*.idx >idxlist && > + test_line_count = 2 idxlist && > + test_line_count = 2 packlist > ' Likewise. But... > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > index 4a3b8f48ac..f76586f808 100755 > --- a/t/t6500-gc.sh > +++ b/t/t6500-gc.sh > @@ -106,17 +106,17 @@ test_expect_success 'auto gc with too many loose objects does not attempt to cre > test_commit "$(test_oid obj2)" && > # Our first gc will create a pack; our second will create a second pack > git gc --auto && > - ls .git/objects/pack | sort >existing_packs && > + ls .git/objects/pack/pack-*.pack | sort >existing_packs && > test_commit "$(test_oid obj3)" && > test_commit "$(test_oid obj4)" && > > git gc --auto 2>err && > test_i18ngrep ! "^warning:" err && > - ls .git/objects/pack/ | sort >post_packs && > + ls .git/objects/pack/pack-*.pack | sort >post_packs && > comm -1 -3 existing_packs post_packs >new && > comm -2 -3 existing_packs post_packs >del && > test_line_count = 0 del && # No packs are deleted > - test_line_count = 2 new # There is one new pack and its .idx > + test_line_count = 1 new # There is one new pack > ' This one is making the test a bit looser (it would miss a case where we somehow failed to generate the .idx). That seems like an unlikely bug, but I wonder if we can keep the original behavior. I guess: ls .git/objects/pack/*.pack \ .git/objects/pack/*.idx | sort >post_packs would work? > test_expect_success 'gc --no-quiet' ' > diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh > index 3d17e932a0..8f1caf8025 100755 > --- a/t/t9300-fast-import.sh > +++ b/t/t9300-fast-import.sh > @@ -1632,7 +1632,10 @@ test_expect_success 'O: blank lines not necessary after other commands' ' > INPUT_END > > git fast-import <input && > - test 8 = $(find .git/objects/pack -type f | grep -v multi-pack-index | wc -l) && > + ls -la .git/objects/pack/pack-*.pack >packlist && > + ls -la .git/objects/pack/pack-*.pack >idxlist && > + test_line_count = 4 idxlist && > + test_line_count = 4 packlist && Another case where I don't think we're losing anything (actually, we are gaining just slightly; if a bug somehow created 6 packs and 2 idx files, we'd now notice!). -Peff
On Thu, Jan 28, 2021 at 7:49 PM Jeff King <peff@peff.net> wrote: > On Mon, Jan 25, 2021 at 06:37:38PM -0500, Taylor Blau wrote: > > +sane_unset GIT_TEST_WRITE_REV_INDEX > > I think we had a discussion a while ago about sane_unset outside of an > &&-chain, where it does nothing that regular "unset" would not. But I > don't know if we reached any kind of conclusion. I actually argued that > it was fine in: > https://lore.kernel.org/git/20200630185536.GA1888406@coredump.intra.peff.net/ > So I guess I should probably stand by that. ;) I recall that discussion. The fact that sane_unset() outside of a test caused a hiccup in your reading of the patch which led you to dig through the mail archive and spend time writing about it in your review could be interpreted as yet another argument against its use outside of tests (specifically, it wastes reviewer time). Of course, it's such a minor thing, though, not at all worth a re-roll.
On Thu, Jan 28, 2021 at 07:45:37PM -0500, Jeff King wrote: > On Mon, Jan 25, 2021 at 06:37:38PM -0500, Taylor Blau wrote: > > > diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh > > index 297de502a9..2fc3aadbd1 100755 > > --- a/t/t5319-multi-pack-index.sh > > +++ b/t/t5319-multi-pack-index.sh > > @@ -710,8 +710,9 @@ test_expect_success 'expire respects .keep files' ' > > PACKA=$(ls .git/objects/pack/a-pack*\.pack | sed s/\.pack\$//) && > > touch $PACKA.keep && > > git multi-pack-index expire && > > - ls -S .git/objects/pack/a-pack* | grep $PACKA >a-pack-files && > > - test_line_count = 3 a-pack-files && > > + test_path_is_file $PACKA.idx && > > + test_path_is_file $PACKA.keep && > > + test_path_is_file $PACKA.pack && > > I find the post-image here more readable than the original. It probably > runs faster, too (zero processes instead of three). > > > --- a/t/t5325-reverse-index.sh > > +++ b/t/t5325-reverse-index.sh > > @@ -3,6 +3,10 @@ > > test_description='on-disk reverse index' > > . ./test-lib.sh > > > > +# The below tests want control over the 'pack.writeReverseIndex' setting > > +# themselves to assert various combinations of it with other options. > > +sane_unset GIT_TEST_WRITE_REV_INDEX > > I think we had a discussion a while ago about sane_unset outside of an > &&-chain, where it does nothing that regular "unset" would not. But I > don't know if we reached any kind of conclusion. I actually argued that > it was fine in: > > https://lore.kernel.org/git/20200630185536.GA1888406@coredump.intra.peff.net/ > > So I guess I should probably stand by that. ;) I think I probably took this from the trace2 tests? Not sure. I'm glad that it's not wrong, strictly speaking. This is another instance that I'd be happy to send a follow-up patch to get rid of all of these at once, unless you feel strongly that it should be changed in this series before applying. > > --- a/t/t5604-clone-reference.sh > > +++ b/t/t5604-clone-reference.sh > > @@ -329,7 +329,7 @@ test_expect_success SYMLINKS 'clone repo with symlinked or unknown files at obje > > for raw in $(ls T*.raw) > > do > > sed -e "s!/../!/Y/!; s![0-9a-f]\{38,\}!Z!" -e "/commit-graph/d" \ > > - -e "/multi-pack-index/d" <$raw >$raw.de-sha-1 && > > + -e "/multi-pack-index/d" -e "/rev/d" <$raw >$raw.de-sha-1 && > > This one is less readable than before. :) I'm not sure how to really > improve that, though. Yeah, this follows from a suggestion that Ævar gave in: https://lore.kernel.org/git/878s8y5bdn.fsf@evledraar.gmail.com/ FWIW, I found that his suggestion was *clearer* than what was in v2. But I get that others may feel differently, too. > > --- a/t/t5702-protocol-v2.sh > > +++ b/t/t5702-protocol-v2.sh > > @@ -851,8 +851,10 @@ test_expect_success 'part of packfile response provided as URI' ' > > test -f h2found && > > > > # Ensure that there are exactly 6 files (3 .pack and 3 .idx). > > - ls http_child/.git/objects/pack/* >filelist && > > - test_line_count = 6 filelist > > + ls http_child/.git/objects/pack/*.pack >packlist && > > + ls http_child/.git/objects/pack/*.idx >idxlist && > > + test_line_count = 3 idxlist && > > + test_line_count = 3 packlist > > ' > > Hmm. Too bad we can't rely on shell brace expansion, as: > > ls http_child/.git/objects/pack/*.{pack,idx} > > would be more readable. You could still do it in a single "ls" by > writing out both arguments manually, but it's probably not that > important. Agreed on all of that. > This one is making the test a bit looser (it would miss a case where we > somehow failed to generate the .idx). That seems like an unlikely bug, > but I wonder if we can keep the original behavior. I guess: > > ls .git/objects/pack/*.pack \ > .git/objects/pack/*.idx | > sort >post_packs > > would work? Sure, I see what you're saying. To be honest, I'm skeptical that we'd have a bug which failed only this one test, so I'm hesitant to send a replacement/reroll for this alone. If you feel strongly, though, I'm happy to change it. (But, I'll err on the side of leaving it as-is, since you indicated in your response to v3's cover letter that you'd be OK if I discarded some or all of your suggestions). > > test_expect_success 'gc --no-quiet' ' > > diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh > > index 3d17e932a0..8f1caf8025 100755 > > --- a/t/t9300-fast-import.sh > > +++ b/t/t9300-fast-import.sh > > @@ -1632,7 +1632,10 @@ test_expect_success 'O: blank lines not necessary after other commands' ' > > INPUT_END > > > > git fast-import <input && > > - test 8 = $(find .git/objects/pack -type f | grep -v multi-pack-index | wc -l) && > > + ls -la .git/objects/pack/pack-*.pack >packlist && > > + ls -la .git/objects/pack/pack-*.pack >idxlist && > > + test_line_count = 4 idxlist && > > + test_line_count = 4 packlist && > > Another case where I don't think we're losing anything (actually, we are > gaining just slightly; if a bug somehow created 6 packs and 2 idx files, > we'd now notice!). Yay! > -Peff Thanks, Taylor
Jeff King <peff@peff.net> writes: >> # Ensure that there are exactly 6 files (3 .pack and 3 .idx). >> - ls http_child/.git/objects/pack/* >filelist && >> - test_line_count = 6 filelist >> + ls http_child/.git/objects/pack/*.pack >packlist && >> + ls http_child/.git/objects/pack/*.idx >idxlist && >> + test_line_count = 3 idxlist && >> + test_line_count = 3 packlist >> ' > > Hmm. Too bad we can't rely on shell brace expansion, as: > > ls http_child/.git/objects/pack/*.{pack,idx} > > would be more readable. You could still do it in a single "ls" by > writing out both arguments manually, but it's probably not that > important. This part looks a bit familiar as I had to fix the interaction with Jonathan's topic, IIRC. We need to update the comment. It is not ensuring "exact 6"---it merely is interested in having three pack/idx pair, and carefully expressing that by preparing for the presence of other cruft in objects/pack/ directory other people may create (like ".rev", but we may gain more). I wonder if we even _care_ about .idx. Why not just count .pack, to prepare for a possible distant future where we do not even write .idx but append to existing multi-pack-index as we download a new pack stream and store it in a .pack, or something? >> - ls .git/objects/pack | sort >existing_packs && >> + ls .git/objects/pack/pack-*.pack | sort >existing_packs && >> test_commit "$(test_oid obj3)" && >> test_commit "$(test_oid obj4)" && >> >> git gc --auto 2>err && >> test_i18ngrep ! "^warning:" err && >> - ls .git/objects/pack/ | sort >post_packs && >> + ls .git/objects/pack/pack-*.pack | sort >post_packs && >> comm -1 -3 existing_packs post_packs >new && >> comm -2 -3 existing_packs post_packs >del && >> test_line_count = 0 del && # No packs are deleted >> - test_line_count = 2 new # There is one new pack and its .idx >> + test_line_count = 1 new # There is one new pack >> ' > > This one is making the test a bit looser (it would miss a case where we > somehow failed to generate the .idx). That seems like an unlikely bug, > but I wonder if we can keep the original behavior. I guess: > > ls .git/objects/pack/*.pack \ > .git/objects/pack/*.idx | > sort >post_packs > > would work? I guess we are looking at the same issue from opposite angle. I suspect that it might even be a good thing to only care about .pack and ignore everything else in the longer run.
On Thu, Jan 28, 2021 at 08:21:50PM -0500, Taylor Blau wrote: > [sane_unset outside of a test] > > I think I probably took this from the trace2 tests? Not sure. I'm glad > that it's not wrong, strictly speaking. > > This is another instance that I'd be happy to send a follow-up patch to > get rid of all of these at once, unless you feel strongly that it should > be changed in this series before applying. Nope, I don't feel strongly. > > This one is making the test a bit looser (it would miss a case where we > > somehow failed to generate the .idx). That seems like an unlikely bug, > > but I wonder if we can keep the original behavior. I guess: > > > > ls .git/objects/pack/*.pack \ > > .git/objects/pack/*.idx | > > sort >post_packs > > > > would work? > > Sure, I see what you're saying. To be honest, I'm skeptical that we'd > have a bug which failed only this one test, so I'm hesitant to send a > replacement/reroll for this alone. > > If you feel strongly, though, I'm happy to change it. (But, I'll err on > the side of leaving it as-is, since you indicated in your response to > v3's cover letter that you'd be OK if I discarded some or all of your > suggestions). Nope, I don't feel strongly. Junio even argued elsewhere that these tests may be better off just looking for pack files (which is looser than what they do now, but as we've said, unlikely to matter for any realistic bug). So I'm happy enough with what you have. -Peff
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 297de502a9..2fc3aadbd1 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -710,8 +710,9 @@ test_expect_success 'expire respects .keep files' ' PACKA=$(ls .git/objects/pack/a-pack*\.pack | sed s/\.pack\$//) && touch $PACKA.keep && git multi-pack-index expire && - ls -S .git/objects/pack/a-pack* | grep $PACKA >a-pack-files && - test_line_count = 3 a-pack-files && + test_path_is_file $PACKA.idx && + test_path_is_file $PACKA.keep && + test_path_is_file $PACKA.pack && test-tool read-midx .git/objects | grep idx >midx-list && test_line_count = 2 midx-list ) diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index 87040263b7..be452bb343 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -3,6 +3,10 @@ test_description='on-disk reverse index' . ./test-lib.sh +# The below tests want control over the 'pack.writeReverseIndex' setting +# themselves to assert various combinations of it with other options. +sane_unset GIT_TEST_WRITE_REV_INDEX + packdir=.git/objects/pack test_expect_success 'setup' ' diff --git a/t/t5604-clone-reference.sh b/t/t5604-clone-reference.sh index 5d682706ae..e845d621f6 100755 --- a/t/t5604-clone-reference.sh +++ b/t/t5604-clone-reference.sh @@ -329,7 +329,7 @@ test_expect_success SYMLINKS 'clone repo with symlinked or unknown files at obje for raw in $(ls T*.raw) do sed -e "s!/../!/Y/!; s![0-9a-f]\{38,\}!Z!" -e "/commit-graph/d" \ - -e "/multi-pack-index/d" <$raw >$raw.de-sha-1 && + -e "/multi-pack-index/d" -e "/rev/d" <$raw >$raw.de-sha-1 && sort $raw.de-sha-1 >$raw.de-sha || return 1 done && diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index 3d994e0b1b..e8f0b4a299 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -851,8 +851,10 @@ test_expect_success 'part of packfile response provided as URI' ' test -f h2found && # Ensure that there are exactly 6 files (3 .pack and 3 .idx). - ls http_child/.git/objects/pack/* >filelist && - test_line_count = 6 filelist + ls http_child/.git/objects/pack/*.pack >packlist && + ls http_child/.git/objects/pack/*.idx >idxlist && + test_line_count = 3 idxlist && + test_line_count = 3 packlist ' test_expect_success 'fetching with valid packfile URI but invalid hash fails' ' @@ -905,8 +907,10 @@ test_expect_success 'packfile-uri with transfer.fsckobjects' ' clone "$HTTPD_URL/smart/http_parent" http_child && # Ensure that there are exactly 4 files (2 .pack and 2 .idx). - ls http_child/.git/objects/pack/* >filelist && - test_line_count = 4 filelist + ls http_child/.git/objects/pack/*.pack >packlist && + ls http_child/.git/objects/pack/*.idx >idxlist && + test_line_count = 2 idxlist && + test_line_count = 2 packlist ' test_expect_success 'packfile-uri with transfer.fsckobjects fails on bad object' ' diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 4a3b8f48ac..f76586f808 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -106,17 +106,17 @@ test_expect_success 'auto gc with too many loose objects does not attempt to cre test_commit "$(test_oid obj2)" && # Our first gc will create a pack; our second will create a second pack git gc --auto && - ls .git/objects/pack | sort >existing_packs && + ls .git/objects/pack/pack-*.pack | sort >existing_packs && test_commit "$(test_oid obj3)" && test_commit "$(test_oid obj4)" && git gc --auto 2>err && test_i18ngrep ! "^warning:" err && - ls .git/objects/pack/ | sort >post_packs && + ls .git/objects/pack/pack-*.pack | sort >post_packs && comm -1 -3 existing_packs post_packs >new && comm -2 -3 existing_packs post_packs >del && test_line_count = 0 del && # No packs are deleted - test_line_count = 2 new # There is one new pack and its .idx + test_line_count = 1 new # There is one new pack ' test_expect_success 'gc --no-quiet' ' diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 3d17e932a0..8f1caf8025 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -1632,7 +1632,10 @@ test_expect_success 'O: blank lines not necessary after other commands' ' INPUT_END git fast-import <input && - test 8 = $(find .git/objects/pack -type f | grep -v multi-pack-index | wc -l) && + ls -la .git/objects/pack/pack-*.pack >packlist && + ls -la .git/objects/pack/pack-*.pack >idxlist && + test_line_count = 4 idxlist && + test_line_count = 4 packlist && test $(git rev-parse refs/tags/O3-2nd) = $(git rev-parse O3^) && git log --reverse --pretty=oneline O3 | sed s/^.*z// >actual && test_cmp expect actual
In the next patch, we'll add support for unconditionally enabling the 'pack.writeReverseIndex' setting with a new GIT_TEST_WRITE_REV_INDEX environment variable. This causes a little bit of fallout with tests that, for example, compare the list of files in the pack directory being unprepared to see .rev files in its output. Those locations can be cleaned up to look for specific file extensions, rather than take everything in the pack directory (for instance) and then grep out unwanted items. Once the pack.writeReverseIndex option has been thoroughly tested, we will default it to 'true', removing GIT_TEST_WRITE_REV_INDEX, and making it possible to revert this patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- t/t5319-multi-pack-index.sh | 5 +++-- t/t5325-reverse-index.sh | 4 ++++ t/t5604-clone-reference.sh | 2 +- t/t5702-protocol-v2.sh | 12 ++++++++---- t/t6500-gc.sh | 6 +++--- t/t9300-fast-import.sh | 5 ++++- 6 files changed, 23 insertions(+), 11 deletions(-)