Message ID | 20190312031303.5tutut7zzvxne5dw@dcvr (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | repack: enable bitmaps by default on bare repos | expand |
On Tue, Mar 12 2019, Eric Wong wrote: > Jeff King <peff@peff.net> wrote: >> On Sat, Mar 09, 2019 at 02:49:44AM +0000, Eric Wong wrote: >> > It would make life easier for people new to hosting git servers >> > (and hopefully reduce centralization :) >> >> I do think they're a net win for people hosting git servers. But if >> that's the goal, I think at most you'd want to make bitmaps the default >> for bare repos. They're really not much help for normal end-user repos >> at this point. > > Fair enough, hopefully this can make life easier for admins > new to hosting git: > > ----------8<--------- > Subject: [PATCH] repack: enable bitmaps by default on bare repos > > A typical use case for bare repos is for serving clones and > fetches to clients. Enable bitmaps by default on bare repos to > make it easier for admins to host git repos in a performant way. > > Signed-off-by: Eric Wong <e@80x24.org> > --- > builtin/repack.c | 16 ++++++++++------ > t/t7700-repack.sh | 14 +++++++++++++- > 2 files changed, 23 insertions(+), 7 deletions(-) > > diff --git a/builtin/repack.c b/builtin/repack.c > index 67f8978043..5d4758b515 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -14,7 +14,7 @@ > > static int delta_base_offset = 1; > static int pack_kept_objects = -1; > -static int write_bitmaps; > +static int write_bitmaps = -1; > static int use_delta_islands; > static char *packdir, *packtmp; > > @@ -343,11 +343,15 @@ int cmd_repack(int argc, const char **argv, const char *prefix) > (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE))) > die(_("--keep-unreachable and -A are incompatible")); > > + if (!(pack_everything & ALL_INTO_ONE)) { > + if (write_bitmaps > 0) > + die(_(incremental_bitmap_conflict_error)); > + } else if (write_bitmaps < 0) { > + write_bitmaps = is_bare_repository(); > + } > + > if (pack_kept_objects < 0) > - pack_kept_objects = write_bitmaps; > - > - if (write_bitmaps && !(pack_everything & ALL_INTO_ONE)) > - die(_(incremental_bitmap_conflict_error)); > + pack_kept_objects = write_bitmaps > 0; > > packdir = mkpathdup("%s/pack", get_object_directory()); > packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid()); > @@ -368,7 +372,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) > argv_array_push(&cmd.args, "--indexed-objects"); > if (repository_format_partial_clone) > argv_array_push(&cmd.args, "--exclude-promisor-objects"); > - if (write_bitmaps) > + if (write_bitmaps > 0) > argv_array_push(&cmd.args, "--write-bitmap-index"); > if (use_delta_islands) > argv_array_push(&cmd.args, "--delta-islands"); > diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh > index 6162e2a8e6..3e0b5c40e4 100755 > --- a/t/t7700-repack.sh > +++ b/t/t7700-repack.sh > @@ -221,5 +221,17 @@ test_expect_success 'repack --keep-pack' ' > ) > ' > > +test_expect_success 'bitmaps are created by default in bare repos' ' > + git clone --bare .git bare.git && > + cd bare.git && > + mkdir old && > + mv objects/pack/* old && > + pack=$(ls old/*.pack) && > + test_path_is_file "$pack" && > + git unpack-objects -q <"$pack" && > + git repack -ad && > + bitmap=$(ls objects/pack/*.bitmap) && > + test_path_is_file "$bitmap" > +' > + > test_done > - Looks sensible in principle, but now the git-repack and git-config docs talking about repack.writeBitmaps are out-of-date. A v2 should update those. Also worth testing that -c repack.writeBitmaps=false on a bare repository still overrides it, even though glancing at the code it looks like that case is handled, but without being tested for.
On Tue, Mar 12, 2019 at 03:13:03AM +0000, Eric Wong wrote: > > I do think they're a net win for people hosting git servers. But if > > that's the goal, I think at most you'd want to make bitmaps the default > > for bare repos. They're really not much help for normal end-user repos > > at this point. > > Fair enough, hopefully this can make life easier for admins > new to hosting git: > > ----------8<--------- > Subject: [PATCH] repack: enable bitmaps by default on bare repos > > A typical use case for bare repos is for serving clones and > fetches to clients. Enable bitmaps by default on bare repos to > make it easier for admins to host git repos in a performant way. OK. I still think of bitmaps as something that might need manual care and feeding, but I think that may be leftover superstition. I can't offhand think of any real downsides to this. > static int delta_base_offset = 1; > static int pack_kept_objects = -1; > -static int write_bitmaps; > +static int write_bitmaps = -1; So we'll have "-1" be "not decided yet". Makes sense. > @@ -343,11 +343,15 @@ int cmd_repack(int argc, const char **argv, const char *prefix) > (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE))) > die(_("--keep-unreachable and -A are incompatible")); > > + if (!(pack_everything & ALL_INTO_ONE)) { > + if (write_bitmaps > 0) > + die(_(incremental_bitmap_conflict_error)); > + } else if (write_bitmaps < 0) { > + write_bitmaps = is_bare_repository(); > + } Might it be easier here to always resolve "-1" into a 0/1? I.e., like: if (write_bitmaps < 0) write_bitmaps = (pack_everything & ALL_INTO_ONE) && is_bare_repository(); and then the rest of the logic can stay the same, and does not need to be modified to handle "write_bitmaps < 0"? > +test_expect_success 'bitmaps are created by default in bare repos' ' > + git clone --bare .git bare.git && > + cd bare.git && Please don't "cd" outside of a subshell, since it impacts further tests that are added. > + mkdir old && > + mv objects/pack/* old && > + pack=$(ls old/*.pack) && Are we sure we have just done $pack here? Our repo came from a local-disk clone, which would have just hard-linked whatever was in the source repo. So we're subtly relying on the state that other tests have left. I'm not sure what we're trying to accomplish with this unpacking, though. Running "git repack -ad" should generate bitmaps whether the objects were already in a single pack or not. So I think this test can just be: git clone --bare . bare.git && git -C bare.git repack -ad && bitmap=$(ls objects/pack/*.bitmap) test_path_is_file "$bitmap" I do agree with Ævar it might also be worth testing that disabling bitmaps explicitly still works. And also that repacking _without_ "-a" (i.e., an incremental) does not complain about being unable to generate bitmaps. -Peff
On Tue, Mar 12, 2019 at 06:49:54AM -0400, Jeff King wrote: > I'm not sure what we're trying to accomplish with this unpacking, > though. Running "git repack -ad" should generate bitmaps whether the > objects were already in a single pack or not. So I think this test can > just be: > > git clone --bare . bare.git && > git -C bare.git repack -ad && > bitmap=$(ls objects/pack/*.bitmap) > test_path_is_file "$bitmap" Of course that should be "bare.git/objects/pack/*.bitmap" in the third line, since we skipped the "cd" entirely. -Peff
Jeff King <peff@peff.net> wrote: > OK. I still think of bitmaps as something that might need manual care > and feeding, but I think that may be leftover superstition. I can't > offhand think of any real downsides to this. It's a _relatively_ new feature to long-time users like us, so maybe the "new == immature" mindset sets in[*]. I skimmed some mail archives but couldn't find any reason not to... But I did find Ævar's forgotten gitperformance doc and thread where the topic was brought up: https://public-inbox.org/git/20170403211644.26814-1-avarab@gmail.com/ > On Tue, Mar 12, 2019 at 03:13:03AM +0000, Eric Wong wrote: > > @@ -343,11 +343,15 @@ int cmd_repack(int argc, const char **argv, const char *prefix) > > (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE))) > > die(_("--keep-unreachable and -A are incompatible")); > > > > + if (!(pack_everything & ALL_INTO_ONE)) { > > + if (write_bitmaps > 0) > > + die(_(incremental_bitmap_conflict_error)); > > + } else if (write_bitmaps < 0) { > > + write_bitmaps = is_bare_repository(); > > + } > > Might it be easier here to always resolve "-1" into a 0/1? I.e., like: > > if (write_bitmaps < 0) > write_bitmaps = (pack_everything & ALL_INTO_ONE) && is_bare_repository(); > > and then the rest of the logic can stay the same, and does not need to > be modified to handle "write_bitmaps < 0"? Good point, changed in v2. > > +test_expect_success 'bitmaps are created by default in bare repos' ' > > + git clone --bare .git bare.git && > > + cd bare.git && > > Please don't "cd" outside of a subshell, since it impacts further tests > that are added. Oops, I got it into my head that each test was already run in a separate subshell :x Fixed. > > + mkdir old && > > + mv objects/pack/* old && > > + pack=$(ls old/*.pack) && > > Are we sure we have just done $pack here? Our repo came from a > local-disk clone, which would have just hard-linked whatever was in the > source repo. So we're subtly relying on the state that other tests have > left. > > I'm not sure what we're trying to accomplish with this unpacking, > though. Running "git repack -ad" should generate bitmaps whether the > objects were already in a single pack or not. So I think this test can > just be: Right, I forgot "repack -a" didn't need loose objects to operate :x > I do agree with Ævar it might also be worth testing that disabling > bitmaps explicitly still works. And also that repacking _without_ "-a" > (i.e., an incremental) does not complain about being unable to generate > bitmaps. Yup, now with additional tests for repack.writeBitmaps=false, repack (w/o "-a") and a config/repack.txt update ------------8<--------- Subject: [PATCH] repack: enable bitmaps by default on bare repos A typical use case for bare repos is for serving clones and fetches to clients. Enable bitmaps by default on bare repos to make it easier for admins to host git repos in a performant way. Signed-off-by: Eric Wong <e@80x24.org> Helped-by: Jeff King <peff@peff.net> --- Documentation/config/repack.txt | 2 +- builtin/repack.c | 5 ++++- t/t7700-repack.sh | 19 ++++++++++++++++++- 3 files changed, 23 insertions(+), 3 deletions(-) diff --git a/Documentation/config/repack.txt b/Documentation/config/repack.txt index a5c37813fd..9c413e177e 100644 --- a/Documentation/config/repack.txt +++ b/Documentation/config/repack.txt @@ -24,4 +24,4 @@ repack.writeBitmaps:: packs created for clones and fetches, at the cost of some disk space and extra time spent on the initial repack. This has no effect if multiple packfiles are created. - Defaults to false. + Defaults to true on bare repos, false otherwise. diff --git a/builtin/repack.c b/builtin/repack.c index 67f8978043..caca113927 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -14,7 +14,7 @@ static int delta_base_offset = 1; static int pack_kept_objects = -1; -static int write_bitmaps; +static int write_bitmaps = -1; static int use_delta_islands; static char *packdir, *packtmp; @@ -343,6 +343,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix) (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE))) die(_("--keep-unreachable and -A are incompatible")); + if (write_bitmaps < 0) + write_bitmaps = (pack_everything & ALL_INTO_ONE) && + is_bare_repository(); if (pack_kept_objects < 0) pack_kept_objects = write_bitmaps; diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 6162e2a8e6..6458286dcf 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -221,5 +221,22 @@ test_expect_success 'repack --keep-pack' ' ) ' -test_done +test_expect_success 'bitmaps are created by default in bare repos' ' + git clone --bare .git bare.git && + git -C bare.git repack -ad && + bitmap=$(ls bare.git/objects/pack/*.bitmap) && + test_path_is_file "$bitmap" +' + +test_expect_success 'incremental repack does not complain' ' + git -C bare.git repack -q 2>repack.err && + ! test -s repack.err +' +test_expect_success 'bitmaps can be disabled on bare repos' ' + git -c repack.writeBitmaps=false -C bare.git repack -ad && + bitmap=$(ls bare.git/objects/pack/*.bitmap 2>/dev/null || :) && + test -z "$bitmap" +' + +test_done
On Wed, Mar 13, 2019 at 01:51:33AM +0000, Eric Wong wrote: > But I did find Ævar's forgotten gitperformance doc and thread > where the topic was brought up: > > https://public-inbox.org/git/20170403211644.26814-1-avarab@gmail.com/ One thing that thread reminded me of: we probably also want to default pack.writebitmaphashcache on. Otherwise the time saved during the object enumeration can backfire when we spend much more time trying delta compression (because we don't know the pathnames of any objects). The reason it defaults to off is for on-disk compatibility with JGit. But I have very little experience running without the hash-cache on. We added it very early on because we found performance was not great without it (I don't know if people running JGit have run into the same problem and if not, why not). > ------------8<--------- > Subject: [PATCH] repack: enable bitmaps by default on bare repos > > A typical use case for bare repos is for serving clones and > fetches to clients. Enable bitmaps by default on bare repos to > make it easier for admins to host git repos in a performant way. This looks good to me, with one minor nit: > +test_expect_success 'incremental repack does not complain' ' > + git -C bare.git repack -q 2>repack.err && > + ! test -s repack.err > +' This last line could use "test_must_be_empty". -Peff
diff --git a/builtin/repack.c b/builtin/repack.c index 67f8978043..5d4758b515 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -14,7 +14,7 @@ static int delta_base_offset = 1; static int pack_kept_objects = -1; -static int write_bitmaps; +static int write_bitmaps = -1; static int use_delta_islands; static char *packdir, *packtmp; @@ -343,11 +343,15 @@ int cmd_repack(int argc, const char **argv, const char *prefix) (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE))) die(_("--keep-unreachable and -A are incompatible")); + if (!(pack_everything & ALL_INTO_ONE)) { + if (write_bitmaps > 0) + die(_(incremental_bitmap_conflict_error)); + } else if (write_bitmaps < 0) { + write_bitmaps = is_bare_repository(); + } + if (pack_kept_objects < 0) - pack_kept_objects = write_bitmaps; - - if (write_bitmaps && !(pack_everything & ALL_INTO_ONE)) - die(_(incremental_bitmap_conflict_error)); + pack_kept_objects = write_bitmaps > 0; packdir = mkpathdup("%s/pack", get_object_directory()); packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid()); @@ -368,7 +372,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) argv_array_push(&cmd.args, "--indexed-objects"); if (repository_format_partial_clone) argv_array_push(&cmd.args, "--exclude-promisor-objects"); - if (write_bitmaps) + if (write_bitmaps > 0) argv_array_push(&cmd.args, "--write-bitmap-index"); if (use_delta_islands) argv_array_push(&cmd.args, "--delta-islands"); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 6162e2a8e6..3e0b5c40e4 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -221,5 +221,17 @@ test_expect_success 'repack --keep-pack' ' ) ' +test_expect_success 'bitmaps are created by default in bare repos' ' + git clone --bare .git bare.git && + cd bare.git && + mkdir old && + mv objects/pack/* old && + pack=$(ls old/*.pack) && + test_path_is_file "$pack" && + git unpack-objects -q <"$pack" && + git repack -ad && + bitmap=$(ls objects/pack/*.bitmap) && + test_path_is_file "$bitmap" +' + test_done -