Message ID | pull.1263.v2.git.1655350617442.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] send-pack.c: add config push.useBitmaps | expand |
On 6/15/2022 11:36 PM, Kyle Zhao via GitGitGadget wrote: > From: Kyle Zhao <kylezhao@tencent.com> > > This allows you to disable bitmaps for "git push". Default is false. > > Reachability bitmaps are designed to speed up the "counting objects" > phase of generating a pack during a clone or fetch. They are not > optimized for Git clients sending a small topic branch via "git push". > In some cases (see [1]), using reachability bitmaps during "git push" > can cause significant performance regressions. > > Add a new "push.useBitmaps" config option to disable reachability > bitmaps during "git push". This allows reachability bitmaps to still > be used in other areas, such as "git rev-list --use-bitmap-index". > > [1]: https://lore.kernel.org/git/87zhoz8b9o.fsf@evledraar.gmail.com/ > > Signed-off-by: Kyle Zhao <kylezhao@tencent.com> > --- > send-pack.c: add config push.useBitmaps > > This patch add config push.useBitmaps to prevent git push using bitmap. > > The origin discussion is here: > https://lore.kernel.org/git/b940e705fbe9454685757f2e3055e2ce@tencent.com/ > > Thanks, -Kyle > > Changes since v1: > > * changed the commit message > * modified and added missing \n to push.txt > * used test_subcommand for test > * modified "if" statement for "git_config_get_bool()" in send-pack.c This version satisfies the recommended changes I had from the last version. As long as we think there is a benefit to having this additional knob over just pack.useBitmaps, then this is good to go. For my part, I think having the additional knob is less complicated than requiring all users in this situation to shift from "push" to a "fetch" (from the other side of the connection). Thanks, -Stolee
On Thu, Jun 16 2022, Kyle Zhao via GitGitGadget wrote: > From: Kyle Zhao <kylezhao@tencent.com> > > This allows you to disable bitmaps for "git push". Default is false. > > Reachability bitmaps are designed to speed up the "counting objects" > phase of generating a pack during a clone or fetch. They are not > optimized for Git clients sending a small topic branch via "git push". > In some cases (see [1]), using reachability bitmaps during "git push" > can cause significant performance regressions. > > Add a new "push.useBitmaps" config option to disable reachability > bitmaps during "git push". This allows reachability bitmaps to still > be used in other areas, such as "git rev-list --use-bitmap-index". > > [1]: https://lore.kernel.org/git/87zhoz8b9o.fsf@evledraar.gmail.com/ > [...] > + mk_test testrepo heads/main && > + git checkout main && > + GIT_TRACE2_EVENT="$PWD/default" \ > + git push testrepo main:test && > + test_subcommand git pack-objects --all-progress-implied --revs --stdout \ > + --thin --delta-base-offset -q --no-use-bitmap-index <default && Nit: We tend to indent these ase we wrap, so e.g.: test_subcommand git ... \ --thin --delta [...] The rest all looks good as far as the diff goes, if what we want to do is to disable this by default, and this isn't worth a re-roll in itself. But I still think that completely disabling bitmaps might be premature here, especially per Taylor's comment on v1 (which I understand to mean that they should help some of the time, even with push). And since this is linking to that old thread I started in 2019 I really think the commit message should be exploring and justifying why this might be slower in all cases, or at least a sufficient number of cases to flip the default. if it's not are we throwing out a generally useful optimization (by default) due to some edge cases we've found? E.g. have you tried to follow-up on this comment by Jeff King? https://lore.kernel.org/git/20190523113031.GA17448@sigill.intra.peff.net/ At the time I didn't, because as noted in a follow-up I'd lost my test case by the time I read that, but it seems you haven't, and have a current test case. If you apply that patch on that old git version at the time (as it probably won't apply cleanly), does it make a difference for your case? To be clear I'm *not* confident that this isn't the right thing to do as far as the diff is concerned. But I *am* confident that the proposed commit message isn't selling me on the idea. I think a really good start would be to come up with some benchmarks for a few common cases, e.g. how do bitmaps do if we have a big repo but few refs, how about a lot of refs? Does their stale-ness make a difference or not (per previous discussions) etc. etc. Or, if you don't have time for any of that I think a good change for now would be to add the flag, but not flip the default, and say "we don't have enough data to say if bitmaps are overall worse for pushes, but here's an opt-opt". But the change as it stands is effectively saying "bitmaps on push hinder more than they help, and turning them on for push was a mistake". Maybe that's true, but I don't think we've got any data to support that. Even though I've got one of those anecdotes (and from occasional investigations of slow "git push" I'm pretty sure this would help my use-cases more than it would harm them) the plural of anecdote isn't data. I've only personally run into this with repos with a large number (>30k at least) number of refs, and where the bitmaps were in some state of staleness (this usually being my working copy, where I rely on "git gc --auto").
> And since this is linking to that old thread I started in 2019 I really >think the commit message should be exploring and justifying why this > might be slower in all cases, or at least a sufficient number of cases > to flip the default. > > if it's not are we throwing out a generally useful optimization (by > default) due to some edge cases we've found? I'm not sure if it's an edge case. However, if a git repo has this problem, it's "git push" will always be very slow instead of going fast and slow. > E.g. have you tried to follow-up on this comment by Jeff King? > https://lore.kernel.org/git/20190523113031.GA17448@sigill.intra.peff.net/ Not yet, I'll try it later. > At the time I didn't, because as noted in a follow-up I'd lost my test > case by the time I read that, but it seems you haven't, and have a > current test case. I tried to find a test case in open-source projects, and finally found one. $ git clone https://github.com/JetBrains/intellij-community.git --bare $ cd intellij-community.git $ git repack -adb $ GIT_TRACE=1 git push . master:test1 Then it would get stuck in "git pack-objects" for about 10s. (OS: Windows) After I deleted the bitmap, "git push" took less than 1s. > I think a really good start would be to come up with some benchmarks for > a few common cases, e.g. how do bitmaps do if we have a big repo but few > refs, how about a lot of refs? Does their stale-ness make a difference > or not (per previous discussions) etc. etc. According to my observation, the problem always occur on git repos which have a lot of refs and objects. Also, "git push" takes much time on "find _objects(..) for haves_bitmap". Hope the information above can help find the cause of the problem. Thanks, - Kyle
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> + mk_test testrepo heads/main && >> + git checkout main && >> + GIT_TRACE2_EVENT="$PWD/default" \ >> + git push testrepo main:test && >> + test_subcommand git pack-objects --all-progress-implied --revs --stdout \ >> + --thin --delta-base-offset -q --no-use-bitmap-index <default && > > Nit: We tend to indent these ase we wrap, so e.g.: > > test_subcommand git ... \ > --thin --delta [...] > > The rest all looks good as far as the diff goes, if what we want to do > is to disable this by default, and this isn't worth a re-roll in itself. > > But I still think that completely disabling bitmaps might be premature > here, especially per Taylor's comment on v1 (which I understand to mean > that they should help some of the time, even with push). The usual way to move is to move slowly and carefully. It may well be the case that disabling bitmaps gives users a better default, but that is not even proven and is hard to prove. How many users of Git do we have? Those silently using it happily will by definition complain here or elsewhere, and the complaints "X is slow with Y, so Y should be disabled when doing X" we hear tend to be louder than "I am happily doing X with Y". I have different problems with this patch, though. It can use a bit more honesty. If you introduce a new knob and sell it as a knob that allows disabling, be honest and keep its behaviour as advertised. As posted, IIUC, the patch does something quite different. It disables by default, and have a knob to allow it enabled again. So, perhaps make it default on to keep the historical behaviour, and document it as "setting it false may improve push performance without affecting use of the reachability bitmaps for other operations. Default is true." Thanks.
On Thu, Jun 16, 2022 at 03:38:36PM +0200, Ævar Arnfjörð Bjarmason wrote: > But the change as it stands is effectively saying "bitmaps on push > hinder more than they help, and turning them on for push was a mistake". > > Maybe that's true, but I don't think we've got any data to support > that. Even though I've got one of those anecdotes (and from occasional > investigations of slow "git push" I'm pretty sure this would help my > use-cases more than it would harm them) the plural of anecdote isn't > data. Yeah, I'm not convinced they hinder more than they help on net. And any problems on pushing are also things that _could_ happen on fetching, so I think this is really a band-aid. But: - even if the average is improved, it's reasonable to want to avoid the most pathological cases - it's reasonable for it to be the user's choice to make, and right now the config isn't expressive enough to allow the choice they want My biggest fear would be that the pathological cases are improved (perhaps with the approach of mine that you linked earlier; thanks for that), but that people erroneously continue to think that turning off push.useBitmaps is a good idea due to inertia or superstition. But that seems like a minor problem that can be addressed later if need be. -Peff
On Thu, Jun 16, 2022 at 03:11:09PM +0000, kylezhao(赵柯宇) wrote: > > At the time I didn't, because as noted in a follow-up I'd lost my test > > case by the time I read that, but it seems you haven't, and have a > > current test case. > > I tried to find a test case in open-source projects, and finally found one. > $ git clone https://github.com/JetBrains/intellij-community.git --bare > $ cd intellij-community.git > $ git repack -adb > $ GIT_TRACE=1 git push . master:test1 I wouldn't expect this to push any objects at all, since you're pushing to a repository that already has all of the objects contained in `master`. A more representative test might be something like: $ git clone https://github.com/JetBrains/intellij-community.git --bare $ cd intellij-community.git $ git repack -adb $ git rev-parse HEAD >in $ time git pack-objects --revs --stdout <in >/dev/null # move the bitmap away so we don't use it $ mv objects/pack/pack-*.bitmap{,.bak} $ time git pack-objects --revs --stdout <in >/dev/null Thanks, Taylor
diff --git a/Documentation/config/push.txt b/Documentation/config/push.txt index e32801e6c91..3f3ff66fe7c 100644 --- a/Documentation/config/push.txt +++ b/Documentation/config/push.txt @@ -137,3 +137,8 @@ push.negotiate:: server attempt to find commits in common. If "false", Git will rely solely on the server's ref advertisement to find commits in common. + +push.useBitmaps:: + If this config and `pack.useBitmaps` are both `true`, then Git will + use reachability bitmaps during `git push`, if available (disabled + by default). diff --git a/send-pack.c b/send-pack.c index bc0fcdbb000..627e79d7623 100644 --- a/send-pack.c +++ b/send-pack.c @@ -84,6 +84,8 @@ static int pack_objects(int fd, struct ref *refs, struct oid_array *advertised, strvec_push(&po.args, "--progress"); if (is_repository_shallow(the_repository)) strvec_push(&po.args, "--shallow"); + if (!args->use_bitmaps) + strvec_push(&po.args, "--no-use-bitmap-index"); po.in = -1; po.out = args->stateless_rpc ? -1 : fd; po.git_cmd = 1; @@ -487,6 +489,7 @@ int send_pack(struct send_pack_args *args, struct async demux; const char *push_cert_nonce = NULL; struct packet_reader reader; + int use_bitmaps; if (!remote_refs) { fprintf(stderr, "No refs in common and none specified; doing nothing.\n" @@ -498,6 +501,9 @@ int send_pack(struct send_pack_args *args, if (push_negotiate) get_commons_through_negotiation(args->url, remote_refs, &commons); + if (!git_config_get_bool("push.usebitmaps", &use_bitmaps)) + args->use_bitmaps = use_bitmaps; + git_config_get_bool("transfer.advertisesid", &advertise_sid); /* Does the other end support the reporting? */ diff --git a/send-pack.h b/send-pack.h index e148fcd9609..f7af1b0353e 100644 --- a/send-pack.h +++ b/send-pack.h @@ -26,7 +26,8 @@ struct send_pack_args { /* One of the SEND_PACK_PUSH_CERT_* constants. */ push_cert:2, stateless_rpc:1, - atomic:1; + atomic:1, + use_bitmaps:1; const struct string_list *push_options; }; diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh index dedca106a7a..0d416d1474f 100755 --- a/t/t5516-fetch-push.sh +++ b/t/t5516-fetch-push.sh @@ -1865,4 +1865,25 @@ test_expect_success 'push warns or fails when using username:password' ' test_line_count = 1 warnings ' +test_expect_success 'push with config push.useBitmaps' ' + mk_test testrepo heads/main && + git checkout main && + GIT_TRACE2_EVENT="$PWD/default" \ + git push testrepo main:test && + test_subcommand git pack-objects --all-progress-implied --revs --stdout \ + --thin --delta-base-offset -q --no-use-bitmap-index <default && + + git config push.useBitmaps true && + GIT_TRACE2_EVENT="$PWD/true" \ + git push testrepo main:test2 && + test_subcommand git pack-objects --all-progress-implied --revs --stdout \ + --thin --delta-base-offset -q <true && + + git config push.useBitmaps false && + GIT_TRACE2_EVENT="$PWD/false" \ + git push testrepo main:test3 && + test_subcommand git pack-objects --all-progress-implied --revs --stdout \ + --thin --delta-base-offset -q --no-use-bitmap-index <false +' + test_done