Message ID | 40c385048a023dbd447c5f0b4c95ff32485e1e23.1629906005.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | fetch: skip formatting updated refs with `--quiet` | expand |
On Wed, Aug 25 2021, Patrick Steinhardt wrote: > [[PGP Signed Part:Undecided]] > When fetching, Git will by default print a list of all updated refs in a > nicely formatted table. In order to come up with this table, Git needs > to iterate refs twice: first to determine the maximum column width, and > a second time to actually format these changed refs. > > While this table will not be printed in case the user passes `--quiet`, > we still go out of our way and do all these steps. In fact, we even do > more work compared to not passing `--quiet`: without the flag, we will > skip all references in the column width computation which have not been > updated, but if it is set we will now compute widths for all refs. > > Fix this issue by completely skipping both preparation of the format and > formatting data for display in case the user passes `--quiet`, improving > performance especially with many refs. The following benchmark shows a > nice speedup for a quiet mirror-fetch in a repository with 2.3M refs: > > Benchmark #1: HEAD~: git-fetch > Time (mean ± σ): 26.929 s ± 0.145 s [User: 24.194 s, System: 4.656 s] > Range (min … max): 26.692 s … 27.068 s 5 runs > > Benchmark #2: HEAD: git-fetch > Time (mean ± σ): 25.189 s ± 0.094 s [User: 22.556 s, System: 4.606 s] > Range (min … max): 25.070 s … 25.314 s 5 runs > > Summary > 'HEAD: git-fetch' ran > 1.07 ± 0.01 times faster than 'HEAD~: git-fetch' > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > builtin/fetch.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/builtin/fetch.c b/builtin/fetch.c > index e064687dbd..d064b66512 100644 > --- a/builtin/fetch.c > +++ b/builtin/fetch.c > @@ -757,6 +757,9 @@ static void prepare_format_display(struct ref *ref_map) > die(_("configuration fetch.output contains invalid value %s"), > format); > > + if (verbosity < 0) > + return; > + > for (rm = ref_map; rm; rm = rm->next) { > if (rm->status == REF_STATUS_REJECT_SHALLOW || > !rm->peer_ref || > @@ -827,7 +830,12 @@ static void format_display(struct strbuf *display, char code, > const char *remote, const char *local, > int summary_width) > { > - int width = (summary_width + strlen(summary) - gettext_width(summary)); > + int width; > + > + if (verbosity < 0) > + return; > + > + width = (summary_width + strlen(summary) - gettext_width(summary)); > > strbuf_addf(display, "%c %-*s ", code, width, summary); > if (!compact_format) I wonder what you think of the below, which is a bit more of a verbose search/replacement change, but I think ultimately cleaner. We just pass the "ref_map" down to all the formatting functions, and the first function that needs to know the "compact_format" or "refcol_width" lazily computes it (stored in a static variable). So we avoid the tricky action at a distance of not preparing certain things because we know we're in the quiet mode. But if I understand your change correctly we're on the one hand not preparing the format change, but then also not printing this at all now under --quiet, correct? I think it would be good to split up that proposed functional change from the optimization of the output, it also seems like if that were done the git-fetch docs on --quiet need to be changed. But I wonder if once we have the change below, whether the small change on top of it to just add a fetch.outputRefWidth=123 wouldn't also largely solve the optimization problem you have, i.e. adding a config variable to adjust the width, instead of us auto-discovering it by looping over all refs. Or maybe not, but I think it would be interesting to see how much of the slowdown you have is that v.s. that we end up emitting this output to stderr. I.e. is it the I/O of the output or the extra loop that's the expensive part? diff --git a/builtin/fetch.c b/builtin/fetch.c index e064687dbdc..88a8b3660d4 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -707,6 +707,24 @@ static int s_update_ref(const char *action, static int refcol_width = 10; static int compact_format; +static int prepared_compact_format; +static void prepare_compact_format(void) +{ + const char *format = "full"; + + if (prepared_compact_format++) + return; + + git_config_get_string_tmp("fetch.output", &format); + if (!strcasecmp(format, "full")) + compact_format = 0; + else if (!strcasecmp(format, "compact")) + compact_format = 1; + else + die(_("configuration fetch.output contains invalid value %s"), + format); +} + static void adjust_refcol_width(const struct ref *ref) { int max, rlen, llen, len; @@ -726,6 +744,7 @@ static void adjust_refcol_width(const struct ref *ref) * anyway because we don't know if the error explanation part * will be printed in update_local_ref) */ + prepare_compact_format(); if (compact_format) { llen = 0; max = max * 2 / 3; @@ -743,19 +762,13 @@ static void adjust_refcol_width(const struct ref *ref) refcol_width = rlen; } -static void prepare_format_display(struct ref *ref_map) +static int prepared_refcol_width; +static void prepare_refcol_width(struct ref *ref_map) { struct ref *rm; - const char *format = "full"; - git_config_get_string_tmp("fetch.output", &format); - if (!strcasecmp(format, "full")) - compact_format = 0; - else if (!strcasecmp(format, "compact")) - compact_format = 1; - else - die(_("configuration fetch.output contains invalid value %s"), - format); + if (prepared_refcol_width++) + return; for (rm = ref_map; rm; rm = rm->next) { if (rm->status == REF_STATUS_REJECT_SHALLOW || @@ -767,9 +780,10 @@ static void prepare_format_display(struct ref *ref_map) } } -static void print_remote_to_local(struct strbuf *display, +static void print_remote_to_local(struct ref *ref_map, struct strbuf *display, const char *remote, const char *local) { + prepare_refcol_width(ref_map); strbuf_addf(display, "%-*s -> %s", refcol_width, remote, local); } @@ -800,7 +814,7 @@ static int find_and_replace(struct strbuf *haystack, return 1; } -static void print_compact(struct strbuf *display, +static void print_compact(struct ref *ref_map, struct strbuf *display, const char *remote, const char *local) { struct strbuf r = STRBUF_INIT; @@ -816,13 +830,14 @@ static void print_compact(struct strbuf *display, if (!find_and_replace(&r, local, "*")) find_and_replace(&l, remote, "*"); - print_remote_to_local(display, r.buf, l.buf); + print_remote_to_local(ref_map, display, r.buf, l.buf); strbuf_release(&r); strbuf_release(&l); } -static void format_display(struct strbuf *display, char code, +static void format_display(struct ref *ref_map, + struct strbuf *display, char code, const char *summary, const char *error, const char *remote, const char *local, int summary_width) @@ -830,15 +845,17 @@ static void format_display(struct strbuf *display, char code, int width = (summary_width + strlen(summary) - gettext_width(summary)); strbuf_addf(display, "%c %-*s ", code, width, summary); + prepare_compact_format(); if (!compact_format) - print_remote_to_local(display, remote, local); + print_remote_to_local(ref_map, display, remote, local); else - print_compact(display, remote, local); + print_compact(ref_map, display, remote, local); if (error) strbuf_addf(display, " (%s)", error); } -static int update_local_ref(struct ref *ref, +static int update_local_ref(struct ref *ref_map, + struct ref *ref, struct ref_transaction *transaction, const char *remote, const struct ref *remote_ref, @@ -857,7 +874,7 @@ static int update_local_ref(struct ref *ref, if (oideq(&ref->old_oid, &ref->new_oid)) { if (verbosity > 0) - format_display(display, '=', _("[up to date]"), NULL, + format_display(ref_map, display, '=', _("[up to date]"), NULL, remote, pretty_ref, summary_width); return 0; } @@ -870,7 +887,7 @@ static int update_local_ref(struct ref *ref, * If this is the head, and it's not okay to update * the head, and the old value of the head isn't empty... */ - format_display(display, '!', _("[rejected]"), + format_display(ref_map, display, '!', _("[rejected]"), _("can't fetch in current branch"), remote, pretty_ref, summary_width); return 1; @@ -881,12 +898,12 @@ static int update_local_ref(struct ref *ref, if (force || ref->force) { int r; r = s_update_ref("updating tag", ref, transaction, 0); - format_display(display, r ? '!' : 't', _("[tag update]"), + format_display(ref_map, display, r ? '!' : 't', _("[tag update]"), r ? _("unable to update local ref") : NULL, remote, pretty_ref, summary_width); return r; } else { - format_display(display, '!', _("[rejected]"), _("would clobber existing tag"), + format_display(ref_map, display, '!', _("[rejected]"), _("would clobber existing tag"), remote, pretty_ref, summary_width); return 1; } @@ -918,7 +935,7 @@ static int update_local_ref(struct ref *ref, } r = s_update_ref(msg, ref, transaction, 0); - format_display(display, r ? '!' : '*', what, + format_display(ref_map, display, r ? '!' : '*', what, r ? _("unable to update local ref") : NULL, remote, pretty_ref, summary_width); return r; @@ -940,7 +957,7 @@ static int update_local_ref(struct ref *ref, strbuf_addstr(&quickref, ".."); strbuf_add_unique_abbrev(&quickref, &ref->new_oid, DEFAULT_ABBREV); r = s_update_ref("fast-forward", ref, transaction, 1); - format_display(display, r ? '!' : ' ', quickref.buf, + format_display(ref_map, display, r ? '!' : ' ', quickref.buf, r ? _("unable to update local ref") : NULL, remote, pretty_ref, summary_width); strbuf_release(&quickref); @@ -952,13 +969,13 @@ static int update_local_ref(struct ref *ref, strbuf_addstr(&quickref, "..."); strbuf_add_unique_abbrev(&quickref, &ref->new_oid, DEFAULT_ABBREV); r = s_update_ref("forced-update", ref, transaction, 1); - format_display(display, r ? '!' : '+', quickref.buf, + format_display(ref_map, display, r ? '!' : '+', quickref.buf, r ? _("unable to update local ref") : _("forced update"), remote, pretty_ref, summary_width); strbuf_release(&quickref); return r; } else { - format_display(display, '!', _("[rejected]"), _("non-fast-forward"), + format_display(ref_map, display, '!', _("[rejected]"), _("non-fast-forward"), remote, pretty_ref, summary_width); return 1; } @@ -1111,8 +1128,6 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, } } - prepare_format_display(ref_map); - /* * We do a pass for each fetch_head_status type in their enum order, so * merged entries are written before not-for-merge. That lets readers @@ -1187,8 +1202,9 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, strbuf_reset(¬e); if (ref) { - rc |= update_local_ref(ref, transaction, what, - rm, ¬e, summary_width); + rc |= update_local_ref(ref_map, ref, + transaction, what, rm, + ¬e, summary_width); free(ref); } else if (write_fetch_head || dry_run) { /* @@ -1196,7 +1212,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, * would be written to FETCH_HEAD, if --dry-run * is set). */ - format_display(¬e, '*', + format_display(ref_map, ¬e, '*', *kind ? kind : "branch", NULL, *what ? what : "HEAD", "FETCH_HEAD", summary_width); @@ -1357,7 +1373,7 @@ static int prune_refs(struct refspec *rs, struct ref *ref_map, fprintf(stderr, _("From %.*s\n"), url_len, url); shown_url = 1; } - format_display(&sb, '-', _("[deleted]"), NULL, + format_display(ref_map, &sb, '-', _("[deleted]"), NULL, _("(none)"), prettify_refname(ref->name), summary_width); fprintf(stderr, " %s\n",sb.buf);
Patrick Steinhardt <ps@pks.im> writes: > When fetching, Git will by default print a list of all updated refs in a > nicely formatted table. In order to come up with this table, Git needs > to iterate refs twice: first to determine the maximum column width, and > a second time to actually format these changed refs. > > While this table will not be printed in case the user passes `--quiet`, > we still go out of our way and do all these steps. In fact, we even do > more work compared to not passing `--quiet`: without the flag, we will > skip all references in the column width computation which have not been > updated, but if it is set we will now compute widths for all refs. Interesting. This line /* uptodate lines are only shown on high verbosity level */ if (!verbosity && oideq(&ref->peer_ref->old_oid, &ref->old_oid)) return; at the beginning of the adjust_refcol_width() function indeed does not skip if verbosity is negative, so the comment is wrong---it is not only computed on high verbosity level. Why doesn't this patch include a change like this then? if (verbosity <= 0 || oideq(&ref->peer_ref->old_oid, &ref->old_oid)) return; Another thing I notice is this part from store_updated_refs(): if (note.len) { if (verbosity >= 0 && !shown_url) { fprintf(stderr, _("From %.*s\n"), url_len, url); shown_url = 1; } if (verbosity >= 0) fprintf(stderr, " %s\n", note.buf); } We no longer need to check for verbosity, right? Other than these two, I like the approach a lot.
On Wed, Aug 25, 2021 at 10:51:55AM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > When fetching, Git will by default print a list of all updated refs in a > > nicely formatted table. In order to come up with this table, Git needs > > to iterate refs twice: first to determine the maximum column width, and > > a second time to actually format these changed refs. > > > > While this table will not be printed in case the user passes `--quiet`, > > we still go out of our way and do all these steps. In fact, we even do > > more work compared to not passing `--quiet`: without the flag, we will > > skip all references in the column width computation which have not been > > updated, but if it is set we will now compute widths for all refs. > > Interesting. This line > > /* uptodate lines are only shown on high verbosity level */ > if (!verbosity && oideq(&ref->peer_ref->old_oid, &ref->old_oid)) > return; > > at the beginning of the adjust_refcol_width() function indeed does > not skip if verbosity is negative, so the comment is wrong---it is > not only computed on high verbosity level. Why doesn't this patch > include a change like this then? > > if (verbosity <= 0 || oideq(&ref->peer_ref->old_oid, &ref->old_oid)) > return; This was indeed my first iteration. But if we just fix the condition like you do here, then we still iterate over all refs even though we know that we're not going to do anything with them. So my version just skips over the iteration completely. > Another thing I notice is this part from store_updated_refs(): > > if (note.len) { > if (verbosity >= 0 && !shown_url) { > fprintf(stderr, _("From %.*s\n"), > url_len, url); > shown_url = 1; > } > if (verbosity >= 0) > fprintf(stderr, " %s\n", note.buf); > } > > We no longer need to check for verbosity, right? Right. It would be less obvious though that we indeed never print to the buffer if `verbosity < 0`, which is why I bailed on that refactoring. I wonder whether we can just refactor this such that the buffer is caller-provided. If `--quiet`, the caller just passes a `NULL` pointer so it's explicit that it cannot be written to. This would also address Ævar's feedback about the "tricky action at a distance", which is valid criticism in my eyes. Patrick
Patrick Steinhardt <ps@pks.im> writes: >> Interesting. This line >> >> /* uptodate lines are only shown on high verbosity level */ >> if (!verbosity && oideq(&ref->peer_ref->old_oid, &ref->old_oid)) >> return; >> >> at the beginning of the adjust_refcol_width() function indeed does >> not skip if verbosity is negative, so the comment is wrong---it is >> not only computed on high verbosity level. Why doesn't this patch >> include a change like this then? >> >> if (verbosity <= 0 || oideq(&ref->peer_ref->old_oid, &ref->old_oid)) >> return; > > This was indeed my first iteration. But if we just fix the condition > like you do here, then we still iterate over all refs even though we > know that we're not going to do anything with them. So my version just > skips over the iteration completely. I didn't ask "why isn't this patch the above change and nothing else?" With or without the "don't bother counting columns under --quiet" fix, the condition used in the above if statement is wrong. With the fix, only because the function is never called with negative verbosity, "if (!verbosity)" means the same thing as "only shown on high verbosity level", but the reader must have followed the flow of logic to realize it. If you fix the condition to exclude all non-verbose cases (including --quiet), the readers do not have to do so to realize that the code is doing what the comment claims that it is doing. >> Another thing I notice is this part from store_updated_refs(): >> >> if (note.len) { >> if (verbosity >= 0 && !shown_url) { >> fprintf(stderr, _("From %.*s\n"), >> url_len, url); >> shown_url = 1; >> } >> if (verbosity >= 0) >> fprintf(stderr, " %s\n", note.buf); >> } >> >> We no longer need to check for verbosity, right? > > Right. It would be less obvious though that we indeed never print to the > buffer if `verbosity < 0`, which is why I bailed on that refactoring. I actually think that the resulting code will still be obvious, even with the (apparently unrelated) URL display, and actually even be better, if we drop the condition on verbosity from this code. The code that led to this part would have stuffed bytes in the note strbuf only when we wanted to show something based on the verbosity setting, and we show what is in note.len only when we have something to say (the URL thing is to give the header for the message). Decision to give what contents to show (or not to show at all) is made elsewhere---it happens to involve verbosity and perhaps in the future it may use some other input, but this part of the code does not have to know. Thanks.
diff --git a/builtin/fetch.c b/builtin/fetch.c index e064687dbd..d064b66512 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -757,6 +757,9 @@ static void prepare_format_display(struct ref *ref_map) die(_("configuration fetch.output contains invalid value %s"), format); + if (verbosity < 0) + return; + for (rm = ref_map; rm; rm = rm->next) { if (rm->status == REF_STATUS_REJECT_SHALLOW || !rm->peer_ref || @@ -827,7 +830,12 @@ static void format_display(struct strbuf *display, char code, const char *remote, const char *local, int summary_width) { - int width = (summary_width + strlen(summary) - gettext_width(summary)); + int width; + + if (verbosity < 0) + return; + + width = (summary_width + strlen(summary) - gettext_width(summary)); strbuf_addf(display, "%c %-*s ", code, width, summary); if (!compact_format)
When fetching, Git will by default print a list of all updated refs in a nicely formatted table. In order to come up with this table, Git needs to iterate refs twice: first to determine the maximum column width, and a second time to actually format these changed refs. While this table will not be printed in case the user passes `--quiet`, we still go out of our way and do all these steps. In fact, we even do more work compared to not passing `--quiet`: without the flag, we will skip all references in the column width computation which have not been updated, but if it is set we will now compute widths for all refs. Fix this issue by completely skipping both preparation of the format and formatting data for display in case the user passes `--quiet`, improving performance especially with many refs. The following benchmark shows a nice speedup for a quiet mirror-fetch in a repository with 2.3M refs: Benchmark #1: HEAD~: git-fetch Time (mean ± σ): 26.929 s ± 0.145 s [User: 24.194 s, System: 4.656 s] Range (min … max): 26.692 s … 27.068 s 5 runs Benchmark #2: HEAD: git-fetch Time (mean ± σ): 25.189 s ± 0.094 s [User: 22.556 s, System: 4.606 s] Range (min … max): 25.070 s … 25.314 s 5 runs Summary 'HEAD: git-fetch' ran 1.07 ± 0.01 times faster than 'HEAD~: git-fetch' Signed-off-by: Patrick Steinhardt <ps@pks.im> --- builtin/fetch.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)