diff mbox series

[v2,1/7] fetch: speed up lookup of want refs via commit-graph

Message ID 4a819a68309bf03db2d9a5e5be070e52c3542af8.1629800774.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series Speed up mirror-fetches with many refs | expand

Commit Message

Patrick Steinhardt Aug. 24, 2021, 10:36 a.m. UTC
When updating our local refs based on the refs fetched from the remote,
we need to iterate through all requested refs and load their respective
commits such that we can determine whether they need to be appended to
FETCH_HEAD or not. In cases where we're fetching from a remote with
exceedingly many refs, resolving these refs can be quite expensive given
that we repeatedly need to unpack object headers for each of the
referenced objects.

Speed this up by opportunistcally trying to resolve object IDs via the
commit graph. We only do so for any refs which are not in "refs/tags":
more likely than not, these are going to be a commit anyway, and this
lets us avoid having to unpack object headers completely in case the
object is a commit that is part of the commit-graph. This significantly
speeds up mirror-fetches in a real-world repository with
2.3M refs:

    Benchmark #1: HEAD~: git-fetch
      Time (mean ± σ):     56.482 s ±  0.384 s    [User: 53.340 s, System: 5.365 s]
      Range (min … max):   56.050 s … 57.045 s    5 runs

    Benchmark #2: HEAD: git-fetch
      Time (mean ± σ):     33.727 s ±  0.170 s    [User: 30.252 s, System: 5.194 s]
      Range (min … max):   33.452 s … 33.871 s    5 runs

    Summary
      'HEAD: git-fetch' ran
        1.67 ± 0.01 times faster than 'HEAD~: git-fetch'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fetch.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

Comments

Derrick Stolee Aug. 25, 2021, 2:16 p.m. UTC | #1
On 8/24/2021 6:36 AM, Patrick Steinhardt wrote:
> Speed this up by opportunistcally trying to resolve object IDs via the

s/opportunistcally/opportunistically/ 

> +			/*
> +			 * References in "refs/tags/" are often going to point
> +			 * to annotated tags, which are not part of the
> +			 * commit-graph. We thus only try to look up refs in
> +			 * the graph which are not in that namespace to not
> +			 * regress performance in repositories with many
> +			 * annotated tags.
> +			 */
> +			if (!starts_with(rm->name, "refs/tags/"))
> +				commit = lookup_commit_in_graph(the_repository, &rm->old_oid);

This new logic looks good.

> +			if (!commit) {
> +				commit = lookup_commit_reference_gently(the_repository,
> +									&rm->old_oid,
> +									1);
> +				if (!commit)
> +					rm->fetch_head_status = FETCH_HEAD_NOT_FOR_MERGE;
> +			}

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/builtin/fetch.c b/builtin/fetch.c
index e064687dbd..91d1301613 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1074,7 +1074,6 @@  static int store_updated_refs(const char *raw_url, const char *remote_name,
 			      int connectivity_checked, struct ref *ref_map)
 {
 	struct fetch_head fetch_head;
-	struct commit *commit;
 	int url_len, i, rc = 0;
 	struct strbuf note = STRBUF_INIT, err = STRBUF_INIT;
 	struct ref_transaction *transaction = NULL;
@@ -1122,6 +1121,7 @@  static int store_updated_refs(const char *raw_url, const char *remote_name,
 	     want_status <= FETCH_HEAD_IGNORE;
 	     want_status++) {
 		for (rm = ref_map; rm; rm = rm->next) {
+			struct commit *commit = NULL;
 			struct ref *ref = NULL;
 
 			if (rm->status == REF_STATUS_REJECT_SHALLOW) {
@@ -1131,11 +1131,23 @@  static int store_updated_refs(const char *raw_url, const char *remote_name,
 				continue;
 			}
 
-			commit = lookup_commit_reference_gently(the_repository,
-								&rm->old_oid,
-								1);
-			if (!commit)
-				rm->fetch_head_status = FETCH_HEAD_NOT_FOR_MERGE;
+			/*
+			 * References in "refs/tags/" are often going to point
+			 * to annotated tags, which are not part of the
+			 * commit-graph. We thus only try to look up refs in
+			 * the graph which are not in that namespace to not
+			 * regress performance in repositories with many
+			 * annotated tags.
+			 */
+			if (!starts_with(rm->name, "refs/tags/"))
+				commit = lookup_commit_in_graph(the_repository, &rm->old_oid);
+			if (!commit) {
+				commit = lookup_commit_reference_gently(the_repository,
+									&rm->old_oid,
+									1);
+				if (!commit)
+					rm->fetch_head_status = FETCH_HEAD_NOT_FOR_MERGE;
+			}
 
 			if (rm->fetch_head_status != want_status)
 				continue;