diff mbox series

[v4] remote: prefetch config

Message ID pull.1779.v4.git.1725565398681.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series [v4] remote: prefetch config | expand

Commit Message

Shubham Kanodia Sept. 5, 2024, 7:43 p.m. UTC
From: Shubham Kanodia <shubham.kanodia10@gmail.com>

Large repositories often contain numerous branches and refs, many of
which individual users may not need. This commit introduces a new
configuration option (`remote.<remote>.prefetch`) to allow
users to specify which remotes to prefetch during
the maintenance task.

Key behaviors:
1. If `remote.<remote>.prefetch` is unset or true, running
   `git-maintenance` will prefetch all refs for the remote.
2. If `remote.<remote>.prefetch` is set to false, the remote
   will be ignored for prefetching.

In a future change, we could also allow restricting the refs that are
prefetched per remote using the `prefetchref` config option per remote.

Both of these options in unison would allow users to optimize their
prefetch operations, reducing network traffic and disk usage.

Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
---
    remote: prefetch config

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1779%2Fpastelsky%2Fsk%2Fmaintenance-prefetch-remote-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1779/pastelsky/sk/maintenance-prefetch-remote-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1779

Range-diff vs v3:

 1:  c348f8efd33 ! 1:  80af121f835 remote: prefetch config
     @@ builtin/gc.c: static int fetch_remote(struct remote *remote, void *cbdata)
       	if (remote->skip_default_update)
       		return 0;
       
     -+	if (!remote->prefetch)
     ++	if (!remote->prefetch_enabled)
      +		return 0;
      +
       	child.git_cmd = 1;
     @@ remote.c: static struct remote *make_remote(struct remote_state *remote_state,
       	CALLOC_ARRAY(ret, 1);
       	ret->prune = -1;  /* unspecified */
       	ret->prune_tags = -1;  /* unspecified */
     -+	ret->prefetch = 1;
     ++	ret->prefetch_enabled = 1;
       	ret->name = xstrndup(name, len);
       	refspec_init(&ret->push, REFSPEC_PUSH);
       	refspec_init(&ret->fetch, REFSPEC_FETCH);
     @@ remote.c: static int handle_config(const char *key, const char *value,
       	else if (!strcmp(subkey, "prunetags"))
       		remote->prune_tags = git_config_bool(key, value);
      +	else if (!strcmp(subkey, "prefetch"))
     -+		remote->prefetch = git_config_bool(key, value);
     ++		remote->prefetch_enabled = git_config_bool(key, value);
       	else if (!strcmp(subkey, "url")) {
       		if (!value)
       			return config_error_nonbool(key);
     @@ remote.h: struct remote {
       
       	struct refspec fetch;
       
     -+	int prefetch;
     ++	int prefetch_enabled;
      +
       	/*
       	 * The setting for whether to fetch tags (as a separate rule from the


 Documentation/config/remote.txt   |  5 ++++
 Documentation/git-maintenance.txt |  7 +++---
 builtin/gc.c                      |  3 +++
 remote.c                          |  3 +++
 remote.h                          |  2 ++
 t/t7900-maintenance.sh            | 42 +++++++++++++++++++++++++++++++
 6 files changed, 59 insertions(+), 3 deletions(-)


base-commit: 2e7b89e038c0c888acf61f1b4ee5a43d4dd5e94c

Comments

Junio C Hamano Sept. 5, 2024, 8:57 p.m. UTC | #1
"Shubham Kanodia via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Shubham Kanodia <shubham.kanodia10@gmail.com>
> ...
> In a future change, we could also allow restricting the refs that are
> prefetched per remote using the `prefetchref` config option per remote.
>
> Both of these options in unison would allow users to optimize their
> prefetch operations, reducing network traffic and disk usage.
>
> Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
> ---

Looking good.  Thanks.
Shubham Kanodia Sept. 6, 2024, 9:42 a.m. UTC | #2
On Fri, Sep 6, 2024 at 2:28 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Shubham Kanodia via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Shubham Kanodia <shubham.kanodia10@gmail.com>
> > ...
> > In a future change, we could also allow restricting the refs that are
> > prefetched per remote using the `prefetchref` config option per remote.
> >
> > Both of these options in unison would allow users to optimize their
> > prefetch operations, reducing network traffic and disk usage.
> >
> > Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
> > ---
>
> Looking good.  Thanks.

How long do you reckon changes like this typically remain in "seen"
until merged upstream?
I'm preparing part-2 of this change separately — so would be good to
know when I can submit that.
diff mbox series

Patch

diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
index 8efc53e836d..c2b3876192c 100644
--- a/Documentation/config/remote.txt
+++ b/Documentation/config/remote.txt
@@ -33,6 +33,11 @@  remote.<name>.fetch::
 	The default set of "refspec" for linkgit:git-fetch[1]. See
 	linkgit:git-fetch[1].
 
+remote.<name>.prefetch::
+	If false, refs from the remote would not be prefetched for
+	the prefetch task in linkgit:git-maintenance[1]. If not set,
+	the value is assumed to be true.
+
 remote.<name>.push::
 	The default set of "refspec" for linkgit:git-push[1]. See
 	linkgit:git-push[1].
diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 51d0f7e94b6..2fd38706ea2 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -97,9 +97,10 @@  commit-graph::
 
 prefetch::
 	The `prefetch` task updates the object directory with the latest
-	objects from all registered remotes. For each remote, a `git fetch`
-	command is run. The configured refspec is modified to place all
-	requested refs within `refs/prefetch/`. Also, tags are not updated.
+	objects from all registered remotes unless they've disabled prefetch
+	using `remote.<remote>.prefetch` set to `false`. For each such remote,
+	a `git fetch` command is run. The configured refspec is modified to place
+	all requested refs within `refs/prefetch/`. Also, tags are not updated.
 +
 This is done to avoid disrupting the remote-tracking branches. The end users
 expect these refs to stay unmoved unless they initiate a fetch.  However,
diff --git a/builtin/gc.c b/builtin/gc.c
index 427faf1cfe1..8da78290929 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1027,6 +1027,9 @@  static int fetch_remote(struct remote *remote, void *cbdata)
 	if (remote->skip_default_update)
 		return 0;
 
+	if (!remote->prefetch_enabled)
+		return 0;
+
 	child.git_cmd = 1;
 	strvec_pushl(&child.args, "fetch", remote->name,
 		     "--prefetch", "--prune", "--no-tags",
diff --git a/remote.c b/remote.c
index 8f3dee13186..fc6eee21408 100644
--- a/remote.c
+++ b/remote.c
@@ -140,6 +140,7 @@  static struct remote *make_remote(struct remote_state *remote_state,
 	CALLOC_ARRAY(ret, 1);
 	ret->prune = -1;  /* unspecified */
 	ret->prune_tags = -1;  /* unspecified */
+	ret->prefetch_enabled = 1;
 	ret->name = xstrndup(name, len);
 	refspec_init(&ret->push, REFSPEC_PUSH);
 	refspec_init(&ret->fetch, REFSPEC_FETCH);
@@ -456,6 +457,8 @@  static int handle_config(const char *key, const char *value,
 		remote->prune = git_config_bool(key, value);
 	else if (!strcmp(subkey, "prunetags"))
 		remote->prune_tags = git_config_bool(key, value);
+	else if (!strcmp(subkey, "prefetch"))
+		remote->prefetch_enabled = git_config_bool(key, value);
 	else if (!strcmp(subkey, "url")) {
 		if (!value)
 			return config_error_nonbool(key);
diff --git a/remote.h b/remote.h
index b901b56746d..c448e5e6f9d 100644
--- a/remote.h
+++ b/remote.h
@@ -77,6 +77,8 @@  struct remote {
 
 	struct refspec fetch;
 
+	int prefetch_enabled;
+
 	/*
 	 * The setting for whether to fetch tags (as a separate rule from the
 	 * configured refspecs);
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index abae7a97546..7484e1f1d46 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -245,6 +245,48 @@  test_expect_success 'prefetch multiple remotes' '
 	test_subcommand git fetch remote2 $fetchargs <skip-remote1.txt
 '
 
+test_expect_success 'prefetch respects remote.*.prefetch config' '
+	test_create_repo prefetch-test-config &&
+	(
+		cd prefetch-test-config &&
+		test_commit initial &&
+		test_create_repo clone1 &&
+		test_create_repo clone2 &&
+		test_create_repo clone3 &&
+
+		git remote add remote1 "file://$(pwd)/clone1" &&
+		git remote add remote2 "file://$(pwd)/clone2" &&
+		git remote add remote3 "file://$(pwd)/clone3" &&
+
+		git config remote.remote1.prefetch false &&
+		git config remote.remote2.prefetch true &&
+		# remote3 is left unset
+
+		# Make changes in all clones
+		git -C clone1 switch -c one &&
+		git -C clone2 switch -c two &&
+		git -C clone3 switch -c three &&
+		test_commit -C clone1 one &&
+		test_commit -C clone2 two &&
+		test_commit -C clone3 three &&
+
+		# Run maintenance prefetch task
+		GIT_TRACE2_EVENT="$(pwd)/prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
+
+		# Check that if remotes were prefetched properly
+		fetchargs="--prefetch --prune --no-tags --no-write-fetch-head --recurse-submodules=no --quiet" &&
+		test_subcommand ! git fetch remote1 $fetchargs <prefetch.txt &&
+		test_subcommand git fetch remote2 $fetchargs <prefetch.txt &&
+		test_subcommand git fetch remote3 $fetchargs <prefetch.txt &&
+
+		# Verify that changes are in the prefetch refs for remote2 and remote3, but not remote1
+		test_must_fail git rev-parse refs/prefetch/remotes/remote1/one &&
+		git fetch --all &&
+		test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
+		test_cmp_rev refs/remotes/remote3/three refs/prefetch/remotes/remote3/three
+	)
+'
+
 test_expect_success 'loose-objects task' '
 	# Repack everything so we know the state of the object dir
 	git repack -adk &&