diff mbox series

[1/2] maintenance: add pack-refs task

Message ID 33b7a74af4eb45756c3648eb7b002d06e4ec3563.1612795943.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Maintenance: add pack-refs task | expand

Commit Message

Derrick Stolee Feb. 8, 2021, 2:52 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

It is valuable to collect loose refs into a more compressed form. This
is typically the packed-refs file, although this could be the reftable
in the future. Having packed refs can be extremely valuable in repos
with many tags or remote branches that are not modified by the local
user, but still are necessary for other queries.

For instance, with many exploded refs, commands such as

	git describe --tags --exact-match HEAD

can be very slow (multiple seconds). This command in particular is used
by terminal prompts to show when a detatched HEAD is pointing to an
existing tag, so having it be slow causes significant delays for users.

Add a new 'pack-refs' maintenance task. This is already a sub-step of
the 'gc' task, but users could run this at other intervals if they are
interested. Also, if users opt-in to the default background maintenance
schedule, then the 'gc' task is disabled.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt |  6 ++++++
 builtin/gc.c                      | 21 +++++++++++++++++----
 t/t7900-maintenance.sh            | 10 ++++++++++
 3 files changed, 33 insertions(+), 4 deletions(-)

Comments

Taylor Blau Feb. 8, 2021, 10:53 p.m. UTC | #1
On Mon, Feb 08, 2021 at 02:52:22PM +0000, Derrick Stolee via GitGitGadget wrote:
> +pack-refs::
> +	The `pack-refs` task collects the loose reference files and
> +	collects them into a single file. This speeds up operations that
> +	need to iterate across many refereences. See linkgit:git-pack-refs[1]
> +	for more information.
> +

Do you think it's worth documenting here or in git-gc(1) that running
this has the effect of disabling the same step during gc?

>  OPTIONS
>  -------
>  --auto::
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 4c40594d660e..8f13c3bb8607 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -54,7 +54,6 @@ static const char *prune_worktrees_expire = "3.months.ago";
>  static unsigned long big_pack_threshold;
>  static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE;
>
> -static struct strvec pack_refs_cmd = STRVEC_INIT;
>  static struct strvec reflog = STRVEC_INIT;
>  static struct strvec repack = STRVEC_INIT;
>  static struct strvec prune = STRVEC_INIT;
> @@ -163,6 +162,15 @@ static void gc_config(void)
>  	git_config(git_default_config, NULL);
>  }
>
> +struct maintenance_run_opts;
> +static int maintenance_task_pack_refs(struct maintenance_run_opts *opts)

It may be worth calling this "unused", since you explicitly pass NULL
below.

> +{
> +	struct strvec pack_refs_cmd = STRVEC_INIT;
> +	strvec_pushl(&pack_refs_cmd, "pack-refs", "--all", "--prune", NULL);
> +
> +	return run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD);
> +}
> +
>  static int too_many_loose_objects(void)
>  {
>  	/*
> @@ -518,8 +526,8 @@ static void gc_before_repack(void)
>  	if (done++)
>  		return;
>
> -	if (pack_refs && run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD))
> -		die(FAILED_RUN, pack_refs_cmd.v[0]);
> +	if (pack_refs && maintenance_task_pack_refs(NULL))
> +		die(FAILED_RUN, "pack-refs");

Hmm. Am I missing where opting into the maintenance step disables this
in gc? You suggest that it does in the commit message, but I can't seem
to see it here.

> +test_expect_success 'pack-refs task' '
> +	for n in $(test_seq 1 5)
> +	do
> +		git branch -f to-pack/$n HEAD || return 1
> +	done &&
> +	git maintenance run --task=pack-refs &&
> +	ls .git/refs/heads/ >after &&
> +	test_must_be_empty after

Worth testing that $GIT_DIR/packed-refs exists, too?

Thanks,
Taylor
Eric Sunshine Feb. 8, 2021, 11:06 p.m. UTC | #2
On Mon, Feb 8, 2021 at 9:52 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> +pack-refs::
> +       The `pack-refs` task collects the loose reference files and
> +       collects them into a single file. This speeds up operations that
> +       need to iterate across many refereences. See linkgit:git-pack-refs[1]
> +       for more information.

s/refereences/references/
Derrick Stolee Feb. 9, 2021, 12:42 p.m. UTC | #3
On 2/8/2021 5:53 PM, Taylor Blau wrote:
> On Mon, Feb 08, 2021 at 02:52:22PM +0000, Derrick Stolee via GitGitGadget wrote:
>> +pack-refs::
>> +	The `pack-refs` task collects the loose reference files and
>> +	collects them into a single file. This speeds up operations that
>> +	need to iterate across many refereences. See linkgit:git-pack-refs[1]
>> +	for more information.
>> +
> 
> Do you think it's worth documenting here or in git-gc(1) that running
> this has the effect of disabling the same step during gc?

It doesn't disable the step during gc.

Perhaps I should use a better term than "extract". The 'gc' task shouldn't
change as we make some of its steps also available as independent tasks.

Instead, users can select a subset of its steps by enabling them directly.
Other such tasks could include:

 * prune-reflog
 * full-repack (as opposed to the existing incremental-repack task)

>> +struct maintenance_run_opts;
>> +static int maintenance_task_pack_refs(struct maintenance_run_opts *opts)
> 
> It may be worth calling this "unused", since you explicitly pass NULL
> below.

Good idea.

>> +	if (pack_refs && maintenance_task_pack_refs(NULL))
>> +		die(FAILED_RUN, "pack-refs");
> 
> Hmm. Am I missing where opting into the maintenance step disables this
> in gc? You suggest that it does in the commit message, but I can't seem
> to see it here.

My commit message suggests wrong.
 
>> +test_expect_success 'pack-refs task' '
>> +	for n in $(test_seq 1 5)
>> +	do
>> +		git branch -f to-pack/$n HEAD || return 1
>> +	done &&
>> +	git maintenance run --task=pack-refs &&
>> +	ls .git/refs/heads/ >after &&
>> +	test_must_be_empty after
> 
> Worth testing that $GIT_DIR/packed-refs exists, too?

That would start breaking if we change the backing store, right? I
want to be sure this doesn't create test breakages with reftable.

I _could_ add a 'test_subcommand' check here, though.

Thanks,
-Stolee
Derrick Stolee Feb. 9, 2021, 12:42 p.m. UTC | #4
On 2/8/2021 6:06 PM, Eric Sunshine wrote:
> On Mon, Feb 8, 2021 at 9:52 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> +pack-refs::
>> +       The `pack-refs` task collects the loose reference files and
>> +       collects them into a single file. This speeds up operations that
>> +       need to iterate across many refereences. See linkgit:git-pack-refs[1]
>> +       for more information.
> 
> s/refereences/references/
 
Thanks!
-Stolee
diff mbox series

Patch

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 3b432171d608..8cc0225b3304 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -145,6 +145,12 @@  incremental-repack::
 	which is a special case that attempts to repack all pack-files
 	into a single pack-file.
 
+pack-refs::
+	The `pack-refs` task collects the loose reference files and
+	collects them into a single file. This speeds up operations that
+	need to iterate across many refereences. See linkgit:git-pack-refs[1]
+	for more information.
+
 OPTIONS
 -------
 --auto::
diff --git a/builtin/gc.c b/builtin/gc.c
index 4c40594d660e..8f13c3bb8607 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -54,7 +54,6 @@  static const char *prune_worktrees_expire = "3.months.ago";
 static unsigned long big_pack_threshold;
 static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE;
 
-static struct strvec pack_refs_cmd = STRVEC_INIT;
 static struct strvec reflog = STRVEC_INIT;
 static struct strvec repack = STRVEC_INIT;
 static struct strvec prune = STRVEC_INIT;
@@ -163,6 +162,15 @@  static void gc_config(void)
 	git_config(git_default_config, NULL);
 }
 
+struct maintenance_run_opts;
+static int maintenance_task_pack_refs(struct maintenance_run_opts *opts)
+{
+	struct strvec pack_refs_cmd = STRVEC_INIT;
+	strvec_pushl(&pack_refs_cmd, "pack-refs", "--all", "--prune", NULL);
+
+	return run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD);
+}
+
 static int too_many_loose_objects(void)
 {
 	/*
@@ -518,8 +526,8 @@  static void gc_before_repack(void)
 	if (done++)
 		return;
 
-	if (pack_refs && run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD))
-		die(FAILED_RUN, pack_refs_cmd.v[0]);
+	if (pack_refs && maintenance_task_pack_refs(NULL))
+		die(FAILED_RUN, "pack-refs");
 
 	if (prune_reflogs && run_command_v_opt(reflog.v, RUN_GIT_CMD))
 		die(FAILED_RUN, reflog.v[0]);
@@ -556,7 +564,6 @@  int cmd_gc(int argc, const char **argv, const char *prefix)
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_gc_usage, builtin_gc_options);
 
-	strvec_pushl(&pack_refs_cmd, "pack-refs", "--all", "--prune", NULL);
 	strvec_pushl(&reflog, "reflog", "expire", "--all", NULL);
 	strvec_pushl(&repack, "repack", "-d", "-l", NULL);
 	strvec_pushl(&prune, "prune", "--expire", NULL);
@@ -1224,6 +1231,7 @@  enum maintenance_task_label {
 	TASK_INCREMENTAL_REPACK,
 	TASK_GC,
 	TASK_COMMIT_GRAPH,
+	TASK_PACK_REFS,
 
 	/* Leave as final value */
 	TASK__COUNT
@@ -1255,6 +1263,11 @@  static struct maintenance_task tasks[] = {
 		maintenance_task_commit_graph,
 		should_write_commit_graph,
 	},
+	[TASK_PACK_REFS] = {
+		"pack-refs",
+		maintenance_task_pack_refs,
+		NULL,
+	},
 };
 
 static int compare_tasks_by_selection(const void *a_, const void *b_)
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 78ccf4b33f87..ef4527823d29 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -343,6 +343,16 @@  test_expect_success 'maintenance.incremental-repack.auto' '
 	test_subcommand git multi-pack-index write --no-progress <trace-B
 '
 
+test_expect_success 'pack-refs task' '
+	for n in $(test_seq 1 5)
+	do
+		git branch -f to-pack/$n HEAD || return 1
+	done &&
+	git maintenance run --task=pack-refs &&
+	ls .git/refs/heads/ >after &&
+	test_must_be_empty after
+'
+
 test_expect_success '--auto and --schedule incompatible' '
 	test_must_fail git maintenance run --auto --schedule=daily 2>err &&
 	test_i18ngrep "at most one" err