diff mbox series

builtin/repack.c: prune unreachable objects with `--expire-to`

Message ID 48438876fb42a889110e100a6c42ca84e93aac49.1733011259.git.me@ttaylorr.com (mailing list archive)
State New
Headers show
Series builtin/repack.c: prune unreachable objects with `--expire-to` | expand

Commit Message

Taylor Blau Dec. 1, 2024, 12:01 a.m. UTC
When invoked with '--expire-to', 'git repack' will move unreachable
objects beyond the grace period to a separate repository outside of the
main object store.

Later on, 'git repack' will remove any existing packs which were made
redundant by the 'repack' operation, before then pruning loose objects
which were packed. Ordinarily, unreachable objects which have expired
were already packed via some earlier 'repack' operation, and so are
removed from the main repository in the first step.

But if a repository has unreachable objects which:

  - have an mtime earlier than the --cruft-expiration period,
  - are loose, and
  - have never been packed

Then we'll create a pack containing those objects to store in the
repository specified by the '--expire-to' option, but never prune the
loose copies of those objects from the main repository. That's because
we don't have a pack in the main repository which contains those
objects, so prune_packed_objects() skips over them.

(As an aside, for repositories that have a large number of unreachable
objects which were never packed, and are old enough to be expired, this
can be quite painful. That's because even though we expect the repack to
prune those objects which were GC'd, we don't per the above).

Teach repack to add the repository specified by '--expire-to' as an
alternate of the main object store so that 'prune_packed_objects()' can
"see" the packed copy of those objects, and remove them appropriately.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/repack.c        | 15 +++++++++++++++
 t/t7704-repack-cruft.sh | 12 ++++++++++++
 2 files changed, 27 insertions(+)


base-commit: cc01bad4a9f566cf4453c7edd6b433851b0835e2
diff mbox series

Patch

diff --git a/builtin/repack.c b/builtin/repack.c
index d6bb37e84ae..57cab72dcf5 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -1553,6 +1553,21 @@  int cmd_repack(int argc,
 							&existing);
 		if (show_progress)
 			opts |= PRUNE_PACKED_VERBOSE;
+
+		if (expire_to && *expire_to) {
+			char *alt = dirname(xstrdup(expire_to));
+			size_t len = strlen(alt);
+
+			if (strip_suffix(alt, "pack", &len) &&
+			    is_dir_sep(alt[len - 1])) {
+				alt[len - 1] = '\0';
+
+				add_to_alternates_memory(alt);
+				reprepare_packed_git(the_repository);
+			}
+
+			free(alt);
+		}
 		prune_packed_objects(opts);
 
 		if (!keep_unreachable &&
diff --git a/t/t7704-repack-cruft.sh b/t/t7704-repack-cruft.sh
index 5db9f4e10f7..ee1ffcdae3c 100755
--- a/t/t7704-repack-cruft.sh
+++ b/t/t7704-repack-cruft.sh
@@ -30,6 +30,12 @@  test_expect_success '--expire-to stores pruned objects (now)' '
 		git branch -D cruft &&
 		git reflog expire --all --expire=all &&
 
+		for obj in $(cat moved.want)
+		do
+			path="$objdir/$(test_oid_to_path $obj)" &&
+			test_path_is_file "$path" || return 1
+		done &&
+
 		git init --bare expired.git &&
 		git repack -d \
 			--cruft --cruft-expiration="now" \
@@ -38,6 +44,12 @@  test_expect_success '--expire-to stores pruned objects (now)' '
 		expired="$(ls expired.git/objects/pack/pack-*.idx)" &&
 		test_path_is_file "${expired%.idx}.mtimes" &&
 
+		for obj in $(cat moved.want)
+		do
+			path="$objdir/$(test_oid_to_path $obj)" &&
+			test_path_is_missing "$path" || return 1
+		done &&
+
 		# Since the `--cruft-expiration` is "now", the effective
 		# behavior is to move _all_ unreachable objects out to
 		# the location in `--expire-to`.