From patchwork Wed Dec 21 04:04:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13078413 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B4BDC10F1B for ; Wed, 21 Dec 2022 04:05:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234296AbiLUEFV (ORCPT ); Tue, 20 Dec 2022 23:05:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229727AbiLUEFI (ORCPT ); Tue, 20 Dec 2022 23:05:08 -0500 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00A8E16487 for ; Tue, 20 Dec 2022 20:05:07 -0800 (PST) Received: by mail-wr1-x429.google.com with SMTP id f18so13765097wrj.5 for ; Tue, 20 Dec 2022 20:05:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1rAY6XDHluwg5QSceibQehTmva2W68Qoo8Q0/MVCo5I=; b=fn3c8TM/GsPl+LcZpRTduXNQCnr3bg6BgLLSWVcrJLeiw0G04ZW/zSmqGWEJtC2ljK 45SCR+jiX9JCeEbwyMdM90z9y1dIpNPqapB0hYzd55Ss6IZcEvojs7z+7k3+lMgMMxqi k5Ea+38S3CkTOnp5LEZG8tGcYiRRjHxsU7fczofu+kpfDcXmGX7rH8K/UCfPdcuPrQA2 WhQhLDb2iTFUs75qafsDtHT6ktvQMx/x1xDsK4+gaQlywGyOjJ+UbFQojzCtr3xA+LHG bgVuA4inXhIGSkBY7Zg8iaZhGIEFuoTx5I50HlOsMXtftOd8Ir2w4eZ2phRlLhnXB709 45Zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1rAY6XDHluwg5QSceibQehTmva2W68Qoo8Q0/MVCo5I=; b=vDIavR2QaBfL8f2IhhFAQP3ATfch29c1qBRToHshtm+SPCZqOwsUVCDmaniHQiVl5+ eFPizsVwgLsz/nYP9vNQ8fEP4zSo7cBzGhzbZqj8CzQpL737L1ahmpKwd8QJzjoTUKhY FCeKX1WpD6eJ7wKcW87DtzxwIKKdyMBNEIG6JJA9pGft8zvKXUq1GlnyJthY897NMp9g XgiYwTdfVCCYuijrG6jp5P8ppDHDu38Qie3w/7nyDtee/8aRow6t3o0tWANIte80DkA2 BukoVKDxjfQMaGkd7fBYcUFk7pzygVWAhPOpF606BoqMGv2rrnYtGBl8GOgymuGXHw8p OAUQ== X-Gm-Message-State: AFqh2koOvwOJtg5xaHcMyXQSRRXEviUZbH6eZyuRq71lQS/1MhRUKMpk zii/QNA+J5VxDBaST0mzhQX11Zz2l/k= X-Google-Smtp-Source: AMrXdXuTezHBsne3D0z3XvSkuDtNitpQx7x0UD6dZNX7blfjJv0NB5tCgRSBYclpnCyp0ZOAezU+Mg== X-Received: by 2002:a5d:4008:0:b0:242:7214:55e4 with SMTP id n8-20020a5d4008000000b00242721455e4mr69696wrp.46.1671595506211; Tue, 20 Dec 2022 20:05:06 -0800 (PST) Received: from Precision-5550.. ([2a04:cec0:1195:e411:35ab:b445:697e:1e87]) by smtp.gmail.com with ESMTPSA id p9-20020a5d4e09000000b00236c1f2cecesm16462298wrt.81.2022.12.20.20.05.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Dec 2022 20:05:05 -0800 (PST) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Christian Couder Subject: [PATCH v4 1/3] pack-objects: allow --filter without --stdout Date: Wed, 21 Dec 2022 05:04:44 +0100 Message-Id: <20221221040446.2860985-2-christian.couder@gmail.com> X-Mailer: git-send-email 2.39.0.59.g6bb98b4b00 In-Reply-To: <20221221040446.2860985-1-christian.couder@gmail.com> References: <20221122175150.366828-1-christian.couder@gmail.com> <20221221040446.2860985-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Christian Couder 9535ce7337 (pack-objects: add list-objects filtering, 2017-11-21) taught pack-objects to use --filter, but required the use of --stdout since a partial clone mechanism was not yet in place to handle missing objects. Since then, changes like 9e27beaa23 (promisor-remote: implement promisor_remote_get_direct(), 2019-06-25) and others added support to dynamically fetch objects that were missing. Remove the --stdout requirement so that in a follow-up commit, repack can pass --filter to pack-objects to omit certain objects from the resulting packfile. Signed-off-by: John Cai Signed-off-by: Christian Couder --- builtin/pack-objects.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 2193f80b89..aa0b13d015 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4371,12 +4371,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (!rev_list_all || !rev_list_reflog || !rev_list_index) unpack_unreachable_expiration = 0; - if (filter_options.choice) { - if (!pack_to_stdout) - die(_("cannot use --filter without --stdout")); - if (stdin_packs) - die(_("cannot use --filter with --stdin-packs")); - } + if (stdin_packs && filter_options.choice) + die(_("cannot use --filter with --stdin-packs")); if (stdin_packs && use_internal_rev_list) die(_("cannot use internal rev list with --stdin-packs")); From patchwork Wed Dec 21 04:04:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13078414 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB0EAC4332F for ; Wed, 21 Dec 2022 04:05:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234314AbiLUEF0 (ORCPT ); Tue, 20 Dec 2022 23:05:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57258 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234295AbiLUEFK (ORCPT ); Tue, 20 Dec 2022 23:05:10 -0500 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 37D4E13F79 for ; Tue, 20 Dec 2022 20:05:09 -0800 (PST) Received: by mail-wr1-x42d.google.com with SMTP id z10so3188282wrh.10 for ; Tue, 20 Dec 2022 20:05:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LJyirdsxnImNtJstD+riPG7ko/oyU+3fQZOjE+zAGSE=; b=DC75VJ0iqY9frDQkBRoYaKvhw7GGytATqgJGWfCqleP3rvwXTtY35/061rRljE9KCW dySJIxSqxuuoZkdZqkOgzK8Znayakopu5en7mG6GB3QsOIIS/1pq0XZ2Z+qwaxaLNz3e GtBJvGTwW63sAj5w+MQfPIDlRzIQyIQkk/WC21VlTc7vBe89V5j/LOboiZy35VWt0GVh OzjwvxycUXQv+voqPl1sdrpJVdvb7i1k0yn32khRPF3R0Emh+v3ITzQ1IrafH/RFJdBY DHtxQMyKwlirNwqd74YjKp3XXeZVjsnLcQemBIUrLzqNCnXtSuUselUrWIDwpAtU6DmJ 28BQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LJyirdsxnImNtJstD+riPG7ko/oyU+3fQZOjE+zAGSE=; b=i8ZR7h8MWQx+w7/rElIjxUEkCM/ysvgl/bi2d27lXlYRMJIQwzwkoFks2/IjUrQbkA XCZOJkHyaYQDqN+YgqhiINQRJ/ja7YvOEP0lBOGJ2Dw7UF+BD3N4KPpnaAx42iQ+RnfM V1Lkp3OmpbLQyAqRxPhJJjcexcnKFY12ljKrwZDG4HeP9ZqDnfBnEYbat1gy/YQr93D8 Rz6BrqOU6++twraY2LNTqNO77uLtEWwsfNTbIc/Sut8TimceZSWFYycZggwYtIokIrnf 8HDkWxACd2o7BYh/6vGXmwGFfqNBLmhPs4c2NvU+VJA/WAjSFBYEFzu9d8eykYE9YvoE qFZw== X-Gm-Message-State: AFqh2kq0f2HlpwQcDzGLVp1DdvwxtkWpCBdy4BK157uiIq0pVBKDrZDL /nJeLViJmUnNuZYxHVPEN4tYo896d5s= X-Google-Smtp-Source: AMrXdXvYco5cBz7ypYaAGxFZnHWdmnD4WoXUG+pTnHC5yLbF4UbqDLncRlrSPIoqNTW51l+3YfTPqg== X-Received: by 2002:a5d:4acb:0:b0:242:844a:70a5 with SMTP id y11-20020a5d4acb000000b00242844a70a5mr43500wrs.35.1671595507501; Tue, 20 Dec 2022 20:05:07 -0800 (PST) Received: from Precision-5550.. ([2a04:cec0:1195:e411:35ab:b445:697e:1e87]) by smtp.gmail.com with ESMTPSA id p9-20020a5d4e09000000b00236c1f2cecesm16462298wrt.81.2022.12.20.20.05.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Dec 2022 20:05:06 -0800 (PST) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Christian Couder Subject: [PATCH v4 2/3] repack: add --filter= option Date: Wed, 21 Dec 2022 05:04:45 +0100 Message-Id: <20221221040446.2860985-3-christian.couder@gmail.com> X-Mailer: git-send-email 2.39.0.59.g6bb98b4b00 In-Reply-To: <20221221040446.2860985-1-christian.couder@gmail.com> References: <20221122175150.366828-1-christian.couder@gmail.com> <20221221040446.2860985-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Christian Couder After cloning with --filter=, for example to avoid getting unneeded large files on a user machine, it's possible that some of these large files still get fetched for some reasons (like checking out old branches) over time. In this case the repo size could grow too much for no good reason and `git repack --filter=` would be useful to remove the unneeded large files. This command could be dangerous to use though, as it might remove local objects that haven't been pushed which would lose data and corrupt the repo. On a server, this command could also corrupt a repo unless ALL the removed objects aren't already available in another remote that clients can access. To mitigate that risk, we check that a promisor remote has at least been configured. Signed-off-by: John Cai Signed-off-by: Christian Couder --- Documentation/git-repack.txt | 8 ++++++++ builtin/repack.c | 28 +++++++++++++++++++++------- t/t7700-repack.sh | 15 +++++++++++++++ 3 files changed, 44 insertions(+), 7 deletions(-) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 4017157949..2539ee0a02 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -143,6 +143,14 @@ depth is 4095. a larger and slower repository; see the discussion in `pack.packSizeLimit`. +--filter=:: + Omits certain objects (usually blobs) from the resulting + packfile. WARNING: this could easily corrupt the current repo + and lose data if ANY of the omitted objects hasn't been already + pushed to a remote. Be very careful about objects that might + have been created locally! See linkgit:git-rev-list[1] for valid + `` forms. + -b:: --write-bitmap-index:: Write a reachability bitmap index as part of the repack. This diff --git a/builtin/repack.c b/builtin/repack.c index c1402ad038..8e5ac9c171 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -49,6 +49,7 @@ struct pack_objects_args { const char *depth; const char *threads; const char *max_pack_size; + const char *filter; int no_reuse_delta; int no_reuse_object; int quiet; @@ -163,6 +164,8 @@ static void prepare_pack_objects(struct child_process *cmd, strvec_pushf(&cmd->args, "--threads=%s", args->threads); if (args->max_pack_size) strvec_pushf(&cmd->args, "--max-pack-size=%s", args->max_pack_size); + if (args->filter) + strvec_pushf(&cmd->args, "--filter=%s", args->filter); if (args->no_reuse_delta) strvec_pushf(&cmd->args, "--no-reuse-delta"); if (args->no_reuse_object) @@ -234,6 +237,13 @@ static struct generated_pack_data *populate_pack_exts(const char *name) return data; } +static void write_promisor_file_1(char *p) +{ + char *promisor_name = mkpathdup("%s-%s.promisor", packtmp, p); + write_promisor_file(promisor_name, NULL, 0); + free(promisor_name); +} + static void repack_promisor_objects(const struct pack_objects_args *args, struct string_list *names) { @@ -265,7 +275,6 @@ static void repack_promisor_objects(const struct pack_objects_args *args, out = xfdopen(cmd.out, "r"); while (strbuf_getline_lf(&line, out) != EOF) { struct string_list_item *item; - char *promisor_name; if (line.len != the_hash_algo->hexsz) die(_("repack: Expecting full hex object ID lines only from pack-objects.")); @@ -282,13 +291,8 @@ static void repack_promisor_objects(const struct pack_objects_args *args, * concatenate the contents of all .promisor files instead of * just creating a new empty file. */ - promisor_name = mkpathdup("%s-%s.promisor", packtmp, - line.buf); - write_promisor_file(promisor_name, NULL, 0); - + write_promisor_file_1(line.buf); item->util = populate_pack_exts(item->string); - - free(promisor_name); } fclose(out); if (finish_command(&cmd)) @@ -800,6 +804,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("limits the maximum number of threads")), OPT_STRING(0, "max-pack-size", &po_args.max_pack_size, N_("bytes"), N_("maximum size of each packfile")), + OPT_STRING(0, "filter", &po_args.filter, N_("args"), + N_("object filtering")), OPT_BOOL(0, "pack-kept-objects", &pack_kept_objects, N_("repack objects in packs marked with .keep")), OPT_STRING_LIST(0, "keep-pack", &keep_pack_list, N_("name"), @@ -834,6 +840,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix) die(_("options '%s' and '%s' cannot be used together"), "--cruft", "-k"); } + if (po_args.filter && !has_promisor_remote()) + die("a promisor remote must be setup\n" + "Also please push all the objects " + "that might be filtered to that remote!\n" + "Otherwise they will be lost!"); + if (write_bitmaps < 0) { if (!write_midx && (!(pack_everything & ALL_INTO_ONE) || !is_bare_repository())) @@ -971,6 +983,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (line.len != the_hash_algo->hexsz) die(_("repack: Expecting full hex object ID lines only from pack-objects.")); item = string_list_append(&names, line.buf); + if (po_args.filter) + write_promisor_file_1(line.buf); item->util = populate_pack_exts(item->string); } strbuf_release(&line); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 4aabe98139..3a6ad9f623 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -253,6 +253,21 @@ test_expect_success 'auto-bitmaps do not complain if unavailable' ' test_must_be_empty actual ' +test_expect_success 'repacking with a filter works' ' + test_when_finished "rm -rf server client" && + test_create_repo server && + git -C server config uploadpack.allowFilter true && + git -C server config uploadpack.allowAnySHA1InWant true && + test_commit -C server 1 && + git clone --bare --no-local server client && + git -C client config remote.origin.promisor true && + git -C client rev-list --objects --all --missing=print >objects && + test $(grep -c "^?" objects) = 0 && + git -C client -c repack.writebitmaps=false repack -a -d --filter=blob:none && + git -C client rev-list --objects --all --missing=print >objects && + test $(grep -c "^?" objects) = 1 +' + objdir=.git/objects midx=$objdir/pack/multi-pack-index From patchwork Wed Dec 21 04:04:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13078415 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88238C4332F for ; Wed, 21 Dec 2022 04:05:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234327AbiLUEFa (ORCPT ); Tue, 20 Dec 2022 23:05:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234299AbiLUEFL (ORCPT ); Tue, 20 Dec 2022 23:05:11 -0500 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F1CD17E06 for ; Tue, 20 Dec 2022 20:05:10 -0800 (PST) Received: by mail-wr1-x42d.google.com with SMTP id y8so737047wrl.13 for ; Tue, 20 Dec 2022 20:05:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qw18Se/n5FS5WvsxAGh1hXZyZ8m6t3Ryt3oHM1GO9J0=; b=gYHHevw9VEN91hK+i9wwKYaOYDvYuCJPHimyqMnaBCDKgNvBmU4M/d2tlpyKs2n2tr ejDFBMhUOYrVGj3L6lXlMoG9Q27ePUib8w3VWZJfVbso8hxIQ9FVHBR9rQ806VnN34u1 xEiKGNpz4NMx1xKnbJtVdjkMHTmvP7QmArfu8kgVMx5aAYIZXFDZnwAnrZ3jo1mXcck+ QamACTPBDqYfpAplBZNpWuQi6AkZ07UL8lSkeTweB5eqHHfMsRGnwjk0T7V4v4UjTVXW pPB/MPNit8gzvtmeE+d7fJN0s+dYOkyfqh5PZqLR+iEb/RHMu5fGhPAxiuMKqp7C5RQJ Kg6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qw18Se/n5FS5WvsxAGh1hXZyZ8m6t3Ryt3oHM1GO9J0=; b=SeM2Q+m4hcV5DmBn91Bv1DqN6N26lnDVqWzyca0TnVM9crroukR4BpwAxP2O07aGu2 sS/Zn15Gl/0huqm9ey0fMChsrcDXzZxnZXCX0opshktzfR7hg2ZRq7M7E1GUPJlGYOK3 jDmlCVY+hikKNvIqzgUcHnruWk0ow+ZNDVbZAKJByDKQT5nD5MCs5CVkYMAMEYnK9A7H 4FJMl+u+h5uJgxaI7WTPiBMLNE/WVNKt/CAIuGAl7vw4X/8GQTelGycMLA1KqLNACPAr ExAZix1zn9KZ+SeBLtSH3DtFbWfAX6e6mADl3VYrnuGOyzCSN8cVW59Ctq6/xCC9AsbP bKuw== X-Gm-Message-State: AFqh2kqopDARtb3g297Nlc715I8sZBB0IUQ0n08cVag+/s696vRcVCXZ mBOasbZSLjVM/XSet195x6KEq7IAjFY= X-Google-Smtp-Source: AMrXdXu/lRkxw69lZZFIpEQLd5YaUDpEIKFw1/uzIXLb1h+GKW0p6DSiciaRB/OoW5U1Lkox4MKK8A== X-Received: by 2002:a05:6000:18cb:b0:242:63e5:2451 with SMTP id w11-20020a05600018cb00b0024263e52451mr2573828wrq.71.1671595508656; Tue, 20 Dec 2022 20:05:08 -0800 (PST) Received: from Precision-5550.. ([2a04:cec0:1195:e411:35ab:b445:697e:1e87]) by smtp.gmail.com with ESMTPSA id p9-20020a5d4e09000000b00236c1f2cecesm16462298wrt.81.2022.12.20.20.05.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Dec 2022 20:05:08 -0800 (PST) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Christian Couder , Christian Couder Subject: [PATCH v4 3/3] gc: add gc.repackFilter config option Date: Wed, 21 Dec 2022 05:04:46 +0100 Message-Id: <20221221040446.2860985-4-christian.couder@gmail.com> X-Mailer: git-send-email 2.39.0.59.g6bb98b4b00 In-Reply-To: <20221221040446.2860985-1-christian.couder@gmail.com> References: <20221122175150.366828-1-christian.couder@gmail.com> <20221221040446.2860985-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit has implemented `git repack --filter=` to allow users to remove objects that are available on a promisor remote but that they don't want locally. Users might want to perform such a cleanup regularly at the same time as they perform other repacks and cleanups, so as part of `git gc`. Let's allow them to configure a for that purpose using a new gc.repackFilter config option. Now when `git gc` will perform a repack with a configured through this option and not empty, the repack process will be passed a corresponding `--filter=` argument. Signed-off-by: Christian Couder --- Documentation/config/gc.txt | 9 +++++++++ builtin/gc.c | 6 ++++++ t/t6500-gc.sh | 19 +++++++++++++++++++ 3 files changed, 34 insertions(+) diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt index 38fea076a2..9359136f14 100644 --- a/Documentation/config/gc.txt +++ b/Documentation/config/gc.txt @@ -130,6 +130,15 @@ or rebase occurring. Since these changes are not part of the current project most users will want to expire them sooner, which is why the default is more aggressive than `gc.reflogExpire`. +gc.repackFilter:: + When repacking, use the specified filter to omit certain + objects from the resulting packfile. WARNING: this could + easily corrupt the current repo and lose data if ANY of the + omitted objects hasn't been already pushed to a remote. Be + very careful about objects that might have been created + locally! See the `--filter=` option of + linkgit:git-repack[1]. + gc.rerereResolved:: Records of conflicted merge you resolved earlier are kept for this many days when 'git rerere gc' is run. diff --git a/builtin/gc.c b/builtin/gc.c index 02455fdcd7..bf28619723 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -52,6 +52,7 @@ static timestamp_t gc_log_expire_time; static const char *gc_log_expire = "1.day.ago"; static const char *prune_expire = "2.weeks.ago"; static const char *prune_worktrees_expire = "3.months.ago"; +static char *repack_filter; static unsigned long big_pack_threshold; static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; @@ -161,6 +162,8 @@ static void gc_config(void) git_config_get_ulong("gc.bigpackthreshold", &big_pack_threshold); git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size); + git_config_get_string("gc.repackfilter", &repack_filter); + git_config(git_default_config, NULL); } @@ -346,6 +349,9 @@ static void add_repack_all_option(struct string_list *keep_pack) if (keep_pack) for_each_string_list(keep_pack, keep_one_pack, NULL); + + if (repack_filter && *repack_filter) + strvec_pushf(&repack, "--filter=%s", repack_filter); } static void add_repack_incremental_option(void) diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index d9acb63951..b1492b521a 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -56,6 +56,7 @@ test_expect_success 'gc -h with invalid configuration' ' ' test_expect_success 'gc is not aborted due to a stale symref' ' + test_when_finished "rm -rf remote client" && git init remote && ( cd remote && @@ -202,6 +203,24 @@ test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "e grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out ' +test_expect_success 'gc.repackFilter launches repack with a filter' ' + test_when_finished "rm -rf server client" && + test_create_repo server && + git -C server config uploadpack.allowFilter true && + git -C server config uploadpack.allowAnySHA1InWant true && + test_commit -C server 1 && + git clone --bare --no-local server client && + git -C client config remote.origin.promisor true && + git -C client rev-list --objects --all --missing=print >objects && + test $(grep -c "^?" objects) = 0 && + + GIT_TRACE=$(pwd)/trace.out git -C client -c gc.repackFilter=blob:none -c repack.writeBitmaps=false -c gc.pruneExpire=now gc && + + grep -E "^trace: (built-in|exec|run_command): git repack .* --filter=blob:none ?.*" trace.out && + git -C client rev-list --objects --all --missing=print >objects && + test $(grep -c "^?" objects) = 1 +' + prepare_cruft_history () { test_commit base &&