From patchwork Mon Oct 24 18:43:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13018104 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 750A1C67871 for ; Mon, 24 Oct 2022 20:33:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233862AbiJXUdS (ORCPT ); Mon, 24 Oct 2022 16:33:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234364AbiJXUcp (ORCPT ); Mon, 24 Oct 2022 16:32:45 -0400 Received: from mail-io1-xd2b.google.com (mail-io1-xd2b.google.com [IPv6:2607:f8b0:4864:20::d2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F2B718F939 for ; Mon, 24 Oct 2022 11:43:57 -0700 (PDT) Received: by mail-io1-xd2b.google.com with SMTP id q196so8467704iod.8 for ; Mon, 24 Oct 2022 11:43:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=P/dL4sCAfbPzlbOiVTltJixLzyuh6c+CBRIy7LnDcjc=; b=HCPoauLyMmDhAQTWFTfgusu3FgrXSjBcCmGM/hAAr5VlKYMDYKwFJw5UnZu1EUyr4j AItvCqhjX9SnaN40oCwfxVMC9U0+wo3L2s/cA2ibwf5rYxWAlbyz5JDyObegwNXwppvy B7p35J4Yj+rCy+QjFP6ibt2jZOmdtLsarPdwZt+GcnGT1oF9wZDviI/Zbo8CtHj6juag 6ist3oZTYPwO4Famvehr7KlNBmO9GEDt/TgLH3qFnSQoAnoaqmoWpmETWTipzvMh4tQP TzUJuXxi6DrqiMoK1sGqhq/g5CdjmRkGVWPcENBQOujYkLiDyR5nfYUouOdvJEr4k/dL udJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=P/dL4sCAfbPzlbOiVTltJixLzyuh6c+CBRIy7LnDcjc=; b=cx9S//Xmp91SOAzYyZVa3TCENF1LwdLDXCJiV70sPl43CWRdAgd8sQh0zciuPFj5qe WtoN0q3JevxCkpCMCUDJ9k2qplVsFGULAPVuI/gTfEsOPryKy5K3jzfgEGDLmGUKVAnV VCFXtPYE6KgvL72X0IkQhQqq8R/IAiS7UUUj0m9Y5ykTP5nmQ3GUwFtwu6uQ+hjtobPU BDWEYtQeo9Ms9Ey1JXFVK2YDat2iXC87TYJN0HAxCR8a5Bco8n5afI5rIuMWGsmOI825 bvPpqU7D9v/CIXlPr7Rli2zaN/wGNvvzKCq9xdy9KWQJK43JbWzOO2P1uBYgBAOzzVBj T+uA== X-Gm-Message-State: ACrzQf3G51C99FHC8KUh3mboTl7LsWjh5XO/H+bt9epq2brqr5daXax0 Hn2ByytteDsAc36ZGj2sULokz3+DLVIWPdGk X-Google-Smtp-Source: AMsMyM700ansgdCGOB4yx9k48ogGuwAk/7jehdYTXDqJ/sqcgga6Tt9ECs0nzQndfi4k1sfty5nVEg== X-Received: by 2002:a05:6638:430f:b0:371:667:72a1 with SMTP id bt15-20020a056638430f00b00371066772a1mr6217328jab.74.1666636984682; Mon, 24 Oct 2022 11:43:04 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id g13-20020a056e020d0d00b002eb1137a774sm222613ilj.59.2022.10.24.11.43.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Oct 2022 11:43:04 -0700 (PDT) Date: Mon, 24 Oct 2022 14:43:03 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Derrick Stolee , Jeff King , Jonathan Tan , Junio C Hamano , Victoria Dye , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 1/4] builtin/repack.c: pass "out" to `prepare_pack_objects` Message-ID: <1dd4136f6199ac050cec5eb671c36ae05fbf3bdd.1666636974.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org `builtin/repack.c`'s `prepare_pack_objects()` is used to prepare a set of arguments to a `pack-objects` process which will generate a desired pack. A future patch will add an `--expire-to` option which allows `git repack` to write a cruft pack containing the pruned objects out to a separate repository. Prepare for this by teaching that function to write packs to an arbitrary location specified by the caller. All existing callers of `prepare_pack_objects()` will pass `packtmp` for `out`, retaining the existing behavior. Signed-off-by: Taylor Blau --- builtin/repack.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index a5bacc7797..0a7bd57636 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -188,7 +188,8 @@ static void remove_redundant_pack(const char *dir_name, const char *base_name) } static void prepare_pack_objects(struct child_process *cmd, - const struct pack_objects_args *args) + const struct pack_objects_args *args, + const char *out) { strvec_push(&cmd->args, "pack-objects"); if (args->window) @@ -211,7 +212,7 @@ static void prepare_pack_objects(struct child_process *cmd, strvec_push(&cmd->args, "--quiet"); if (delta_base_offset) strvec_push(&cmd->args, "--delta-base-offset"); - strvec_push(&cmd->args, packtmp); + strvec_push(&cmd->args, out); cmd->git_cmd = 1; cmd->out = -1; } @@ -275,7 +276,7 @@ static void repack_promisor_objects(const struct pack_objects_args *args, FILE *out; struct strbuf line = STRBUF_INIT; - prepare_pack_objects(&cmd, args); + prepare_pack_objects(&cmd, args, packtmp); cmd.in = -1; /* @@ -673,7 +674,7 @@ static int write_cruft_pack(const struct pack_objects_args *args, FILE *in, *out; int ret; - prepare_pack_objects(&cmd, args); + prepare_pack_objects(&cmd, args, packtmp); strvec_push(&cmd.args, "--cruft"); if (cruft_expiration) @@ -861,7 +862,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) sigchain_push_common(remove_pack_on_signal); - prepare_pack_objects(&cmd, &po_args); + prepare_pack_objects(&cmd, &po_args, packtmp); show_progress = !po_args.quiet && isatty(2); From patchwork Mon Oct 24 18:43:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13018105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C11DBC67871 for ; Mon, 24 Oct 2022 20:33:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231351AbiJXUde (ORCPT ); Mon, 24 Oct 2022 16:33:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234297AbiJXUc7 (ORCPT ); Mon, 24 Oct 2022 16:32:59 -0400 Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com [IPv6:2607:f8b0:4864:20::d2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8AF64EAC80 for ; Mon, 24 Oct 2022 11:44:01 -0700 (PDT) Received: by mail-io1-xd2f.google.com with SMTP id h203so8498568iof.1 for ; Mon, 24 Oct 2022 11:44:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=D56Dz5U5vARygvly6YaT2A43Chkn0XLn0o/cFivL+KA=; b=ArCNDp8Oy6jHIo0BW4CkWzEGHuC3xYPDoZTVSlLy6sDBsaAE4zJUxslnSw042G26Gm c/V8VPfJ/HTR96pNKYYx3dhJWPffnGPMSwahRTUlpcIRFKAAVpO5HGYLbP9zO1ZPt1e1 LEwDutm/4ngOa0kfWKSiSX7DF+zbvIeaUfpk1Wa3koM23EEphrj/A1I8/tlvL/ZsKQgc P0F/WI0GljRDeBC0NnXJN2XBTnDi/QMqPtWeD57oBT4zK0JYRSbMBiYVzTnK4eGuMJnv gypmAp93uSYEBAHCAc2LTLL/YVv2iuJf5+PoZ8h36dHx8HjuFF1fhIh8SGC43dFoNDpe 3UFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=D56Dz5U5vARygvly6YaT2A43Chkn0XLn0o/cFivL+KA=; b=TIXxxeDqEMnhdMu2hgSFWk0hUo0CVeEykqj+SklpGfO+WOABg1h5aFiLsjCIUATuae D7WDrOEQdzPSoR+EqHx0qwp6cx9B4NvegPhbuIKVDUKOQVsb/0W+Cb5uBiYVQgwVHB3y 4YsSFOFhUkGhPY5sO3/lwbPgEF+lsCMclcPyUFxRd07Ksii8RM4H9m9z24aClujE0GBw hQJ3u6DhzuvAJmhHSqeP7watcqFWiWF680IxG3vEg0CM3GvC8Px3UBurec/gPzHAo87o nqMyV7kcuI7neKRROeqZIrZ7w6NMqBq5PbBGDGafh4UUFjJ0wbsFN60XAZkgerMuQso0 G54g== X-Gm-Message-State: ACrzQf0L5RgNAgvkhsv70rqfgynN080AdDVDiScmgP12A/s4fOuODh1N oNk6pPY77dDM4gzcZ4CXRjyIGWbIalgxcLYB X-Google-Smtp-Source: AMsMyM6Yqe4hEt4Zx9JYQ+XrdxDbSIozSfS6NEwLOgm1ShyhmMA96BzvLZXAUMLCGuNuuAn0vAUb4g== X-Received: by 2002:a05:6602:2d83:b0:6bc:9e73:8fa7 with SMTP id k3-20020a0566022d8300b006bc9e738fa7mr20357157iow.94.1666636987936; Mon, 24 Oct 2022 11:43:07 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id cn11-20020a0566383a0b00b0036368623574sm68352jab.169.2022.10.24.11.43.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Oct 2022 11:43:07 -0700 (PDT) Date: Mon, 24 Oct 2022 14:43:06 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Derrick Stolee , Jeff King , Jonathan Tan , Junio C Hamano , Victoria Dye , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 2/4] builtin/repack.c: pass "cruft_expiration" to `write_cruft_pack` Message-ID: <7d731d8dd5ebe0570a5dd8a88b3dd3104a79592a.1666636974.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org `builtin/repack.c`'s `write_cruft_pack()` is used to generate the cruft pack when `--cruft` is supplied. It uses a static variable "cruft_expiration" which is filled in by option parsing. A future patch will add an `--expire-to` option which allows `git repack` to write a cruft pack containing the pruned objects out to a separate repository. In order to implement this functionality, some callers will have to pass a value for `cruft_expiration` different than the one filled out by option parsing. Prepare for this by teaching `write_cruft_pack` to take a "cruft_expiration" parameter, instead of reading a single static variable. The (sole) existing caller of `write_cruft_pack()` will pass the value for "cruft_expiration" filled in by option parsing, retaining existing behavior. This means that we can make the variable local to `cmd_repack()`, and eliminate the static declaration. Signed-off-by: Taylor Blau --- builtin/repack.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 0a7bd57636..1184e8c257 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -32,7 +32,6 @@ static int write_bitmaps = -1; static int use_delta_islands; static int run_update_server_info = 1; static char *packdir, *packtmp_name, *packtmp; -static char *cruft_expiration; static const char *const git_repack_usage[] = { N_("git repack []"), @@ -664,6 +663,7 @@ static int write_midx_included_packs(struct string_list *include, static int write_cruft_pack(const struct pack_objects_args *args, const char *pack_prefix, + const char *cruft_expiration, struct string_list *names, struct string_list *existing_packs, struct string_list *existing_kept_packs) @@ -746,6 +746,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) struct pack_objects_args cruft_po_args = {NULL}; int geometric_factor = 0; int write_midx = 0; + const char *cruft_expiration = NULL; struct option builtin_repack_options[] = { OPT_BIT('a', NULL, &pack_everything, @@ -985,7 +986,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) cruft_po_args.local = po_args.local; cruft_po_args.quiet = po_args.quiet; - ret = write_cruft_pack(&cruft_po_args, pack_prefix, &names, + ret = write_cruft_pack(&cruft_po_args, pack_prefix, + cruft_expiration, &names, &existing_nonkept_packs, &existing_kept_packs); if (ret) From patchwork Mon Oct 24 18:43:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13018107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF786FA3742 for ; Mon, 24 Oct 2022 20:35:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232864AbiJXUe7 (ORCPT ); Mon, 24 Oct 2022 16:34:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234424AbiJXUeC (ORCPT ); Mon, 24 Oct 2022 16:34:02 -0400 Received: from mail-io1-f48.google.com (mail-io1-f48.google.com [209.85.166.48]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06E618B2FD for ; Mon, 24 Oct 2022 11:44:56 -0700 (PDT) Received: by mail-io1-f48.google.com with SMTP id o65so8486460iof.4 for ; Mon, 24 Oct 2022 11:44:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2YrgxEGKeBuj3GypGtTRliNnBQ2QnIM7SpuB885mI8k=; b=6TnORhzy+q84xbCre8I1c5tdWGITYwWwNhMEhcBdiTKmn7JKKpsp+B/5fA804TYAmS foczeFuW54UAeHxXjMgNdu1BGDFiI2FLC4CC/F0R8hlrWa9T9b/u+ocQ89t7Nhrrvwt5 P7KHTNKenGCk+CZnoXWPGgY57aG2mbLoh3vx9duInVl80lV39z7TB0zCFbapSv6Hwrnc UQy7gXxe90fF3MAawVlpoyn7BcbLGd+LIJ1L5u0tkFelePoSTfWYoImdz3+6x3MoBNSM jDGj96Fa7zjFd/1SlShwCZ2XwrNKVkPVk2P44e87avtYKmRmLmezMmGm0vx4t4zQQFPN 8oUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=2YrgxEGKeBuj3GypGtTRliNnBQ2QnIM7SpuB885mI8k=; b=LVKLigVG2LasgdPalg8RgV59duDqXuD+bQGRL1ioouA/fZLgMUbwdTWHRfVt/4MmMt 4WgWdNUhZQ5MdvQSwuOe+eiBjngid4iQcF2MquPXjnNHL4obn5VKpT1xo/GbyXP13mF3 /FAkG/DuvdR/ZBnLX+7flEgUngIRvvluTHBSqNJmCPSRkxNnpMUv4wVppGCNkw2sGIHw GNIS6ZDQpbBnRsEVOcEIspf+NRjg/PqDxlA53InqWK9MGeTxly6apcmODUBtq+2s1xqy ZzULI4O3M+d38vEz6z7u8Ls10oqwtfn/ynRzGLyCk0Zb0rkvrXw+snxd3/Kf4v3TYB60 Diew== X-Gm-Message-State: ACrzQf0hey45hmUlOsBAeho1NaeIZ81eRtKYMEtSG0gYe0P6+ScZLuSd kXzU57KAGEYqNYw0VRpwwfH+l/beVO3MvxVp X-Google-Smtp-Source: AMsMyM5pFLrRjMWnih8r0mv7yJsIocPXC32u5z9Wc/8C5gGjOJbqn+zroePXL9tz0mRly9O0X1zPaA== X-Received: by 2002:a05:6602:134f:b0:6a4:cd04:7842 with SMTP id i15-20020a056602134f00b006a4cd047842mr21605445iov.172.1666636990963; Mon, 24 Oct 2022 11:43:10 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 2-20020a056e0220c200b002ffa449535asm216572ilq.74.2022.10.24.11.43.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Oct 2022 11:43:10 -0700 (PDT) Date: Mon, 24 Oct 2022 14:43:09 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Derrick Stolee , Jeff King , Jonathan Tan , Junio C Hamano , Victoria Dye , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 3/4] builtin/repack.c: write cruft packs to arbitrary locations Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In the following commit, a new write_cruft_pack() caller will be added which wants to write a cruft pack to an arbitrary location. Prepare for this by adding a parameter which controls the destination of the cruft pack. For now, provide "packtmp" so that this commit does not change any behavior. Signed-off-by: Taylor Blau --- builtin/repack.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 1184e8c257..a5386ac893 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -662,6 +662,7 @@ static int write_midx_included_packs(struct string_list *include, } static int write_cruft_pack(const struct pack_objects_args *args, + const char *destination, const char *pack_prefix, const char *cruft_expiration, struct string_list *names, @@ -673,8 +674,10 @@ static int write_cruft_pack(const struct pack_objects_args *args, struct string_list_item *item; FILE *in, *out; int ret; + const char *scratch; + int local = skip_prefix(destination, packdir, &scratch); - prepare_pack_objects(&cmd, args, packtmp); + prepare_pack_objects(&cmd, args, destination); strvec_push(&cmd.args, "--cruft"); if (cruft_expiration) @@ -714,7 +717,12 @@ static int write_cruft_pack(const struct pack_objects_args *args, if (line.len != the_hash_algo->hexsz) die(_("repack: Expecting full hex object ID lines only " "from pack-objects.")); - string_list_append(names, line.buf); + /* + * avoid putting packs written outside of the repository in the + * list of names + */ + if (local) + string_list_append(names, line.buf); } fclose(out); @@ -986,7 +994,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) cruft_po_args.local = po_args.local; cruft_po_args.quiet = po_args.quiet; - ret = write_cruft_pack(&cruft_po_args, pack_prefix, + ret = write_cruft_pack(&cruft_po_args, packtmp, pack_prefix, cruft_expiration, &names, &existing_nonkept_packs, &existing_kept_packs); From patchwork Mon Oct 24 18:43:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13018106 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7C38C67871 for ; Mon, 24 Oct 2022 20:33:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234324AbiJXUdu (ORCPT ); Mon, 24 Oct 2022 16:33:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234322AbiJXUdD (ORCPT ); Mon, 24 Oct 2022 16:33:03 -0400 Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95E4A1C2EB1 for ; Mon, 24 Oct 2022 11:44:07 -0700 (PDT) Received: by mail-io1-xd33.google.com with SMTP id y80so8487559iof.3 for ; Mon, 24 Oct 2022 11:44:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pbk8yZu7pCVPXdnyGUib8bHOGbXosux8adK7h7mGgTg=; b=ejxXcUxnPVEIGvVNWw/fl2Pir+uxY1tlG6r19veVH3mdrAJyYUAPUpiLekPVYw2DsF BzckxBIuRQQheyq9vStWrz5ApHtUD5nPDiojZJS2jxs3dgGjZ2u9gHuxu43y3iikhEXh EiqzoJKYrjcqpOmM+Q0CX4XxA40XjnRg3YzmLN+NbKNYuk8Mgu3M/HF3Y0hkPxmHhLGP x6yNPCEnGOTCkltF/s6sZTf+nDLXFp6dgmP2hQBC7F6+X2JsbSW1neRDmN7rpUKpvfxD 41MwC0WqdwjRXWfSpe4PbtWo2wEFuODVmXYshGhLX6joJiFgHjnbikYOesVC6gXuXhwP NIIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=pbk8yZu7pCVPXdnyGUib8bHOGbXosux8adK7h7mGgTg=; b=LkT6+ICMGvD+xoNpZn9MrJ5VwntGp5DaFC59KNj1AMfLl0ycsJV8fv9urhP5Wlu8xK nBVJxy1Q5hZ9jTbxdqSnoqleM/kHXkRJrezze/nc1pYeugko57AVDvpk/6t41actHl5P xPqvzaYe/ktH/fovuIgdCt8P1BSeduOx72jwxl7Tb5Ujmqs+/5ce5o/EL7BcPcRLh9uF 8G0Jeq07KigZ+RtXEqBem5utj1WN9tubnukyHCoK/r8dCe4At9QJKwh83i99VPDR3DIX PlBQ05+NGvVEilZfiJRecRbMB0zgNeZ7r1neOBm0YvJ8dPim3weeCCBiJQiZUdXJ2rjS 66iA== X-Gm-Message-State: ACrzQf1LbOC+r6dbdeTMFsx3VqIdbyKQtGXCBYC3gkIC8w4RqJkRJCmh yqu7ak23AcV2jVtrN/spdYTx/oUiVaPCD0RP X-Google-Smtp-Source: AMsMyM4a+VFp/Mj9lA1idfGAcQZtB3aHjkPKh1psN4DCcfNrSWn0wuk2kxF1sicryI9PBGEufXHHiw== X-Received: by 2002:a6b:620f:0:b0:6bf:f49d:e7e9 with SMTP id f15-20020a6b620f000000b006bff49de7e9mr1061763iog.102.1666636993924; Mon, 24 Oct 2022 11:43:13 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id b18-20020a026f52000000b003725d3b06a0sm90198jae.45.2022.10.24.11.43.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Oct 2022 11:43:13 -0700 (PDT) Date: Mon, 24 Oct 2022 14:43:12 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Derrick Stolee , Jeff King , Jonathan Tan , Junio C Hamano , Victoria Dye , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 4/4] builtin/repack.c: implement `--expire-to` for storing pruned objects Message-ID: <6376d15c9c9adce883dba86ef5e5219f803aa9bf.1666636974.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When pruning objects with `--cruft`, `git repack` offers some flexibility when selecting the set of which objects are pruned via the `--cruft-expiration` option. This is useful for expiring objects which are older than the grace period, making races where to-be-pruned objects become reachable and then ancestors of freshly pushed objects, leaving the repository in a corrupt state after pruning substantially less likely [1]. But in practice, such races are impossible to avoid entirely, no matter how long the grace period is. To prevent this race, it is often advisable to temporarily put a repository into a read-only state. But in practice, this is not always practical, and so some middle ground would be nice. This patch introduces a new option, `--expire-to`, which teaches `git repack` to write an additional cruft pack containing just the objects which were pruned from the repository. The caller can specify a directory outside of the current repository as the destination for this second cruft pack. This makes it possible to prune objects from a repository, while still holding onto a supplemental copy of them outside of the original repository. Having this copy on-disk makes it substantially easier to recover objects when the aforementioned race is encountered. `--expire-to` is implemented in a somewhat convoluted manner, which is to take advantage of the fact that the first time `write_cruft_pack()` is called, it adds the name of the cruft pack to the `names` string list. That means the second time we call `write_cruft_pack()`, objects in the previously-written cruft pack will be excluded. As long as the caller ensures that no objects are expired during the second pass, this is sufficient to generate a cruft pack containing all objects which don't appear in any of the new packs written by `git repack`, including the cruft pack. In other words, all of the objects which are about to be pruned from the repository. It is important to note that the destination in `--expire-to` does not necessarily need to be a Git repository (though it can be) Notably, the expired packs do not contain all ancestors of expired objects. So if the source repository contains something like: / C1 --- C2 \ refs/heads/master where C2 is unreachable, but has a parent (C1) which is reachable, and C2 would be pruned, then the expiry pack will contain only C2, not C1. [1]: https://lore.kernel.org/git/20190319001829.GL29661@sigill.intra.peff.net/ Signed-off-by: Taylor Blau --- Documentation/git-repack.txt | 6 ++ builtin/repack.c | 40 ++++++++++++ t/t7700-repack.sh | 121 +++++++++++++++++++++++++++++++++++ 3 files changed, 167 insertions(+) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 0bf13893d8..4017157949 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -74,6 +74,12 @@ to the new separate pack will be written. immediately instead of waiting for the next `git gc` invocation. Only useful with `--cruft -d`. +--expire-to=:: + Write a cruft pack containing pruned objects (if any) to the + directory ``. This option is useful for keeping a copy of + any pruned objects in a separate directory as a backup. Only + useful with `--cruft -d`. + -l:: Pass the `--local` option to 'git pack-objects'. See linkgit:git-pack-objects[1]. diff --git a/builtin/repack.c b/builtin/repack.c index a5386ac893..3bc18e0b2f 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -702,6 +702,10 @@ static int write_cruft_pack(const struct pack_objects_args *args, * By the time it is read here, it contains only the pack(s) * that were just written, which is exactly the set of packs we * want to consider kept. + * + * If `--expire-to` is given, the double-use served by `names` + * ensures that the pack written to `--expire-to` excludes any + * objects contained in the cruft pack. */ in = xfdopen(cmd.in, "w"); for_each_string_list_item(item, names) @@ -755,6 +759,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) int geometric_factor = 0; int write_midx = 0; const char *cruft_expiration = NULL; + const char *expire_to = NULL; struct option builtin_repack_options[] = { OPT_BIT('a', NULL, &pack_everything, @@ -804,6 +809,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("find a geometric progression with factor ")), OPT_BOOL('m', "write-midx", &write_midx, N_("write a multi-pack index of the resulting packs")), + OPT_STRING(0, "expire-to", &expire_to, N_("dir"), + N_("pack prefix to store a pack containing pruned objects")), OPT_END() }; @@ -1000,6 +1007,39 @@ int cmd_repack(int argc, const char **argv, const char *prefix) &existing_kept_packs); if (ret) return ret; + + if (delete_redundant && expire_to) { + /* + * If `--expire-to` is given with `-d`, it's possible + * that we're about to prune some objects. With cruft + * packs, pruning is implicit: any objects from existing + * packs that weren't picked up by new packs are removed + * when their packs are deleted. + * + * Generate an additional cruft pack, with one twist: + * `names` now includes the name of the cruft pack + * written in the previous step. So the contents of + * _this_ cruft pack exclude everything contained in the + * existing cruft pack (that is, all of the unreachable + * objects which are no older than + * `--cruft-expiration`). + * + * To make this work, cruft_expiration must become NULL + * so that this cruft pack doesn't actually prune any + * objects. If it were non-NULL, this call would always + * generate an empty pack (since every object not in the + * cruft pack generated above will have an mtime older + * than the expiration). + */ + ret = write_cruft_pack(&cruft_po_args, expire_to, + pack_prefix, + NULL, + &names, + &existing_nonkept_packs, + &existing_kept_packs); + if (ret) + return ret; + } } string_list_sort(&names); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index ca45c4cd2c..17ee6fc2cc 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -482,4 +482,125 @@ test_expect_success '-n overrides repack.updateServerInfo=true' ' test_server_info_missing ' +test_expect_success '--expire-to stores pruned objects (now)' ' + git init expire-to-now && + ( + cd expire-to-now && + + git branch -M main && + + test_commit base && + + git checkout -b cruft && + test_commit --no-tag cruft && + + git rev-list --objects --no-object-names main..cruft >moved.raw && + sort moved.raw >moved.want && + + git rev-list --all --objects --no-object-names >expect.raw && + sort expect.raw >expect && + + git checkout main && + git branch -D cruft && + git reflog expire --all --expire=all && + + git init --bare expired.git && + git repack -d \ + --cruft --cruft-expiration="now" \ + --expire-to="expired.git/objects/pack/pack" && + + expired="$(ls expired.git/objects/pack/pack-*.idx)" && + test_path_is_file "${expired%.idx}.mtimes" && + + # Since the `--cruft-expiration` is "now", the effective + # behavior is to move _all_ unreachable objects out to + # the location in `--expire-to`. + git show-index <$expired >expired.raw && + cut -d" " -f2 expired.raw | sort >expired.objects && + git rev-list --all --objects --no-object-names \ + >remaining.objects && + + # ...in other words, the combined contents of this + # repository and expired.git should be the same as the + # set of objects we started with. + cat expired.objects remaining.objects | sort >actual && + test_cmp expect actual && + + # The "moved" objects (i.e., those in expired.git) + # should be the same as the cruft objects which were + # expired in the previous step. + test_cmp moved.want expired.objects + ) +' + +test_expect_success '--expire-to stores pruned objects (5.minutes.ago)' ' + git init expire-to-5.minutes.ago && + ( + cd expire-to-5.minutes.ago && + + git branch -M main && + + test_commit base && + + # Create two classes of unreachable objects, one which + # is older than 5 minutes (stale), and another which is + # newer (recent). + for kind in stale recent + do + git checkout -b $kind main && + test_commit --no-tag $kind || return 1 + done && + + git rev-list --objects --no-object-names main..stale >in && + stale="$(git pack-objects $objdir/pack/pack expect.raw && + sort expect.raw >expect && + + # moved.want holds the set of objects we expect to find + # in expired.git + git rev-list --objects --no-object-names main..stale >out && + sort out >moved.want && + + git checkout main && + git branch -D stale recent && + git reflog expire --all --expire=all && + git prune-packed && + + git init --bare expired.git && + git repack -d \ + --cruft --cruft-expiration=5.minutes.ago \ + --expire-to="expired.git/objects/pack/pack" && + + # Some of the remaining objects in this repository are + # unreachable, so use `cat-file --batch-all-objects` + # instead of `rev-list` to get their names + git cat-file --batch-all-objects --batch-check="%(objectname)" \ + >remaining.objects && + sort remaining.objects >actual && + test_cmp expect actual && + + ( + cd expired.git && + + expired="$(ls objects/pack/pack-*.mtimes)" && + test-tool pack-mtimes $(basename $expired) >out && + cut -d" " -f1 out | sort >../moved.got && + + # Ensure that there are as many objects with the + # expected mtime as were moved to expired.git. + # + # In other words, ensure that the recorded + # mtimes of any moved objects was written + # correctly. + grep " $mtime$" out >matching && + test_line_count = $(wc -l <../moved.want) matching + ) && + test_cmp moved.want moved.got + ) +' + test_done