From patchwork Thu Mar 3 00:20:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12766903 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE508C433EF for ; Thu, 3 Mar 2022 00:20:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230156AbiCCAVg (ORCPT ); Wed, 2 Mar 2022 19:21:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230205AbiCCAVa (ORCPT ); Wed, 2 Mar 2022 19:21:30 -0500 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 340A113AA25 for ; Wed, 2 Mar 2022 16:20:46 -0800 (PST) Received: by mail-io1-xd34.google.com with SMTP id r7so3980594iot.3 for ; Wed, 02 Mar 2022 16:20:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=U57WZotEOCKfboIeqgCsQcTnM0wsgdqmIpBbD+9WI7g=; b=5KIvbMBJYCyLksEd8N0666T/QZEQDyUaIMNrn7xQ8d+BGxpGfcQo/SBe/7zvvvwZ9d +SiFlhVzDvjBorYSfmDqjyVZRa/5fT3HKyDdPTsHAh2TxOlCXKazp3wD2byJXozyBvXR 3tBuh5HJUgYl9i7GbeabdmokP4WXlEnyWRKp+nfT5XozOotsfhjGIt41iAh8uNhf5Z4J aCpA3z1F8cWKN2Mi+sSoJ162jK6kwb4Hi8kAq6RBY7PNNr6DidHz7hZLlzx9zNMNRWr1 znkCuLpVOv3VstwTq8xdXCcpCoyVn2PZ4zbE1TpFFT/ibVoTWc3T6havC3mEVGjlv8Ug Ha5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=U57WZotEOCKfboIeqgCsQcTnM0wsgdqmIpBbD+9WI7g=; b=Ghj+G3PETRIQoafLAoalO5Cn0sU6eTHSXZLVJbstklyv/fOZQYRtOF2YMcc2q1jiEY Zf+MGYYQ712DV1cfxkOFD5nQuX2LmXNQ4ZJ4R812GAewxSnDcbmhAClVRYAmj607dOl4 p0YVMIm2mjjrL6Az7oFbWLCKAjQ/06bml4x2+yAM5CODF1uFjBY6rTPUVhqhjA277Pch X2iDoNi4zEvB9st2FQLxP+YpT6fV8qMpxz5wZJRsqF+u/PI4t4vxKuZXFjh+1UokzlEo wANXyaiUtXIm2yLRkRA6Dn+7T0hKm7vpDTqQtWbu1sBhYT8GlXcZ00Ul0WOtxtbqcmpB +cLg== X-Gm-Message-State: AOAM533yqwXnXnPf2uaUTNbsMTD27cJ42QcHob8WEFdq52BWrSLPRiNN EQK8uzajAR1A9cWJxSdiMDHvhvGeF+IAPc5H X-Google-Smtp-Source: ABdhPJwSSN6TmXvfyU74Jql9Gbg0L/3hNu/4aatqeRjGPh3H7ptExEP9oXL36tdbSnfQf+A2COA9Fg== X-Received: by 2002:a02:cc26:0:b0:30f:ce14:5241 with SMTP id o6-20020a02cc26000000b0030fce145241mr26601412jap.94.1646266845101; Wed, 02 Mar 2022 16:20:45 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id m9-20020a923f09000000b002c2a1a3a888sm339013ila.50.2022.03.02.16.20.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 16:20:44 -0800 (PST) Date: Wed, 2 Mar 2022 19:20:44 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v3 01/17] Documentation/technical: add cruft-packs.txt Message-ID: <784ee7e0eec9ba520ebaaa27de2de810e2f6798a.1646266835.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Create a technical document to explain cruft packs. It contains a brief overview of the problem, some background, details on the implementation, and a couple of alternative approaches not considered here. Signed-off-by: Taylor Blau --- Documentation/Makefile | 1 + Documentation/technical/cruft-packs.txt | 97 +++++++++++++++++++++++++ 2 files changed, 98 insertions(+) create mode 100644 Documentation/technical/cruft-packs.txt diff --git a/Documentation/Makefile b/Documentation/Makefile index ed656db2ae..0b01c9408e 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -91,6 +91,7 @@ TECH_DOCS += MyFirstContribution TECH_DOCS += MyFirstObjectWalk TECH_DOCS += SubmittingPatches TECH_DOCS += technical/bundle-format +TECH_DOCS += technical/cruft-packs TECH_DOCS += technical/hash-function-transition TECH_DOCS += technical/http-protocol TECH_DOCS += technical/index-format diff --git a/Documentation/technical/cruft-packs.txt b/Documentation/technical/cruft-packs.txt new file mode 100644 index 0000000000..2c3c5d93f8 --- /dev/null +++ b/Documentation/technical/cruft-packs.txt @@ -0,0 +1,97 @@ += Cruft packs + +The cruft packs feature offer an alternative to Git's traditional mechanism of +removing unreachable objects. This document provides an overview of Git's +pruning mechanism, and how a cruft pack can be used instead to accomplish the +same. + +== Background + +To remove unreachable objects from your repository, Git offers `git repack -Ad` +(see linkgit:git-repack[1]). Quoting from the documentation: + +[quote] +[...] unreachable objects in a previous pack become loose, unpacked objects, +instead of being left in the old pack. [...] loose unreachable objects will be +pruned according to normal expiry rules with the next 'git gc' invocation. + +Unreachable objects aren't removed immediately, since doing so could race with +an incoming push which may reference an object which is about to be deleted. +Instead, those unreachable objects are stored as loose object and stay that way +until they are older than the expiration window, at which point they are removed +by linkgit:git-prune[1]. + +Git must store these unreachable objects loose in order to keep track of their +per-object mtimes. If these unreachable objects were written into one big pack, +then either freshening that pack (because an object contained within it was +re-written) or creating a new pack of unreachable objects would cause the pack's +mtime to get updated, and the objects within it would never leave the expiration +window. Instead, objects are stored loose in order to keep track of the +individual object mtimes and avoid a situation where all cruft objects are +freshened at once. + +This can lead to undesirable situations when a repository contains many +unreachable objects which have not yet left the grace period. Having large +directories in the shards of `.git/objects` can lead to decreased performance in +the repository. But given enough unreachable objects, this can lead to inode +starvation and degrade the performance of the whole system. Since we +can never pack those objects, these repositories often take up a large amount of +disk space, since we can only zlib compress them, but not store them in delta +chains. + +== Cruft packs + +A cruft pack eliminates the need for storing unreachable objects in a loose +state by including the per-object mtimes in a separate file alongside a single +pack containing all loose objects. + +A cruft pack is written by `git repack --cruft` when generating a new pack. +linkgit:git-pack-objects[1]'s `--cruft` option. Note that `git repack --cruft` +is a classic all-into-one repack, meaning that everything in the resulting pack is +reachable, and everything else is unreachable. Once written, the `--cruft` +option instructs `git repack` to generate another pack containing only objects +not packed in the previous step (which equates to packing all unreachable +objects together). This progresses as follows: + + 1. Enumerate every object, marking any object which is (a) not contained in a + kept-pack, and (b) whose mtime is within the grace period as a traversal + tip. + + 2. Perform a reachability traversal based on the tips gathered in the previous + step, adding every object along the way to the pack. + + 3. Write the pack out, along with a `.mtimes` file that records the per-object + timestamps. + +This mode is invoked internally by linkgit:git-repack[1] when instructed to +write a cruft pack. Crucially, the set of in-core kept packs is exactly the set +of packs which will not be deleted by the repack; in other words, they contain +all of the repository's reachable objects. + +When a repository already has a cruft pack, `git repack --cruft` typically only +adds objects to it. An exception to this is when `git repack` is given the +`--cruft-expiration` option, which allows the generated cruft pack to omit +expired objects instead of waiting for linkgit:git-gc[1] to expire those objects +later on. + +It is linkgit:git-gc[1] that is typically responsible for removing expired +unreachable objects. + +== Alternatives + +Notable alternatives to this design include: + + - The location of the per-object mtime data, and + - Storing unreachable objects in multiple cruft packs. + +On the location of mtime data, a new auxiliary file tied to the pack was chosen +to avoid complicating the `.idx` format. If the `.idx` format were ever to gain +support for optional chunks of data, it may make sense to consolidate the +`.mtimes` format into the `.idx` itself. + +Storing unreachable objects among multiple cruft packs (e.g., creating a new +cruft pack during each repacking operation including only unreachable objects +which aren't already stored in an earlier cruft pack) is significantly more +complicated to construct, and so aren't pursued here. The obvious drawback to +the current implementation is that the entire cruft pack must be re-written from +scratch. From patchwork Thu Mar 3 00:20:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12766905 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C961DC4332F for ; Thu, 3 Mar 2022 00:20:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230221AbiCCAVi (ORCPT ); Wed, 2 Mar 2022 19:21:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37168 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230225AbiCCAVe (ORCPT ); Wed, 2 Mar 2022 19:21:34 -0500 Received: from mail-il1-x12e.google.com (mail-il1-x12e.google.com [IPv6:2607:f8b0:4864:20::12e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7BC9F1EA2 for ; Wed, 2 Mar 2022 16:20:48 -0800 (PST) Received: by mail-il1-x12e.google.com with SMTP id h28so2781079ila.3 for ; Wed, 02 Mar 2022 16:20:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=kHAKXZZd4Y/Nyp9Gd+4IEnLixhh5NWU0f8kupjKvR/o=; b=usgL3QYb8nfIFgE8NlNqT9BbodDDT2sXd5583/u9CPj1WXkFWsOS0zQnAllCToVo7b XgvOxCUgwca8ncazZ7ReZzHk6L3QDOTBOIs361bqd1QzZRcKailmRqQlubhnxx7mtiAC CrVtiZDdECo3bL9/8roh6CZUahUCtUXCMPoXp7v6zPK3o2EiqByx+HlTD4YH3y+Gk271 YHz5Qkf/Fx5JymSIguDQ/nnERLr6no9Ky36S6WPFsvaPSbUKLz3NcHuMGXOYejbNWznx MjFCr1DS3Tge+B+o31r9RTmxkZ6PP02qWkirtXikj+WJ6ILOFc7jOTmZT1UZW90LZk0M WVhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=kHAKXZZd4Y/Nyp9Gd+4IEnLixhh5NWU0f8kupjKvR/o=; b=iXs2VMCvJl4QZW3lkEC5Tg3bQU0xHACWJL8RUdcBhE7/QR8L5hmOKjTKmofZalJ83i QgqG4vylj7jV2tL31llnnTP3xo9hzX9jeXVNawechBOkh/GCrYGoYMWzzFl5G/BXwCxY eIuDPJEPpijusfgp5c3mSvb/MPSMT/QNFlTt5eUdJ5XgiF6qTj66wfpoceKtF+7043oq gNgwCKmoJozsQUuehJdI01oYhtyyYFJ6QBTnyKf1UxzU1ObdjaluIs/18iM/AVxMQbTD cOElwq3HG/Spjn7Q2w50a5+pCVS/O0op5g8bswq2+wbD9mTKKoqJMI6DXyEiNaSmQEJk 18xw== X-Gm-Message-State: AOAM5321NuBTI6nSC4KL558pwdWuOUl7zMdqXAm8Lh5CkmCFjya8fIdH 48VrzETVf8sPK8syVjl81mXvXF+DcFVGD37U X-Google-Smtp-Source: ABdhPJzdvq/Zcem5YJ1neOxCakXBSBnp/kwfGAosSNnxKpuCd/dHV9BAOfB8m2pjjhd2sRbEjFXyEw== X-Received: by 2002:a05:6e02:1489:b0:2c2:dffe:5c55 with SMTP id n9-20020a056e02148900b002c2dffe5c55mr16270855ilk.177.1646266847433; Wed, 02 Mar 2022 16:20:47 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id j19-20020a92c213000000b002c25f2c33f1sm312195ilo.52.2022.03.02.16.20.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 16:20:47 -0800 (PST) Date: Wed, 2 Mar 2022 19:20:46 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v3 02/17] pack-mtimes: support reading .mtimes files Message-ID: <1ec754ad1b5c1051b52acef6ec72c0464c0eabf0.1646266835.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org To store the individual mtimes of objects in a cruft pack, introduce a new `.mtimes` format that can optionally accompany a single pack in the repository. The format is defined in Documentation/technical/pack-format.txt, and stores a 4-byte network order timestamp for each object in name (index) order. This patch prepares for cruft packs by defining the `.mtimes` format, and introducing a basic API that callers can use to read out individual mtimes. Signed-off-by: Taylor Blau --- Documentation/technical/pack-format.txt | 19 ++++ Makefile | 1 + builtin/repack.c | 1 + object-store.h | 5 +- pack-mtimes.c | 126 ++++++++++++++++++++++++ pack-mtimes.h | 15 +++ packfile.c | 19 +++- 7 files changed, 183 insertions(+), 3 deletions(-) create mode 100644 pack-mtimes.c create mode 100644 pack-mtimes.h diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 6d3efb7d16..c443dbb526 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -294,6 +294,25 @@ Pack file entry: <+ All 4-byte numbers are in network order. +== pack-*.mtimes files have the format: + + - A 4-byte magic number '0x4d544d45' ('MTME'). + + - A 4-byte version identifier (= 1). + + - A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256). + + - A table of 4-byte unsigned integers in network order. The ith + value is the modification time (mtime) of the ith object in the + corresponding pack by lexicographic (index) order. The mtimes + count standard epoch seconds. + + - A trailer, containing a checksum of the corresponding packfile, + and a checksum of all of the above (each having length according + to the specified hash function). + +All 4-byte numbers are in network order. + == multi-pack-index (MIDX) files have the following format: The multi-pack-index files refer to multiple pack-files and loose objects. diff --git a/Makefile b/Makefile index 6f0b4b775f..1b186f4fd7 100644 --- a/Makefile +++ b/Makefile @@ -959,6 +959,7 @@ LIB_OBJS += oidtree.o LIB_OBJS += pack-bitmap-write.o LIB_OBJS += pack-bitmap.o LIB_OBJS += pack-check.o +LIB_OBJS += pack-mtimes.o LIB_OBJS += pack-objects.o LIB_OBJS += pack-revindex.o LIB_OBJS += pack-write.o diff --git a/builtin/repack.c b/builtin/repack.c index da1e364a75..f908f7d5dd 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -212,6 +212,7 @@ static struct { } exts[] = { {".pack"}, {".rev", 1}, + {".mtimes", 1}, {".bitmap", 1}, {".promisor", 1}, {".idx"}, diff --git a/object-store.h b/object-store.h index 6f89482df0..9b227661f2 100644 --- a/object-store.h +++ b/object-store.h @@ -115,12 +115,15 @@ struct packed_git { freshened:1, do_not_close:1, pack_promisor:1, - multi_pack_index:1; + multi_pack_index:1, + is_cruft:1; unsigned char hash[GIT_MAX_RAWSZ]; struct revindex_entry *revindex; const uint32_t *revindex_data; const uint32_t *revindex_map; size_t revindex_size; + const uint32_t *mtimes_map; + size_t mtimes_size; /* something like ".git/objects/pack/xxxxx.pack" */ char pack_name[FLEX_ARRAY]; /* more */ }; diff --git a/pack-mtimes.c b/pack-mtimes.c new file mode 100644 index 0000000000..46ad584af1 --- /dev/null +++ b/pack-mtimes.c @@ -0,0 +1,126 @@ +#include "pack-mtimes.h" +#include "object-store.h" +#include "packfile.h" + +static char *pack_mtimes_filename(struct packed_git *p) +{ + size_t len; + if (!strip_suffix(p->pack_name, ".pack", &len)) + BUG("pack_name does not end in .pack"); + /* NEEDSWORK: this could reuse code from pack-revindex.c. */ + return xstrfmt("%.*s.mtimes", (int)len, p->pack_name); +} + +#define MTIMES_HEADER_SIZE (12) +#define MTIMES_MIN_SIZE (MTIMES_HEADER_SIZE + (2 * the_hash_algo->rawsz)) + +struct mtimes_header { + uint32_t signature; + uint32_t version; + uint32_t hash_id; +}; + +static int load_pack_mtimes_file(char *mtimes_file, + uint32_t num_objects, + const uint32_t **data_p, size_t *len_p) +{ + int fd, ret = 0; + struct stat st; + void *data = NULL; + size_t mtimes_size; + struct mtimes_header header; + uint32_t *hdr; + + fd = git_open(mtimes_file); + + if (fd < 0) { + ret = -1; + goto cleanup; + } + if (fstat(fd, &st)) { + ret = error_errno(_("failed to read %s"), mtimes_file); + goto cleanup; + } + + mtimes_size = xsize_t(st.st_size); + + if (mtimes_size < MTIMES_MIN_SIZE) { + ret = error(_("mtimes file %s is too small"), mtimes_file); + goto cleanup; + } + + if (mtimes_size - MTIMES_MIN_SIZE != st_mult(sizeof(uint32_t), num_objects)) { + ret = error(_("mtimes file %s is corrupt"), mtimes_file); + goto cleanup; + } + + data = hdr = xmmap(NULL, mtimes_size, PROT_READ, MAP_PRIVATE, fd, 0); + + header.signature = ntohl(hdr[0]); + header.version = ntohl(hdr[1]); + header.hash_id = ntohl(hdr[2]); + + if (header.signature != MTIMES_SIGNATURE) { + ret = error(_("mtimes file %s has unknown signature"), mtimes_file); + goto cleanup; + } + + if (header.version != 1) { + ret = error(_("mtimes file %s has unsupported version %"PRIu32), + mtimes_file, header.version); + goto cleanup; + } + + if (!(header.hash_id == 1 || header.hash_id == 2)) { + ret = error(_("mtimes file %s has unsupported hash id %"PRIu32), + mtimes_file, header.hash_id); + goto cleanup; + } + +cleanup: + if (ret) { + if (data) + munmap(data, mtimes_size); + } else { + *len_p = mtimes_size; + *data_p = (const uint32_t *)data; + } + + close(fd); + return ret; +} + +int load_pack_mtimes(struct packed_git *p) +{ + char *mtimes_name = NULL; + int ret = 0; + + if (!p->is_cruft) + return ret; /* not a cruft pack */ + if (p->mtimes_map) + return ret; /* already loaded */ + + ret = open_pack_index(p); + if (ret < 0) + goto cleanup; + + mtimes_name = pack_mtimes_filename(p); + ret = load_pack_mtimes_file(mtimes_name, + p->num_objects, + &p->mtimes_map, + &p->mtimes_size); +cleanup: + free(mtimes_name); + return ret; +} + +uint32_t nth_packed_mtime(struct packed_git *p, uint32_t pos) +{ + if (!p->mtimes_map) + BUG("pack .mtimes file not loaded for %s", p->pack_name); + if (p->num_objects <= pos) + BUG("pack .mtimes out-of-bounds (%"PRIu32" vs %"PRIu32")", + pos, p->num_objects); + + return get_be32(p->mtimes_map + pos + 3); +} diff --git a/pack-mtimes.h b/pack-mtimes.h new file mode 100644 index 0000000000..38ddb9f893 --- /dev/null +++ b/pack-mtimes.h @@ -0,0 +1,15 @@ +#ifndef PACK_MTIMES_H +#define PACK_MTIMES_H + +#include "git-compat-util.h" + +#define MTIMES_SIGNATURE 0x4d544d45 /* "MTME" */ +#define MTIMES_VERSION 1 + +struct packed_git; + +int load_pack_mtimes(struct packed_git *p); + +uint32_t nth_packed_mtime(struct packed_git *p, uint32_t pos); + +#endif diff --git a/packfile.c b/packfile.c index 835b2d2716..fc0245fbab 100644 --- a/packfile.c +++ b/packfile.c @@ -334,12 +334,22 @@ static void close_pack_revindex(struct packed_git *p) p->revindex_data = NULL; } +static void close_pack_mtimes(struct packed_git *p) +{ + if (!p->mtimes_map) + return; + + munmap((void *)p->mtimes_map, p->mtimes_size); + p->mtimes_map = NULL; +} + void close_pack(struct packed_git *p) { close_pack_windows(p); close_pack_fd(p); close_pack_index(p); close_pack_revindex(p); + close_pack_mtimes(p); oidset_clear(&p->bad_objects); } @@ -363,7 +373,7 @@ void close_object_store(struct raw_object_store *o) void unlink_pack_path(const char *pack_name, int force_delete) { - static const char *exts[] = {".pack", ".idx", ".rev", ".keep", ".bitmap", ".promisor"}; + static const char *exts[] = {".pack", ".idx", ".rev", ".keep", ".bitmap", ".promisor", ".mtimes"}; int i; struct strbuf buf = STRBUF_INIT; size_t plen; @@ -718,6 +728,10 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local) if (!access(p->pack_name, F_OK)) p->pack_promisor = 1; + xsnprintf(p->pack_name + path_len, alloc - path_len, ".mtimes"); + if (!access(p->pack_name, F_OK)) + p->is_cruft = 1; + xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack"); if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) { free(p); @@ -869,7 +883,8 @@ static void prepare_pack(const char *full_name, size_t full_name_len, ends_with(file_name, ".pack") || ends_with(file_name, ".bitmap") || ends_with(file_name, ".keep") || - ends_with(file_name, ".promisor")) + ends_with(file_name, ".promisor") || + ends_with(file_name, ".mtimes")) string_list_append(data->garbage, full_name); else report_garbage(PACKDIR_FILE_GARBAGE, full_name); From patchwork Thu Mar 3 00:20:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12766904 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4520EC433FE for ; Thu, 3 Mar 2022 00:20:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230216AbiCCAVh (ORCPT ); Wed, 2 Mar 2022 19:21:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230241AbiCCAVf (ORCPT ); Wed, 2 Mar 2022 19:21:35 -0500 Received: from mail-qk1-x72a.google.com (mail-qk1-x72a.google.com [IPv6:2607:f8b0:4864:20::72a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F763E2346 for ; Wed, 2 Mar 2022 16:20:51 -0800 (PST) Received: by mail-qk1-x72a.google.com with SMTP id f21so2723767qke.13 for ; Wed, 02 Mar 2022 16:20:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=h1/TaS4hJ7uvqeQbRpqvGUZQGPog/DBrFRJeVMB1op0=; b=mGMmH2hSA8vMyG9F7VKuGZfthuajLZH5ziJLKLD/I2uXO4HoZAkb0jha6m6fn5lLk9 tqZct680RgGiYURrdVqBoVVFVU1ZIx9XaXYu9Zmt2IWYlKg4GfefdSY2irH65SfM4PKt aIXgfHQd1kGgmskQP0Pia51MlhnV5asFCeEAy6hJASmQqdAMBlgbDgyu7ItbJSoQPxxV ixZd7AJF23ha3WnOTd8oWGiuX3eypADu65R4piJ62xwD6cFluGxd1unahmlkPH+HaUTK lyPSDnJPnMIuirE7HFnZU2o6IWhr6YKKyKjb9mHuQn9/aSaRB57+IXPxj7VzfhWwIRiS KTIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=h1/TaS4hJ7uvqeQbRpqvGUZQGPog/DBrFRJeVMB1op0=; b=TJxluq5CuJ5ndVPwB8naKHPMMsoOUf+bZADkZ1X9Z5qt+Kml++JU2kj5Lw/e1fBXWn S7CxV+BwCFJ6VNDlCFwHOtk4+ddM9PlIx3m11kESjtu0AK8kGmDo//6v1Vkw9nxvkFCz iZjpN6bHoaEgvlRa/jifjIelpiXi1GcOx6SdZKdurHCmQEX0rfBYcceIEstOMDeW4E0h JAzJPsau76RvHMRio3rJuaqrMdSv2wOIoXJYWbrZCa2/abqGYNDfRWryFsX0YKKRvYg9 83iOvUt1ITf1o8UMjFbfP1T/ZyAiCW/rHQ6R69kn+NkXjohAAmAuQWvHUuGDsBstrgZE zeoA== X-Gm-Message-State: AOAM532EIyyfeUd4kLc33P77UOFCFaGsQC3+E6Sf6QWYFdnwX3Z+sqnP TWl8uMO5WprLQ4tBmZ7zGWQoDsr78jQS+TlZ X-Google-Smtp-Source: ABdhPJzKcC6Zcw9qeAITJ51glTT1rrKEv162D54bOW0ei6jaSzSsOKI8AeawqzB2oQ3ZO0nJi8sOOw== X-Received: by 2002:a37:648c:0:b0:5f1:9243:7b5d with SMTP id y134-20020a37648c000000b005f192437b5dmr17428623qkb.719.1646266850291; Wed, 02 Mar 2022 16:20:50 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id p68-20020a378d47000000b006491d2d1450sm308184qkd.10.2022.03.02.16.20.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 16:20:49 -0800 (PST) Date: Wed, 2 Mar 2022 19:20:49 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v3 03/17] pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' Message-ID: <0f5d6d64924bcc1c81853ae246327338f7679a5e.1646266835.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This structure will be used to communicate the per-object mtimes when writing a cruft pack. Here, we need the full packing_data structure because the mtime information is stored in an array there, not on the individual object_entry's themselves (to avoid paying the overhead in structure width for operations which do not generate a cruft pack). We haven't passed this information down before because one of the two callers (in bulk-checkin.c) does not have a packing_data structure at all. In that case (where no cruft pack will be generated), NULL is passed instead. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 3 ++- bulk-checkin.c | 2 +- pack-write.c | 1 + pack.h | 3 +++ 4 files changed, 7 insertions(+), 2 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 178e611f09..385970cb7b 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1254,7 +1254,8 @@ static void write_pack_file(void) stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, - &pack_idx_opts, hash, &idx_tmp_name); + &to_pack, &pack_idx_opts, hash, + &idx_tmp_name); if (write_bitmap_index) { size_t tmpname_len = tmpname.len; diff --git a/bulk-checkin.c b/bulk-checkin.c index 8785b2ac80..99f7596c4e 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -33,7 +33,7 @@ static void finish_tmp_packfile(struct strbuf *basename, char *idx_tmp_name = NULL; stage_tmp_packfiles(basename, pack_tmp_name, written_list, nr_written, - pack_idx_opts, hash, &idx_tmp_name); + NULL, pack_idx_opts, hash, &idx_tmp_name); rename_tmp_packfile_idx(basename, &idx_tmp_name); free(idx_tmp_name); diff --git a/pack-write.c b/pack-write.c index a5846f3a34..d594e3008e 100644 --- a/pack-write.c +++ b/pack-write.c @@ -483,6 +483,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, + struct packing_data *to_pack, struct pack_idx_option *pack_idx_opts, unsigned char hash[], char **idx_tmp_name) diff --git a/pack.h b/pack.h index b22bfc4a18..fd27cfdfd7 100644 --- a/pack.h +++ b/pack.h @@ -109,11 +109,14 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, #define PH_ERROR_PROTOCOL (-3) int read_pack_header(int fd, struct pack_header *); +struct packing_data; + struct hashfile *create_tmp_packfile(char **pack_tmp_name); void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, + struct packing_data *to_pack, struct pack_idx_option *pack_idx_opts, unsigned char hash[], char **idx_tmp_name); From patchwork Thu Mar 3 00:20:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12766906 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1CB3C433EF for ; Thu, 3 Mar 2022 00:21:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230223AbiCCAVn (ORCPT ); Wed, 2 Mar 2022 19:21:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230225AbiCCAVj (ORCPT ); Wed, 2 Mar 2022 19:21:39 -0500 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82816F1E84 for ; Wed, 2 Mar 2022 16:20:54 -0800 (PST) Received: by mail-qk1-x731.google.com with SMTP id v5so2765193qkj.4 for ; Wed, 02 Mar 2022 16:20:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=LtC0Ec4kYzTlFC8i2zTuZ0qokyxmwms48N+ucsIWL1Y=; b=VBEU3mxpuUI1JYm2cYENHc2BLnaOFRWs5lf0r/+WEALlfyb+p2n6pm4r+w3gHDU2Jw fthkPCFbxeAxb0sBa5T4RrqhaLNVCldP+66nt6ZqllqOlldhQ69YNhKQ+0UAPaaN0w60 FJ2tmBxg5WfZc3dt0F/zdWLzFXojob39HbbJDbk9TBjJCMwbdfpIX+Ep8y3Bb8nLd1QU TiSkVA4eJmU8qvBkCWdu28Q7FQy6iJiBhyGdEeM7W/zGZbmjwjLgSlziPEMHwv/hmlPA Ck6s1ADNoMcjtVNRytV/oZumo10uTSoVHfhinSGIUqfbcVBa0g0ZzkzTOpPLwjecQ2Ci 2xAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=LtC0Ec4kYzTlFC8i2zTuZ0qokyxmwms48N+ucsIWL1Y=; b=WFsfQx533SyEsqrf3ygwfYXlNmR0+IwkzVuDffRw0W3lyXt5ziGjoIQYoEP5yAojRK Vk+/JHrgDzn1UNpgPJVqsd9uJ10YGLz3IZWn8QSqf5jX96T73ihDig6fDTMTNFbDN96+ 7F3PUrfVxjq06vcVIHAcUgOpRKp4hQRwUvC8zB91s4TzQU72Zwfrg9FbMDeqf7gaME79 Q42cG0Ht1z3ViWQ/PNOfmHpAMVobUA7xkOq8aN4GvCT+kNaPZn7/LluB9wbPjR6Z1s2h NvA/WQlNBdit9AhrOt9HmPg1XX+Pfvplgw1zuV/Ei8k+YlZUt2myF2OenssnRTHvPxWI ntNQ== X-Gm-Message-State: AOAM53089hHSMNMlLm3RudkHDfAA49GRYHfyrt3IlFqHQhBphzA/r6wb ldwC089AEr5l0QIAL9RuroM2mfst5C+DKAVc X-Google-Smtp-Source: ABdhPJzeNP4KVm+oe8zaf3GIMz2xVqgHT8e9+V2y8YUwkwkX2qOtJyD7zGvCdNZDPGCCvZazbFW8mA== X-Received: by 2002:a37:bcf:0:b0:60d:ed93:67a1 with SMTP id 198-20020a370bcf000000b0060ded9367a1mr17886167qkl.548.1646266853411; Wed, 02 Mar 2022 16:20:53 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id a8-20020a05622a064800b002dd4f1eccc3sm387407qtb.35.2022.03.02.16.20.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 16:20:53 -0800 (PST) Date: Wed, 2 Mar 2022 19:20:52 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v3 04/17] chunk-format.h: extract oid_version() Message-ID: <135a07276b0a40b04f2c28d4f48c26b1af76c12c.1646266835.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org There are three definitions of an identical function which converts `the_hash_algo` into either 1 (for SHA-1) or 2 (for SHA-256). There is a copy of this function for writing both the commit-graph and multi-pack-index file, and another inline definition used to write the .rev header. Consolidate these into a single definition in chunk-format.h. It's not clear that this is the best header to define this function in, but it should do for now. (Worth noting, the .rev caller expects a 4-byte unsigned, but the other two callers work with a single unsigned byte. The consolidated version uses the latter type, and lets the compiler widen it when required). Another caller will be added in a subsequent patch. Signed-off-by: Taylor Blau --- chunk-format.c | 12 ++++++++++++ chunk-format.h | 3 +++ commit-graph.c | 18 +++--------------- midx.c | 18 +++--------------- pack-write.c | 15 ++------------- 5 files changed, 23 insertions(+), 43 deletions(-) diff --git a/chunk-format.c b/chunk-format.c index 1c3dca62e2..0275b74a89 100644 --- a/chunk-format.c +++ b/chunk-format.c @@ -181,3 +181,15 @@ int read_chunk(struct chunkfile *cf, return CHUNK_NOT_FOUND; } + +uint8_t oid_version(const struct git_hash_algo *algop) +{ + switch (hash_algo_by_ptr(algop)) { + case GIT_HASH_SHA1: + return 1; + case GIT_HASH_SHA256: + return 2; + default: + die(_("invalid hash version")); + } +} diff --git a/chunk-format.h b/chunk-format.h index 9ccbe00377..7885aa0848 100644 --- a/chunk-format.h +++ b/chunk-format.h @@ -2,6 +2,7 @@ #define CHUNK_FORMAT_H #include "git-compat-util.h" +#include "hash.h" struct hashfile; struct chunkfile; @@ -65,4 +66,6 @@ int read_chunk(struct chunkfile *cf, chunk_read_fn fn, void *data); +uint8_t oid_version(const struct git_hash_algo *algop); + #endif diff --git a/commit-graph.c b/commit-graph.c index 265c010122..f678d2c4a1 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -193,18 +193,6 @@ char *get_commit_graph_chain_filename(struct object_directory *odb) return xstrfmt("%s/info/commit-graphs/commit-graph-chain", odb->path); } -static uint8_t oid_version(void) -{ - switch (hash_algo_by_ptr(the_hash_algo)) { - case GIT_HASH_SHA1: - return 1; - case GIT_HASH_SHA256: - return 2; - default: - die(_("invalid hash version")); - } -} - static struct commit_graph *alloc_commit_graph(void) { struct commit_graph *g = xcalloc(1, sizeof(*g)); @@ -365,9 +353,9 @@ struct commit_graph *parse_commit_graph(struct repository *r, } hash_version = *(unsigned char*)(data + 5); - if (hash_version != oid_version()) { + if (hash_version != oid_version(the_hash_algo)) { error(_("commit-graph hash version %X does not match version %X"), - hash_version, oid_version()); + hash_version, oid_version(the_hash_algo)); return NULL; } @@ -1911,7 +1899,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) hashwrite_be32(f, GRAPH_SIGNATURE); hashwrite_u8(f, GRAPH_VERSION); - hashwrite_u8(f, oid_version()); + hashwrite_u8(f, oid_version(the_hash_algo)); hashwrite_u8(f, get_num_chunks(cf)); hashwrite_u8(f, ctx->num_commit_graphs_after - 1); diff --git a/midx.c b/midx.c index 865170bad0..65e670c5e2 100644 --- a/midx.c +++ b/midx.c @@ -41,18 +41,6 @@ #define PACK_EXPIRED UINT_MAX -static uint8_t oid_version(void) -{ - switch (hash_algo_by_ptr(the_hash_algo)) { - case GIT_HASH_SHA1: - return 1; - case GIT_HASH_SHA256: - return 2; - default: - die(_("invalid hash version")); - } -} - const unsigned char *get_midx_checksum(struct multi_pack_index *m) { return m->data + m->data_len - the_hash_algo->rawsz; @@ -134,9 +122,9 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local m->version); hash_version = m->data[MIDX_BYTE_HASH_VERSION]; - if (hash_version != oid_version()) { + if (hash_version != oid_version(the_hash_algo)) { error(_("multi-pack-index hash version %u does not match version %u"), - hash_version, oid_version()); + hash_version, oid_version(the_hash_algo)); goto cleanup_fail; } m->hash_len = the_hash_algo->rawsz; @@ -420,7 +408,7 @@ static size_t write_midx_header(struct hashfile *f, { hashwrite_be32(f, MIDX_SIGNATURE); hashwrite_u8(f, MIDX_VERSION); - hashwrite_u8(f, oid_version()); + hashwrite_u8(f, oid_version(the_hash_algo)); hashwrite_u8(f, num_chunks); hashwrite_u8(f, 0); /* unused */ hashwrite_be32(f, num_packs); diff --git a/pack-write.c b/pack-write.c index d594e3008e..ff305b404c 100644 --- a/pack-write.c +++ b/pack-write.c @@ -2,6 +2,7 @@ #include "pack.h" #include "csum-file.h" #include "remote.h" +#include "chunk-format.h" void reset_pack_idx_option(struct pack_idx_option *opts) { @@ -181,21 +182,9 @@ static int pack_order_cmp(const void *va, const void *vb, void *ctx) static void write_rev_header(struct hashfile *f) { - uint32_t oid_version; - switch (hash_algo_by_ptr(the_hash_algo)) { - case GIT_HASH_SHA1: - oid_version = 1; - break; - case GIT_HASH_SHA256: - oid_version = 2; - break; - default: - die("write_rev_header: unknown hash version"); - } - hashwrite_be32(f, RIDX_SIGNATURE); hashwrite_be32(f, RIDX_VERSION); - hashwrite_be32(f, oid_version); + hashwrite_be32(f, oid_version(the_hash_algo)); } static void write_rev_index_positions(struct hashfile *f, From patchwork Thu Mar 3 00:20:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12766907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14953C433F5 for ; Thu, 3 Mar 2022 00:21:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230243AbiCCAVr (ORCPT ); Wed, 2 Mar 2022 19:21:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230238AbiCCAVm (ORCPT ); Wed, 2 Mar 2022 19:21:42 -0500 Received: from mail-qk1-x734.google.com (mail-qk1-x734.google.com [IPv6:2607:f8b0:4864:20::734]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29012F8B9F for ; Wed, 2 Mar 2022 16:20:57 -0800 (PST) Received: by mail-qk1-x734.google.com with SMTP id t21so2744161qkg.6 for ; Wed, 02 Mar 2022 16:20:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=P5GgWkN9cl8Xn1q1KDYUKB5PiHhKBc7Ki0Cn957yFXs=; b=oJ4tVASl754dK/CLSKcNRVFAWhc9vCvmLzEoSbEA6PCX4PS0ii59ZFLnK+ynHOUniT /2I2eh/8SaRTF/03xtQIa4sk2GYa5DWHUDC1tVYkFSlXRY+Yu3zf55SRiHRzaYTAI4ul mLvqCkys63jjuaBBz8YSuAbUrR+nX3rTUdw7Pzn1GJotGm2deuKik3OOZPTCxjW0ui5n kuCbx4HRkpAB28PEYY8iNj7q1DFLP7JOPTMnAzj1Cj7uvRjlDY1vxopDDjz/o7lnrzC/ Z1aoqIHEeMgjFXc66W2jws492XoOZFddNUeQoqUhyZhpIeSBeeXF+g49tQFtn7/Q1vA4 N/dQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=P5GgWkN9cl8Xn1q1KDYUKB5PiHhKBc7Ki0Cn957yFXs=; b=lPVVt1Qke9bW0gC0x/J52B5Rf5T391eSqYxZwvjY3OL9W+XFLe9kt/zFQybcjCxnvg pJUHi8IP+uNBGh4toio+v0ifeXF3h78whV0dl/mXfjsPAqu1Y/El+ZLNJ0PUVw41kqUD cjhz+QaBDmcGvo7X3ZUR5c0d5kJz8miH91gunipiQJEFZi1BQRhfegPNsy2P9Q9DI+av rRz0Mi5GnJVLGfdNX2FnitUVze+OoNWKaZDa0eBCbmSpinepStKmBm5WavB0c67ph/iw FOS1U+n3lsM8kDz+izP0xbxbbtPxZkYYBrnFR8f5NWpUWHoFOJ3ohHD67hs2hNj49XL/ pVSA== X-Gm-Message-State: AOAM533tFH//NtGYy8EWzxUoVarK7lzVkxAB/TDmEuaDZvue8swIUEA6 0jMdkOWsuWM65aEeT3WRwX72eR+CnBUtvNGz X-Google-Smtp-Source: ABdhPJxxojgWeDd65u5CQX3vhm2HxqUoGHdSWQarKm7zfMPGdBl4N08SF3hjNpB0B6Y+yCEJP2vu1g== X-Received: by 2002:a05:620a:2a05:b0:649:1a85:9212 with SMTP id o5-20020a05620a2a0500b006491a859212mr17655407qkp.586.1646266856074; Wed, 02 Mar 2022 16:20:56 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id f13-20020ac87f0d000000b002dce143f369sm384826qtk.53.2022.03.02.16.20.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 16:20:55 -0800 (PST) Date: Wed, 2 Mar 2022 19:20:55 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v3 05/17] pack-mtimes: support writing pack .mtimes files Message-ID: <0600503856dbccb135aaead27693b6815a774b4f.1646266835.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that the `.mtimes` format is defined, supplement the pack-write API to be able to conditionally write an `.mtimes` file along with a pack by setting an additional flag and passing an oidmap that contains the timestamps corresponding to each object in the pack. Signed-off-by: Taylor Blau --- pack-objects.c | 6 ++++ pack-objects.h | 25 ++++++++++++++++ pack-write.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++ pack.h | 1 + 4 files changed, 109 insertions(+) diff --git a/pack-objects.c b/pack-objects.c index fe2a4eace9..272e8d4517 100644 --- a/pack-objects.c +++ b/pack-objects.c @@ -170,6 +170,9 @@ struct object_entry *packlist_alloc(struct packing_data *pdata, if (pdata->layer) REALLOC_ARRAY(pdata->layer, pdata->nr_alloc); + + if (pdata->cruft_mtime) + REALLOC_ARRAY(pdata->cruft_mtime, pdata->nr_alloc); } new_entry = pdata->objects + pdata->nr_objects++; @@ -198,6 +201,9 @@ struct object_entry *packlist_alloc(struct packing_data *pdata, if (pdata->layer) pdata->layer[pdata->nr_objects - 1] = 0; + if (pdata->cruft_mtime) + pdata->cruft_mtime[pdata->nr_objects - 1] = 0; + return new_entry; } diff --git a/pack-objects.h b/pack-objects.h index dca2351ef9..393b9db546 100644 --- a/pack-objects.h +++ b/pack-objects.h @@ -168,6 +168,14 @@ struct packing_data { /* delta islands */ unsigned int *tree_depth; unsigned char *layer; + + /* + * Used when writing cruft packs. + * + * Object mtimes are stored in pack order when writing, but + * written out in lexicographic (index) order. + */ + uint32_t *cruft_mtime; }; void prepare_packing_data(struct repository *r, struct packing_data *pdata); @@ -289,4 +297,21 @@ static inline void oe_set_layer(struct packing_data *pack, pack->layer[e - pack->objects] = layer; } +static inline uint32_t oe_cruft_mtime(struct packing_data *pack, + struct object_entry *e) +{ + if (!pack->cruft_mtime) + return 0; + return pack->cruft_mtime[e - pack->objects]; +} + +static inline void oe_set_cruft_mtime(struct packing_data *pack, + struct object_entry *e, + uint32_t mtime) +{ + if (!pack->cruft_mtime) + CALLOC_ARRAY(pack->cruft_mtime, pack->nr_alloc); + pack->cruft_mtime[e - pack->objects] = mtime; +} + #endif diff --git a/pack-write.c b/pack-write.c index ff305b404c..270280c4df 100644 --- a/pack-write.c +++ b/pack-write.c @@ -3,6 +3,10 @@ #include "csum-file.h" #include "remote.h" #include "chunk-format.h" +#include "pack-mtimes.h" +#include "oidmap.h" +#include "chunk-format.h" +#include "pack-objects.h" void reset_pack_idx_option(struct pack_idx_option *opts) { @@ -276,6 +280,70 @@ const char *write_rev_file_order(const char *rev_name, return rev_name; } +static void write_mtimes_header(struct hashfile *f) +{ + hashwrite_be32(f, MTIMES_SIGNATURE); + hashwrite_be32(f, MTIMES_VERSION); + hashwrite_be32(f, oid_version(the_hash_algo)); +} + +/* + * Writes the object mtimes of "objects" for use in a .mtimes file. + * Note that objects must be in lexicographic (index) order, which is + * the expected ordering of these values in the .mtimes file. + */ +static void write_mtimes_objects(struct hashfile *f, + struct packing_data *to_pack, + struct pack_idx_entry **objects, + uint32_t nr_objects) +{ + uint32_t i; + for (i = 0; i < nr_objects; i++) { + struct object_entry *e = (struct object_entry*)objects[i]; + hashwrite_be32(f, oe_cruft_mtime(to_pack, e)); + } +} + +static void write_mtimes_trailer(struct hashfile *f, const unsigned char *hash) +{ + hashwrite(f, hash, the_hash_algo->rawsz); +} + +static const char *write_mtimes_file(const char *mtimes_name, + struct packing_data *to_pack, + struct pack_idx_entry **objects, + uint32_t nr_objects, + const unsigned char *hash) +{ + struct hashfile *f; + int fd; + + if (!to_pack) + BUG("cannot call write_mtimes_file with NULL packing_data"); + + if (!mtimes_name) { + struct strbuf tmp_file = STRBUF_INIT; + fd = odb_mkstemp(&tmp_file, "pack/tmp_mtimes_XXXXXX"); + mtimes_name = strbuf_detach(&tmp_file, NULL); + } else { + unlink(mtimes_name); + fd = xopen(mtimes_name, O_CREAT|O_EXCL|O_WRONLY, 0600); + } + f = hashfd(fd, mtimes_name); + + write_mtimes_header(f); + write_mtimes_objects(f, to_pack, objects, nr_objects); + write_mtimes_trailer(f, hash); + + if (adjust_shared_perm(mtimes_name) < 0) + die(_("failed to make %s readable"), mtimes_name); + + finalize_hashfile(f, NULL, + CSUM_HASH_IN_STREAM | CSUM_CLOSE | CSUM_FSYNC); + + return mtimes_name; +} + off_t write_pack_header(struct hashfile *f, uint32_t nr_entries) { struct pack_header hdr; @@ -478,6 +546,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, char **idx_tmp_name) { const char *rev_tmp_name = NULL; + const char *mtimes_tmp_name = NULL; if (adjust_shared_perm(pack_tmp_name)) die_errno("unable to make temporary pack file readable"); @@ -490,9 +559,17 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, pack_idx_opts->flags); + if (pack_idx_opts->flags & WRITE_MTIMES) { + mtimes_tmp_name = write_mtimes_file(NULL, to_pack, written_list, + nr_written, + hash); + } + rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); if (rev_tmp_name) rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); + if (mtimes_tmp_name) + rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes"); } void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought) diff --git a/pack.h b/pack.h index fd27cfdfd7..01d385903a 100644 --- a/pack.h +++ b/pack.h @@ -44,6 +44,7 @@ struct pack_idx_option { #define WRITE_IDX_STRICT 02 #define WRITE_REV 04 #define WRITE_REV_VERIFY 010 +#define WRITE_MTIMES 020 uint32_t version; uint32_t off32_limit; From patchwork Thu Mar 3 00:20:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12766908 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0115C433EF for ; Thu, 3 Mar 2022 00:21:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230253AbiCCAVs (ORCPT ); Wed, 2 Mar 2022 19:21:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230225AbiCCAVo (ORCPT ); Wed, 2 Mar 2022 19:21:44 -0500 Received: from mail-il1-x12f.google.com (mail-il1-x12f.google.com [IPv6:2607:f8b0:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EEC45FF35 for ; Wed, 2 Mar 2022 16:20:59 -0800 (PST) Received: by mail-il1-x12f.google.com with SMTP id w4so2774254ilj.5 for ; Wed, 02 Mar 2022 16:20:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=rzAK6KV5383vhhOfHxdpZCKzzhHayX+JyBviHEbbJ8c=; b=Gfth5qPvMQ8o3/aEZaXGc+IsbFXjzxePEXAHP1UZYgQNU4+0cX3Ubjs1g/xLAA5JLI d0iDLVbDBO5+IJ+9kgvMnEJuVfZ93WCyUMeJiOvqlqvh2EgnyY2CoGIooEf2GhgVI2Vj lVbBKRhfNkgcMmk4EUiVkVCS6ANHSp/8hpiYQy7k+I7REshQOLTXbFMXuFihy1qigzuI V+IbzwkgC6hmmGHszBMT8mWN26jDjd4FjoRUVjN/hhi1qRvhK44gJSiEduv7A1nrxhwX CBcsrJzyorVYNOxqRRSdHi/pUgCybvywFWhm06PxZM0uQW/IgJh05vl69GkQadlyHXIc su4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=rzAK6KV5383vhhOfHxdpZCKzzhHayX+JyBviHEbbJ8c=; b=2TYZQtfUmIh7pUjHO14av7UlT0rdfPqH/3Wk7C6QOBNhnOxpBquqLXutYpR1cTY08I KiQjNkTLaY8JL+uBQCNsaseA5y3hHpsTgXSz6Tg9ZkqLwrTyfHwdSBq6XNj8MAnJ65F1 Uh2H+PTvkob1fCkQoX9BKmbsxFp/O7eN7BXDuO93UR06QqTEaHbOEgsIxduJlPNLzy31 YvAGjBJxcVR+ze0+FUukAjVtLO9LEqf5/PIdPj+phVXY2j5LONn5Y+x4o7PpJkquT/rS COLc92fxF2vI9LT2iTU+LxwwlAOrv3y0tyNt0Po1WUIK0Rp3cM3cIl9psiz/uBFGzvSW xnPw== X-Gm-Message-State: AOAM5329pCCwpbfo7DaHTEvtKaqoQzpsHwzuP3YTT1v2B9E3BqfBjWtu uAAzjiGWRBUNH7cnGzjBz9udYx877X9UrD9L X-Google-Smtp-Source: ABdhPJxbRzczP9BmycJLe+33+d6jEGwjkqg1UUi1Egld1ijs5KJ6nQ903B4TSgnr1PdUAszHG9NZ3w== X-Received: by 2002:a05:6e02:b2d:b0:2be:46c3:aa74 with SMTP id e13-20020a056e020b2d00b002be46c3aa74mr29974549ilu.217.1646266858488; Wed, 02 Mar 2022 16:20:58 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d15-20020a056e02214f00b002bc80da7cc6sm367771ilv.72.2022.03.02.16.20.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 16:20:58 -0800 (PST) Date: Wed, 2 Mar 2022 19:20:57 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v3 06/17] t/helper: add 'pack-mtimes' test-tool Message-ID: <4780c8437bd2dcdf2c038d84160a4c575e92e58d.1646266835.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In the next patch, we will implement and test support for writing a cruft pack via a special mode of `git pack-objects`. To make sure that objects are written with the correct timestamps, and a new test-tool that can dump the object names and corresponding timestamps from a given `.mtimes` file. Signed-off-by: Taylor Blau --- Makefile | 1 + t/helper/test-pack-mtimes.c | 56 +++++++++++++++++++++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + 4 files changed, 59 insertions(+) create mode 100644 t/helper/test-pack-mtimes.c diff --git a/Makefile b/Makefile index 1b186f4fd7..5c0ed1ade7 100644 --- a/Makefile +++ b/Makefile @@ -727,6 +727,7 @@ TEST_BUILTINS_OBJS += test-oid-array.o TEST_BUILTINS_OBJS += test-oidmap.o TEST_BUILTINS_OBJS += test-oidtree.o TEST_BUILTINS_OBJS += test-online-cpus.o +TEST_BUILTINS_OBJS += test-pack-mtimes.o TEST_BUILTINS_OBJS += test-parse-options.o TEST_BUILTINS_OBJS += test-parse-pathspec-file.o TEST_BUILTINS_OBJS += test-partial-clone.o diff --git a/t/helper/test-pack-mtimes.c b/t/helper/test-pack-mtimes.c new file mode 100644 index 0000000000..f7b79daf4c --- /dev/null +++ b/t/helper/test-pack-mtimes.c @@ -0,0 +1,56 @@ +#include "git-compat-util.h" +#include "test-tool.h" +#include "strbuf.h" +#include "object-store.h" +#include "packfile.h" +#include "pack-mtimes.h" + +static void dump_mtimes(struct packed_git *p) +{ + uint32_t i; + if (load_pack_mtimes(p) < 0) + die("could not load pack .mtimes"); + + for (i = 0; i < p->num_objects; i++) { + struct object_id oid; + if (nth_packed_object_id(&oid, p, i) < 0) + die("could not load object id at position %"PRIu32, i); + + printf("%s %"PRIu32"\n", + oid_to_hex(&oid), nth_packed_mtime(p, i)); + } +} + +static const char *pack_mtimes_usage = "\n" +" test-tool pack-mtimes "; + +int cmd__pack_mtimes(int argc, const char **argv) +{ + struct strbuf buf = STRBUF_INIT; + struct packed_git *p; + + setup_git_directory(); + + if (argc != 2) + usage(pack_mtimes_usage); + + for (p = get_all_packs(the_repository); p; p = p->next) { + strbuf_addstr(&buf, basename(p->pack_name)); + strbuf_strip_suffix(&buf, ".pack"); + strbuf_addstr(&buf, ".mtimes"); + + if (!strcmp(buf.buf, argv[1])) + break; + + strbuf_reset(&buf); + } + + strbuf_release(&buf); + + if (!p) + die("could not find pack '%s'", argv[1]); + + dump_mtimes(p); + + return 0; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index e6ec69cf32..7d472b31fd 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -47,6 +47,7 @@ static struct test_cmd cmds[] = { { "oidmap", cmd__oidmap }, { "oidtree", cmd__oidtree }, { "online-cpus", cmd__online_cpus }, + { "pack-mtimes", cmd__pack_mtimes }, { "parse-options", cmd__parse_options }, { "parse-pathspec-file", cmd__parse_pathspec_file }, { "partial-clone", cmd__partial_clone }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index 20756eefdd..0ac4f32955 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -37,6 +37,7 @@ int cmd__mktemp(int argc, const char **argv); int cmd__oidmap(int argc, const char **argv); int cmd__oidtree(int argc, const char **argv); int cmd__online_cpus(int argc, const char **argv); +int cmd__pack_mtimes(int argc, const char **argv); int cmd__parse_options(int argc, const char **argv); int cmd__parse_pathspec_file(int argc, const char** argv); int cmd__partial_clone(int argc, const char **argv); From patchwork Thu Mar 3 00:21:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12766909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F70FC433F5 for ; Thu, 3 Mar 2022 00:21:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230280AbiCCAV6 (ORCPT ); Wed, 2 Mar 2022 19:21:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230250AbiCCAVr (ORCPT ); Wed, 2 Mar 2022 19:21:47 -0500 Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D99C108749 for ; Wed, 2 Mar 2022 16:21:02 -0800 (PST) Received: by mail-qt1-x829.google.com with SMTP id b23so3272499qtt.6 for ; Wed, 02 Mar 2022 16:21:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=CIKN/CB83u6sQ1/KE0vEIeyTEwD5VnDa2rxArL8gDJ0=; b=ibV0CIchzVnBUieKpmpQcvk16m4H+GsHXI3qGRkXjCq3FM5ix0LnMZKUxmmWADhHjH IbGmWRc85aTwGMzzzUsn4zly8UUekdARDjUo3Wp+CanRI++5KaOVGTpo8K/woXzTgRbx 9HzHk7C7cLhC2wjVH3b4AhD5Wdwh2ZE4TETMs3q1rGoSurOhkkTMuvPzeNkRhhXiJbsX 3OvP184qISTu8XCLUazv5l3LZdYRl032S8CPh6kYaSTmhl65NedCZu4yqKxr3O0zwDs8 6dITAew9jv+ziM/dBNzDoletSnGWqNry6FgNSqaxZtF0mGwP+VZRrEUC+P3TzN9jhPCi Sdpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=CIKN/CB83u6sQ1/KE0vEIeyTEwD5VnDa2rxArL8gDJ0=; b=NihEO8+je1Ar90MAFZr0Dba5JOHqAvdesQmQDUjpj/omiZUeDCYlwgdWh6j4L+HfAO 8F0JJIyCeOXM2T+4c9zp+aSGiRQZHnIGWkAjqfsy0ECq/E6NbApsUMCYsXKEu6VlZKJN REJaBOSM7JnVouFkvqGYlVWX7S3CSbDD2TdhWv+2QtDCi+QJIZMJTnKGyvFWfCZ8qFhN 8OZD1Lr58dk6ptOyo3/+IwAiL3E9h/w6sIBjRtAxy7/wU/ZD/ZY3x/Z9GcgVzAPn++Js W8nkqoWGwBto/RMBClIifKMVEj2AqXIpavO1BmXjQMFzox9SlTlfqcop1cs1yFo+fgf+ 3fng== X-Gm-Message-State: AOAM530FXHg9vw1TK+G2fOGMh7U1IiMAehyktdFkTcsuMXFqTmi7y/us v/XXs9tDlcJ7ZxE3aWBcvRrUYXgNpmTCGITg X-Google-Smtp-Source: ABdhPJyNZyW3J5IvvPcbKxFoBq/PO+XDqCCP0JD7Y3BOcXh6pez+es6jj8zNtnci2POJ/bfQza1H2w== X-Received: by 2002:a05:622a:1394:b0:2de:6af2:406e with SMTP id o20-20020a05622a139400b002de6af2406emr26081959qtk.527.1646266861036; Wed, 02 Mar 2022 16:21:01 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id t9-20020a05622a148900b002de2dfd0ee2sm359139qtx.70.2022.03.02.16.21.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 16:21:00 -0800 (PST) Date: Wed, 2 Mar 2022 19:21:00 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v3 07/17] builtin/pack-objects.c: return from create_object_entry() Message-ID: <33862a07c927184a40ccbfe5182404923a392c4a.1646266835.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A new caller in the next commit will want to immediately modify the object_entry structure created by create_object_entry(). Instead of forcing that caller to wastefully look-up the entry we just created, return it from create_object_entry() instead. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 385970cb7b..3f08a3c63a 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1508,13 +1508,13 @@ static int want_object_in_pack(const struct object_id *oid, return 1; } -static void create_object_entry(const struct object_id *oid, - enum object_type type, - uint32_t hash, - int exclude, - int no_try_delta, - struct packed_git *found_pack, - off_t found_offset) +static struct object_entry *create_object_entry(const struct object_id *oid, + enum object_type type, + uint32_t hash, + int exclude, + int no_try_delta, + struct packed_git *found_pack, + off_t found_offset) { struct object_entry *entry; @@ -1531,6 +1531,8 @@ static void create_object_entry(const struct object_id *oid, } entry->no_try_delta = no_try_delta; + + return entry; } static const char no_closure_warning[] = N_( From patchwork Thu Mar 3 00:21:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12766910 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2CB0C433FE for ; Thu, 3 Mar 2022 00:21:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230304AbiCCAV7 (ORCPT ); Wed, 2 Mar 2022 19:21:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230283AbiCCAV5 (ORCPT ); Wed, 2 Mar 2022 19:21:57 -0500 Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7BF945FF35 for ; Wed, 2 Mar 2022 16:21:04 -0800 (PST) Received: by mail-io1-xd2d.google.com with SMTP id r7so3981281iot.3 for ; Wed, 02 Mar 2022 16:21:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=BAmuCwBfieKO0j3OoOkc1z2O5C5diC1FfxP5KsP0W38=; b=YCE1aC64cSoswN/5v/HQsOhrpH/U7+n4ZiniiwB3S53oSZpDQqUm3BjG733W57ktTv rOTl2b35zS/2UFcDNHjUKMcYQSbxSwSkrhkv996jFl3wlH54BZo+Rm0RCX5xXvRFJAQo bYIDfR/xjzfkBQNk829x6e14ehQW4zBaTW3p2mhS5gHQV63qg7cOc1sOVNlhyRMl2hlb V/vvCTOEr8O7Yb/tfwzuTU4WaQIKPWOK9+YT16TZPb59EZ0K7VXOhL9Ij8Y/UTeMsmNZ HTDdC+X16auD62xa9IWK1Q+whGyYDJJeVEZIACJjQmWTqCBWmJcsr6TeFn1J3N2DUobC dxHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=BAmuCwBfieKO0j3OoOkc1z2O5C5diC1FfxP5KsP0W38=; b=t8x/dlb+3DOIDDWQU+EvBxWUoOBVTdEf9R7kORVT5wAw/jmL0Vy8tDjvQj46TI8EiB rLEvqlQ6Mo1cn5kd42h7/a6bv3tyO+y9WUCtTjEHscWNp9Lic9dFTW7MW8rib+VNOjAO sh2B9ZA1l+MDRwUhUxlcI3TXqGeE4jF7+Y6td6KJtK5ajEv1b45cF3flsAGtFrlmxlAM MXDcm1jDFj9TbGsIcYXhbvcndgeYFZvZA+s5S4wimH8bTlXQjOyt3uJtRlS5HED6tnX2 ufF2cP/rnX3KI4ph9TI1cXcS6OD4dde/U7fkkjD3/TcMAFigMZBNqoppSbAJxFDoqaCw 19aw== X-Gm-Message-State: AOAM532I/PaDMfvartV2xuFlsFSNcxyoOoLGZ7V1xKbJ6bWF5JYerA3x KTb+qwhyPcxeD1psoui5uyzRkmiCNMK2Mmxr X-Google-Smtp-Source: ABdhPJzXOi9DkwOkfmda26oxA9FsAhRmHdOSOB6DpJzloXRf+KsO9VCT5Fs0UmyuP9yRc5qJEOYofw== X-Received: by 2002:a05:6638:2688:b0:314:e214:d996 with SMTP id o8-20020a056638268800b00314e214d996mr26550091jat.167.1646266863470; Wed, 02 Mar 2022 16:21:03 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id b18-20020a92c852000000b002bf7b6b3041sm308951ilq.75.2022.03.02.16.21.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 16:21:03 -0800 (PST) Date: Wed, 2 Mar 2022 19:21:02 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v3 08/17] builtin/pack-objects.c: --cruft without expiration Message-ID: <22705e4887b5c9e3d7ef9ff1eadaabeeac0d57da.1646266835.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Teach `pack-objects` how to generate a cruft pack when no objects are dropped (i.e., `--cruft-expiration=never`). Later patches will teach `pack-objects` how to generate a cruft pack that prunes objects. When generating a cruft pack which does not prune objects, we want to collect all unreachable objects into a single pack (noting and updating their mtimes as we accumulate them). Ordinary use will pass the result of a `git repack -A` as a kept pack, so when this patch says "kept pack", readers should think "reachable objects". Generating a non-expiring cruft packs works as follows: - Callers provide a list of every pack they know about, and indicate which packs are about to be removed. - All packs which are going to be removed (we'll call these the redundant ones) are marked as kept in-core. Any packs the caller did not mention (but are known to the `pack-objects` process) are also marked as kept in-core. Packs not mentioned by the caller are assumed to be unknown to them, i.e., they entered the repository after the caller decided which packs should be kept and which should be discarded. Since we do not want to include objects in these "unknown" packs (because we don't know which of their objects are or aren't reachable), these are also marked as kept in-core. - Then, we enumerate all objects in the repository, and add them to our packing list if they do not appear in an in-core kept pack. This results in a new cruft pack which contains all known objects that aren't included in the kept packs. When the kept pack is the result of `git repack -A`, the resulting pack contains all unreachable objects. Signed-off-by: Taylor Blau --- Documentation/git-pack-objects.txt | 30 ++++ builtin/pack-objects.c | 201 +++++++++++++++++++++++++- object-file.c | 2 +- object-store.h | 2 + t/t5329-pack-objects-cruft.sh | 218 +++++++++++++++++++++++++++++ 5 files changed, 448 insertions(+), 5 deletions(-) create mode 100755 t/t5329-pack-objects-cruft.sh diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index f8344e1e5b..a9995a932c 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -13,6 +13,7 @@ SYNOPSIS [--no-reuse-delta] [--delta-base-offset] [--non-empty] [--local] [--incremental] [--window=] [--depth=] [--revs [--unpacked | --all]] [--keep-pack=] + [--cruft] [--cruft-expiration=