From patchwork Wed Jan 27 15:01:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 12050321 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEA9AC433DB for ; Wed, 27 Jan 2021 15:08:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E4EB2076E for ; Wed, 27 Jan 2021 15:08:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235776AbhA0PGb (ORCPT ); Wed, 27 Jan 2021 10:06:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235454AbhA0PCz (ORCPT ); Wed, 27 Jan 2021 10:02:55 -0500 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 826CFC061574 for ; Wed, 27 Jan 2021 07:02:00 -0800 (PST) Received: by mail-wm1-x32d.google.com with SMTP id u14so1944404wmq.4 for ; Wed, 27 Jan 2021 07:02:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Ms1U6NIMxhBEvvSgVTSYt9UCALDuBXR38CI1ZIOmQYE=; b=kNmDvmE+Xap4ORmkjZAsg1khFEhiVQMgrbSYbdm9ZYkrLwfVi/+HsFO6Y52Ex1s98a 31n+NV1uwCUAAB+E3eYYLLRy0Eqklf4Qi+ddo+TtCif8bmscNwdtTpEa6htOOlk4oAjA AE+eYYhQ9VzV//hcPFB9NMklCHiL3lbvdW7GOBuv9hGex4eijHfc4K9o7jzEAnn2RyjY MlOOrKWE/TC+AsV1lcE4O68c8OLQtbR8dnsUsGgdt5CXyzV+FGL2tZfOGgMMdQwWcp9v iaZf4d5xRa4aM+5QZSspsnGzb7i2rUFBcHheDnTg5aqtsQ5bxw+0VTzSxRwyV4Usd13q 50Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Ms1U6NIMxhBEvvSgVTSYt9UCALDuBXR38CI1ZIOmQYE=; b=lAkHyx7yNg8ZLhgx9aN6uFLg/I96Fw+6cNVlFyTP3WrKFXrld9K+F5dvAeDjOLYu0d /j6T/XqSH1KLu50YpTIx/PnqaFWJ2GHuXLTy4HHQNQ1uyVybcP5UTEJERjUqFwybnlqG Df/ObSiefVIwfo1abbLKkYy3ZbtGOUtWLNbHCboKH1gdTCOH7aGL8+NbHzAaYpPcdrKX jf8ufy+Ui2WBCpTnwX8FB/peuosK8/fE4xApf7BFBf8CfT9hVVnWH9E0Z3+WO8bBbV4Y LrZftmtC8NBQ0x3bjgT0jeuPNMsJjVtu/R5pIF87DkzBq9EfL7gKniSGOZ+K5Hh1HWLF tMPQ== X-Gm-Message-State: AOAM532Ue3LeC6iZZ0Ynun2kslzHIx7sOFxTAlf2wcyBvJ+tBK5t+OKw G9jWmhi8jhXNBO7rEveZec96cJME2Ys= X-Google-Smtp-Source: ABdhPJwqC13pIdHRs1Ckj2YdoK3PcQ0zk1YwAW0mk+66JKV+lMNykPRpuEuAiN/tW6P1geNLoB5e9g== X-Received: by 2002:a1c:3d56:: with SMTP id k83mr4577087wma.25.1611759718539; Wed, 27 Jan 2021 07:01:58 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r124sm2941731wmr.16.2021.01.27.07.01.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Jan 2021 07:01:57 -0800 (PST) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Wed, 27 Jan 2021 15:01:39 +0000 Subject: [PATCH v2 00/17] Refactor chunk-format into an API Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: me@ttaylorr.com, gitster@pobox.com, l.s.r@web.de, szeder.dev@gmail.com, Chris Torek , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This is a restart on the topic previously submitted [1] but dropped because ak/corrected-commit-date was still in progress. This version is based on that branch. [1] https://lore.kernel.org/git/pull.804.git.1607012215.gitgitgadget@gmail.com/ This version also changes the approach to use a more dynamic interaction with a struct chunkfile pointer. This idea is credited to Taylor Blau [2], but I started again from scratch. I also go further to make struct chunkfile anonymous to API consumers. It is defined only in chunk-format.c, which should hopefully deter future users from interacting with that data directly. [2] https://lore.kernel.org/git/X8%2FI%2FRzXZksio+ri@nand.local/ This combined API is beneficial to reduce duplicated logic. Or rather, to ensure that similar file formats have similar protections against bad data. The multi-pack-index code did not have as many guards as the commit-graph code did, but now they both share a common base that checks for things like duplicate chunks or offsets outside the size of the file. Here are some stats for the end-to-end change: * 570 insertions(+), 456 deletions(-). * commit-graph.c: 107 insertions(+), 192 deletions(-) * midx.c: 164 insertions(+), 260 deletions(-) While there is an overall increase to the code size, the consumers do get smaller. Boilerplate things like abstracting method to match chunk_write_fn and chunk_read_fn make up a lot of these insertions. The "interesting" code gets a lot smaller and cleaner. Updates in V2 ============= * The method pair_chunk() now automatically sets a pointer while read_chunk() uses the callback. This greatly reduces the code size. * Pointer casts are now implicit instead of explicit. * Extra care is taken to not overflow when verifying chunk sizes on write. Thanks, -Stolee Derrick Stolee (17): commit-graph: anonymize data in chunk_write_fn chunk-format: create chunk format write API commit-graph: use chunk-format write API midx: rename pack_info to write_midx_context midx: use context in write_midx_pack_names() midx: add entries to write_midx_context midx: add pack_perm to write_midx_context midx: add num_large_offsets to write_midx_context midx: return success/failure in chunk write methods midx: drop chunk progress during write midx: use chunk-format API in write_midx_internal() chunk-format: create read chunk API commit-graph: use chunk-format read API midx: use chunk-format read API midx: use 64-bit multiplication for chunk sizes chunk-format: restore duplicate chunk checks chunk-format: add technical docs Documentation/technical/chunk-format.txt | 54 +++ .../technical/commit-graph-format.txt | 3 + Documentation/technical/pack-format.txt | 3 + Makefile | 1 + chunk-format.c | 181 ++++++++ chunk-format.h | 53 +++ commit-graph.c | 299 +++++------- midx.c | 424 +++++++----------- t/t5318-commit-graph.sh | 2 +- t/t5319-multi-pack-index.sh | 6 +- 10 files changed, 570 insertions(+), 456 deletions(-) create mode 100644 Documentation/technical/chunk-format.txt create mode 100644 chunk-format.c create mode 100644 chunk-format.h base-commit: 5a3b130cad0d5c770f766e3af6d32b41766374c0 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-848%2Fderrickstolee%2Fchunk-format%2Frefactor-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-848/derrickstolee/chunk-format/refactor-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/848 Range-diff vs v1: 1: 09b32829e4f ! 1: 243dcec9436 commit-graph: anonymize data in chunk_write_fn @@ commit-graph.c: struct write_commit_graph_context { - struct write_commit_graph_context *ctx) + void *data) { -+ struct write_commit_graph_context *ctx = -+ (struct write_commit_graph_context *)data; ++ struct write_commit_graph_context *ctx = data; int i, count = 0; struct commit **list = ctx->commits.list; @@ commit-graph.c: static int write_graph_chunk_fanout(struct hashfile *f, - struct write_commit_graph_context *ctx) + void *data) { -+ struct write_commit_graph_context *ctx = -+ (struct write_commit_graph_context *)data; ++ struct write_commit_graph_context *ctx = data; struct commit **list = ctx->commits.list; int count; for (count = 0; count < ctx->commits.nr; count++, list++) { @@ commit-graph.c: static const unsigned char *commit_to_sha1(size_t index, void *t - struct write_commit_graph_context *ctx) + void *data) { -+ struct write_commit_graph_context *ctx = -+ (struct write_commit_graph_context *)data; ++ struct write_commit_graph_context *ctx = data; struct commit **list = ctx->commits.list; struct commit **last = ctx->commits.list + ctx->commits.nr; uint32_t num_extra_edges = 0; @@ commit-graph.c: static int write_graph_chunk_data(struct hashfile *f, - struct write_commit_graph_context *ctx) + void *data) { -+ struct write_commit_graph_context *ctx = -+ (struct write_commit_graph_context *)data; ++ struct write_commit_graph_context *ctx = data; int i, num_generation_data_overflows = 0; for (i = 0; i < ctx->commits.nr; i++) { @@ commit-graph.c: static int write_graph_chunk_generation_data(struct hashfile *f, - struct write_commit_graph_context *ctx) + void *data) { -+ struct write_commit_graph_context *ctx = -+ (struct write_commit_graph_context *)data; ++ struct write_commit_graph_context *ctx = data; int i; for (i = 0; i < ctx->commits.nr; i++) { struct commit *c = ctx->commits.list[i]; @@ commit-graph.c: static int write_graph_chunk_generation_data_overflow(struct has - struct write_commit_graph_context *ctx) + void *data) { -+ struct write_commit_graph_context *ctx = -+ (struct write_commit_graph_context *)data; ++ struct write_commit_graph_context *ctx = data; struct commit **list = ctx->commits.list; struct commit **last = ctx->commits.list + ctx->commits.nr; struct commit_list *parent; @@ commit-graph.c: static int write_graph_chunk_extra_edges(struct hashfile *f, - struct write_commit_graph_context *ctx) + void *data) { -+ struct write_commit_graph_context *ctx = -+ (struct write_commit_graph_context *)data; ++ struct write_commit_graph_context *ctx = data; struct commit **list = ctx->commits.list; struct commit **last = ctx->commits.list + ctx->commits.nr; uint32_t cur_pos = 0; @@ commit-graph.c: static void trace2_bloom_filter_settings(struct write_commit_gra - struct write_commit_graph_context *ctx) + void *data) { -+ struct write_commit_graph_context *ctx = -+ (struct write_commit_graph_context *)data; ++ struct write_commit_graph_context *ctx = data; struct commit **list = ctx->commits.list; struct commit **last = ctx->commits.list + ctx->commits.nr; @@ commit-graph.c: static int write_graph_chunk_base_1(struct hashfile *f, - struct write_commit_graph_context *ctx) + void *data) { -+ struct write_commit_graph_context *ctx = -+ (struct write_commit_graph_context *)data; ++ struct write_commit_graph_context *ctx = data; int num = write_graph_chunk_base_1(f, ctx->new_base_graph); if (num != ctx->num_commit_graphs_after - 1) { 2: 9bd273f8c94 ! 2: 814512f2167 chunk-format: create chunk format write API @@ chunk-format.c (new) + if (result) + return result; + -+ if (cf->f->total + cf->f->offset != start_offset + cf->chunks[i].size) ++ if (cf->f->total + cf->f->offset - start_offset != cf->chunks[i].size) + BUG("expected to write %"PRId64" bytes to chunk %"PRIx32", but wrote %"PRId64" instead", + cf->chunks[i].size, cf->chunks[i].id, + cf->f->total + cf->f->offset - start_offset); 3: a3d6177a352 = 3: 70af6e3083f commit-graph: use chunk-format write API 4: 9fe5ee8611c ! 4: 0cac7890bed midx: rename pack_info to write_midx_context @@ midx.c: struct pack_list { const char *file_name, void *data) { - struct pack_list *packs = (struct pack_list *)data; -+ struct write_midx_context *ctx = (struct write_midx_context *)data; ++ struct write_midx_context *ctx = data; if (ends_with(file_name, ".idx")) { - display_progress(packs->progress, ++packs->pack_paths_checked); 5: 14a0246b982 ! 5: 4a4e90b129a midx: use context in write_midx_pack_names() @@ midx.c: static struct pack_midx_entry *get_sorted_entries(struct multi_pack_inde - uint32_t num_packs) +static size_t write_midx_pack_names(struct hashfile *f, void *data) { -+ struct write_midx_context *ctx = (struct write_midx_context *)data; ++ struct write_midx_context *ctx = data; uint32_t i; unsigned char padding[MIDX_CHUNK_ALIGNMENT]; size_t written = 0; 6: 79f479ef7d1 ! 6: 30ad423997b midx: add entries to write_midx_context @@ midx.c: static size_t write_midx_pack_names(struct hashfile *f, void *data) { - struct pack_midx_entry *list = objects; - struct pack_midx_entry *last = objects + nr_objects; -+ struct write_midx_context *ctx = (struct write_midx_context *)data; ++ struct write_midx_context *ctx = data; + struct pack_midx_entry *list = ctx->entries; + struct pack_midx_entry *last = ctx->entries + ctx->entries_nr; uint32_t count = 0; @@ midx.c: static size_t write_midx_oid_fanout(struct hashfile *f, + void *data) { - struct pack_midx_entry *list = objects; -+ struct write_midx_context *ctx = (struct write_midx_context *)data; ++ struct write_midx_context *ctx = data; + unsigned char hash_len = the_hash_algo->rawsz; + struct pack_midx_entry *list = ctx->entries; uint32_t i; 7: 0b4ce3f1732 ! 7: 2f1c496f3ab midx: add pack_perm to write_midx_context @@ midx.c: static size_t write_midx_oid_lookup(struct hashfile *f, + void *data) { - struct pack_midx_entry *list = objects; -+ struct write_midx_context *ctx = (struct write_midx_context *)data; ++ struct write_midx_context *ctx = data; + struct pack_midx_entry *list = ctx->entries; uint32_t i, nr_large_offset = 0; size_t written = 0; 8: eabc7b73647 ! 8: c4939548e51 midx: add num_large_offsets to write_midx_context @@ midx.c: static size_t write_midx_object_offsets(struct hashfile *f, + void *data) { - struct pack_midx_entry *list = objects, *end = objects + nr_objects; -+ struct write_midx_context *ctx = (struct write_midx_context *)data; ++ struct write_midx_context *ctx = data; + struct pack_midx_entry *list = ctx->entries; + struct pack_midx_entry *end = ctx->entries + ctx->entries_nr; size_t written = 0; 9: 909ca28e0ba ! 9: b3cc73c2256 midx: return success/failure in chunk write methods @@ midx.c: static struct pack_midx_entry *get_sorted_entries(struct multi_pack_inde -static size_t write_midx_pack_names(struct hashfile *f, void *data) +static int write_midx_pack_names(struct hashfile *f, void *data) { - struct write_midx_context *ctx = (struct write_midx_context *)data; + struct write_midx_context *ctx = data; uint32_t i; @@ midx.c: static size_t write_midx_pack_names(struct hashfile *f, void *data) if (i < MIDX_CHUNK_ALIGNMENT) { @@ midx.c: static size_t write_midx_pack_names(struct hashfile *f, void *data) +static int write_midx_oid_fanout(struct hashfile *f, + void *data) { - struct write_midx_context *ctx = (struct write_midx_context *)data; + struct write_midx_context *ctx = data; struct pack_midx_entry *list = ctx->entries; @@ midx.c: static size_t write_midx_oid_fanout(struct hashfile *f, list = next; @@ midx.c: static size_t write_midx_oid_fanout(struct hashfile *f, +static int write_midx_oid_lookup(struct hashfile *f, + void *data) { - struct write_midx_context *ctx = (struct write_midx_context *)data; + struct write_midx_context *ctx = data; unsigned char hash_len = the_hash_algo->rawsz; struct pack_midx_entry *list = ctx->entries; uint32_t i; @@ midx.c: static size_t write_midx_oid_lookup(struct hashfile *f, +static int write_midx_object_offsets(struct hashfile *f, + void *data) { - struct write_midx_context *ctx = (struct write_midx_context *)data; + struct write_midx_context *ctx = data; struct pack_midx_entry *list = ctx->entries; uint32_t i, nr_large_offset = 0; - size_t written = 0; @@ midx.c: static size_t write_midx_object_offsets(struct hashfile *f, +static int write_midx_large_offsets(struct hashfile *f, + void *data) { - struct write_midx_context *ctx = (struct write_midx_context *)data; + struct write_midx_context *ctx = data; struct pack_midx_entry *list = ctx->entries; struct pack_midx_entry *end = ctx->entries + ctx->entries_nr; - size_t written = 0; 10: e613ffa9ac6 = 10: 78744d3b701 midx: drop chunk progress during write 11: 49cfb4f63e2 = 11: 07dc0cf8c68 midx: use chunk-format API in write_midx_internal() 12: e3475633e1d ! 12: d8d8e9e2aa3 chunk-format: create read chunk API @@ Commit message 1. initialize a 'struct chunkfile' with init_chunkfile(NULL). 2. call read_table_of_contents(). - 3. for each chunk to parse, call pair_chunk() with appropriate pointers. + 3. for each chunk to parse, + a. call pair_chunk() to assign a pointer with the chunk position, or + b. call read_chunk() to run a callback on the chunk start and size. 4. call free_chunkfile() to clear the 'struct chunkfile' data. We are re-using the anonymous 'struct chunkfile' data, as it is internal @@ chunk-format.c: int write_chunkfile(struct chunkfile *cf, void *data) + +int pair_chunk(struct chunkfile *cf, + uint32_t chunk_id, ++ const unsigned char **p) ++{ ++ int i; ++ ++ for (i = 0; i < cf->chunks_nr; i++) { ++ if (cf->chunks[i].id == chunk_id) { ++ *p = cf->chunks[i].start; ++ return 0; ++ } ++ } ++ ++ return CHUNK_NOT_FOUND; ++} ++ ++int read_chunk(struct chunkfile *cf, ++ uint32_t chunk_id, + chunk_read_fn fn, + void *data) +{ @@ chunk-format.h: void add_chunk(struct chunkfile *cf, + uint64_t toc_offset, + int toc_length); + ++#define CHUNK_NOT_FOUND (-2) ++ +/* -+ * When reading a table of contents, we find the chunk with matching 'id' -+ * then call its read_fn to populate the necessary 'data' based on the -+ * chunk start and size. ++ * Find 'chunk_id' in the given chunkfile and assign the ++ * given pointer to the position in the mmap'd file where ++ * that chunk begins. ++ * ++ * Returns CHUNK_NOT_FOUND if the chunk does not exist. + */ ++int pair_chunk(struct chunkfile *cf, ++ uint32_t chunk_id, ++ const unsigned char **p); ++ +typedef int (*chunk_read_fn)(const unsigned char *chunk_start, + size_t chunk_size, void *data); -+ -+ -+#define CHUNK_NOT_FOUND (-2) -+int pair_chunk(struct chunkfile *cf, ++/* ++ * Find 'chunk_id' in the given chunkfile and call the ++ * given chunk_read_fn method with the information for ++ * that chunk. ++ * ++ * Returns CHUNK_NOT_FOUND if the chunk does not exist. ++ */ ++int read_chunk(struct chunkfile *cf, + uint32_t chunk_id, + chunk_read_fn fn, + void *data); 13: 7339990f07d ! 13: 8744d278596 commit-graph: use chunk-format read API @@ commit-graph.c: static int verify_commit_graph_lite(struct commit_graph *g) return 0; } -+static int graph_read_oid_fanout(const unsigned char *chunk_start, -+ size_t chunk_size, void *data) -+{ -+ struct commit_graph *g = (struct commit_graph *)data; -+ g->chunk_oid_fanout = (uint32_t*)chunk_start; -+ return 0; -+} -+ +static int graph_read_oid_lookup(const unsigned char *chunk_start, + size_t chunk_size, void *data) +{ -+ struct commit_graph *g = (struct commit_graph *)data; ++ struct commit_graph *g = data; + g->chunk_oid_lookup = chunk_start; + g->num_commits = chunk_size / g->hash_len; + return 0; +} + -+static int graph_read_data(const unsigned char *chunk_start, -+ size_t chunk_size, void *data) -+{ -+ struct commit_graph *g = (struct commit_graph *)data; -+ g->chunk_commit_data = chunk_start; -+ return 0; -+} -+ -+static int graph_read_extra_edges(const unsigned char *chunk_start, -+ size_t chunk_size, void *data) -+{ -+ struct commit_graph *g = (struct commit_graph *)data; -+ g->chunk_extra_edges = chunk_start; -+ return 0; -+} -+ -+static int graph_read_base_graphs(const unsigned char *chunk_start, -+ size_t chunk_size, void *data) -+{ -+ struct commit_graph *g = (struct commit_graph *)data; -+ g->chunk_base_graphs = chunk_start; -+ return 0; -+} -+ -+static int graph_read_generation_data(const unsigned char *chunk_start, -+ size_t chunk_size, void *data) -+{ -+ struct commit_graph *g = (struct commit_graph *)data; -+ g->chunk_generation_data = chunk_start; -+ return 0; -+} -+ -+static int graph_read_generation_overflow(const unsigned char *chunk_start, -+ size_t chunk_size, void *data) -+{ -+ struct commit_graph *g = (struct commit_graph *)data; -+ g->chunk_generation_data_overflow = chunk_start; -+ return 0; -+} -+ -+static int graph_read_bloom_indices(const unsigned char *chunk_start, -+ size_t chunk_size, void *data) -+{ -+ struct commit_graph *g = (struct commit_graph *)data; -+ g->chunk_bloom_indexes = chunk_start; -+ return 0; -+} -+ +static int graph_read_bloom_data(const unsigned char *chunk_start, + size_t chunk_size, void *data) +{ -+ struct commit_graph *g = (struct commit_graph *)data; ++ struct commit_graph *g = data; + uint32_t hash_version; + g->chunk_bloom_data = chunk_start; + hash_version = get_be32(chunk_start); @@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repository *r, - / graph->hash_len; - } - break; -- ++ cf = init_chunkfile(NULL); + - case GRAPH_CHUNKID_DATA: - if (graph->chunk_commit_data) - chunk_repeated = 1; @@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repository *r, - else - graph->chunk_generation_data = data + chunk_offset; - break; -+ cf = init_chunkfile(NULL); - +- - case GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW: - if (graph->chunk_generation_data_overflow) - chunk_repeated = 1; @@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repository *r, - } - break; - } -+ pair_chunk(cf, GRAPH_CHUNKID_OIDFANOUT, graph_read_oid_fanout, graph); -+ pair_chunk(cf, GRAPH_CHUNKID_OIDLOOKUP, graph_read_oid_lookup, graph); -+ pair_chunk(cf, GRAPH_CHUNKID_DATA, graph_read_data, graph); -+ pair_chunk(cf, GRAPH_CHUNKID_EXTRAEDGES, graph_read_extra_edges, graph); -+ pair_chunk(cf, GRAPH_CHUNKID_BASE, graph_read_base_graphs, graph); ++ pair_chunk(cf, GRAPH_CHUNKID_OIDFANOUT, ++ (const unsigned char **)&graph->chunk_oid_fanout); ++ read_chunk(cf, GRAPH_CHUNKID_OIDLOOKUP, graph_read_oid_lookup, graph); ++ pair_chunk(cf, GRAPH_CHUNKID_DATA, &graph->chunk_commit_data); ++ pair_chunk(cf, GRAPH_CHUNKID_EXTRAEDGES, &graph->chunk_extra_edges); ++ pair_chunk(cf, GRAPH_CHUNKID_BASE, &graph->chunk_base_graphs); + pair_chunk(cf, GRAPH_CHUNKID_GENERATION_DATA, -+ graph_read_generation_data, graph); ++ &graph->chunk_generation_data); + pair_chunk(cf, GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW, -+ graph_read_generation_overflow, graph); ++ &graph->chunk_generation_data_overflow); - if (chunk_repeated) { - error(_("commit-graph chunk id %08x appears multiple times"), chunk_id); @@ commit-graph.c: struct commit_graph *parse_commit_graph(struct repository *r, - } + if (r->settings.commit_graph_read_changed_paths) { + pair_chunk(cf, GRAPH_CHUNKID_BLOOMINDEXES, -+ graph_read_bloom_indices, graph); -+ pair_chunk(cf, GRAPH_CHUNKID_BLOOMDATA, ++ &graph->chunk_bloom_indexes); ++ read_chunk(cf, GRAPH_CHUNKID_BLOOMDATA, + graph_read_bloom_data, graph); } 14: cb145e0e32a ! 14: 750c03253c9 midx: use chunk-format read API @@ midx.c: static char *get_midx_filename(const char *object_dir) return xstrfmt("%s/pack/multi-pack-index", object_dir); } -+static int midx_read_pack_names(const unsigned char *chunk_start, -+ size_t chunk_size, void *data) -+{ -+ struct multi_pack_index *m = (struct multi_pack_index *)data; -+ m->chunk_pack_names = chunk_start; -+ return 0; -+} -+ +static int midx_read_oid_fanout(const unsigned char *chunk_start, + size_t chunk_size, void *data) +{ -+ struct multi_pack_index *m = (struct multi_pack_index *)data; ++ struct multi_pack_index *m = data; + m->chunk_oid_fanout = (uint32_t *)chunk_start; + + if (chunk_size != 4 * 256) { @@ midx.c: static char *get_midx_filename(const char *object_dir) + } + return 0; +} -+ -+static int midx_read_oid_lookup(const unsigned char *chunk_start, -+ size_t chunk_size, void *data) -+{ -+ struct multi_pack_index *m = (struct multi_pack_index *)data; -+ m->chunk_oid_lookup = chunk_start; -+ return 0; -+} -+ -+static int midx_read_offsets(const unsigned char *chunk_start, -+ size_t chunk_size, void *data) -+{ -+ struct multi_pack_index *m = (struct multi_pack_index *)data; -+ m->chunk_object_offsets = chunk_start; -+ return 0; -+} -+ -+static int midx_read_large_offsets(const unsigned char *chunk_start, -+ size_t chunk_size, void *data) -+{ -+ struct multi_pack_index *m = (struct multi_pack_index *)data; -+ m->chunk_large_offsets = chunk_start; -+ return 0; -+} + struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local) { @@ midx.c: struct multi_pack_index *load_multi_pack_index(const char *object_dir, i + MIDX_HEADER_SIZE, m->num_chunks)) + goto cleanup_fail; + -+ if (pair_chunk(cf, MIDX_CHUNKID_PACKNAMES, midx_read_pack_names, m) == CHUNK_NOT_FOUND) ++ if (pair_chunk(cf, MIDX_CHUNKID_PACKNAMES, &m->chunk_pack_names) == CHUNK_NOT_FOUND) die(_("multi-pack-index missing required pack-name chunk")); - if (!m->chunk_oid_fanout) -+ if (pair_chunk(cf, MIDX_CHUNKID_OIDFANOUT, midx_read_oid_fanout, m) == CHUNK_NOT_FOUND) ++ if (read_chunk(cf, MIDX_CHUNKID_OIDFANOUT, midx_read_oid_fanout, m) == CHUNK_NOT_FOUND) die(_("multi-pack-index missing required OID fanout chunk")); - if (!m->chunk_oid_lookup) -+ if (pair_chunk(cf, MIDX_CHUNKID_OIDLOOKUP, midx_read_oid_lookup, m) == CHUNK_NOT_FOUND) ++ if (pair_chunk(cf, MIDX_CHUNKID_OIDLOOKUP, &m->chunk_oid_lookup) == CHUNK_NOT_FOUND) die(_("multi-pack-index missing required OID lookup chunk")); - if (!m->chunk_object_offsets) -+ if (pair_chunk(cf, MIDX_CHUNKID_OBJECTOFFSETS, midx_read_offsets, m) == CHUNK_NOT_FOUND) ++ if (pair_chunk(cf, MIDX_CHUNKID_OBJECTOFFSETS, &m->chunk_object_offsets) == CHUNK_NOT_FOUND) die(_("multi-pack-index missing required object offsets chunk")); -+ pair_chunk(cf, MIDX_CHUNKID_LARGEOFFSETS, midx_read_large_offsets, m); ++ pair_chunk(cf, MIDX_CHUNKID_LARGEOFFSETS, &m->chunk_large_offsets); + m->num_objects = ntohl(m->chunk_oid_fanout[255]); 15: f6c58ff72d2 = 15: 83d292532a0 midx: use 64-bit multiplication for chunk sizes 16: 62a23842aa6 = 16: 669eeec707a chunk-format: restore duplicate chunk checks 17: 05cbd0a8d93 = 17: 8f3985ab5df chunk-format: add technical docs