From patchwork Tue Dec 3 21:52:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13893002 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 905BA1FBEA7 for ; Tue, 3 Dec 2024 21:53:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262785; cv=none; b=DiQPPZvWh55L2oJbHr802gOEIZEhP6LWGmwv12iXq3AapJwEoeQuljkNM1yjZVUOxBtRkoTUu9XPESZP9d5l0a1m8gYIpUvaQo4K0MRB9OTRLyyD7hB+pjkHkLyykpuzvq6BjxkQ6l9mA8vHcObRTqCkzqRxV7f/cZSOQHHcMLI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262785; c=relaxed/simple; bh=CUEIWvURSubmvumRn/4Dk/TphslCRo+Q/kDcOlNpgw4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=OFuyrp/yVW1g44BobPftgy0JnBRPbpcXOrpJFNymt7cxg7AhN02EkrVfUYfvYw4ENRe1Lb34KmvqANB4xSzui/Ah9Fm10sQ8P3enB6AUt+YdGtMI1wIeLypWrOgjajIV1zBon8vVGj1QNqX9y/opwuP+GiG2uAyy+Ek3UQ8q1ho= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=0YRRoQUu; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="0YRRoQUu" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-72467c300a6so185637b3a.0 for ; Tue, 03 Dec 2024 13:53:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733262783; x=1733867583; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=o2G4SSla57aF4GIAeWoBSIxKru0BAHIYgXUN7YmxNNU=; b=0YRRoQUu3ScsLdniA4/eBusF0bTOZIhPdBzH7vD7mOVhooLTYsotSun+3EKf/rxJFx aQj5SL+dgPh4ggXGXkKn0xA90No+eZijhfvo/oyjsxY8pp/zja14mDHC5L3oGLrJclPQ IE2LGJnI7HZbvnSd1n55f2KyRHn8+LknEJAknyFV1rAvT32pV5eVlWQ18oF2IbB9bxpt EV9UjmsK377vgzI/h74hunMSf0+fFmV2PrMHK9rpMN3ZvzUtCYqMybDmskfDR3GZnmWb 7s7hOAkvGiXyH44lEyD25YGeBrpuy1IuPh9Pl1PRzRJ/wl7ZtduYOYoufVRut6qeBQ63 DPHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733262783; x=1733867583; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=o2G4SSla57aF4GIAeWoBSIxKru0BAHIYgXUN7YmxNNU=; b=ryb6R0RYDAG3+f/u34a26ao9VuBR+TS0DKJa36Sv6EDLma57UzuK/YojnMZhnVdbOV 0WsTN8KkYeLIFplJ8en6ifMgAf7SKQ2hQATxKPLUVWxr/2zCNFuAkuG4PwU3DbTos0oC xdCbJtf42iOIlOVNv1q9op1LKtcOBXLSdym63+pTdE9YkS9XuOFH4mQwq53bXxeAZaib wSTGHW7n9SWg3N8zxf8C9ipNG+AZCgqoy8VoO2Gf6nnMBRfxeVqOrPNlzMhYU2y1jWQV nXjiSRrJQ6sXSO8FKhUUnC1pL2oLTjlZHS/UEMeo9oneJgqsjC0VVsIXeE7PHVlFHbOa WFPA== X-Gm-Message-State: AOJu0YyaWzu7Ehd510fMU2GjUEEbN0vplSKElNxeKFwfcair0KN8cWxG RUjryN60gKzstyTJi+TdOG1tjWFGakKgbez3RX6bf/gzIQBu8RVzGAyuh389MLdPkTWdozUOOL7 a5J97KoJM/jwVe03hsTfYNu/g0divnR4Yd5PAoK8EXkKZ14kMnUyKyMeEedKOJYT7SbBRDf/NjM nCzdIj249E6AkeOXt9vz6tSxeCdeJc0CV8AyaC2v9qM8K0PBrRq7LXJiP5G8inQOMqOQ== X-Google-Smtp-Source: AGHT+IGXg+FqAS/hzQWA7mckdYFpk3j9kg1Kad2jFUQJnuKMCf378SEUjpFNJc8zF5Tw6lpwuDnxACD8cxR9GBjOD8Tr X-Received: from pjbsv12.prod.google.com ([2002:a17:90b:538c:b0:2ea:61ba:b8f7]) (user=jonathantanmy job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:80c:b0:2ee:e2f6:8abc with SMTP id 98e67ed59e1d1-2eee2f68f8dmr11132619a91.10.1733262782782; Tue, 03 Dec 2024 13:53:02 -0800 (PST) Date: Tue, 3 Dec 2024 13:52:54 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog Message-ID: <7ae21c921fe367d4b15cd4a299196009c15205d9.1733262662.git.jonathantanmy@google.com> Subject: [PATCH v3 1/3] index-pack --promisor: dedup before checking links From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , ps@pks.im, gitster@pobox.com Commit c08589efdc (index-pack: repack local links into promisor packs, 2024-11-01) fixed a bug with what was believed to be a negligible decrease in performance [1] [2]. But at $DAYJOB, with at least one repo, it was found that the decrease in performance was very significant. Looking at the patch, whenever we parse an object in the packfile to be indexed, we check the targets of all its outgoing links for its existence. However, this could be optimized by first collecting all such targets into an oidset (thus deduplicating them) before checking. Teach Git to do that. On a certain fetch from the aforementioned repo, this improved performance from approximately 7 hours to 24m47.815s. This number will be further reduced in a subsequent patch. [1] https://lore.kernel.org/git/CAG1j3zGiNMbri8rZNaF0w+yP+6OdMz0T8+8_Wgd1R_p1HzVasg@mail.gmail.com/ [2] https://lore.kernel.org/git/20241105212849.3759572-1-jonathantanmy@google.com/ Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 73 +++++++++++++++++++++++++------------------- 1 file changed, 41 insertions(+), 32 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 95babdc5ea..d1c777a6af 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -155,11 +155,11 @@ static int input_fd, output_fd; static const char *curr_pack; /* - * local_links is guarded by read_mutex, and record_local_links is read-only in - * a thread. + * outgoing_links is guarded by read_mutex, and record_outgoing_links is + * read-only in a thread. */ -static struct oidset local_links = OIDSET_INIT; -static int record_local_links; +static struct oidset outgoing_links = OIDSET_INIT; +static int record_outgoing_links; static struct thread_local_data *thread_data; static int nr_dispatched; @@ -812,18 +812,12 @@ static int check_collison(struct object_entry *entry) return 0; } -static void record_if_local_object(const struct object_id *oid) +static void record_outgoing_link(const struct object_id *oid) { - struct object_info info = OBJECT_INFO_INIT; - if (oid_object_info_extended(the_repository, oid, &info, 0)) - /* Missing; assume it is a promisor object */ - return; - if (info.whence == OI_PACKED && info.u.packed.pack->pack_promisor) - return; - oidset_insert(&local_links, oid); + oidset_insert(&outgoing_links, oid); } -static void do_record_local_links(struct object *obj) +static void do_record_outgoing_links(struct object *obj) { if (obj->type == OBJ_TREE) { struct tree *tree = (struct tree *)obj; @@ -837,16 +831,16 @@ static void do_record_local_links(struct object *obj) */ return; while (tree_entry_gently(&desc, &entry)) - record_if_local_object(&entry.oid); + record_outgoing_link(&entry.oid); } else if (obj->type == OBJ_COMMIT) { struct commit *commit = (struct commit *) obj; struct commit_list *parents = commit->parents; for (; parents; parents = parents->next) - record_if_local_object(&parents->item->object.oid); + record_outgoing_link(&parents->item->object.oid); } else if (obj->type == OBJ_TAG) { struct tag *tag = (struct tag *) obj; - record_if_local_object(get_tagged_oid(tag)); + record_outgoing_link(get_tagged_oid(tag)); } } @@ -896,7 +890,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, free(has_data); } - if (strict || do_fsck_object || record_local_links) { + if (strict || do_fsck_object || record_outgoing_links) { read_lock(); if (type == OBJ_BLOB) { struct blob *blob = lookup_blob(the_repository, oid); @@ -928,8 +922,8 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, die(_("fsck error in packed object")); if (strict && fsck_walk(obj, NULL, &fsck_options)) die(_("Not all child objects of %s are reachable"), oid_to_hex(&obj->oid)); - if (record_local_links) - do_record_local_links(obj); + if (record_outgoing_links) + do_record_outgoing_links(obj); if (obj->type == OBJ_TREE) { struct tree *item = (struct tree *) obj; @@ -1781,26 +1775,41 @@ static void repack_local_links(void) struct object_id *oid; char *base_name; - if (!oidset_size(&local_links)) + if (!oidset_size(&outgoing_links)) return; - base_name = mkpathdup("%s/pack/pack", repo_get_object_directory(the_repository)); + oidset_iter_init(&outgoing_links, &iter); + while ((oid = oidset_iter_next(&iter))) { + struct object_info info = OBJECT_INFO_INIT; + if (oid_object_info_extended(the_repository, oid, &info, 0)) + /* Missing; assume it is a promisor object */ + continue; + if (info.whence == OI_PACKED && info.u.packed.pack->pack_promisor) + continue; - strvec_push(&cmd.args, "pack-objects"); - strvec_push(&cmd.args, "--exclude-promisor-objects-best-effort"); - strvec_push(&cmd.args, base_name); - cmd.git_cmd = 1; - cmd.in = -1; - cmd.out = -1; - if (start_command(&cmd)) - die(_("could not start pack-objects to repack local links")); + if (!cmd.args.nr) { + base_name = mkpathdup( + "%s/pack/pack", + repo_get_object_directory(the_repository)); + strvec_push(&cmd.args, "pack-objects"); + strvec_push(&cmd.args, + "--exclude-promisor-objects-best-effort"); + strvec_push(&cmd.args, base_name); + cmd.git_cmd = 1; + cmd.in = -1; + cmd.out = -1; + if (start_command(&cmd)) + die(_("could not start pack-objects to repack local links")); + } - oidset_iter_init(&local_links, &iter); - while ((oid = oidset_iter_next(&iter))) { if (write_in_full(cmd.in, oid_to_hex(oid), the_hash_algo->hexsz) < 0 || write_in_full(cmd.in, "\n", 1) < 0) die(_("failed to feed local object to pack-objects")); } + + if (!cmd.args.nr) + return; + close(cmd.in); out = xfdopen(cmd.out, "r"); @@ -1899,7 +1908,7 @@ int cmd_index_pack(int argc, } else if (skip_to_optional_arg(arg, "--keep", &keep_msg)) { ; /* nothing to do */ } else if (skip_to_optional_arg(arg, "--promisor", &promisor_msg)) { - record_local_links = 1; + record_outgoing_links = 1; } else if (starts_with(arg, "--threads=")) { char *end; nr_threads = strtoul(arg+10, &end, 0); From patchwork Tue Dec 3 21:52:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13893003 Received: from mail-oa1-f74.google.com (mail-oa1-f74.google.com [209.85.160.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F74D209F49 for ; Tue, 3 Dec 2024 21:53:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262787; cv=none; b=bNZk+0s4rn9PjyovtL6K7xlXw/yvXUOzGe2K54oqjjeHghILl0x5q9mqGHOuAD6YJBD3x+cJ3h88t+Sjst2n13+u6WYJrdU5ulGCiBg4Zq/kiVm4c6gFakMvqiOv8iALq0vijMoFA4c+sRDie2QwJOAj9Q27eWslvr2u70RFU58= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262787; c=relaxed/simple; bh=RB858o40qlolunBXqIuHz9lOnO6CmSzXU4OGgb5LioE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=bdQRvlHAWORI9hnK/KQCE9vbzKCd0EI+xkJJt6+WkVceRqJyaA9gC7GLZ2NU6e3W0H3I+gl7+96O3E1tRi21uoVQmmUUhjuFr+RaIPwoY9Bv+nv4aCdkn0CHk9VKAAUrdZFpyaDyvZ/1DbHct2oxsFTsVaEtCoPGrp7Mm7DnKvg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=WyjRwj5j; arc=none smtp.client-ip=209.85.160.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WyjRwj5j" Received: by mail-oa1-f74.google.com with SMTP id 586e51a60fabf-29e525890e4so3045109fac.3 for ; Tue, 03 Dec 2024 13:53:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733262784; x=1733867584; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=I4c9dVPxmreSId8/F9kkGAoIDK59JPJqOkX4hBvaaRs=; b=WyjRwj5jzzoCokqGbkRsEUQIcTyryFedhW31UV9UK105ock9H/3XNuGsv5epDD7nJh Zp3kfWFqOiqj/NXtUqLoMdspn23HR+mGWlq9pQ9OopACWgwBa1nN53t4aIS6v4lJpAZh NC/qjqAEmU3otxfF35UpbyjK+yb8CK5P3XBvTcw3lZD5Q+U36v8ZDWC/dDWelHTKpmWD gEe5Ra56rGOuqJhRjU0TacrCg4ksxTV/VpYxBzaLdwK1WFtCAkcDaB2ZyBbfJYtFEWY4 Ofjnuwq9sYK6bUjEzyiUX5cZD1szCFHYXmMGob2qt5snVBqIzEpXVTtYQ4wmvOd1aRdL gUuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733262784; x=1733867584; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=I4c9dVPxmreSId8/F9kkGAoIDK59JPJqOkX4hBvaaRs=; b=gEq32p6a0IqTlWuYQqGcqhjIDQcEhGoMz0J0UZ3tU0c4DyiFuRFVJ5pkK3pdyvaa+F y2Q+JY0qcNafI7GqjJGP6ESApVXQfif1VT1gxortyp86Ifnho0UKHQR0k7q0+WEllR5B rEXKzYnKZ1NOTaHMBxUdEk5GDj9C1GJrGR34+0AXt1rqNKaF+Ot7WgiHWKjZJgqdHLDm EDwIdVaYiTN5X0pZkzjkUrI9xHc5b9YhDTgvlRKlkaBWoE3354JRpR5XAAthLNU2eJ3+ rZdpRF6LLqMKK2cGIIFKACkSQqsMRjJPYSmDjwz1dyds3x04gTHE9IEzkcGjA7VgPiVi rTXQ== X-Gm-Message-State: AOJu0YxqLk6+lJHqocZCD5Y1HNpprw5j9CErLpZbM/kQ6aXYpDV6ytGS 10v26q55/F1CBH9gMhteHpku4PwjFwV6wko2f+Wn0wVjvm0e6uQJsYJQEqgidsQ8r7yavpDzRdF VW5ZKM11+Ai6ozFbz6WepowGtkdw52rN0YyeWa/qg8/KC628ojYJAkBOJJ9+QL2dfU/p9xqg36/ n/bff/M2+ahU8h6YLHnJlliHsLWPm7yee8/Ynq2gI64IFTRgrs18Yrpsc+kxrWVAp3+A== X-Google-Smtp-Source: AGHT+IEdgO2HwB27JpP8QpK7YP8g1+L5VsCdOYr6UVHeHODo2nEcR3Iwx8WvGZJiCr0kfZEQMwFsvLWdn6ZgIETdyEO6 X-Received: from oacpy6.prod.google.com ([2002:a05:6871:e406:b0:29d:c4b7:250d]) (user=jonathantanmy job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6870:e386:b0:29e:7f8c:8f57 with SMTP id 586e51a60fabf-29e8884d9c9mr5298023fac.27.1733262784639; Tue, 03 Dec 2024 13:53:04 -0800 (PST) Date: Tue, 3 Dec 2024 13:52:55 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog Message-ID: Subject: [PATCH v3 2/3] index-pack --promisor: don't check blobs From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , ps@pks.im, gitster@pobox.com As a follow-up to the parent of this commit, it was found that not checking for the existence of blobs linked from trees sped up the fetch from 24m47.815s to 2m2.127s. Teach Git to do that. The tradeoff of not checking blobs is documented in a code comment. (Blobs may also be linked from tag objects, but it is impossible to know the type of an object linked from a tag object without looking it up in the object database, so the code for that is untouched.) Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 31 ++++++++++++++++++++++++++++++- 1 file changed, 30 insertions(+), 1 deletion(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index d1c777a6af..2e90fe186e 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -817,6 +817,35 @@ static void record_outgoing_link(const struct object_id *oid) oidset_insert(&outgoing_links, oid); } +static void maybe_record_name_entry(const struct name_entry *entry) +{ + /* + * Checking only trees here results in a significantly faster packfile + * indexing, but the drawback is that if the packfile to be indexed + * references a local blob only directly (that is, never through a + * local tree), that local blob is in danger of being garbage + * collected. Such a situation may arise if we push local commits, + * including one with a change to a blob in the root tree, and then the + * server incorporates them into its main branch through a "rebase" or + * "squash" merge strategy, and then we fetch the new main branch from + * the server. + * + * This situation has not been observed yet - we have only noticed + * missing commits, not missing trees or blobs. (In fact, if it were + * believed that only missing commits are problematic, one could argue + * that we should also exclude trees during the outgoing link check; + * but it is safer to include them.) + * + * Due to the rarity of the situation (it has not been observed to + * happen in real life), and because the "penalty" in such a situation + * is merely to refetch the missing blob when it's needed (and this + * happens only once - when refetched, the blob goes into a promisor + * pack, so it won't be GC-ed, the tradeoff seems worth it. + */ + if (S_ISDIR(entry->mode)) + record_outgoing_link(&entry->oid); +} + static void do_record_outgoing_links(struct object *obj) { if (obj->type == OBJ_TREE) { @@ -831,7 +860,7 @@ static void do_record_outgoing_links(struct object *obj) */ return; while (tree_entry_gently(&desc, &entry)) - record_outgoing_link(&entry.oid); + maybe_record_name_entry(&entry); } else if (obj->type == OBJ_COMMIT) { struct commit *commit = (struct commit *) obj; struct commit_list *parents = commit->parents; From patchwork Tue Dec 3 21:52:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13893004 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 356DB1FBEA7 for ; Tue, 3 Dec 2024 21:53:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262788; cv=none; b=H1klCVif9JoCC1QxJOaGd+MZ3h7A8s52YRtrV6nGQyVi4Wu7++lBFArF1jixZ2eoolbBsYT4UGxg9VfCSC4I0NJhq9pIuafg50/n6Kjvc4Pt/s7hJ71y7L4uJUAPdB6SnMqEPacua3tQ7PAfNWKErtIUh0L2mBeT1Y8IuTTMvkw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262788; c=relaxed/simple; bh=qRXNHc3e9qElHF+XE3Rmr8l4+eLxZnEpJc36Ach+4S4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Mg/oHbeIfD/D8LAwCS27lHYBiMCVobiD00fZfyHp9ev84GEkCqjDXM/co4p7RZoLqRLvCFMWETX0HhoYX3ENgGHgo5V/EVKX+nBTZoVdLl8yb7GNMQOyQgT781DV/nCg7H3knTapAtDbWmbj1Ic4O15ztOkgX2vNqJ/QTbVq39c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=jKXm4czn; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="jKXm4czn" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-7ee3c2a2188so4488967a12.3 for ; Tue, 03 Dec 2024 13:53:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733262786; x=1733867586; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Ci6rzfBnd5EQ5ISlpqG5ZO5OAcQKCvNSZGqaRLwh/HQ=; b=jKXm4cznPGSNRug0Kjlq1T2HXWUQH8vBmzM6TGTgZ9tVSXfUseF6QYr8Rro4KRbKJr YUXpofdK82gxlFvTdw2Z3MXmHxzJWt7/qM2j4JfngixVkgMv76lw+1xCvz6dmRtw8g9r flW7kVhdzSUtfEfshxGqS6kEmQtGWp3efRJ+Wrtm7Xj60HXn78c/1bvGbHhiobTywPeU ycY9ildyBqN5lvESMkoUn2a3LE3iz50TQ9CBEHop5+xiDb8c3Ool0ruzsc/qEQrZrXJL vPaAi34n9pBdOY3sakHDNMRjBggkzXkrVe4KB7jWUFjYo5+tu23YL/BjwFFY3VkprcX3 XZoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733262786; x=1733867586; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Ci6rzfBnd5EQ5ISlpqG5ZO5OAcQKCvNSZGqaRLwh/HQ=; b=YRxXMwFf8/Yfecve6HsNN2LMIABi0FW6jgSDO10lKYEzFoZFKdEnFmbPv5nT3ttfSR 7kh8QcTkeXclbcw7wlTwNWfeEVa0ZUrzPKuMzGdsCjsK/3c8YakD9KJgFWYAYM5WjUn2 tLe3MOgXWaaXpFlggLKe7WZcD+/hKLJIr4btvl/P1WH0Z7OKJ9ZK3ewxLaZUxLeFPS0Q bdfLFqwFoNpfNFxo9CjdsfDKlnsM8It6LPOcxntqoGcmWjfC0FmiSwPeJHbbxoWwAzqI ysbYUnTBw+AuE4nCeUItWFix4Qcxg4vQpHB96j/PnQ/pdiZxbnXLwrc+ikwDBcif6Rw1 I2mA== X-Gm-Message-State: AOJu0YyVH3WbgKV9+hbL5YSQNf7rFxPKWmcnb/7rWJYmcOQ4xgehSK3X xTW4np14fwxlHd4DPf0X6awzfdyMhhKeHAltg2Hntlg28LVozsXkOIPHTfjE5btDy6EYpUizM5i QqTG+ElSPDUAoupMA0Be4uwCUwuZuKAykbS5IKAOV7EkRAAy6Klpcz7D73UXqhfGp3RbhbyvH6f ihrNjNHQDWWqJ/GoD1QMK+XJsK+eGuOzbCxyNqoqPhlUhKj5Bk9WpPJCnKGNM4DQ0VTg== X-Google-Smtp-Source: AGHT+IHSi+ScM7xkX+aZUq2QmZJBHF3UV9Ra71YJ1jmhYGcTBKa5w89HyJBfRF3eSmqS+dqfJq6+TPi32QRwSxJIJRPH X-Received: from pfbbe16.prod.google.com ([2002:a05:6a00:1f10:b0:724:eb96:cf5c]) (user=jonathantanmy job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:c947:b0:215:94c0:f6ea with SMTP id d9443c01a7336-215bd1878c7mr38472465ad.56.1733262786352; Tue, 03 Dec 2024 13:53:06 -0800 (PST) Date: Tue, 3 Dec 2024 13:52:56 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog Message-ID: Subject: [PATCH v3 3/3] index-pack --promisor: also check commits' trees From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , ps@pks.im, gitster@pobox.com Commit c08589efdc (index-pack: repack local links into promisor packs, 2024-11-01) seems to contain an oversight in that the tree of a commit is not checked. Teach git to check these trees. The fix slows down a fetch from a certain repo at $DAYJOB from 2m2.127s to 2m45.052s, but in order to make the fetch correct, it seems worth it. In order to test this, we could create server and client repos as follows... C S \ / O (O and C are commits both on the client and server. S is a commit only on the server. C and S have the same tree but different commit messages. The diff between O and C is non-zero.) ...and then, from the client, fetch S from the server. In theory, the client declares "have C" and the server can use this information to exclude S's tree (since it knows that the client has C's tree, which is the same as S's tree). However, it is also possible for the server to compute that it needs to send S and not O, and proceed from there; therefore the objects of C are not considered at all when determining what to send in the packfile. In order to prevent a test of client functionality from having such a dependence on server behavior, I have not included such a test. Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 1 + 1 file changed, 1 insertion(+) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 2e90fe186e..1594f2b81d 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -865,6 +865,7 @@ static void do_record_outgoing_links(struct object *obj) struct commit *commit = (struct commit *) obj; struct commit_list *parents = commit->parents; + record_outgoing_link(get_commit_tree_oid(commit)); for (; parents; parents = parents->next) record_outgoing_link(&parents->item->object.oid); } else if (obj->type == OBJ_TAG) { From patchwork Wed Dec 4 04:46:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junio C Hamano X-Patchwork-Id: 13893210 Received: from fhigh-a4-smtp.messagingengine.com (fhigh-a4-smtp.messagingengine.com [103.168.172.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E98522500D6 for ; Wed, 4 Dec 2024 04:46:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.155 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733287615; cv=none; b=uxIerhsM3RCbrngVhJ5h8OdknEd0BsqpS0gC07TZeaS2JYaAp240yESe2JFZFtiPd1hbLjw/CkCE7Ye245jbPNHDZhMluFLgR7rcabjFaYd667d0fIVQsNRDKuHhhFM/L6tuheIGRu6v/bnD9CBUUzDGl68E0WCul6PwOvcHbKA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733287615; c=relaxed/simple; bh=NUmua8ORmsEIUAb6xAMjdsLvwl1P0MYAH53IhyQTVKo=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=R+IyjKuTPHo02sCd6ial0gIGdi46tcsbd34egZrT3zebTONB45pGMwuO2qhBKDBoSyn5yKZu823fNcTc+SkncsFz/lV2W7Kuar+UvtrL8UZh7ikJqKGW4KFSfN+HJT6a+J45k2wP4nb/TNDlhQ6Iaj1s2fGMoS+i0Gda2c2I/U0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=pobox.com; spf=pass smtp.mailfrom=pobox.com; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=UXzcFbQ9; arc=none smtp.client-ip=103.168.172.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=pobox.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pobox.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="UXzcFbQ9" Received: from phl-compute-03.internal (phl-compute-03.phl.internal [10.202.2.43]) by mailfhigh.phl.internal (Postfix) with ESMTP id EF717114020A; Tue, 3 Dec 2024 23:46:51 -0500 (EST) Received: from phl-frontend-02 ([10.202.2.161]) by phl-compute-03.internal (MEProxy); Tue, 03 Dec 2024 23:46:51 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1733287611; x=1733374011; bh=WvqBrAaxEovw1SnkAjYqOqNmr3/3NpscuYy T62JsszU=; b=UXzcFbQ9N2QUb7pZRcnwuJhKWWbMZTSwsGrgtc2aXOxJZWsytrx UW9yqfZ0r17cg/7xKidU6goEAScCg57NoZJXhUYfXuNLW36+u7r5+l0pa+q1SREJ 7/L5KEyabfmUXq/oye77cyD4rt/9FulpD+YPbLIkA1HkWc7ePx0qFQ10Nkysr7rk ceQqhUDGbCp6DScs0/RLEGvCwwrlp6ExkChapFC8c4PPfapKvHId2Jp8gWuOw2XH mHD/FbIeYO2BBCeL1zGicIz5gql6MWKy8cJRxDy6zUnIIVS0g9Xu3BMDOcv00z7R 1Rl5C5Ot63HvsAWqS4Y0YLgbuOqmZsVVrYQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefuddrieeggdejvdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdpuffr tefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecunecujfgurhephffvvefujg hffffkfgggtgesthdtredttdertdenucfhrhhomheplfhunhhiohcuvecujfgrmhgrnhho uceoghhithhsthgvrhesphhosghogidrtghomheqnecuggftrfgrthhtvghrnhepffeite eujeevfeehuddvjeduffeijeegfefhtddvkeefjeejhedtgeefgfeijedtnecuffhomhgr ihhnpehgihhthhhusgdrtghomhenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh epmhgrihhlfhhrohhmpehgihhtshhtvghrsehpohgsohigrdgtohhmpdhnsggprhgtphht thhopeegpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopehjohhnrghthhgrnhhtrg hnmhihsehgohhoghhlvgdrtghomhdprhgtphhtthhopehgihhtsehvghgvrhdrkhgvrhhn vghlrdhorhhgpdhrtghpthhtohepphhssehpkhhsrdhimhdprhgtphhtthhopehgihhtsh htvghrsehpohgsohigrdgtohhm X-ME-Proxy: Feedback-ID: if26b431b:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 3 Dec 2024 23:46:51 -0500 (EST) From: Junio C Hamano To: Jonathan Tan Cc: git@vger.kernel.org, ps@pks.im Subject: [PATCH 4/3] index-pack: work around false positive use of uninitialized variable In-Reply-To: (Jonathan Tan's message of "Tue, 3 Dec 2024 13:52:53 -0800") References: Date: Wed, 04 Dec 2024 13:46:50 +0900 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The base_name variable in this function is given a value if cmd.args array is not empty (i.e., if we run the pack-objects command), and the function returns when cmd.args is empty before hitting a call to free(base_name) near the end of the function, so to a human reader, it can be seen that the variable is not used uninitialized, but to a semi-intelligent compiler it is not so clear. Squelch a false positive by a meaningless NULL initialization. Signed-off-by: Junio C Hamano --- * Tentatively queued to unblock CI. There may be breakages due to other topics in flight, but at least this one is easy to resolve (hopefully---I haven't pushed it out). https://github.com/git/git/actions/runs/12152173257 builtin/index-pack.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 1594f2b81d..8e600a58bf 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1803,7 +1803,7 @@ static void repack_local_links(void) struct strbuf line = STRBUF_INIT; struct oidset_iter iter; struct object_id *oid; - char *base_name; + char *base_name = NULL; if (!oidset_size(&outgoing_links)) return;