From patchwork Mon Dec 2 20:18:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13891282 Received: from mail-oo1-f74.google.com (mail-oo1-f74.google.com [209.85.161.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F05661DED67 for ; Mon, 2 Dec 2024 20:18:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733170728; cv=none; b=IvDB9uYeUQZA7E0GkPvELM9foK8hF9FVpLUzbaEiHo41cI9hrvvXlUBb0T5QVUqN7ZlyELPOKzVzcs0ItmwzX+fqbh7+ap3jTUXCYG9QbaaKtRDzqAAMu8kCh40aZBkhxC5CzOgW8icOoPwjeolSIpuSbx+IljVaOfH044i3GWU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733170728; c=relaxed/simple; bh=x6uTvDqau67ublwB1WoByA/IK4JjhTqi8H86eRiBTeU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=kjAXX2WkBpegSFkHlkHXlg2mi+eBDOsfICPgH3yKcxBQNgnAxV9ePS3di1fLRym0FE9zGUm68fT299tO9r3zucVQmRlHnJPKfmxggUe+DwE7Vr4AD4d3SJ1+1TRio3/IZyRJE8EWvWbaqvODakJRomXrA0VnEMQpLznWyaihIeI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=sUMd+sTq; arc=none smtp.client-ip=209.85.161.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="sUMd+sTq" Received: by mail-oo1-f74.google.com with SMTP id 006d021491bc7-5eb7826862cso1677798eaf.1 for ; Mon, 02 Dec 2024 12:18:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733170726; x=1733775526; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0DQ4NuPogRmj48mB6MtiliPEuMhqVgH0zVzX8wiYp7Q=; b=sUMd+sTqpNa9Veg6XBXGlDf1EoUL2Epzv4n1zxbHRsJxFK2b245aWG8rU8I5+/Uwh5 2NnXT3Ron7ndOR8P8BjWb1/4gx8ZAjYr/CscsYqDgeXHxsmh9D22gtScMAOSzz/mLNM2 3f1R3ULF2TWKc90YowjUBGUQTDK6LHgwS9BXLkndM1RsDRQ72jt5DKBdZt5JcKRnb0+s lM98UZdQVcGJppDhj3d0wIjdm1HenvTuLan8X5QiVbTdlMrbWC9Z9N3bHk+xE56Qr4vm 99rDunInelGtdWrWmYhc++dRR0+Lyak3oA+O0y+1RRuy/DZCkwhm1oy+ebWCmqgwi6Yt g0vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733170726; x=1733775526; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0DQ4NuPogRmj48mB6MtiliPEuMhqVgH0zVzX8wiYp7Q=; b=midrYA0uCT4TwmbGgx7wR3DdQMdEuuXRMhXS5UQKP/tnSdNkA7EGdfrcs4Q0bIcg2n eoFQAxMzrzWiTAxsa85Sm1QOcx4JqToYOIpqXYJkEiIrxRqdqF6GY4urtnS3prViJu3u SU5qTMed0Qgs86Q8hPzau/d28RUondW+r7JPw8B/R0ALzf1KVG00UnIrc130UwCPPAXn kTIauPxtFxadzaFw806+T06QQZKPUXcdLq640atlVGjSkdjzgmgad4fjV11Bil+/iB2X OkQDdNXSZh/gW7su9B/ExB+58Vj3V2IxDrbCJ3ak0kx3NmOjyxXRNGK62UmL7SFrhQVq +LOQ== X-Gm-Message-State: AOJu0YwEPOpgQnISpZr5oDE30FNOBczmy49nqx1j+X6xxzcFcRJOLVHi 1zbxF76dW+nW+ajqDVzLSF8YtPVbLEXBhhzgTa/h1ZexqnFacTuhFpW6YAIV4pd5W5nRIon5Tba xKImPILOnpkbuwaPqzyiitp7L1AhroYZL0MJ/NgXK4lwW3TBQscanIAJth8+uTtXHobpkhru5Kp AKV7YkEVDGpgJp0cy7LRPm8x3mp86t+gAv7FfI2jBMU0/5FrUUfLk4+zCbZ9N+QGeChw== X-Google-Smtp-Source: AGHT+IFquogFUUBY+4l4h7SBMetnTNo8Tf1u3bzEE2ctY5g9Nq48GVon6AephhHR4xuXlIYG/t7/R3I24oMNc6PRBRZw X-Received: from oabfu18.prod.google.com ([2002:a05:6870:5d92:b0:29d:c6f7:1bb2]) (user=jonathantanmy job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6870:aa9b:b0:29e:684d:274d with SMTP id 586e51a60fabf-29e684d2fdbmr4849543fac.11.1733170725884; Mon, 02 Dec 2024 12:18:45 -0800 (PST) Date: Mon, 2 Dec 2024 12:18:38 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog Message-ID: <5f0f114dbdf00fe246308490f09b649bd8de242c.1733170252.git.jonathantanmy@google.com> Subject: [PATCH 1/3] index-pack: dedup first during outgoing link check From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , hanyang.tony@bytedance.com Commit c08589efdc (index-pack: repack local links into promisor packs, 2024-11-01) fixed a bug with what was believed to be a negligible decrease in performance [1] [2]. But at $DAYJOB, with at least one repo, it was found that the decrease in performance was very significant. Looking at the patch, whenever we parse an object in the packfile to be indexed, we check the targets of all its outgoing links for its existence. However, this could be optimized by first collecting all such targets into an oidset (thus deduplicating them) before checking. Teach Git to do that. On a certain fetch from the aforementioned repo, this improved performance from approximately 7 hours to 24m47.815s. This number will be further reduced in a subsequent patch. [1] https://lore.kernel.org/git/CAG1j3zGiNMbri8rZNaF0w+yP+6OdMz0T8+8_Wgd1R_p1HzVasg@mail.gmail.com/ [2] https://lore.kernel.org/git/20241105212849.3759572-1-jonathantanmy@google.com/ Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 44 ++++++++++++++++++++++---------------------- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 95babdc5ea..8e7d14c17e 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -155,11 +155,11 @@ static int input_fd, output_fd; static const char *curr_pack; /* - * local_links is guarded by read_mutex, and record_local_links is read-only in - * a thread. + * outgoing_links is guarded by read_mutex, and record_outgoing_links is + * read-only in a thread. */ -static struct oidset local_links = OIDSET_INIT; -static int record_local_links; +static struct oidset outgoing_links = OIDSET_INIT; +static int record_outgoing_links; static struct thread_local_data *thread_data; static int nr_dispatched; @@ -812,18 +812,12 @@ static int check_collison(struct object_entry *entry) return 0; } -static void record_if_local_object(const struct object_id *oid) +static void record_outgoing_link(const struct object_id *oid) { - struct object_info info = OBJECT_INFO_INIT; - if (oid_object_info_extended(the_repository, oid, &info, 0)) - /* Missing; assume it is a promisor object */ - return; - if (info.whence == OI_PACKED && info.u.packed.pack->pack_promisor) - return; - oidset_insert(&local_links, oid); + oidset_insert(&outgoing_links, oid); } -static void do_record_local_links(struct object *obj) +static void do_record_outgoing_links(struct object *obj) { if (obj->type == OBJ_TREE) { struct tree *tree = (struct tree *)obj; @@ -837,16 +831,16 @@ static void do_record_local_links(struct object *obj) */ return; while (tree_entry_gently(&desc, &entry)) - record_if_local_object(&entry.oid); + record_outgoing_link(&entry.oid); } else if (obj->type == OBJ_COMMIT) { struct commit *commit = (struct commit *) obj; struct commit_list *parents = commit->parents; for (; parents; parents = parents->next) - record_if_local_object(&parents->item->object.oid); + record_outgoing_link(&parents->item->object.oid); } else if (obj->type == OBJ_TAG) { struct tag *tag = (struct tag *) obj; - record_if_local_object(get_tagged_oid(tag)); + record_outgoing_link(get_tagged_oid(tag)); } } @@ -896,7 +890,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, free(has_data); } - if (strict || do_fsck_object || record_local_links) { + if (strict || do_fsck_object || record_outgoing_links) { read_lock(); if (type == OBJ_BLOB) { struct blob *blob = lookup_blob(the_repository, oid); @@ -928,8 +922,8 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, die(_("fsck error in packed object")); if (strict && fsck_walk(obj, NULL, &fsck_options)) die(_("Not all child objects of %s are reachable"), oid_to_hex(&obj->oid)); - if (record_local_links) - do_record_local_links(obj); + if (record_outgoing_links) + do_record_outgoing_links(obj); if (obj->type == OBJ_TREE) { struct tree *item = (struct tree *) obj; @@ -1781,7 +1775,7 @@ static void repack_local_links(void) struct object_id *oid; char *base_name; - if (!oidset_size(&local_links)) + if (!oidset_size(&outgoing_links)) return; base_name = mkpathdup("%s/pack/pack", repo_get_object_directory(the_repository)); @@ -1795,8 +1789,14 @@ static void repack_local_links(void) if (start_command(&cmd)) die(_("could not start pack-objects to repack local links")); - oidset_iter_init(&local_links, &iter); + oidset_iter_init(&outgoing_links, &iter); while ((oid = oidset_iter_next(&iter))) { + struct object_info info = OBJECT_INFO_INIT; + if (oid_object_info_extended(the_repository, oid, &info, 0)) + /* Missing; assume it is a promisor object */ + continue; + if (info.whence == OI_PACKED && info.u.packed.pack->pack_promisor) + continue; if (write_in_full(cmd.in, oid_to_hex(oid), the_hash_algo->hexsz) < 0 || write_in_full(cmd.in, "\n", 1) < 0) die(_("failed to feed local object to pack-objects")); @@ -1899,7 +1899,7 @@ int cmd_index_pack(int argc, } else if (skip_to_optional_arg(arg, "--keep", &keep_msg)) { ; /* nothing to do */ } else if (skip_to_optional_arg(arg, "--promisor", &promisor_msg)) { - record_local_links = 1; + record_outgoing_links = 1; } else if (starts_with(arg, "--threads=")) { char *end; nr_threads = strtoul(arg+10, &end, 0); From patchwork Mon Dec 2 20:18:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13891283 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B40A91DEFC0 for ; Mon, 2 Dec 2024 20:18:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733170730; cv=none; b=Fn2gum/AEYtSntA0QhacMOUKx3qAjyqLYDW1QHjPtzrX8mkvT/TNe4irq3GJc1hx5oiW2Wd0Z0Mt36WcRDiQ4FlGkPikoOVJJqijguF1zTlusW/HfZYHvYo65CMwxo+R5cXgNPgZX8g8GGbadjCpa+4nia2xXYSlUtJ1ktnA1o0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733170730; c=relaxed/simple; bh=5GLpRW1pJnghx+3+As4hETJ4Zg+ZbAxKjnJcu+50mF4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=E0WQ2thk5qMMeiadDoMSs6ZxDkByLrf8mra/XoB+OJwsouWbKMG5FbPmLwQU4C2o1mFfGkAMpcZDoJpTPXChF6NvJKJY80fGG0kDUMF0C2FpdWQF887dgirWkYMcaeQbD/0mgeQAGvJNVGQvz+Rvf4G08iyQ2xcc1ElxMpwNBJ0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=K7NXD7uo; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="K7NXD7uo" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-724f6189a4aso5479803b3a.3 for ; Mon, 02 Dec 2024 12:18:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733170728; x=1733775528; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=iv2Pi4eVb7KbicHQ0WDVKbT+pc93lbW2+tfBsrUYFn4=; b=K7NXD7uok6hNMpmnpSd59s1oLMZ5AG10dZAtzguZJyq77vohoBQUKCF8oNhRJpPuJ+ ASWKjQ8Hm42fHMsRwt/UeUQeTLa0e7Kbl72Ni+A6/p8VCJjlxehj6FJ1P4WX2gWbGCtH sro8/5BSJOnDOLXR1V8azH7M6/0mCYiBOs/ukobSz9uvx7Br/nOidbAiylYYOa5nWre0 RihHzLRmpV4MvAJQDaXnewnLzWTC9BodERszdVOUz+2U6chLWqOnpcMQrxIdZqEz1lAI /oB8Pnp+uQzd8VqUSqp4Lisq9wytJp1qaT01K9EhSxo6psIMG74CtILMotvEbxJEEiO0 WVlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733170728; x=1733775528; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iv2Pi4eVb7KbicHQ0WDVKbT+pc93lbW2+tfBsrUYFn4=; b=cqjmJq95+UvD62Ud5ti013fsVUq9DzqeLaFZXTC0fj9X51GZ+mmjgtGBe83qoh0u7V 17+P/6gBIM1uFhZP7Nkup8cRmOzblBl3UbOpT9JnjVRrT2bQ2uRE42Nn+gDgXfzpNRxf 53edzJoNGovB2jfpw4Alqx6fv6dFyVkI2uZ2tMahEJKuH2IBoeFnq1iQsrXw4tUfju8y Hc+6I5CI+CuGfvivaQJrtVlIDMMnNJvoxgWgg5yUvCeRtIJDMfRjmJ0iE7nIoDXF+O0y 45OFS1pfzHYPglsm6vytfgB3DDJ+3mcfzQBON2K9WFQaN5wxo96W+DxZfheYRqATdpNg u81w== X-Gm-Message-State: AOJu0YydcKNlygu55eiHMZ84mk7D0ry50GxK25RdwZHc90Lb0pc7Ze8V vGC4XBFVBAHF6qxsnwuuO+k/LYRrCd+VOiBQGfcIOoQK1v+1PqZfRoOFgZEE4Wg6bi7IHlv5lBL LxwUk27xw9MxGVCsUnTUwXTXRmY3LwvA/bLH/X4hkggOlDemWieyLPAgiNCF78w6rhm0YKBkX6K orUqgwdXreNQMyUQeKL+YPbxdFx47kvfrzWoaMjN8mgV2UEkPnX2s1uYUKY765vp92kw== X-Google-Smtp-Source: AGHT+IE+sDOJIKGKnBpxP+tve4WKGW8VKX+YUVmyvxC7YHFAx/vnDdG4hLB/P08QUH9Y4YItpYgjstlw6nHVpC+d3EtI X-Received: from plb11.prod.google.com ([2002:a17:903:440b:b0:215:9d29:1aa8]) (user=jonathantanmy job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:fc47:b0:215:5204:3913 with SMTP id d9443c01a7336-21552044c13mr200768585ad.52.1733170727743; Mon, 02 Dec 2024 12:18:47 -0800 (PST) Date: Mon, 2 Dec 2024 12:18:39 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog Message-ID: <300f53b8e39fa1dd55f65924d20f8abd22cbbfc9.1733170252.git.jonathantanmy@google.com> Subject: [PATCH 2/3] index-pack: no blobs during outgoing link check From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , hanyang.tony@bytedance.com As a follow-up to the parent of this commit, it was found that not checking for the existence of blobs linked from trees sped up the fetch from 24m47.815s to 2m2.127s. Teach Git to do that. The benefit of doing this is as above (fetch speedup), but the drawback is that if the packfile to be indexed references a local blob directly (that is, not through a local tree), that local blob is in danger of being garbage collected. Such a situation may arise if we push local commits, including one with a change to a blob in the root tree, and then the server incorporates them into its main branch through a "rebase" or "squash" merge strategy, and then we fetch the new main branch from the server. This situation has not been observed yet - we have only noticed missing commits, not missing trees or blobs. (In fact, if it were believed that only missing commits are problematic, one could argue that we should also exclude trees during the outgoing link check; but it is safer to include them.) Due to the rarity of the situation (it has not been observed to happen in real life), and because the "penalty" in such a situation is merely to refetch the missing blob when it's needed, the tradeoff seems worth it. (Blobs may also be linked from tag objects, but it is impossible to know the type of an object linked from a tag object without looking it up in the object database, so the code for that is untouched.) Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 8e7d14c17e..58d24540dc 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -830,8 +830,10 @@ static void do_record_outgoing_links(struct object *obj) * verified, so do not print any here. */ return; - while (tree_entry_gently(&desc, &entry)) - record_outgoing_link(&entry.oid); + while (tree_entry_gently(&desc, &entry)) { + if (S_ISDIR(entry.mode)) + record_outgoing_link(&entry.oid); + } } else if (obj->type == OBJ_COMMIT) { struct commit *commit = (struct commit *) obj; struct commit_list *parents = commit->parents; From patchwork Mon Dec 2 20:18:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13891284 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 67AEF1DEFEC for ; Mon, 2 Dec 2024 20:18:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733170731; cv=none; b=BctA06ndzxvyVPxaa8nCVCLNuY8CJ9kQzs5f5Uqwv0DfIrxpbN9XKojvXlm8ca7Ht7WaE00LgFpBDLqhQuonRxA5qV/SJMCIR6Za+2l704e5cQwKoFEFiPFqKA9cEUkmqesxSfevBF+uW3xr+WQ+zaDXAB4f7mFSLHO1C1taA7s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733170731; c=relaxed/simple; bh=9lY1Uru6csBWMa7BmPw7olZn2suCdbZXbtEYPAbH/nw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Qq4jSoupEqu22Va8IM7oldXqFqNC6j0sioTrJRv5g8X9nxwlaEvkHMDDJogDpt0JpCxqxlnIWRz0Ofo6spX8ma2Of+3QypFTVKWBZoqhI1plJoJVincXXrOLuUQzSowdVT8AyQ8B8jUGMvgFkqmnIDrep9JxdiUvDEJvEkH9ggE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=wh0TBrvR; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="wh0TBrvR" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2eeeb5b7022so742301a91.0 for ; Mon, 02 Dec 2024 12:18:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733170729; x=1733775529; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+yva30+shWkPJiuBPSI5WLlCeO3wOkqmni5AUcx+I+s=; b=wh0TBrvRqkKJM42hjzJEKujjTW4EuMt7L180NSk+uEhPbGSK6Gvi0Hz/XQo9A9ROvz Gna3muiyPEZAcIT3Plj1A5cV8l+Zfddn/G9XcKkUsmBXKLWLSNiKVjU2dpMHr7TZrEQj y72j8A6lFaZeNn1ythMY12TzAEe/yWDJYzK+SYVY56X6Y6oKtOuqLep8H4rQxjSQ2ztv NKYiJyr3fWHGtofo0pVEfZRWSL647NKy+p02jo81Hx71xmCtkKtd2IZWoZ5SWynZH2J2 +R2eH3DW5aQx0fQuO9IUUcYVYHGBjWWUedDj6Un/6bhI2AGdnJTDNoZJn6OcoSLxNLVo HDEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733170729; x=1733775529; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+yva30+shWkPJiuBPSI5WLlCeO3wOkqmni5AUcx+I+s=; b=FHNNKFBWFI8RfdPWHJ903NPaBweqwMoYrtNrwyWXLTencdPgMHNjR0eA+PBPqvKoic Hf/9pagSTAKVSkhkBrAoOjtpbybuJ/SlfsdXj2w/zAGEXKXAlhKDcFAilux7ByuyR/tc C8++y7Txdj81z8onZsfs/L4YVVXnqc8pTiitf3modFmwLNINJ8h4/8z5/Ht2oHCTJq8z IG5ohc/SXjO27jKvFkrbg3N+pZP0qRc0dQCdq1/ojvXUJxViDGg7FzeR+PG9ngzswh9p py2GuXasEUmupl2dOOj6eECnv30/Q6GzWCPhrcgWiUtlzEnjBlcNNbTYqoJxXUqr1/G9 o3Yg== X-Gm-Message-State: AOJu0YyvUne2yja++TsI2nSFcWB6fGuG+CJmESmMTXQyqrLJZm8YSqnZ LL4BXQW0dCVPXh2TUHgXQV2fiaG2FzSG8WKbBaDCkgsKROHgHONRRAjbkZpY7nkjGefEuVvIHJW bfK4F0Olj7gGmdiLpUi3OBGeZj9+u9N45wF7pLdDskbiNXYbpiPGs8XysJHRqvYvJciBZMI/80B TWK97sHERElnF0Ro7nqt92soQUJAfC5UIHT6RfMm/H49r+4QIscipTflvCsHwCQRZX5w== X-Google-Smtp-Source: AGHT+IFHigklfKGIvj9+jfUk+0aEnRPUJw8G7nvLgvBNaLTdjuj21Sln/mKH7DU7uOeH8lJ64WMhGPG0U6hyPAndxwq7 X-Received: from pjuw3.prod.google.com ([2002:a17:90a:d603:b0:2ea:5835:dbbf]) (user=jonathantanmy job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1e0b:b0:2ee:af31:a7bd with SMTP id 98e67ed59e1d1-2eeaf31aa50mr9538974a91.5.1733170729568; Mon, 02 Dec 2024 12:18:49 -0800 (PST) Date: Mon, 2 Dec 2024 12:18:40 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog Message-ID: <2f2f0db78bf85c14ef132e1924ab5021298aace3.1733170252.git.jonathantanmy@google.com> Subject: [PATCH 3/3] index-pack: commit tree during outgoing link check From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , hanyang.tony@bytedance.com Commit c08589efdc (index-pack: repack local links into promisor packs, 2024-11-01) seems to contain an oversight in that the tree of a commit is not checked. The fix slows down a fetch from a certain repo at $DAYJOB from 2m2.127s to 2m45.052s, but in order to make the fetch correct, it seems worth it. In order to test this, we could create server and client repos as follows... C S \ / O (O and C are commits both on the client and server. S is a commit only on the server. C and S have the same tree but different commit messages.) ...and then, from the client, fetch S from the server. In theory, the client declares "have C" and the server can use this information to exclude S's tree (since it knows that the client has C's tree, which is the same as S's tree). However, it is also possible for the server to compute that it needs to send S and not O, and proceed from there; therefore the objects of C are not considered at all when determining what to send in the packfile. In order to prevent a test of client functionality from having such a dependence on server behavior, I have not included such a test. Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 1 + 1 file changed, 1 insertion(+) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 58d24540dc..338aeeadc8 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -838,6 +838,7 @@ static void do_record_outgoing_links(struct object *obj) struct commit *commit = (struct commit *) obj; struct commit_list *parents = commit->parents; + record_outgoing_link(get_commit_tree_oid(commit)); for (; parents; parents = parents->next) record_outgoing_link(&parents->item->object.oid); } else if (obj->type == OBJ_TAG) {