From patchwork Tue Aug 24 10:36:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 12454595 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15226C4338F for ; Tue, 24 Aug 2021 10:37:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8A1761008 for ; Tue, 24 Aug 2021 10:37:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236188AbhHXKhx (ORCPT ); Tue, 24 Aug 2021 06:37:53 -0400 Received: from out4-smtp.messagingengine.com ([66.111.4.28]:56303 "EHLO out4-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236175AbhHXKhm (ORCPT ); Tue, 24 Aug 2021 06:37:42 -0400 Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.nyi.internal (Postfix) with ESMTP id 8970B5C014D; Tue, 24 Aug 2021 06:36:58 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Tue, 24 Aug 2021 06:36:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=fm1; bh=MVoPtfxc8OADHhfUfTIB5cKU4jF DTPrEfIRAf/D+U5c=; b=atJlw8Kd5PeodWHqki3FqCJwTJSQ0U0t9lH9qu4LUNF evS1P6SlhsZsbi2sPRm3UbNxbZOqYs/JdzobhAhREbBuL28mEF9pdBRq6AouZ4Kj N6gGKY7BUZPjUDYCjFGfOraJCV4RGGcDXn4DACNMkuNxyvJViaF0vuSq5ba93vIX BxqhTnOXqgHJ9LOBqM8gzeCQeOA5pZoKRactjxOs98JM+hOGscgNLp/o1N3MZY8m jTZ7wi3vGh24ibsxO5llms8O7Dw35OYWGD5FgIBtobvnUi/AE0lhybDRDDOiW7iq 4p92EJCYbLE1HDS0QfauKZpkCmgqrcKQPEp5C8LW3qg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; bh=MVoPtf xc8OADHhfUfTIB5cKU4jFDTPrEfIRAf/D+U5c=; b=iyswFxxJZ4g8+FyhSJylBe nRmbrWzNXfLeK2p3wGJ/kOuEumCVbdj5QDSPeG+CArmtbYj0O03Oj8TJmzDKkKj0 2axuAsxQkQM9CB6Wgx5KXHcdbwgYpxnP3YPXJg+80ylJWY3SPAMWABTJghf47u6W L0b8NKUOdvzvNNudU+bfIlWUk42AFx+o8lWzd9ZMD2VYMO2x/bKFwmr/+RS9UPqS qBW5t+nLAK8Bhc+tzyFv4xK0XAHwa+NxGxnRSxjLLCBXJZ43bTJXX/3sj9ZNstf7 Mny+zH4nsl6Lwp8srZY8uxepKv/X2TTEU0uIsk/Od67G2JrUtm1LDwYGdSNoxWSQ == X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddruddtjedgfedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvffukfhfgggtuggjsehgtderredttdejnecuhfhrohhmpefrrghtrhhi tghkucfuthgvihhnhhgrrhguthcuoehpshesphhkshdrihhmqeenucggtffrrghtthgvrh hnpeehgfejueevjeetudehgffffeffvdejfeejiedvkeffgfekuefgheevteeufeelkeen ucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehpshesph hkshdrihhm X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 24 Aug 2021 06:36:57 -0400 (EDT) Received: from localhost (ncase [10.192.0.11]) by vm-mail.pks.im (OpenSMTPD) with ESMTPSA id 13460ce9 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 24 Aug 2021 10:36:56 +0000 (UTC) Date: Tue, 24 Aug 2021 12:36:55 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Jeff King , =?iso-8859-1?q?=C6var_Arnfj=F6r=F0?= Bjarmason , Junio C Hamano , Derrick Stolee , =?iso-8859-1?q?Ren=E9?= Scharfe Subject: [PATCH v2 1/7] fetch: speed up lookup of want refs via commit-graph Message-ID: <4a819a68309bf03db2d9a5e5be070e52c3542af8.1629800774.git.ps@pks.im> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When updating our local refs based on the refs fetched from the remote, we need to iterate through all requested refs and load their respective commits such that we can determine whether they need to be appended to FETCH_HEAD or not. In cases where we're fetching from a remote with exceedingly many refs, resolving these refs can be quite expensive given that we repeatedly need to unpack object headers for each of the referenced objects. Speed this up by opportunistcally trying to resolve object IDs via the commit graph. We only do so for any refs which are not in "refs/tags": more likely than not, these are going to be a commit anyway, and this lets us avoid having to unpack object headers completely in case the object is a commit that is part of the commit-graph. This significantly speeds up mirror-fetches in a real-world repository with 2.3M refs: Benchmark #1: HEAD~: git-fetch Time (mean ± σ): 56.482 s ± 0.384 s [User: 53.340 s, System: 5.365 s] Range (min … max): 56.050 s … 57.045 s 5 runs Benchmark #2: HEAD: git-fetch Time (mean ± σ): 33.727 s ± 0.170 s [User: 30.252 s, System: 5.194 s] Range (min … max): 33.452 s … 33.871 s 5 runs Summary 'HEAD: git-fetch' ran 1.67 ± 0.01 times faster than 'HEAD~: git-fetch' Signed-off-by: Patrick Steinhardt --- builtin/fetch.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index e064687dbd..91d1301613 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1074,7 +1074,6 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, int connectivity_checked, struct ref *ref_map) { struct fetch_head fetch_head; - struct commit *commit; int url_len, i, rc = 0; struct strbuf note = STRBUF_INIT, err = STRBUF_INIT; struct ref_transaction *transaction = NULL; @@ -1122,6 +1121,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, want_status <= FETCH_HEAD_IGNORE; want_status++) { for (rm = ref_map; rm; rm = rm->next) { + struct commit *commit = NULL; struct ref *ref = NULL; if (rm->status == REF_STATUS_REJECT_SHALLOW) { @@ -1131,11 +1131,23 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, continue; } - commit = lookup_commit_reference_gently(the_repository, - &rm->old_oid, - 1); - if (!commit) - rm->fetch_head_status = FETCH_HEAD_NOT_FOR_MERGE; + /* + * References in "refs/tags/" are often going to point + * to annotated tags, which are not part of the + * commit-graph. We thus only try to look up refs in + * the graph which are not in that namespace to not + * regress performance in repositories with many + * annotated tags. + */ + if (!starts_with(rm->name, "refs/tags/")) + commit = lookup_commit_in_graph(the_repository, &rm->old_oid); + if (!commit) { + commit = lookup_commit_reference_gently(the_repository, + &rm->old_oid, + 1); + if (!commit) + rm->fetch_head_status = FETCH_HEAD_NOT_FOR_MERGE; + } if (rm->fetch_head_status != want_status) continue;