[v3,2/4] revision: stop retrieving reference twice

Message ID	db854806498b316b4f59c33d70a1ea9c1401fdb7.1627896460.git.ps@pks.im (mailing list archive)
State	Superseded
Headers	show Return-Path: <git-owner@kernel.org> Date: Mon, 2 Aug 2021 11:38:06 +0200 From: Patrick Steinhardt <ps@pks.im> To: git@vger.kernel.org Cc: Jeff King <peff@peff.net>, Felipe Contreras <felipe.contreras@gmail.com>, SZEDER =?iso-8859-1?q?G=E1bor?= <szeder.dev@gmail.com>, Chris Torek <chris.torek@gmail.com>, =?iso-8859-1?q?=C6var_Arnfj=F6r=F0?= Bjarmason <avarab@gmail.com>, Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com> Subject: [PATCH v3 2/4] revision: stop retrieving reference twice Message-ID: <db854806498b316b4f59c33d70a1ea9c1401fdb7.1627896460.git.ps@pks.im> References: <cover.1624858240.git.ps@pks.im> <cover.1627896460.git.ps@pks.im> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="uAzoqpYH79JdQWvO" Content-Disposition: inline In-Reply-To: <cover.1627896460.git.ps@pks.im> Precedence: bulk
Series	Speed up connectivity checks \| expand [v3,0/4] Speed up connectivity checks [v3,1/4] connected: do not sort input revisions [v3,2/4] revision: stop retrieving reference twice [v3,3/4] revision: avoid loading object headers multiple times [v3,4/4] revision: avoid hitting packfiles when commits are in commit-graph

Message ID

db854806498b316b4f59c33d70a1ea9c1401fdb7.1627896460.git.ps@pks.im (mailing list archive)

State

Superseded

Headers

Date: Mon, 2 Aug 2021 11:38:06 +0200
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Felipe Contreras <felipe.contreras@gmail.com>,
 SZEDER =?iso-8859-1?q?G=E1bor?= <szeder.dev@gmail.com>,
 Chris Torek <chris.torek@gmail.com>,
 =?iso-8859-1?q?=C6var_Arnfj=F6r=F0?= Bjarmason <avarab@gmail.com>,
 Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com>
Subject: [PATCH v3 2/4] revision: stop retrieving reference twice
Message-ID: 
 <db854806498b316b4f59c33d70a1ea9c1401fdb7.1627896460.git.ps@pks.im>
References: <cover.1624858240.git.ps@pks.im>
 <cover.1627896460.git.ps@pks.im>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha512;
        protocol="application/pgp-signature"; boundary="uAzoqpYH79JdQWvO"
Content-Disposition: inline
In-Reply-To: <cover.1627896460.git.ps@pks.im>
Precedence: bulk

Series

Speed up connectivity checks | expand

Commit Message

Patrick Steinhardt Aug. 2, 2021, 9:38 a.m. UTC

When queueing up references for the revision walk, `handle_one_ref()`
will resolve the reference's object ID and then queue the ID as pending
object via `add_pending_oid()`. But `add_pending_oid()` will again try
to resolve the object ID to an object, effectively duplicating the work
its caller already did before.

Fix the issue by instead calling `add_pending_object()`, which takes the
already-resolved object as input. In a repository with lots of refs,
this translates in a nearly 10% speedup:

    Benchmark #1: HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
      Time (mean ± σ):      5.015 s ±  0.038 s    [User: 4.698 s, System: 0.316 s]
      Range (min … max):    4.970 s …  5.089 s    10 runs

    Benchmark #2: HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
      Time (mean ± σ):      4.606 s ±  0.029 s    [User: 4.260 s, System: 0.345 s]
      Range (min … max):    4.565 s …  4.657 s    10 runs

    Summary
      'HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' ran
        1.09 ± 0.01 times faster than 'HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 revision.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ævar Arnfjörð Bjarmason Aug. 2, 2021, 12:53 p.m. UTC | #1

On Mon, Aug 02 2021, Patrick Steinhardt wrote:

> [[PGP Signed Part:Undecided]]
> When queueing up references for the revision walk, `handle_one_ref()`
> will resolve the reference's object ID and then queue the ID as pending
> object via `add_pending_oid()`. But `add_pending_oid()` will again try
> to resolve the object ID to an object, effectively duplicating the work
> its caller already did before.
>
> Fix the issue by instead calling `add_pending_object()`, which takes the
> already-resolved object as input. In a repository with lots of refs,
> this translates in a nearly 10% speedup:
>
>     Benchmark #1: HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
>       Time (mean ± σ):      5.015 s ±  0.038 s    [User: 4.698 s, System: 0.316 s]
>       Range (min … max):    4.970 s …  5.089 s    10 runs
>
>     Benchmark #2: HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
>       Time (mean ± σ):      4.606 s ±  0.029 s    [User: 4.260 s, System: 0.345 s]
>       Range (min … max):    4.565 s …  4.657 s    10 runs
>
>     Summary
>       'HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' ran
>         1.09 ± 0.01 times faster than 'HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev'

It might be worth calling out explicitly that it's not just
"effectively", but that add_pending_oid() is just a thin wrapper for
get_reference() followed by add_pending_object(), so we're guaranteed to
get the exact same result here as before, just without the duplicate
work.

I.e. we're not going down some other lookup path that uses object
lookups with different flags or whatever as a result of this change.

diff --git a/revision.c b/revision.c
index 7eb09009ba..f06a5d63a3 100644
--- a/revision.c
+++ b/revision.c
@@ -1534,7 +1534,7 @@  static int handle_one_ref(const char *path, const struct object_id *oid,
 
 	object = get_reference(cb->all_revs, path, oid, cb->all_flags);
 	add_rev_cmdline(cb->all_revs, object, path, REV_CMD_REF, cb->all_flags);
-	add_pending_oid(cb->all_revs, path, oid, cb->all_flags);
+	add_pending_object(cb->all_revs, object, path);
 	return 0;
 }

[v3,2/4] revision: stop retrieving reference twice

Commit Message

Comments

Patch