Message ID | db854806498b316b4f59c33d70a1ea9c1401fdb7.1627896460.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Speed up connectivity checks | expand |
On Mon, Aug 02 2021, Patrick Steinhardt wrote: > [[PGP Signed Part:Undecided]] > When queueing up references for the revision walk, `handle_one_ref()` > will resolve the reference's object ID and then queue the ID as pending > object via `add_pending_oid()`. But `add_pending_oid()` will again try > to resolve the object ID to an object, effectively duplicating the work > its caller already did before. > > Fix the issue by instead calling `add_pending_object()`, which takes the > already-resolved object as input. In a repository with lots of refs, > this translates in a nearly 10% speedup: > > Benchmark #1: HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev > Time (mean ± σ): 5.015 s ± 0.038 s [User: 4.698 s, System: 0.316 s] > Range (min … max): 4.970 s … 5.089 s 10 runs > > Benchmark #2: HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev > Time (mean ± σ): 4.606 s ± 0.029 s [User: 4.260 s, System: 0.345 s] > Range (min … max): 4.565 s … 4.657 s 10 runs > > Summary > 'HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' ran > 1.09 ± 0.01 times faster than 'HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' It might be worth calling out explicitly that it's not just "effectively", but that add_pending_oid() is just a thin wrapper for get_reference() followed by add_pending_object(), so we're guaranteed to get the exact same result here as before, just without the duplicate work. I.e. we're not going down some other lookup path that uses object lookups with different flags or whatever as a result of this change.
diff --git a/revision.c b/revision.c index 7eb09009ba..f06a5d63a3 100644 --- a/revision.c +++ b/revision.c @@ -1534,7 +1534,7 @@ static int handle_one_ref(const char *path, const struct object_id *oid, object = get_reference(cb->all_revs, path, oid, cb->all_flags); add_rev_cmdline(cb->all_revs, object, path, REV_CMD_REF, cb->all_flags); - add_pending_oid(cb->all_revs, path, oid, cb->all_flags); + add_pending_object(cb->all_revs, object, path); return 0; }
When queueing up references for the revision walk, `handle_one_ref()` will resolve the reference's object ID and then queue the ID as pending object via `add_pending_oid()`. But `add_pending_oid()` will again try to resolve the object ID to an object, effectively duplicating the work its caller already did before. Fix the issue by instead calling `add_pending_object()`, which takes the already-resolved object as input. In a repository with lots of refs, this translates in a nearly 10% speedup: Benchmark #1: HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev Time (mean ± σ): 5.015 s ± 0.038 s [User: 4.698 s, System: 0.316 s] Range (min … max): 4.970 s … 5.089 s 10 runs Benchmark #2: HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev Time (mean ± σ): 4.606 s ± 0.029 s [User: 4.260 s, System: 0.345 s] Range (min … max): 4.565 s … 4.657 s 10 runs Summary 'HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' ran 1.09 ± 0.01 times faster than 'HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' Signed-off-by: Patrick Steinhardt <ps@pks.im> --- revision.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)