mbox series

[v3,0/2] fetch: speed up mirror-fetches with many refs

Message ID cover.1644495978.git.ps@pks.im (mailing list archive)
Headers show
Series fetch: speed up mirror-fetches with many refs | expand

Message

Patrick Steinhardt Feb. 10, 2022, 12:28 p.m. UTC
Hi,

this is the third version of my patch series which aimn to speed up
mirror-fetches in repos with huge amounts of refs. Again, the only
change compared to v2 is a change in commit messages: Chris has rightly
pointed out that the benchmarks were a bit confusing, so I've updated
them to hopefully be less so.

Thanks for your feedback!

Patrick

Patrick Steinhardt (2):
  fetch-pack: use commit-graph when computing cutoff
  fetch: skip computing output width when not printing anything

 builtin/fetch.c |  8 ++++++--
 fetch-pack.c    | 28 ++++++++++++++++------------
 2 files changed, 22 insertions(+), 14 deletions(-)

Range-diff against v2:
1:  6fac914f0f ! 1:  077d06764c fetch-pack: use commit-graph when computing cutoff
    @@ Commit message
         the commit-graph first, which is a lot more efficient.
     
         Benchmarks in a repository with about 2,1 million refs and an up-to-date
    -    commit-graph show a 20% speedup when mirror-fetching:
    +    commit-graph show an almost 20% speedup when mirror-fetching:
     
    -        Benchmark 1: git fetch --atomic +refs/*:refs/* (v2.35.0)
    -          Time (mean ± σ):     75.264 s ±  1.115 s    [User: 68.199 s, System: 10.094 s]
    -          Range (min … max):   74.145 s … 76.862 s    5 runs
    +        Benchmark 1: git fetch +refs/*:refs/* (v2.35.0)
    +          Time (mean ± σ):     115.587 s ±  2.009 s    [User: 109.874 s, System: 11.305 s]
    +          Range (min … max):   113.584 s … 118.820 s    5 runs
     
    -        Benchmark 2: git fetch --atomic +refs/*:refs/* (HEAD)
    -          Time (mean ± σ):     62.350 s ±  0.854 s    [User: 55.412 s, System: 9.976 s]
    -          Range (min … max):   61.224 s … 63.216 s    5 runs
    +        Benchmark 2: git fetch +refs/*:refs/* (HEAD)
    +          Time (mean ± σ):     96.859 s ±  0.624 s    [User: 91.948 s, System: 10.980 s]
    +          Range (min … max):   96.180 s … 97.875 s    5 runs
     
             Summary
    -          'git fetch --atomic +refs/*:refs/* (HEAD)' ran
    -            1.21 ± 0.02 times faster than 'git fetch --atomic +refs/*:refs/* (v2.35.0)'
    +          'git fetch +refs/*:refs/* (HEAD)' ran
    +            1.19 ± 0.02 times faster than 'git fetch +refs/*:refs/* (v2.35.0)'
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
2:  4b9bbcf795 ! 2:  ef1fd07be5 fetch: skip computing output width when not printing anything
    @@ Commit message
         don't print the summary, but still compute the length.
     
         Skip computing the summary width when the user asked for us to be quiet.
    -    This gives us a small speedup of nearly 10% when doing a dry-run
    -    mirror-fetch in a repository with thousands of references being updated:
    +    This gives us a speedup of nearly 10% when doing a mirror-fetch in a
    +    repository with thousands of references being updated:
     
    -        Benchmark 1: git fetch --prune --dry-run +refs/*:refs/* (HEAD~)
    -          Time (mean ± σ):     34.048 s ±  0.233 s    [User: 30.739 s, System: 4.640 s]
    -          Range (min … max):   33.785 s … 34.296 s    5 runs
    +        Benchmark 1: git fetch --quiet +refs/*:refs/* (HEAD~)
    +          Time (mean ± σ):     96.078 s ±  0.508 s    [User: 91.378 s, System: 10.870 s]
    +          Range (min … max):   95.449 s … 96.760 s    5 runs
     
    -        Benchmark 2: git fetch --prune --dry-run +refs/*:refs/* (HEAD)
    -          Time (mean ± σ):     30.768 s ±  0.287 s    [User: 27.534 s, System: 4.565 s]
    -          Range (min … max):   30.432 s … 31.181 s    5 runs
    +        Benchmark 2: git fetch --quiet +refs/*:refs/* (HEAD)
    +          Time (mean ± σ):     88.214 s ±  0.192 s    [User: 83.274 s, System: 10.978 s]
    +          Range (min … max):   87.998 s … 88.446 s    5 runs
     
             Summary
    -          'git fetch --prune --dry-run +refs/*:refs/* (HEAD)' ran
    -            1.11 ± 0.01 times faster than 'git fetch --prune --dry-run +refs/*:refs/* (HEAD~)'
    +          'git fetch --quiet +refs/*:refs/* (HEAD)' ran
    +            1.09 ± 0.01 times faster than 'git fetch --quiet +refs/*:refs/* (HEAD~)'
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>

Comments

Junio C Hamano Feb. 10, 2022, 6:04 p.m. UTC | #1
1Patrick Steinhardt <ps@pks.im> writes:

> this is the third version of my patch series which aimn to speed up
> mirror-fetches in repos with huge amounts of refs. Again, the only
> change compared to v2 is a change in commit messages: Chris has rightly
> pointed out that the benchmarks were a bit confusing, so I've updated
> them to hopefully be less so.
>
> Thanks for your feedback!

> Patrick
>
> Patrick Steinhardt (2):
>   fetch-pack: use commit-graph when computing cutoff
>   fetch: skip computing output width when not printing anything

Both changes are based on quite sensible idea.  If we have
precomputed dates for each commit, it makes sense to look it up
before parsing the commit.  If we are not preparing output, there is
no point in computing the output width.

Very simple and potentially effective.

Will queue.  Thanks.