Message ID | cover.1629800774.git.ps@pks.im (mailing list archive) |
---|---|
Headers | show |
Series | Speed up mirror-fetches with many refs | expand |
Patrick Steinhardt <ps@pks.im> writes: > this is the second version of my patch series to speed up mirror-fetches > with many refs. This topic applies on top of Junio's 9d5700f60b (Merge > branch 'ps/connectivity-optim' into jch, 2021-08-23). It is a horrible commit to base anything on. You are taking your patches hostage to all of these other topics. 9d5700f60b Merge branch 'ps/connectivity-optim' into jch 7ad315de2f Merge branch 'js/log-protocol-version' into jch 1726f748f5 Merge branch 'en/ort-becomes-the-default' into jch 23aeecb099 Merge branch 'en/merge-strategy-docs' into jch 568277d458 Merge branch 'en/pull-conflicting-options' into jch 2b316bb006 ### match next 4efa9ea0b6 Merge branch 'ps/fetch-pack-load-refs-optim' into jch b305842ee8 Merge branch 'jt/push-negotiation-fixes' into jch 83b45616f1 Merge branch 'es/trace2-log-parent-process-name' into jch be89aa8c38 Merge branch 'hn/refs-test-cleanup' into jch 256d56ed32 Merge branch 'en/ort-perf-batch-15' into jch 7477fbf53a Merge branch 'js/expand-runtime-prefix' into jch b1453dfd30 Merge branch 'ab/bundle-doc' into jch 1b66e8e89d Merge branch 'zh/ref-filter-raw-data' into jch 1fbf27ddcd Merge branch 'ab/pack-stdin-packs-fix' into jch dcf57bfebb Merge branch 'ab/http-drop-old-curl' into jch 93041f7c57 Merge branch 'ds/add-with-sparse-index' into jch 814a016195 Merge branch 'jc/bisect-sans-show-branch' into jch A better way to handle a situation like this is to limit your dependencies more explicitly. If you look at what I did to the last round of this topic, you'll see that there is a merge of the 'ps/connectivity-optim' topic into v2.33 followed by application of the patches, like this: 1d576ca7b2 fetch: avoid second connectivity check if we already have all objects 6768595f10 fetch: refactor fetch refs to be more extendable a615d7cf87 fetch-pack: optimize loading of refs via commit graph bfd04fc24c connected: refactor iterator to return next object ID directly 1a387c9f3a fetch: avoid unpacking headers in object existence check f1a4367ec4 fetch: speed up lookup of want refs via commit-graph 3628199d4d Merge branch 'ps/connectivity-optim' into ps/fetch-optim What I did to your last round was to merge 'ps/connectivity-optim' on top of v2.33 and then queue them. You can do the same for this round (you can tell people "apply these on top of the result of merging topic X, Y and Z on tag V"). df52ef2c3a fetch: avoid second connectivity check if we already have all objects c1721680e4 fetch: merge fetching and consuming refs 5470cbe1be fetch: refactor fetch refs to be more extendable 016a510428 fetch-pack: optimize loading of refs via commit graph f6c7e63cc7 connected: refactor iterator to return next object ID directly 17c8e90df3 fetch: avoid unpacking headers in object existence check a54c245004 fetch: speed up lookup of want refs via commit-graph 3628199d4d Merge branch 'ps/connectivity-optim' into ps/fetch-optim I had to adjust [4/7] while applying them on top of the same 3628199d4d I created for queuing the previous round, and it would be appreciated if you can double-check the result. Thanks.
On Tue, Aug 24, 2021 at 03:48:19PM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: [snip] > A better way to handle a situation like this is to limit your > dependencies more explicitly. If you look at what I did to the last > round of this topic, you'll see that there is a merge of the > 'ps/connectivity-optim' topic into v2.33 followed by application of > the patches, like this: I wasn't quite sure how to best handle this, but I'll keep this in mind for future iterations/patch series. Thanks for the explanation. [snip] > I had to adjust [4/7] while applying them on top of the same > 3628199d4d I created for queuing the previous round, and it would be > appreciated if you can double-check the result. The result looks good to me, thanks. Patrick
On 8/24/2021 6:36 AM, Patrick Steinhardt wrote: > Changes compared to v1: > > - Patch 1/7: I've applied Stolee's proposal to only > opportunistically load objects via the commit-graph in case the > reference is not in refs/tags/ such that we don't regress repos > with many annotated tags. > > - Patch 3/7: The return parameter of the iterator is now const to > allow further optimizations by the compiler, as suggested by > René. I've also re-benchmarked this, and one can now see a very > slight performance improvement of ~1%. > > - Patch 4/7: Added my missing DCO, as pointed out by Junio. > > - Patch 5, 6, 7: I've redone these to make it clearer that the > refactoring I'm doing doesn't cause us to miss any object > connectivity checks. Most importantly, I've merged `fetch_refs()` > and `consume_refs()` into `fetch_and_consume_refs()` in 6/7, which > makes the optimization where we elide the second connectivity > check in 7/7 trivial. These changes are positive. My read through this set of patches had only a few nit-picks. Thanks, -Stolee