Message ID | 5e0dee7beba083159f4277ddcd9e931859239bde.1619104091.git.matheus.bernardino@usp.br (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Parallel Checkout (part 3) | expand |
On 4/22/2021 11:17 AM, Matheus Tavares wrote: > There is one code path in builtin/checkout.c which still doesn't benefit > from parallel checkout because it calls checkout_entry() directly,> instead of unpack_trees(). Let's add parallel checkout support for this > missing spot as well. I couldn't tell immediately from the patch what would trigger this code path. I had to trace the method calls to discover that it is for the case of a pathspec-limited checkout: git checkout <ref> -- <pathspec> I confirmed that this does work with this change, but it might be nice to have a test that verifies that parallelism is triggering for this case. Looking ahead to patches 4-6, which add tests, I do not see one for this code path. Yes, patch 7 will implicitly test it through optional settings, but it would be nice to verify that the code is actually using parallel workers. The test_checkout_workers helper in patch 4 should be helpful for this effort. Please point out the test that covers this case, in case I'm just not seeing it. The good news is that I can see a difference. By alternating checkouts of the Git repository's "t" directory between v2.20 and v2.31.1, I can see these results for varying numbers of workers: Benchmark #1: 16 workers Time (mean ± σ): 108.6 ms ± 5.2 ms [User: 146.1 ms, System: 146.1 ms] Range (min … max): 95.5 ms … 124.9 ms 100 runs Benchmark #2: 8 workers Time (mean ± σ): 104.8 ms ± 4.8 ms [User: 128.3 ms, System: 131.7 ms] Range (min … max): 94.2 ms … 119.0 ms 100 runs Benchmark #3: 4 workers Time (mean ± σ): 112.3 ms ± 6.2 ms [User: 114.6 ms, System: 112.1 ms] Range (min … max): 100.0 ms … 127.4 ms 100 runs Benchmark #4: 2 workers Time (mean ± σ): 124.2 ms ± 4.2 ms [User: 106.5 ms, System: 102.0 ms] Range (min … max): 114.8 ms … 136.3 ms 100 runs Benchmark #5: sequential Time (mean ± σ): 154.6 ms ± 6.7 ms [User: 83.5 ms, System: 79.4 ms] Range (min … max): 142.1 ms … 176.0 ms 100 runs Summary '8 workers' ran 1.04 ± 0.07 times faster than '16 workers' 1.07 ± 0.08 times faster than '4 workers' 1.19 ± 0.07 times faster than '2 workers' 1.48 ± 0.09 times faster than 'sequential' (Note: these time measurements are for the round-trip of two checkout commands.) > @@ -359,16 +360,22 @@ static int checkout_worktree(const struct checkout_opts *opts, > int nr_checkouts = 0, nr_unmerged = 0; > int errs = 0; > int pos; > + int pc_workers, pc_threshold; > + struct mem_pool ce_mem_pool; > > state.force = 1; > state.refresh_cache = 1; > state.istate = &the_index; > > + mem_pool_init(&ce_mem_pool, 0); > + get_parallel_checkout_configs(&pc_workers, &pc_threshold); > init_checkout_metadata(&state.meta, info->refname, > info->commit ? &info->commit->object.oid : &info->oid, > NULL); > > enable_delayed_checkout(&state); > + if (pc_workers > 1) > + init_parallel_checkout(); I'm late to looking at your parallel checkout work, but I find this to be a really nice API to get things initialized. > else if (opts->merge) > errs |= checkout_merged(pos, &state, > - &nr_unmerged); > + &nr_unmerged, > + &ce_mem_pool); I see the need to populate these merged entries in the pool for now, before... > pos = skip_same_name(ce, pos) - 1; > } > } > + if (pc_workers > 1) > + errs |= run_parallel_checkout(&state, pc_workers, pc_threshold, > + NULL, NULL); > + mem_pool_discard(&ce_mem_pool, should_validate_cache_entries()); ...now running the parallel checkout and discarding the entries. Thanks, -Stolee
On Fri, Apr 23, 2021 at 1:19 PM Derrick Stolee <stolee@gmail.com> wrote: > > On 4/22/2021 11:17 AM, Matheus Tavares wrote: > > There is one code path in builtin/checkout.c which still doesn't benefit > > from parallel checkout because it calls checkout_entry() directly,> instead of unpack_trees(). Let's add parallel checkout support for this > > missing spot as well. > > I couldn't tell immediately from the patch what would trigger this > code path. I had to trace the method calls to discover that it is > for the case of a pathspec-limited checkout: > > git checkout <ref> -- <pathspec> Oops, I should have mentioned that in the commit message. Thanks for pointing it out. > I confirmed that this does work with this change, but it might be > nice to have a test that verifies that parallelism is triggering for > this case. > > Looking ahead to patches 4-6, which add tests, I do not see one for this > code path. Yes, patch 7 will implicitly test it through optional > settings, but it would be nice to verify that the code is actually using > parallel workers. The test_checkout_workers helper in patch 4 should be > helpful for this effort. > > Please point out the test that covers this case, in case I'm just not > seeing it. Hmm, there are some tests at t2081 and t2082 that check the pathspec-limited case with parallel workers. For example the collision tests run `test_checkout_workers 2 git checkout .`. We also test direct pathnames in t2082, using `test_checkout_workers 2 git checkout A B`. > The good news is that I can see a difference. By alternating checkouts > of the Git repository's "t" directory between v2.20 and v2.31.1, I can > see these results for varying numbers of workers: > > Benchmark #1: 16 workers > Time (mean ± σ): 108.6 ms ± 5.2 ms [User: 146.1 ms, System: 146.1 ms] > Range (min … max): 95.5 ms … 124.9 ms 100 runs > > Benchmark #2: 8 workers > Time (mean ± σ): 104.8 ms ± 4.8 ms [User: 128.3 ms, System: 131.7 ms] > Range (min … max): 94.2 ms … 119.0 ms 100 runs > > Benchmark #3: 4 workers > Time (mean ± σ): 112.3 ms ± 6.2 ms [User: 114.6 ms, System: 112.1 ms] > Range (min … max): 100.0 ms … 127.4 ms 100 runs > > Benchmark #4: 2 workers > Time (mean ± σ): 124.2 ms ± 4.2 ms [User: 106.5 ms, System: 102.0 ms] > Range (min … max): 114.8 ms … 136.3 ms 100 runs > > Benchmark #5: sequential > Time (mean ± σ): 154.6 ms ± 6.7 ms [User: 83.5 ms, System: 79.4 ms] > Range (min … max): 142.1 ms … 176.0 ms 100 runs > > Summary > '8 workers' ran > 1.04 ± 0.07 times faster than '16 workers' > 1.07 ± 0.08 times faster than '4 workers' > 1.19 ± 0.07 times faster than '2 workers' > 1.48 ± 0.09 times faster than 'sequential' Nice! Thanks for the benchmark! > (Note: these time measurements are for the round-trip of two checkout > commands.) > > @@ -359,16 +360,22 @@ static int checkout_worktree(const struct checkout_opts *opts, > > int nr_checkouts = 0, nr_unmerged = 0; > > int errs = 0; > > int pos; > > + int pc_workers, pc_threshold; > > + struct mem_pool ce_mem_pool; > > > > state.force = 1; > > state.refresh_cache = 1; > > state.istate = &the_index; > > > > + mem_pool_init(&ce_mem_pool, 0); > > + get_parallel_checkout_configs(&pc_workers, &pc_threshold); > > init_checkout_metadata(&state.meta, info->refname, > > info->commit ? &info->commit->object.oid : &info->oid, > > NULL); > > > > enable_delayed_checkout(&state); > > + if (pc_workers > 1) > > + init_parallel_checkout(); > > I'm late to looking at your parallel checkout work, but I find this > to be a really nice API to get things initialized. Thanks :)
diff --git a/builtin/checkout.c b/builtin/checkout.c index db667d0267..b71dc08430 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -27,6 +27,7 @@ #include "wt-status.h" #include "xdiff-interface.h" #include "entry.h" +#include "parallel-checkout.h" static const char * const checkout_usage[] = { N_("git checkout [<options>] <branch>"), @@ -230,7 +231,8 @@ static int checkout_stage(int stage, const struct cache_entry *ce, int pos, return error(_("path '%s' does not have their version"), ce->name); } -static int checkout_merged(int pos, const struct checkout *state, int *nr_checkouts) +static int checkout_merged(int pos, const struct checkout *state, + int *nr_checkouts, struct mem_pool *ce_mem_pool) { struct cache_entry *ce = active_cache[pos]; const char *path = ce->name; @@ -291,11 +293,10 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid)) die(_("Unable to add merge result for '%s'"), path); free(result_buf.ptr); - ce = make_transient_cache_entry(mode, &oid, path, 2, NULL); + ce = make_transient_cache_entry(mode, &oid, path, 2, ce_mem_pool); if (!ce) die(_("make_cache_entry failed for path '%s'"), path); status = checkout_entry(ce, state, NULL, nr_checkouts); - discard_cache_entry(ce); return status; } @@ -359,16 +360,22 @@ static int checkout_worktree(const struct checkout_opts *opts, int nr_checkouts = 0, nr_unmerged = 0; int errs = 0; int pos; + int pc_workers, pc_threshold; + struct mem_pool ce_mem_pool; state.force = 1; state.refresh_cache = 1; state.istate = &the_index; + mem_pool_init(&ce_mem_pool, 0); + get_parallel_checkout_configs(&pc_workers, &pc_threshold); init_checkout_metadata(&state.meta, info->refname, info->commit ? &info->commit->object.oid : &info->oid, NULL); enable_delayed_checkout(&state); + if (pc_workers > 1) + init_parallel_checkout(); for (pos = 0; pos < active_nr; pos++) { struct cache_entry *ce = active_cache[pos]; if (ce->ce_flags & CE_MATCHED) { @@ -384,10 +391,15 @@ static int checkout_worktree(const struct checkout_opts *opts, &nr_checkouts, opts->overlay_mode); else if (opts->merge) errs |= checkout_merged(pos, &state, - &nr_unmerged); + &nr_unmerged, + &ce_mem_pool); pos = skip_same_name(ce, pos) - 1; } } + if (pc_workers > 1) + errs |= run_parallel_checkout(&state, pc_workers, pc_threshold, + NULL, NULL); + mem_pool_discard(&ce_mem_pool, should_validate_cache_entries()); remove_marked_cache_entries(&the_index, 1); remove_scheduled_dirs(); errs |= finish_delayed_checkout(&state, &nr_checkouts);
There is one code path in builtin/checkout.c which still doesn't benefit from parallel checkout because it calls checkout_entry() directly, instead of unpack_trees(). Let's add parallel checkout support for this missing spot as well. Note: the transient cache entries allocated in checkout_merged() are now allocated in a mem_pool which is only discarded after parallel checkout finishes. This is done because the entries need to be valid when run_parallel_checkout() is called. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> --- builtin/checkout.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)