Message ID | b8a349c6deeb4b970075629d0c292b2ae9f7d0d3.1652724693.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Sparse index: integrate with sparse-checkout | expand |
Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <dstolee@microsoft.com> > > When modifying the sparse-checkout definition, the sparse-checkout > builtin calls update_sparsity() to modify the SKIP_WORKTREE bits of all > cache entries in the index. Before, we needed the index to be fully > expanded in order to ensure we had the full list of files necessary that > match the new patterns. > > Insert a call to reset_sparse_directories() that expands sparse > directories that are within the new pattern list, but only far enough > that every necessary file path now exists as a cache entry. The > remaining logic within update_sparsity() will modify the SKIP_WORKTREE > bits appropriately. > > This allows us to disable command_requires_full_index within the > sparse-checkout builtin. Add tests that demonstrate that we are not > expanding to a full index unnecessarily. > > We can see the improved performance in the p2000 test script: > > Test HEAD~1 HEAD > ------------------------------------------------------------------------ > 2000.24: git ... (sparse-v3) 2.14(1.55+0.58) 1.57(1.03+0.53) -26.6% > 2000.25: git ... (sparse-v4) 2.20(1.62+0.57) 1.58(0.98+0.59) -28.2% > > These reductions of 26-28% are small compared to most examples, but the > time is dominated by writing a new copy of the base repository to the > worktree and then deleting it again. The fact that the previous index > expansion was such a large portion of the time is telling how important > it is to complete this sparse index integration. > > Signed-off-by: Derrick Stolee <derrickstolee@github.com> > --- > builtin/sparse-checkout.c | 3 +++ > t/t1092-sparse-checkout-compatibility.sh | 25 ++++++++++++++++++++++++ > unpack-trees.c | 4 ++++ > 3 files changed, 32 insertions(+) > > diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c > index cbff6ad00b0..0157b292b36 100644 > --- a/builtin/sparse-checkout.c > +++ b/builtin/sparse-checkout.c > @@ -937,6 +937,9 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) > > git_config(git_default_config, NULL); > > + prepare_repo_settings(the_repository); > + the_repository->settings.command_requires_full_index = 0; > + > if (argc > 0) { > if (!strcmp(argv[0], "list")) > return sparse_checkout_list(argc, argv); > diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh > index 93bcfd20bbc..614357fc48c 100755 > --- a/t/t1092-sparse-checkout-compatibility.sh > +++ b/t/t1092-sparse-checkout-compatibility.sh > @@ -1552,6 +1552,31 @@ test_expect_success 'ls-files' ' > ensure_not_expanded ls-files --sparse > ' > > +test_expect_success 'sparse index is not expanded: sparse-checkout' ' > + init_repos && > + > + ensure_not_expanded sparse-checkout set deep/deeper2 && > + ensure_not_expanded sparse-checkout set deep/deeper1 && > + ensure_not_expanded sparse-checkout set deep && > + ensure_not_expanded sparse-checkout add folder1 && > + ensure_not_expanded sparse-checkout set deep/deeper1 && > + ensure_not_expanded sparse-checkout set folder2 && > + > + # Demonstrate that the checks that "folder1/a" is a file > + # do not cause a sparse-index expansion (since it is in the > + # sparse-checkout cone). > + echo >>sparse-index/folder2/a && > + git -C sparse-index add folder2/a && > + > + ensure_not_expanded sparse-checkout add folder1 && > + > + # Skip checks here, since deep/deeper1 is inside a sparse directory > + # that must be expanded to check whether `deep/deeper1` is a file > + # or not. > + ensure_not_expanded sparse-checkout set --skip-checks deep/deeper1 && > + ensure_not_expanded sparse-checkout set > +' > + These tests look good for ensuring sparsity is preserved, but it'd be nice to also have some "stress tests" of 'sparse-checkout (add|set)'. The purpose would be to make sure the index has the right contents for various types of pattern changes, e.g. running 'sparse-checkout (add|set) <path>', then verifying index contents with 'ls-files --sparse'. Paths might be: - in vs. out of (current) cone - match an existing vs. nonexistent directory etc. > # NEEDSWORK: a sparse-checkout behaves differently from a full checkout > # in this scenario, but it shouldn't. > test_expect_success 'reset mixed and checkout orphan' ' > diff --git a/unpack-trees.c b/unpack-trees.c > index 7f528d35cc2..9745e0dfc34 100644 > --- a/unpack-trees.c > +++ b/unpack-trees.c > @@ -18,6 +18,7 @@ > #include "promisor-remote.h" > #include "entry.h" > #include "parallel-checkout.h" > +#include "sparse-index.h" > > /* > * Error messages expected by scripts out of plumbing commands such as > @@ -2018,6 +2019,9 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o) > goto skip_sparse_checkout; > } > > + /* Expand sparse directories as needed */ > + expand_to_pattern_list(o->src_index, o->pl); > + > /* Set NEW_SKIP_WORKTREE on existing entries. */ > mark_all_ce_unused(o->src_index); > mark_new_skip_worktree(o->pl, o->src_index, 0,
On 5/16/2022 4:38 PM, Victoria Dye wrote: > Derrick Stolee via GitGitGadget wrote: >> From: Derrick Stolee <dstolee@microsoft.com> >> +test_expect_success 'sparse index is not expanded: sparse-checkout' ' >> + init_repos && >> + >> + ensure_not_expanded sparse-checkout set deep/deeper2 && >> + ensure_not_expanded sparse-checkout set deep/deeper1 && >> + ensure_not_expanded sparse-checkout set deep && >> + ensure_not_expanded sparse-checkout add folder1 && >> + ensure_not_expanded sparse-checkout set deep/deeper1 && >> + ensure_not_expanded sparse-checkout set folder2 && >> + >> + # Demonstrate that the checks that "folder1/a" is a file >> + # do not cause a sparse-index expansion (since it is in the >> + # sparse-checkout cone). >> + echo >>sparse-index/folder2/a && >> + git -C sparse-index add folder2/a && >> + >> + ensure_not_expanded sparse-checkout add folder1 && >> + >> + # Skip checks here, since deep/deeper1 is inside a sparse directory >> + # that must be expanded to check whether `deep/deeper1` is a file >> + # or not. >> + ensure_not_expanded sparse-checkout set --skip-checks deep/deeper1 && >> + ensure_not_expanded sparse-checkout set >> +' >> + > > These tests look good for ensuring sparsity is preserved, but it'd be nice > to also have some "stress tests" of 'sparse-checkout (add|set)'. The purpose > would be to make sure the index has the right contents for various types of > pattern changes, e.g. running 'sparse-checkout (add|set) <path>', then > verifying index contents with 'ls-files --sparse'. Paths might be: > > - in vs. out of (current) cone > - match an existing vs. nonexistent directory > > etc. I guess I was relying on tests added previously for the sparse index, such as this one: test_expect_success 'sparse-index contents' ' init_repos && git -C sparse-index ls-files --sparse --stage >cache && for dir in folder1 folder2 x do TREE=$(git -C sparse-index rev-parse HEAD:$dir) && grep "040000 $TREE 0 $dir/" cache \ || return 1 done && git -C sparse-index sparse-checkout set folder1 && git -C sparse-index ls-files --sparse --stage >cache && for dir in deep folder2 x do TREE=$(git -C sparse-index rev-parse HEAD:$dir) && grep "040000 $TREE 0 $dir/" cache \ || return 1 done && git -C sparse-index sparse-checkout set deep/deeper1 && git -C sparse-index ls-files --sparse --stage >cache && for dir in deep/deeper2 folder1 folder2 x do TREE=$(git -C sparse-index rev-parse HEAD:$dir) && grep "040000 $TREE 0 $dir/" cache \ || return 1 done && # Disabling the sparse-index replaces tree entries with full ones git -C sparse-index sparse-checkout init --no-sparse-index && test_sparse_match git ls-files --stage --sparse ' But this test isn't covering enough interesting cases that might cause issues with the changes in this series. I'll add a patch that increases coverage in this area. Thanks, -Stolee
diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index cbff6ad00b0..0157b292b36 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -937,6 +937,9 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) git_config(git_default_config, NULL); + prepare_repo_settings(the_repository); + the_repository->settings.command_requires_full_index = 0; + if (argc > 0) { if (!strcmp(argv[0], "list")) return sparse_checkout_list(argc, argv); diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 93bcfd20bbc..614357fc48c 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -1552,6 +1552,31 @@ test_expect_success 'ls-files' ' ensure_not_expanded ls-files --sparse ' +test_expect_success 'sparse index is not expanded: sparse-checkout' ' + init_repos && + + ensure_not_expanded sparse-checkout set deep/deeper2 && + ensure_not_expanded sparse-checkout set deep/deeper1 && + ensure_not_expanded sparse-checkout set deep && + ensure_not_expanded sparse-checkout add folder1 && + ensure_not_expanded sparse-checkout set deep/deeper1 && + ensure_not_expanded sparse-checkout set folder2 && + + # Demonstrate that the checks that "folder1/a" is a file + # do not cause a sparse-index expansion (since it is in the + # sparse-checkout cone). + echo >>sparse-index/folder2/a && + git -C sparse-index add folder2/a && + + ensure_not_expanded sparse-checkout add folder1 && + + # Skip checks here, since deep/deeper1 is inside a sparse directory + # that must be expanded to check whether `deep/deeper1` is a file + # or not. + ensure_not_expanded sparse-checkout set --skip-checks deep/deeper1 && + ensure_not_expanded sparse-checkout set +' + # NEEDSWORK: a sparse-checkout behaves differently from a full checkout # in this scenario, but it shouldn't. test_expect_success 'reset mixed and checkout orphan' ' diff --git a/unpack-trees.c b/unpack-trees.c index 7f528d35cc2..9745e0dfc34 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -18,6 +18,7 @@ #include "promisor-remote.h" #include "entry.h" #include "parallel-checkout.h" +#include "sparse-index.h" /* * Error messages expected by scripts out of plumbing commands such as @@ -2018,6 +2019,9 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o) goto skip_sparse_checkout; } + /* Expand sparse directories as needed */ + expand_to_pattern_list(o->src_index, o->pl); + /* Set NEW_SKIP_WORKTREE on existing entries. */ mark_all_ce_unused(o->src_index); mark_new_skip_worktree(o->pl, o->src_index, 0,