Message ID | 8d55a6ba2fdf64cee4eb51f3cb6f9808bd0b7505.1629841904.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Sparse index: delete ignored files outside sparse cone | expand |
On Tue, Aug 24, 2021 at 2:51 PM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > > From: Derrick Stolee <dstolee@microsoft.com> > > The clean_tracked_sparse_directories() method deletes the tracked > directories that go out of scope when the sparse-checkout cone changes, > at least in cone mode. This is new behavior, but is recommended based on > our understanding of how users are interacting with the feature in most > cases. > > It is possible that some users will object to the new behavior, so > create a new configuration option 'index.deleteSparseDirectories' that > can be set to 'false' to make clean_tracked_sparse_directories() do > nothing. This will keep all untracked files in the working tree and > cause performance problems with the sparse index, but those trade-offs > are for the user to decide. I'm not sold that we need anything here, and it sounds like this is all being added based on a theoretical concern. I might not object too much normally to trying to address theoretical conerns with new features, but: * I'm a little concerned that we're adding a configuration option (which live forever and are harder to work with backward-compatibility-wise) rather than a command line flag. * It's not clear to me that the option name is what we want. For example, git checkout has a --[no-]overwrite-ignored for overwriting ignored files (or not) when switching to a different branch. git merge has the same flags (though only the fast-forwarding backend heeds it currently). This behavior is quite similar to that flag, and has the same default of treating ignored files as expendable. Should the naming be more similar? * It's not clear to me that the option has the right level of implementation granularity. If someone wants ignored files to be treated as important, should they need to set a merge.no_overwrite_ignored AND a checkout.no_overwrite_ignored AND a index.deleteSparseDirectories AND other names, or have one high-level option that sets all of these behaviors? > > Signed-off-by: Derrick Stolee <dstolee@microsoft.com> > --- > Documentation/config/index.txt | 6 ++++++ > builtin/sparse-checkout.c | 9 ++++++++- > t/t1091-sparse-checkout-builtin.sh | 4 ++++ > 3 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/Documentation/config/index.txt b/Documentation/config/index.txt > index 75f3a2d1054..c65da20a931 100644 > --- a/Documentation/config/index.txt > +++ b/Documentation/config/index.txt > @@ -1,3 +1,9 @@ > +index.deleteSparseDirectories:: > + When enabled, the cone mode sparse-checkout feature will delete > + directories that are outside of the sparse-checkout cone, unless > + such a directory contains an untracked, non-ignored file. Defaults > + to true. > + > index.recordEndOfIndexEntries:: > Specifies whether the index file should include an "End Of Index > Entry" section. This reduces index load time on multiprocessor > diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c > index d0f5c4702be..33ec729d9de 100644 > --- a/builtin/sparse-checkout.c > +++ b/builtin/sparse-checkout.c > @@ -102,7 +102,7 @@ static int sparse_checkout_list(int argc, const char **argv) > > static void clean_tracked_sparse_directories(struct repository *r) > { > - int i, was_full = 0; > + int i, value, was_full = 0; > struct strbuf path = STRBUF_INIT; > size_t pathlen; > struct string_list_item *item; > @@ -118,6 +118,13 @@ static void clean_tracked_sparse_directories(struct repository *r) > !r->index->sparse_checkout_patterns->use_cone_patterns) > return; > > + /* > + * Users can disable this behavior. > + */ > + if (!repo_config_get_bool(r, "index.deletesparsedirectories", &value) && > + !value) > + return; > + > /* > * Use the sparse index as a data structure to assist finding > * directories that are safe to delete. This conversion to a > diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh > index 71236981e64..e0f31186d89 100755 > --- a/t/t1091-sparse-checkout-builtin.sh > +++ b/t/t1091-sparse-checkout-builtin.sh > @@ -666,6 +666,10 @@ test_expect_success 'cone mode clears ignored subdirectories' ' > git -C repo status --porcelain=v2 >out && > test_must_be_empty out && > > + git -C repo -c index.deleteSparseDirectories=false sparse-checkout reapply && > + test_path_is_dir repo/folder1 && > + test_path_is_dir repo/deep/deeper2 && > + > git -C repo sparse-checkout reapply && > test_path_is_missing repo/folder1 && > test_path_is_missing repo/deep/deeper2 && > -- > gitgitgadget
On 8/27/2021 4:58 PM, Elijah Newren wrote: > On Tue, Aug 24, 2021 at 2:51 PM Derrick Stolee via GitGitGadget > <gitgitgadget@gmail.com> wrote: >> >> From: Derrick Stolee <dstolee@microsoft.com> >> >> The clean_tracked_sparse_directories() method deletes the tracked >> directories that go out of scope when the sparse-checkout cone changes, >> at least in cone mode. This is new behavior, but is recommended based on >> our understanding of how users are interacting with the feature in most >> cases. >> >> It is possible that some users will object to the new behavior, so >> create a new configuration option 'index.deleteSparseDirectories' that >> can be set to 'false' to make clean_tracked_sparse_directories() do >> nothing. This will keep all untracked files in the working tree and >> cause performance problems with the sparse index, but those trade-offs >> are for the user to decide. > > I'm not sold that we need anything here, and it sounds like this is > all being added based on a theoretical concern. I might not object > too much normally to trying to address theoretical conerns with new > features, but: > > * I'm a little concerned that we're adding a configuration option > (which live forever and are harder to work with > backward-compatibility-wise) rather than a command line flag. The issue with a command-line flag is that it would need to be added to all the commands that reapply the sparse-checkout definition. Maybe that's just the 'git sparse-checkout (init|set|add|reapply)' and 'git read-tree' commands, but are there other places where this logic might be applied in the future? Also, as more third-party tools integrate with sparse-checkout, I don't expect users to be directly involved in changing their cone, so any use of a command-line flag would need to be integrated into those tools. A config option applies this logic universally, when needed. > * It's not clear to me that the option name is what we want. For > example, git checkout has a --[no-]overwrite-ignored for overwriting > ignored files (or not) when switching to a different branch. git > merge has the same flags (though only the fast-forwarding backend > heeds it currently). This behavior is quite similar to that flag, and > has the same default of treating ignored files as expendable. Should > the naming be more similar? I'm open to changing the name to more closely match existing naming conventions, once we've decided if this should be included at all. > * It's not clear to me that the option has the right level of > implementation granularity. If someone wants ignored files to be > treated as important, should they need to set a > merge.no_overwrite_ignored AND a checkout.no_overwrite_ignored AND a > index.deleteSparseDirectories AND other names, or have one high-level > option that sets all of these behaviors? These cases have different meanings for why an ignored file is important. For merge and checkout, the argument is saying "don't overwrite ignored files if they are to be replaced by a tracked file". For sparse-checkout, the config is saying "don't delete ignored files that are leaving the sparse-checkout scope". I think it is meaningful to say that files that match the .gitignore patterns but are still tracked are deleted, because sparse-checkout removes all tracked files. I'm less convinced that there needs to be a meta-setting that covers all of these cases simultaneously. --- As for moving forward, I'm fine skipping this patch if there is no support for it. I just wanted to demonstrate that it could be done. Perhaps we wait for the theoretical concern to be a real one, as requested by a user with a legitimate reason to care? Thanks, -Stolee
On Mon, Aug 30, 2021 at 6:30 AM Derrick Stolee <stolee@gmail.com> wrote: > > On 8/27/2021 4:58 PM, Elijah Newren wrote: > > On Tue, Aug 24, 2021 at 2:51 PM Derrick Stolee via GitGitGadget > > <gitgitgadget@gmail.com> wrote: > >> > >> From: Derrick Stolee <dstolee@microsoft.com> > >> > >> The clean_tracked_sparse_directories() method deletes the tracked > >> directories that go out of scope when the sparse-checkout cone changes, > >> at least in cone mode. This is new behavior, but is recommended based on > >> our understanding of how users are interacting with the feature in most > >> cases. > >> > >> It is possible that some users will object to the new behavior, so > >> create a new configuration option 'index.deleteSparseDirectories' that > >> can be set to 'false' to make clean_tracked_sparse_directories() do > >> nothing. This will keep all untracked files in the working tree and > >> cause performance problems with the sparse index, but those trade-offs > >> are for the user to decide. > > > > I'm not sold that we need anything here, and it sounds like this is > > all being added based on a theoretical concern. I might not object > > too much normally to trying to address theoretical conerns with new > > features, but: > > > > * I'm a little concerned that we're adding a configuration option > > (which live forever and are harder to work with > > backward-compatibility-wise) rather than a command line flag. > > The issue with a command-line flag is that it would need to be added > to all the commands that reapply the sparse-checkout definition. > Maybe that's just the 'git sparse-checkout (init|set|add|reapply)' > and 'git read-tree' commands, but are there other places where this > logic might be applied in the future? > > Also, as more third-party tools integrate with sparse-checkout, I > don't expect users to be directly involved in changing their cone, > so any use of a command-line flag would need to be integrated into > those tools. A config option applies this logic universally, when > needed. > > > * It's not clear to me that the option name is what we want. For > > example, git checkout has a --[no-]overwrite-ignored for overwriting > > ignored files (or not) when switching to a different branch. git > > merge has the same flags (though only the fast-forwarding backend > > heeds it currently). This behavior is quite similar to that flag, and > > has the same default of treating ignored files as expendable. Should > > the naming be more similar? > > I'm open to changing the name to more closely match existing naming > conventions, once we've decided if this should be included at all. > > > * It's not clear to me that the option has the right level of > > implementation granularity. If someone wants ignored files to be > > treated as important, should they need to set a > > merge.no_overwrite_ignored AND a checkout.no_overwrite_ignored AND a > > index.deleteSparseDirectories AND other names, or have one high-level > > option that sets all of these behaviors? > > These cases have different meanings for why an ignored file is > important. For merge and checkout, the argument is saying "don't > overwrite ignored files if they are to be replaced by a tracked > file". For sparse-checkout, the config is saying "don't delete > ignored files that are leaving the sparse-checkout scope". I think > it is meaningful to say that files that match the .gitignore > patterns but are still tracked are deleted, because sparse-checkout > removes all tracked files. > > I'm less convinced that there needs to be a meta-setting that covers > all of these cases simultaneously. > > --- > > As for moving forward, I'm fine skipping this patch if there is no > support for it. I just wanted to demonstrate that it could be done. That's fair, and you certainly achieved that. > Perhaps we wait for the theoretical concern to be a real one, as > requested by a user with a legitimate reason to care? Sounds good to me. Then we'll have a better frame of reference for judging how the feature should be tailored as well.
diff --git a/Documentation/config/index.txt b/Documentation/config/index.txt index 75f3a2d1054..c65da20a931 100644 --- a/Documentation/config/index.txt +++ b/Documentation/config/index.txt @@ -1,3 +1,9 @@ +index.deleteSparseDirectories:: + When enabled, the cone mode sparse-checkout feature will delete + directories that are outside of the sparse-checkout cone, unless + such a directory contains an untracked, non-ignored file. Defaults + to true. + index.recordEndOfIndexEntries:: Specifies whether the index file should include an "End Of Index Entry" section. This reduces index load time on multiprocessor diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index d0f5c4702be..33ec729d9de 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -102,7 +102,7 @@ static int sparse_checkout_list(int argc, const char **argv) static void clean_tracked_sparse_directories(struct repository *r) { - int i, was_full = 0; + int i, value, was_full = 0; struct strbuf path = STRBUF_INIT; size_t pathlen; struct string_list_item *item; @@ -118,6 +118,13 @@ static void clean_tracked_sparse_directories(struct repository *r) !r->index->sparse_checkout_patterns->use_cone_patterns) return; + /* + * Users can disable this behavior. + */ + if (!repo_config_get_bool(r, "index.deletesparsedirectories", &value) && + !value) + return; + /* * Use the sparse index as a data structure to assist finding * directories that are safe to delete. This conversion to a diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index 71236981e64..e0f31186d89 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -666,6 +666,10 @@ test_expect_success 'cone mode clears ignored subdirectories' ' git -C repo status --porcelain=v2 >out && test_must_be_empty out && + git -C repo -c index.deleteSparseDirectories=false sparse-checkout reapply && + test_path_is_dir repo/folder1 && + test_path_is_dir repo/deep/deeper2 && + git -C repo sparse-checkout reapply && test_path_is_missing repo/folder1 && test_path_is_missing repo/deep/deeper2 &&