Message ID | pull.1398.git.1667189512579.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] diff: introduce scope option | expand |
On Mon, Oct 31, 2022 at 04:11:52AM +0000, ZheNing Hu via GitGitGadget wrote:
> t/t4070-diff-sparse-checkout-scope.sh | 469 ++++++++++++++++++++++++++
It looks like this test is non-executable, leading to the following
error:
~/s/git [nand] (zh/diff--scope) $ make test
SUBDIR templates
make -C t/ all
make[1]: Entering directory '/home/ttaylorr/src/git/t'
rm -f -r 'test-results'
non-executable tests: t4070-diff-sparse-checkout-scope.sh
make[1]: *** [Makefile:120: test-lint-executable] Error 1
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory '/home/ttaylorr/src/git/t'
make: *** [Makefile:3083: test] Error 2
Thanks,
Taylor
Taylor Blau <me@ttaylorr.com> 于2022年11月1日周二 09:34写道: > > On Mon, Oct 31, 2022 at 04:11:52AM +0000, ZheNing Hu via GitGitGadget wrote: > > t/t4070-diff-sparse-checkout-scope.sh | 469 ++++++++++++++++++++++++++ > > It looks like this test is non-executable, leading to the following > error: > > ~/s/git [nand] (zh/diff--scope) $ make test > SUBDIR templates > make -C t/ all > make[1]: Entering directory '/home/ttaylorr/src/git/t' > rm -f -r 'test-results' > non-executable tests: t4070-diff-sparse-checkout-scope.sh > make[1]: *** [Makefile:120: test-lint-executable] Error 1 > make[1]: *** Waiting for unfinished jobs.... > make[1]: Leaving directory '/home/ttaylorr/src/git/t' > make: *** [Makefile:3083: test] Error 2 > Thanks for pointing out that I had overlooked this problem. > Thanks, > Taylor ZheNing Hu
Hi, On Sun, Oct 30, 2022 at 9:11 PM ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com> wrote: > > From: ZheNing Hu <adlternative@gmail.com> > > When we use sparse-checkout, we often want the set of files > that some commands operate on to be restricted to the > sparse-checkout specification. It seems a bit premature to send this, when the guideline document[*] detailing how these options should work is still in the "Needs Review" state. I know, it's been waiting for a while, but it's a _long_ document. [*] https://lore.kernel.org/git/pull.1367.v3.git.1665269538608.gitgitgadget@gmail.com/ > So introduce the `--scope` option to git diff, which have two > value: "sparse" and "all". "sparse" mean that diff is performed > restrict to paths which matching sparse-checkout specification, > "all" mean that diff is performed regardless of whether the path > meets the sparse-checkout specification. The wording probably needs some care to reflect the fact that it only affects cases where either --cached or revisions are passed. In particular, your wording for "all" suggests behavior very different from today, whereas "all" is probably best thought of as today's current behavior. For example, a plain `git diff` without --cached or revisions, should be unaffected by either of these flags. > `--no-scope` is the default > option for now. What does --no-scope even mean? You didn't explain it, and I don't see how it could make sense. We explicitly avoided a `--no-` prefix by allowing the --scope option to take a value. I don't think there should be a --no-scope option. > Add `diff.scope={sparse, all}` config, which can also have the same > capabilities as `--scope`, and it will be covered by `--scope` option. This is not what we want. The high level usecases should not need to be configured per-command. There should be a config option which reflects the high level use cases (e.g. sparse.scope) and then all relevant commands (diff, log, grep, etc.) can key off it. > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > --- > [RFC] diff: introduce scope option > > In [1], we discovered that users working on different sparse-checkout > specification may download unnecessary blobs from each other's > specification in collaboration. In [2] Junio suggested that maybe we can > restrict some git command's filespec in sparse-checkout specification to > elegantly solve this problem above. In [3]: Newren and Derrick Stolee > prefer to name the option --scope={sparse, all}. > > So this patch is attempt to do this thing on git diff: > > v1: > > 1. add --restrict option to git diff, which restrict diff filespec in > sparse-checkout specification. [4] v2. > 2. rename --restrict to --scope={sparse, all}, support --no-scope. > 3. add config: diff.scope={sparse,all}. > > Unresolved work: > > 1. how to properly pass this --scope={sparse, all} to other commands > like git log, git format-patch, etc. log & grep should accept a similar flag. format-patch should not, and should ignore any config in this area. > 2. how to set the default value of scope for different diff commands. I don't understand this. > [1]: > https://lore.kernel.org/git/CAOLTT8SHo66kGbvWr=+LQ9UVd1NHgqGGEYK2qq6==QgRCgLZqQ@mail.gmail.com/ > [2]: https://lore.kernel.org/git/xmqqzgeqw0sy.fsf@gitster.g/ [3]: > https://lore.kernel.org/git/07a25d48-e364-0d9b-6ffa-41a5984eb5db@github.com/ > [4]: > https://lore.kernel.org/git/pull.1368.git.1664036052741.gitgitgadget@gmail.com/ > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1398%2Fadlternative%2Fzh%2Fdiff-scope-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1398/adlternative/zh/diff-scope-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/1398 > > Documentation/config/diff.txt | 12 + > Documentation/diff-options.txt | 18 + > builtin/diff.c | 4 + > diff-lib.c | 36 +- > diff-no-index.c | 4 + > diff.c | 39 +++ > diff.h | 11 + > t/t4070-diff-sparse-checkout-scope.sh | 469 ++++++++++++++++++++++++++ > tree-diff.c | 5 + > 9 files changed, 597 insertions(+), 1 deletion(-) > create mode 100644 t/t4070-diff-sparse-checkout-scope.sh > > diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt > index 35a7bf86d77..52707e1b2d6 100644 > --- a/Documentation/config/diff.txt > +++ b/Documentation/config/diff.txt > @@ -201,6 +201,18 @@ diff.algorithm:: > -- > + > > +diff.scope:: > + Choose diff scope. The variants are as follows: > ++ > +-- > +`sparse`;; > + Restrict diff paths to those matching sparse-checkout specification. > +`all`;; > + Without restriction, diff is performed regardless of whether the path > + meets the sparse-checkout specification. As noted above, this is the wrong level to specify things. The description for "all" is misleading as well and suggests something other than "behavior B" from the direction document. > diff.wsErrorHighlight:: > Highlight whitespace errors in the `context`, `old` or `new` > lines of the diff. Multiple values are separated by comma, > diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt > index 3674ac48e92..04bf83e8be1 100644 > --- a/Documentation/diff-options.txt > +++ b/Documentation/diff-options.txt > @@ -195,6 +195,24 @@ For instance, if you configured the `diff.algorithm` variable to a > non-default value and want to use the default one, then you > have to use `--diff-algorithm=default` option. > > +ifndef::git-format-patch[] > +ifndef::git-log[] > + > +--scope={sparse|all}:: > + Choose diff scope. The variants are as follows: > ++ > +-- > +`--sparse`;; > + Restrict diff paths to those matching sparse-checkout specification. > +`--all`;; > + Without restriction, diff is performed regardless of whether the path > + meets the sparse-checkout specification. > +-- > ++ > + > +endif::git-log[] > +endif::git-format-patch[] What about diff-files, diff-index, diff-tree, and show? > + > --stat[=<width>[,<name-width>[,<count>]]]:: > Generate a diffstat. By default, as much space as necessary > will be used for the filename part, and the rest for the graph > diff --git a/builtin/diff.c b/builtin/diff.c > index 854d2c5a5c4..6b450f7184c 100644 > --- a/builtin/diff.c > +++ b/builtin/diff.c > @@ -54,6 +54,10 @@ static void stuff_change(struct diff_options *opt, > oideq(old_oid, new_oid) && (old_mode == new_mode)) > return; > > + if (opt->scope == DIFF_SCOPE_SPARSE && > + !diff_paths_in_sparse_checkout(old_path, new_path)) > + return; This can't be right. git diff c231e0f26fe9b2ea9ec46aa68ff95ba984ce592e 72d42bd856228c15f702fa3c353432f4f1defe03 (to directly diff two known blobs) will go through this function, with old_path == c231e0f26fe9b2ea9ec46aa68ff95ba984ce592e and new_path == 72d42bd856228c15f702fa3c353432f4f1defe03. But those aren't real paths, and sparse-checkout should not restrict what is shown in those cases. > + > if (opt->flags.reverse_diff) { > SWAP(old_mode, new_mode); > SWAP(old_oid, new_oid); > diff --git a/diff-lib.c b/diff-lib.c > index 2edea41a234..a3381f2e0ff 100644 > --- a/diff-lib.c > +++ b/diff-lib.c > @@ -88,6 +88,22 @@ static int match_stat_with_submodule(struct diff_options *diffopt, > return changed; > } > > +int diff_path_in_sparse_checkout(const char *path) { > + if (core_sparse_checkout_cone) > + return path_in_cone_mode_sparse_checkout(path, the_repository->index); > + else > + return path_in_sparse_checkout(path, the_repository->index); > +} This says we are including the path if it matches the sparsity patterns. Thus, we have to be careful when we use this function, because the relevant paths are ones that match the sparsity specification. The sparsity specification will always match the sparsity patterns when diffing two commits, but when either the index or the working tree is part of the diff, the sparsity specification *might* be temporarily expanded. > +int diff_paths_in_sparse_checkout(const char *one, const char*two) { > + if (one == two || !strcmp(one, two)) > + return diff_path_in_sparse_checkout(one); > + else > + return diff_path_in_sparse_checkout(one) && > + diff_path_in_sparse_checkout(two); Why && rather than || ? > +} > + > + > int run_diff_files(struct rev_info *revs, unsigned int option) > { > int entries, i; > @@ -113,6 +129,9 @@ int run_diff_files(struct rev_info *revs, unsigned int option) > > if (diff_can_quit_early(&revs->diffopt)) > break; > + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE && > + !diff_path_in_sparse_checkout(ce->name)) > + continue; Here you've cut off the possibility of showing diffs for anything outside the sparsity patterns, which is a mistake. We need to handle a temporarily expanded sparse specification too. > if (!ce_path_match(istate, ce, &revs->prune_data, NULL)) > continue; > @@ -202,7 +221,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option) > continue; > } > > - if (ce_uptodate(ce) || ce_skip_worktree(ce)) > + if (ce_uptodate(ce) || > + (revs->diffopt.scope != DIFF_SCOPE_ALL && ce_skip_worktree(ce))) > continue; Here you make --scope=all show files even if they are skip-worktree, making them appear to have been deleted. I called out your description earlier as potentially misleading, because it could imply this behavior. It looks like you were consistent with the description and implementation, it just doesn't match what we want. > /* > @@ -439,6 +459,20 @@ static void do_oneway_diff(struct unpack_trees_options *o, do_oneway_diff is for cases where we are diffing against the index... > return; /* nothing to diff.. */ > } > > + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE) { > + if (idx && tree) { > + if (!diff_paths_in_sparse_checkout(idx->name, tree->name)) > + return; > + } else if (idx) { > + if (!diff_path_in_sparse_checkout(idx->name)) > + return; > + } else if (tree) { > + if (!diff_path_in_sparse_checkout(tree->name)) > + return; > + } else > + return; > + } ...and you again mistakenly only compare to the sparsity patterns instead of the sparse specification. > + > /* if the entry is not checked out, don't examine work tree */ > cached = o->index_only || > (idx && ((idx->ce_flags & CE_VALID) || ce_skip_worktree(idx))); > diff --git a/diff-no-index.c b/diff-no-index.c > index 18edbdf4b59..ea94a104ea4 100644 > --- a/diff-no-index.c > +++ b/diff-no-index.c > @@ -281,6 +281,10 @@ int diff_no_index(struct rev_info *revs, > > fixup_paths(paths, &replacement); > > + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE && > + !diff_paths_in_sparse_checkout(paths[0], paths[1])) > + goto out; --no-index means we're diffing two files that are not tracked, or at least treating them as not tracked. sparse-checkout should not affect such files. > + > revs->diffopt.skip_stat_unmatch = 1; > if (!revs->diffopt.output_format) > revs->diffopt.output_format = DIFF_FORMAT_PATCH; > diff --git a/diff.c b/diff.c > index 285d6e2d575..9de4044ae05 100644 > --- a/diff.c > +++ b/diff.c > @@ -48,6 +48,7 @@ static int diff_interhunk_context_default; > static const char *diff_word_regex_cfg; > static const char *external_diff_cmd_cfg; > static const char *diff_order_file_cfg; > +static const char *external_diff_scope_cfg; > int diff_auto_refresh_index = 1; > static int diff_mnemonic_prefix; > static int diff_no_prefix; > @@ -57,6 +58,7 @@ static int diff_dirstat_permille_default = 30; > static struct diff_options default_diff_options; > static long diff_algorithm; > static unsigned ws_error_highlight_default = WSEH_NEW; > +static enum diff_scope external_diff_scope; Why is this called "external"? > static char diff_colors[][COLOR_MAXLEN] = { > GIT_COLOR_RESET, > @@ -423,6 +425,16 @@ int git_diff_ui_config(const char *var, const char *value, void *cb) > return 0; > } > > + if (!strcmp(var, "diff.scope")) { > + git_config_string(&external_diff_scope_cfg, var, value); > + if (!strcmp(value, "all")) > + external_diff_scope = DIFF_SCOPE_ALL; > + else if (!strcmp(value, "sparse")) > + external_diff_scope = DIFF_SCOPE_SPARSE; > + else > + return -1; > + } > + > if (git_color_config(var, value, cb) < 0) > return -1; > > @@ -4663,6 +4675,7 @@ void repo_diff_setup(struct repository *r, struct diff_options *options) > > options->color_moved = diff_color_moved_default; > options->color_moved_ws_handling = diff_color_moved_ws_default; > + options->scope = external_diff_scope; > > prep_parse_options(options); > } > @@ -4914,6 +4927,29 @@ static int parse_dirstat_opt(struct diff_options *options, const char *params) > return 1; > } > > +static int diff_opt_diff_scope(const struct option *option, > + const char *optarg, int unset) > +{ > + struct diff_options *opt = option->value; > + > + if (unset) { > + opt->scope = DIFF_SCOPE_NONE; I think we should instead have a BUG_ON_OPT_NEG(unset) or, even better, a BUG_ON_OPT_NEG_NOARG(unset, optarg) at the beginning of this function... > + } else if (optarg) { ...which would also allow you to drop this if and dedent the rest of the function. > + if (!strcmp(optarg, "all")) { > + if (core_apply_sparse_checkout) { > + opt->scope = DIFF_SCOPE_ALL; > + } > + } else if (!strcmp(optarg, "sparse")) { > + if (core_apply_sparse_checkout) { > + opt->scope = DIFF_SCOPE_SPARSE; > + } If core_apply_sparse_checkout is false, should we perhaps throw an error instead of just silently ignoring the option the user passed? > + } else > + return error(_("invalid --scope value: %s"), optarg); > + } As written with no follow-on else clause here, wouldn't this silently accept "--scope" without an "=<something>" argument without an error and without properly initializing opt->scope? > + > + return 0; > +} > + > static int diff_opt_diff_filter(const struct option *option, > const char *optarg, int unset) > { > @@ -5683,6 +5719,9 @@ static void prep_parse_options(struct diff_options *options) > OPT_CALLBACK_F(0, "diff-filter", options, N_("[(A|C|D|M|R|T|U|X|B)...[*]]"), > N_("select files by diff type"), > PARSE_OPT_NONEG, diff_opt_diff_filter), > + OPT_CALLBACK_F(0, "scope", options, N_("[sparse|all]"), > + N_("choose diff scope"), maybe "choose diff scope in sparse checkouts"? > + PARSE_OPT_OPTARG, diff_opt_diff_scope), > { OPTION_CALLBACK, 0, "output", options, N_("<file>"), > N_("output to a specific file"), > PARSE_OPT_NONEG, NULL, 0, diff_opt_output }, > diff --git a/diff.h b/diff.h > index 8ae18e5ab1e..90f7512034c 100644 > --- a/diff.h > +++ b/diff.h > @@ -230,6 +230,12 @@ enum diff_submodule_format { > DIFF_SUBMODULE_INLINE_DIFF > }; > > +enum diff_scope { > + DIFF_SCOPE_NONE = 0, > + DIFF_SCOPE_ALL, > + DIFF_SCOPE_SPARSE, > +}; > + > /** > * the set of options the calling program wants to affect the operation of > * diffcore library with. > @@ -285,6 +291,9 @@ struct diff_options { > /* diff-filter bits */ > unsigned int filter, filter_not; > > + /* diff sparse-checkout scope */ > + enum diff_scope scope; > + > int use_color; > > /* Number of context lines to generate in patch output. */ > @@ -696,4 +705,6 @@ void print_stat_summary(FILE *fp, int files, > int insertions, int deletions); > void setup_diff_pager(struct diff_options *); > > +int diff_path_in_sparse_checkout(const char *path); > +int diff_paths_in_sparse_checkout(const char *one, const char *two); > #endif /* DIFF_H */ > diff --git a/t/t4070-diff-sparse-checkout-scope.sh b/t/t4070-diff-sparse-checkout-scope.sh > new file mode 100644 This needs to be fixed. > index 00000000000..dca75a3308b > --- /dev/null > +++ b/t/t4070-diff-sparse-checkout-scope.sh > @@ -0,0 +1,469 @@ > +#!/bin/sh > + > +test_description='diff sparse-checkout scope' > + > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > + > +. ./test-lib.sh > + > + > +test_expect_success 'setup' ' > + git init temp && > + ( > + cd temp && > + mkdir sub1 && > + mkdir sub2 && > + echo sub1/file1 >sub1/file1 && > + echo sub2/file2 >sub2/file2 && > + echo file1 >file1 && > + echo file2 >file2 && > + git add --all && > + git commit -m init && > + echo sub1/file1 >>sub1/file1 && > + echo sub1/file2 >>sub1/file2 && > + echo sub2/file1 >>sub2/file1 && > + echo sub2/file2 >>sub2/file2 && > + echo file1 >>file1 && > + echo file2 >>file2 && > + git add --all && > + git commit -m change1 && > + echo sub1/file1 >>sub1/file1 && > + echo sub1/file2 >>sub1/file2 && > + echo sub2/file1 >>sub2/file1 && > + echo sub2/file2 >>sub2/file2 && > + echo file1 >>file1 && > + echo file2 >>file2 && > + git add --all && > + git commit -m change2 > + ) > +' > + > +reset_repo () { > + rm -rf repo && > + git clone --no-checkout temp repo Why --no-checkout rather than say --sparse? > +} > + > +reset_with_sparse_checkout() { > + reset_repo && > + git -C repo sparse-checkout set $1 sub1 && > + git -C repo checkout Fixing the above would let us get rid of this really weird extra checkout command too. > +} > + > +change_worktree_and_index() { > + ( > + cd repo && > + mkdir sub2 sub3 && > + echo sub1/file3 >sub1/file3 && > + echo sub2/file3 >sub2/file3 && > + echo sub3/file3 >sub3/file3 && > + echo file3 >file3 && > + git add --all --sparse && > + echo sub1/file3 >>sub1/file3 && > + echo sub2/file3 >>sub2/file3 && > + echo sub3/file3 >>sub3/file3 && > + echo file3 >>file3 > + ) > +} It would be nice to modify different paths in the working tree and index, to see if we can handle cases where the sparse specification does not match the sparsity patterns better. (So, modify files inside and outside the sparsity patterns, stage those changes, and then do a `git sparse-checkout reapply` to make the files outside the sparsity patterns disappear from the working tree...then modify different files in the working tree both inside and outside the sparsity patterns. And also remove some file that matches the sparsity patterns and manually mark it as SKIP_WORKTREE.) That'd give us much better coverage. > + > +diff_scope() { > + title=$1 > + need_change_worktree_and_index=$2 > + sparse_checkout_option=$3 > + scope_option=$4 > + expect=$5 > + shift 5 > + args=("$@") > + > + test_expect_success "$title $sparse_checkout_option $scope_option" " > + reset_with_sparse_checkout $sparse_checkout_option && > + if test \"$need_change_worktree_and_index\" = \"true\" ; then > + change_worktree_and_index > + fi && > + git -C repo diff $scope_option ${args[*]} >actual && > + if test -z \"$expect\" ; then > + >expect > + else > + cat > expect <<-EOF > +$expect > + EOF > + fi && > + test_cmp expect actual > + " > +} > + > +args=("--name-only" "HEAD" "HEAD~") > +diff_scope builtin_diff_tree false "--no-cone" "--scope=sparse" \ > +"sub1/file1 > +sub1/file2" "${args[@]}" > + > +diff_scope builtin_diff_tree false "--no-cone" "--scope=all" \ > +"file1 > +file2 > +sub1/file1 > +sub1/file2 > +sub2/file1 > +sub2/file2" "${args[@]}" > + > +diff_scope builtin_diff_tree false "--no-cone" "--no-scope" \ > +"file1 > +file2 > +sub1/file1 > +sub1/file2 > +sub2/file1 > +sub2/file2" "${args[@]}" > + > +diff_scope builtin_diff_tree false "--cone" "--scope=sparse" \ > +"file1 > +file2 > +sub1/file1 > +sub1/file2" "${args[@]}" > + > +diff_scope builtin_diff_tree false "--cone" "--scope=all" \ > +"file1 > +file2 > +sub1/file1 > +sub1/file2 > +sub2/file1 > +sub2/file2" "${args[@]}" > + > +diff_scope builtin_diff_tree false "--cone" "--no-scope" \ > +"file1 > +file2 > +sub1/file1 > +sub1/file2 > +sub2/file1 > +sub2/file2" "${args[@]}" > + > +args=("--name-only" "HEAD~") > +diff_scope builtin_diff_index true "--no-cone" "--scope=sparse" \ > +"sub1/file1 > +sub1/file2 > +sub1/file3" "${args[@]}" Here's a good example where the testcase is doing the wrong thing. The expected answer here would also include file3, sub2/file3, and sub3/file3. > + > +diff_scope builtin_diff_index true "--no-cone" "--scope=all" \ > +"file1 > +file2 > +file3 > +sub1/file1 > +sub1/file2 > +sub1/file3 > +sub2/file1 > +sub2/file2 > +sub2/file3 > +sub3/file3" "${args[@]}" > + > +diff_scope builtin_diff_index true "--no-cone" "--no-scope" \ > +"file1 > +file2 > +file3 > +sub1/file1 > +sub1/file2 > +sub1/file3 > +sub2/file1 > +sub2/file2 > +sub2/file3 > +sub3/file3" "${args[@]}" > + > +diff_scope builtin_diff_index true "--cone" "--scope=sparse" \ > +"file1 > +file2 > +file3 > +sub1/file1 > +sub1/file2 > +sub1/file3" "${args[@]}" This is also wrong; it's missing sub2/file3 and sub3/file3. > + > +diff_scope builtin_diff_index true "--cone" "--scope=all" \ > +"file1 > +file2 > +file3 > +sub1/file1 > +sub1/file2 > +sub1/file3 > +sub2/file1 > +sub2/file2 > +sub2/file3 > +sub3/file3" "${args[@]}" > + > +diff_scope builtin_diff_index true "--cone" "--no-scope" \ > +"file1 > +file2 > +file3 > +sub1/file1 > +sub1/file2 > +sub1/file3 > +sub2/file1 > +sub2/file2 > +sub2/file3 > +sub3/file3" "${args[@]}" > + > +args=("--name-only" "file3" "sub1/" "sub2/") > + > +diff_scope builtin_diff_files true "--no-cone" "--scope=sparse" \ > +"sub1/file3" "${args[@]}" This should also include file3, sub2/file3, and sub3/file3. `--scope=` should not affect diff output at all if neither --cached nor revision arguments are supplied. > + > +diff_scope builtin_diff_files true "--no-cone" "--scope=all" \ > +"file3 > +sub1/file3 > +sub2/file1 > +sub2/file2 > +sub2/file3" "${args[@]}" This is wrong due to including too much; it should not include sub2/file1 or sub2/file2 (it is only including those because it is showing them as deleted, when they are not deleted but are SKIP_WORKTREE). I think I'm going to stop reviewing here. I'm probably just going to keep repeating the same issues I identified earlier if I continue.
inHi, Elijah Newren <newren@gmail.com> 于2022年11月1日周二 13:18写道: > > Hi, > > On Sun, Oct 30, 2022 at 9:11 PM ZheNing Hu via GitGitGadget > <gitgitgadget@gmail.com> wrote: > > > > From: ZheNing Hu <adlternative@gmail.com> > > > > When we use sparse-checkout, we often want the set of files > > that some commands operate on to be restricted to the > > sparse-checkout specification. > > It seems a bit premature to send this, when the guideline document[*] > detailing how these options should work is still in the "Needs Review" > state. I know, it's been waiting for a while, but it's a _long_ > document. > > [*] https://lore.kernel.org/git/pull.1367.v3.git.1665269538608.gitgitgadget@gmail.com/ > Fine, I just want to start trying to experiment with this feature in git-diff earlier, and I can wait for the sparse-checkout.txt documentation to finish first if needed :) > > So introduce the `--scope` option to git diff, which have two > > value: "sparse" and "all". "sparse" mean that diff is performed > > restrict to paths which matching sparse-checkout specification, > > "all" mean that diff is performed regardless of whether the path > > meets the sparse-checkout specification. > > The wording probably needs some care to reflect the fact that it only > affects cases where either --cached or revisions are passed. In > particular, your wording for "all" suggests behavior very different > from today, whereas "all" is probably best thought of as today's > current behavior. For example, a plain `git diff` without --cached or > revisions, should be unaffected by either of these flags. > Yes, after re-reading your sparse-checkout.txt patch, I realized that I misinterpreted "--scope=sparse" as sparse patterns instead of sparse specification, and misinterpreted "-scope=all" as diff on all files. > > `--no-scope` is the default > > option for now. > > What does --no-scope even mean? You didn't explain it, and I don't > see how it could make sense. We explicitly avoided a `--no-` prefix > by allowing the --scope option to take a value. I don't think there > should be a --no-scope option. > I think the “--no-scope” here does nothing, as if it were unaffected by scope (just like correctly "--scope=all", right?) > > Add `diff.scope={sparse, all}` config, which can also have the same > > capabilities as `--scope`, and it will be covered by `--scope` option. > > This is not what we want. The high level usecases should not need to > be configured per-command. There should be a config option which > reflects the high level use cases (e.g. sparse.scope) and then all > relevant commands (diff, log, grep, etc.) can key off it. > Ok, using a global config should indeed be more useful. > > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > > --- > > [RFC] diff: introduce scope option > > > > In [1], we discovered that users working on different sparse-checkout > > specification may download unnecessary blobs from each other's > > specification in collaboration. In [2] Junio suggested that maybe we can > > restrict some git command's filespec in sparse-checkout specification to > > elegantly solve this problem above. In [3]: Newren and Derrick Stolee > > prefer to name the option --scope={sparse, all}. > > > > So this patch is attempt to do this thing on git diff: > > > > v1: > > > > 1. add --restrict option to git diff, which restrict diff filespec in > > sparse-checkout specification. [4] v2. > > 2. rename --restrict to --scope={sparse, all}, support --no-scope. > > 3. add config: diff.scope={sparse,all}. > > > > Unresolved work: > > > > 1. how to properly pass this --scope={sparse, all} to other commands > > like git log, git format-patch, etc. > > log & grep should accept a similar flag. format-patch should not, and > should ignore any config in this area. > > > 2. how to set the default value of scope for different diff commands. > > I don't understand this. > Oh, I was just curious if the config defaults for scope needed to be configured separately for the different diff commands git diff-files, git diff-index, git diff-no-index, git diff-tree, since sparse-checkout.txt mentions the different behavior of scope for them. Now I think this just needs to be handled in command code logic and no additional command level configuration is needed. > > [1]: > > https://lore.kernel.org/git/CAOLTT8SHo66kGbvWr=+LQ9UVd1NHgqGGEYK2qq6==QgRCgLZqQ@mail.gmail.com/ > > [2]: https://lore.kernel.org/git/xmqqzgeqw0sy.fsf@gitster.g/ [3]: > > https://lore.kernel.org/git/07a25d48-e364-0d9b-6ffa-41a5984eb5db@github.com/ > > [4]: > > https://lore.kernel.org/git/pull.1368.git.1664036052741.gitgitgadget@gmail.com/ > > > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1398%2Fadlternative%2Fzh%2Fdiff-scope-v1 > > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1398/adlternative/zh/diff-scope-v1 > > Pull-Request: https://github.com/gitgitgadget/git/pull/1398 > > > > Documentation/config/diff.txt | 12 + > > Documentation/diff-options.txt | 18 + > > builtin/diff.c | 4 + > > diff-lib.c | 36 +- > > diff-no-index.c | 4 + > > diff.c | 39 +++ > > diff.h | 11 + > > t/t4070-diff-sparse-checkout-scope.sh | 469 ++++++++++++++++++++++++++ > > tree-diff.c | 5 + > > 9 files changed, 597 insertions(+), 1 deletion(-) > > create mode 100644 t/t4070-diff-sparse-checkout-scope.sh > > > > diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt > > index 35a7bf86d77..52707e1b2d6 100644 > > --- a/Documentation/config/diff.txt > > +++ b/Documentation/config/diff.txt > > @@ -201,6 +201,18 @@ diff.algorithm:: > > -- > > + > > > > +diff.scope:: > > + Choose diff scope. The variants are as follows: > > ++ > > +-- > > +`sparse`;; > > + Restrict diff paths to those matching sparse-checkout specification. > > +`all`;; > > + Without restriction, diff is performed regardless of whether the path > > + meets the sparse-checkout specification. > > As noted above, this is the wrong level to specify things. The > description for "all" is misleading as well and suggests something > other than "behavior B" from the direction document. > So do we need to make "--scope=all" correspond to the "behavior B", The correct description of it should be: "worktree-sparse-history-dense"... > > diff.wsErrorHighlight:: > > Highlight whitespace errors in the `context`, `old` or `new` > > lines of the diff. Multiple values are separated by comma, > > diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt > > index 3674ac48e92..04bf83e8be1 100644 > > --- a/Documentation/diff-options.txt > > +++ b/Documentation/diff-options.txt > > @@ -195,6 +195,24 @@ For instance, if you configured the `diff.algorithm` variable to a > > non-default value and want to use the default one, then you > > have to use `--diff-algorithm=default` option. > > > > +ifndef::git-format-patch[] > > +ifndef::git-log[] > > + > > +--scope={sparse|all}:: > > + Choose diff scope. The variants are as follows: > > ++ > > +-- > > +`--sparse`;; > > + Restrict diff paths to those matching sparse-checkout specification. > > +`--all`;; > > + Without restriction, diff is performed regardless of whether the path > > + meets the sparse-checkout specification. > > +-- > > ++ > > + > > +endif::git-log[] > > +endif::git-format-patch[] > > What about diff-files, diff-index, diff-tree, and show? > diff-options.txt included in their documents, and git-format-patch.txt, git-log.txt, but should not in git-format-patch.txt. I don't know if it should be included in git-diff-files.txt, because git diff-files compare the files in the working tree and the index (so it's the same as "simple" git-diff, which should not be affected by scope?) > > + > > --stat[=<width>[,<name-width>[,<count>]]]:: > > Generate a diffstat. By default, as much space as necessary > > will be used for the filename part, and the rest for the graph > > diff --git a/builtin/diff.c b/builtin/diff.c > > index 854d2c5a5c4..6b450f7184c 100644 > > --- a/builtin/diff.c > > +++ b/builtin/diff.c > > @@ -54,6 +54,10 @@ static void stuff_change(struct diff_options *opt, > > oideq(old_oid, new_oid) && (old_mode == new_mode)) > > return; > > > > + if (opt->scope == DIFF_SCOPE_SPARSE && > > + !diff_paths_in_sparse_checkout(old_path, new_path)) > > + return; > > This can't be right. > git diff c231e0f26fe9b2ea9ec46aa68ff95ba984ce592e > 72d42bd856228c15f702fa3c353432f4f1defe03 > (to directly diff two known blobs) will go through this function, with > old_path == c231e0f26fe9b2ea9ec46aa68ff95ba984ce592e and new_path == > 72d42bd856228c15f702fa3c353432f4f1defe03. But those aren't real > paths, and sparse-checkout should not restrict what is shown in those > cases. > Yeah, it's buggy that I forget to check two blobs without paths. > > + > > if (opt->flags.reverse_diff) { > > SWAP(old_mode, new_mode); > > SWAP(old_oid, new_oid); > > diff --git a/diff-lib.c b/diff-lib.c > > index 2edea41a234..a3381f2e0ff 100644 > > --- a/diff-lib.c > > +++ b/diff-lib.c > > @@ -88,6 +88,22 @@ static int match_stat_with_submodule(struct diff_options *diffopt, > > return changed; > > } > > > > +int diff_path_in_sparse_checkout(const char *path) { > > + if (core_sparse_checkout_cone) > > + return path_in_cone_mode_sparse_checkout(path, the_repository->index); > > + else > > + return path_in_sparse_checkout(path, the_repository->index); > > +} > > This says we are including the path if it matches the sparsity > patterns. Thus, we have to be careful when we use this function, > because the relevant paths are ones that match the sparsity > specification. The sparsity specification will always match the > sparsity patterns when diffing two commits, but when either the index > or the working tree is part of the diff, the sparsity specification > *might* be temporarily expanded. > Yes, I may have to look at more code to better understand how and when the "sparsity specification" is extended. Any recommendations for places to read? > > +int diff_paths_in_sparse_checkout(const char *one, const char*two) { > > + if (one == two || !strcmp(one, two)) > > + return diff_path_in_sparse_checkout(one); > > + else > > + return diff_path_in_sparse_checkout(one) && > > + diff_path_in_sparse_checkout(two); > > Why && rather than || ? > Agree, we do need to use || here, because the one diff side in the sparse specification, we should diff the two files. > > +} > > + > > + > > int run_diff_files(struct rev_info *revs, unsigned int option) > > { > > int entries, i; > > @@ -113,6 +129,9 @@ int run_diff_files(struct rev_info *revs, unsigned int option) > > > > if (diff_can_quit_early(&revs->diffopt)) > > break; > > + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE && > > + !diff_path_in_sparse_checkout(ce->name)) > > + continue; > > Here you've cut off the possibility of showing diffs for anything > outside the sparsity patterns, which is a mistake. We need to handle > a temporarily expanded sparse specification too. > Agree. > > if (!ce_path_match(istate, ce, &revs->prune_data, NULL)) > > continue; > > @@ -202,7 +221,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option) > > continue; > > } > > > > - if (ce_uptodate(ce) || ce_skip_worktree(ce)) > > + if (ce_uptodate(ce) || > > + (revs->diffopt.scope != DIFF_SCOPE_ALL && ce_skip_worktree(ce))) > > continue; > > Here you make --scope=all show files even if they are skip-worktree, > making them appear to have been deleted. I called out your > description earlier as potentially misleading, because it could imply > this behavior. It looks like you were consistent with the description > and implementation, it just doesn't match what we want. > Agree too. So we should not do anything in run_diff_files. > > /* > > @@ -439,6 +459,20 @@ static void do_oneway_diff(struct unpack_trees_options *o, > > do_oneway_diff is for cases where we are diffing against the index... > > > return; /* nothing to diff.. */ > > } > > > > + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE) { > > + if (idx && tree) { > > + if (!diff_paths_in_sparse_checkout(idx->name, tree->name)) > > + return; > > + } else if (idx) { > > + if (!diff_path_in_sparse_checkout(idx->name)) > > + return; > > + } else if (tree) { > > + if (!diff_path_in_sparse_checkout(tree->name)) > > + return; > > + } else > > + return; > > + } > > ...and you again mistakenly only compare to the sparsity patterns > instead of the sparse specification. > So here we should use ce_skip_worktree(idx) to check if this index entry matches sparse specification. > > + > > /* if the entry is not checked out, don't examine work tree */ > > cached = o->index_only || > > (idx && ((idx->ce_flags & CE_VALID) || ce_skip_worktree(idx))); > > > > > diff --git a/diff-no-index.c b/diff-no-index.c > > index 18edbdf4b59..ea94a104ea4 100644 > > --- a/diff-no-index.c > > +++ b/diff-no-index.c > > @@ -281,6 +281,10 @@ int diff_no_index(struct rev_info *revs, > > > > fixup_paths(paths, &replacement); > > > > + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE && > > + !diff_paths_in_sparse_checkout(paths[0], paths[1])) > > + goto out; > > --no-index means we're diffing two files that are not tracked, or at > least treating them as not tracked. sparse-checkout should not affect > such files. > Yeah, we should care about untracked files sparse-checkout only when we are using git add/rm/update-index... > > + > > revs->diffopt.skip_stat_unmatch = 1; > > if (!revs->diffopt.output_format) > > revs->diffopt.output_format = DIFF_FORMAT_PATCH; > > diff --git a/diff.c b/diff.c > > index 285d6e2d575..9de4044ae05 100644 > > --- a/diff.c > > +++ b/diff.c > > @@ -48,6 +48,7 @@ static int diff_interhunk_context_default; > > static const char *diff_word_regex_cfg; > > static const char *external_diff_cmd_cfg; > > static const char *diff_order_file_cfg; > > +static const char *external_diff_scope_cfg; > > int diff_auto_refresh_index = 1; > > static int diff_mnemonic_prefix; > > static int diff_no_prefix; > > @@ -57,6 +58,7 @@ static int diff_dirstat_permille_default = 30; > > static struct diff_options default_diff_options; > > static long diff_algorithm; > > static unsigned ws_error_highlight_default = WSEH_NEW; > > +static enum diff_scope external_diff_scope; > > Why is this called "external"? > Learn from external_diff_cmd_cfg, I should remove it. > > static char diff_colors[][COLOR_MAXLEN] = { > > GIT_COLOR_RESET, > > @@ -423,6 +425,16 @@ int git_diff_ui_config(const char *var, const char *value, void *cb) > > return 0; > > } > > > > + if (!strcmp(var, "diff.scope")) { > > + git_config_string(&external_diff_scope_cfg, var, value); > > + if (!strcmp(value, "all")) > > + external_diff_scope = DIFF_SCOPE_ALL; > > + else if (!strcmp(value, "sparse")) > > + external_diff_scope = DIFF_SCOPE_SPARSE; > > + else > > + return -1; > > + } > > + > > if (git_color_config(var, value, cb) < 0) > > return -1; > > > > @@ -4663,6 +4675,7 @@ void repo_diff_setup(struct repository *r, struct diff_options *options) > > > > options->color_moved = diff_color_moved_default; > > options->color_moved_ws_handling = diff_color_moved_ws_default; > > + options->scope = external_diff_scope; > > > > prep_parse_options(options); > > } > > @@ -4914,6 +4927,29 @@ static int parse_dirstat_opt(struct diff_options *options, const char *params) > > return 1; > > } > > > > +static int diff_opt_diff_scope(const struct option *option, > > + const char *optarg, int unset) > > +{ > > + struct diff_options *opt = option->value; > > + > > + if (unset) { > > + opt->scope = DIFF_SCOPE_NONE; > > I think we should instead have a > BUG_ON_OPT_NEG(unset) > or, even better, a > BUG_ON_OPT_NEG_NOARG(unset, optarg) > at the beginning of this function... > > > + } else if (optarg) { > > ...which would also allow you to drop this if and dedent the rest of > the function. > Agree. > > + if (!strcmp(optarg, "all")) { > > + if (core_apply_sparse_checkout) { > > + opt->scope = DIFF_SCOPE_ALL; > > + } > > + } else if (!strcmp(optarg, "sparse")) { > > + if (core_apply_sparse_checkout) { > > + opt->scope = DIFF_SCOPE_SPARSE; > > + } > > If core_apply_sparse_checkout is false, should we perhaps throw an > error instead of just silently ignoring the option the user passed? > Agree. > > + } else > > + return error(_("invalid --scope value: %s"), optarg); > > + } > > As written with no follow-on else clause here, wouldn't this silently > accept "--scope" without an "=<something>" argument without an error > and without properly initializing opt->scope? > Because opt will be initializing with default_diff_options in repo_diff_setup(), and opt->scope should respect config value first. So I don't think there's a mistake here. > > + > > + return 0; > > +} > > + > > static int diff_opt_diff_filter(const struct option *option, > > const char *optarg, int unset) > > { > > @@ -5683,6 +5719,9 @@ static void prep_parse_options(struct diff_options *options) > > OPT_CALLBACK_F(0, "diff-filter", options, N_("[(A|C|D|M|R|T|U|X|B)...[*]]"), > > N_("select files by diff type"), > > PARSE_OPT_NONEG, diff_opt_diff_filter), > > + OPT_CALLBACK_F(0, "scope", options, N_("[sparse|all]"), > > + N_("choose diff scope"), > > maybe "choose diff scope in sparse checkouts"? > OK. > > + PARSE_OPT_OPTARG, diff_opt_diff_scope), > > { OPTION_CALLBACK, 0, "output", options, N_("<file>"), > > N_("output to a specific file"), > > PARSE_OPT_NONEG, NULL, 0, diff_opt_output }, > > diff --git a/diff.h b/diff.h > > index 8ae18e5ab1e..90f7512034c 100644 > > --- a/diff.h > > +++ b/diff.h > > @@ -230,6 +230,12 @@ enum diff_submodule_format { > > DIFF_SUBMODULE_INLINE_DIFF > > }; > > > > +enum diff_scope { > > + DIFF_SCOPE_NONE = 0, > > + DIFF_SCOPE_ALL, > > + DIFF_SCOPE_SPARSE, > > +}; > > + > > /** > > * the set of options the calling program wants to affect the operation of > > * diffcore library with. > > @@ -285,6 +291,9 @@ struct diff_options { > > /* diff-filter bits */ > > unsigned int filter, filter_not; > > > > + /* diff sparse-checkout scope */ > > + enum diff_scope scope; > > + > > int use_color; > > > > /* Number of context lines to generate in patch output. */ > > @@ -696,4 +705,6 @@ void print_stat_summary(FILE *fp, int files, > > int insertions, int deletions); > > void setup_diff_pager(struct diff_options *); > > > > +int diff_path_in_sparse_checkout(const char *path); > > +int diff_paths_in_sparse_checkout(const char *one, const char *two); > > #endif /* DIFF_H */ > > diff --git a/t/t4070-diff-sparse-checkout-scope.sh b/t/t4070-diff-sparse-checkout-scope.sh > > new file mode 100644 > > This needs to be fixed. > > > index 00000000000..dca75a3308b > > --- /dev/null > > +++ b/t/t4070-diff-sparse-checkout-scope.sh > > @@ -0,0 +1,469 @@ > > +#!/bin/sh > > + > > +test_description='diff sparse-checkout scope' > > + > > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > > + > > +. ./test-lib.sh > > + > > + > > +test_expect_success 'setup' ' > > + git init temp && > > + ( > > + cd temp && > > + mkdir sub1 && > > + mkdir sub2 && > > + echo sub1/file1 >sub1/file1 && > > + echo sub2/file2 >sub2/file2 && > > + echo file1 >file1 && > > + echo file2 >file2 && > > + git add --all && > > + git commit -m init && > > + echo sub1/file1 >>sub1/file1 && > > + echo sub1/file2 >>sub1/file2 && > > + echo sub2/file1 >>sub2/file1 && > > + echo sub2/file2 >>sub2/file2 && > > + echo file1 >>file1 && > > + echo file2 >>file2 && > > + git add --all && > > + git commit -m change1 && > > + echo sub1/file1 >>sub1/file1 && > > + echo sub1/file2 >>sub1/file2 && > > + echo sub2/file1 >>sub2/file1 && > > + echo sub2/file2 >>sub2/file2 && > > + echo file1 >>file1 && > > + echo file2 >>file2 && > > + git add --all && > > + git commit -m change2 > > + ) > > +' > > + > > +reset_repo () { > > + rm -rf repo && > > + git clone --no-checkout temp repo > > Why --no-checkout rather than say --sparse? > This is because I accidentally associated it with a partial clone. I often use "git clone -filter=blob:none -no-checkout" to do a partial clone, then "git sparse- checkout set <pattern>" to set sparse-checkout patterns, and "git checkout" to download the missing blobs and checkout to a branch. But in this test file, we only need sparse-checkout, so it's true that I should not do such strange no-checkout thing. > > +} > > + > > +reset_with_sparse_checkout() { > > + reset_repo && > > + git -C repo sparse-checkout set $1 sub1 && > > + git -C repo checkout > > Fixing the above would let us get rid of this really weird extra > checkout command too. > > > +} > > + > > +change_worktree_and_index() { > > + ( > > + cd repo && > > + mkdir sub2 sub3 && > > + echo sub1/file3 >sub1/file3 && > > + echo sub2/file3 >sub2/file3 && > > + echo sub3/file3 >sub3/file3 && > > + echo file3 >file3 && > > + git add --all --sparse && > > + echo sub1/file3 >>sub1/file3 && > > + echo sub2/file3 >>sub2/file3 && > > + echo sub3/file3 >>sub3/file3 && > > + echo file3 >>file3 > > + ) > > +} > > It would be nice to modify different paths in the working tree and > index, to see if we can handle cases where the sparse specification > does not match the sparsity patterns better. (So, modify files inside > and outside the sparsity patterns, stage those changes, and then do a > `git sparse-checkout reapply` to make the files outside the sparsity > patterns disappear from the working tree...then modify different files > in the working tree both inside and outside the sparsity patterns. > And also remove some file that matches the sparsity patterns and > manually mark it as SKIP_WORKTREE.) That'd give us much better > coverage. > Nice addition. So I should use git update-index --skip-worktree to set skip_worktree bit for index entries. > > + > > +diff_scope() { > > + title=$1 > > + need_change_worktree_and_index=$2 > > + sparse_checkout_option=$3 > > + scope_option=$4 > > + expect=$5 > > + shift 5 > > + args=("$@") > > + > > + test_expect_success "$title $sparse_checkout_option $scope_option" " > > + reset_with_sparse_checkout $sparse_checkout_option && > > + if test \"$need_change_worktree_and_index\" = \"true\" ; then > > + change_worktree_and_index > > + fi && > > + git -C repo diff $scope_option ${args[*]} >actual && > > + if test -z \"$expect\" ; then > > + >expect > > + else > > + cat > expect <<-EOF > > +$expect > > + EOF > > + fi && > > + test_cmp expect actual > > + " > > +} > > + > > +args=("--name-only" "HEAD" "HEAD~") > > +diff_scope builtin_diff_tree false "--no-cone" "--scope=sparse" \ > > +"sub1/file1 > > +sub1/file2" "${args[@]}" > > + > > +diff_scope builtin_diff_tree false "--no-cone" "--scope=all" \ > > +"file1 > > +file2 > > +sub1/file1 > > +sub1/file2 > > +sub2/file1 > > +sub2/file2" "${args[@]}" > > + > > +diff_scope builtin_diff_tree false "--no-cone" "--no-scope" \ > > +"file1 > > +file2 > > +sub1/file1 > > +sub1/file2 > > +sub2/file1 > > +sub2/file2" "${args[@]}" > > + > > +diff_scope builtin_diff_tree false "--cone" "--scope=sparse" \ > > +"file1 > > +file2 > > +sub1/file1 > > +sub1/file2" "${args[@]}" > > + > > +diff_scope builtin_diff_tree false "--cone" "--scope=all" \ > > +"file1 > > +file2 > > +sub1/file1 > > +sub1/file2 > > +sub2/file1 > > +sub2/file2" "${args[@]}" > > + > > +diff_scope builtin_diff_tree false "--cone" "--no-scope" \ > > +"file1 > > +file2 > > +sub1/file1 > > +sub1/file2 > > +sub2/file1 > > +sub2/file2" "${args[@]}" > > + > > +args=("--name-only" "HEAD~") > > +diff_scope builtin_diff_index true "--no-cone" "--scope=sparse" \ > > +"sub1/file1 > > +sub1/file2 > > +sub1/file3" "${args[@]}" > > Here's a good example where the testcase is doing the wrong thing. > The expected answer here would also include file3, sub2/file3, and > sub3/file3. > Yeah. Files that are not part of the sparse-checkout patterns are temporarily extended into the sparse specification. > > + > > +diff_scope builtin_diff_index true "--no-cone" "--scope=all" \ > > +"file1 > > +file2 > > +file3 > > +sub1/file1 > > +sub1/file2 > > +sub1/file3 > > +sub2/file1 > > +sub2/file2 > > +sub2/file3 > > +sub3/file3" "${args[@]}" > > + > > +diff_scope builtin_diff_index true "--no-cone" "--no-scope" \ > > +"file1 > > +file2 > > +file3 > > +sub1/file1 > > +sub1/file2 > > +sub1/file3 > > +sub2/file1 > > +sub2/file2 > > +sub2/file3 > > +sub3/file3" "${args[@]}" > > + > > +diff_scope builtin_diff_index true "--cone" "--scope=sparse" \ > > +"file1 > > +file2 > > +file3 > > +sub1/file1 > > +sub1/file2 > > +sub1/file3" "${args[@]}" > > This is also wrong; it's missing sub2/file3 and sub3/file3. > > > + > > +diff_scope builtin_diff_index true "--cone" "--scope=all" \ > > +"file1 > > +file2 > > +file3 > > +sub1/file1 > > +sub1/file2 > > +sub1/file3 > > +sub2/file1 > > +sub2/file2 > > +sub2/file3 > > +sub3/file3" "${args[@]}" > > + > > +diff_scope builtin_diff_index true "--cone" "--no-scope" \ > > +"file1 > > +file2 > > +file3 > > +sub1/file1 > > +sub1/file2 > > +sub1/file3 > > +sub2/file1 > > +sub2/file2 > > +sub2/file3 > > +sub3/file3" "${args[@]}" > > + > > +args=("--name-only" "file3" "sub1/" "sub2/") > > + > > +diff_scope builtin_diff_files true "--no-cone" "--scope=sparse" \ > > +"sub1/file3" "${args[@]}" > > This should also include file3, sub2/file3, and sub3/file3. > `--scope=` should not affect diff output at all if neither --cached > nor revision arguments are supplied. > Agree. > > + > > +diff_scope builtin_diff_files true "--no-cone" "--scope=all" \ > > +"file3 > > +sub1/file3 > > +sub2/file1 > > +sub2/file2 > > +sub2/file3" "${args[@]}" > > This is wrong due to including too much; it should not include > sub2/file1 or sub2/file2 (it is only including those because it is > showing them as deleted, when they are not deleted but are > SKIP_WORKTREE). > Agree. > I think I'm going to stop reviewing here. I'm probably just going to > keep repeating the same issues I identified earlier if I continue. Thank you very much for your review, you have pointed out very many problems with this patch :) -- ZheNing Hu
Hi, On Sat, Nov 5, 2022 at 7:11 PM ZheNing Hu <adlternative@gmail.com> wrote: > Elijah Newren <newren@gmail.com> 于2022年11月1日周二 13:18写道: > > On Sun, Oct 30, 2022 at 9:11 PM ZheNing Hu via GitGitGadget > > <gitgitgadget@gmail.com> wrote: > > > > > > From: ZheNing Hu <adlternative@gmail.com> > > > > > > When we use sparse-checkout, we often want the set of files > > > that some commands operate on to be restricted to the > > > sparse-checkout specification. > > > > It seems a bit premature to send this, when the guideline document[*] > > detailing how these options should work is still in the "Needs Review" > > state. I know, it's been waiting for a while, but it's a _long_ > > document. > > > > [*] https://lore.kernel.org/git/pull.1367.v3.git.1665269538608.gitgitgadget@gmail.com/ > > > > Fine, I just want to start trying to experiment with this feature in > git-diff earlier, > and I can wait for the sparse-checkout.txt documentation to finish > first if needed :) Note that you may be able to help reduce the wait by reviewing the document. (You commented on v1 -- thanks! -- but no one has commented on any of the newer versions, despite being out for over a month and still being marked as "Needs Review" in the "What's cooking" reports.) > > > `--no-scope` is the default > > > option for now. > > > > What does --no-scope even mean? You didn't explain it, and I don't > > see how it could make sense. We explicitly avoided a `--no-` prefix > > by allowing the --scope option to take a value. I don't think there > > should be a --no-scope option. > > I think the “--no-scope” here does nothing, as if it were unaffected by scope > (just like correctly "--scope=all", right?) Reading your patch, I was unable to determine where you made --no-scope behave differently than how you implemented --scope=all. Maybe there was a difference and I missed it. Anyway, I don't think there should be a --no-scope option. > > > diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt > > > index 3674ac48e92..04bf83e8be1 100644 > > > --- a/Documentation/diff-options.txt > > > +++ b/Documentation/diff-options.txt > > > @@ -195,6 +195,24 @@ For instance, if you configured the `diff.algorithm` variable to a > > > non-default value and want to use the default one, then you > > > have to use `--diff-algorithm=default` option. > > > > > > +ifndef::git-format-patch[] > > > +ifndef::git-log[] > > > + > > > +--scope={sparse|all}:: > > > + Choose diff scope. The variants are as follows: > > > ++ > > > +-- > > > +`--sparse`;; > > > + Restrict diff paths to those matching sparse-checkout specification. > > > +`--all`;; > > > + Without restriction, diff is performed regardless of whether the path > > > + meets the sparse-checkout specification. > > > +-- > > > ++ > > > + > > > +endif::git-log[] > > > +endif::git-format-patch[] > > > > What about diff-files, diff-index, diff-tree, and show? > > > > diff-options.txt included in their documents, and git-format-patch.txt, > git-log.txt, but should not in git-format-patch.txt. I don't know if it > should be included in git-diff-files.txt, because git diff-files compare > the files in the working tree and the index (so it's the same as > "simple" git-diff, which should not be affected by scope?) Oh, good point. Yeah, git-diff-files.txt should not get this option as it won't have any affect. git-diff-index and git-diff-tree and git-show and git-log should get it, though. > > > +int diff_path_in_sparse_checkout(const char *path) { > > > + if (core_sparse_checkout_cone) > > > + return path_in_cone_mode_sparse_checkout(path, the_repository->index); > > > + else > > > + return path_in_sparse_checkout(path, the_repository->index); > > > +} > > > > This says we are including the path if it matches the sparsity > > patterns. Thus, we have to be careful when we use this function, > > because the relevant paths are ones that match the sparsity > > specification. The sparsity specification will always match the > > sparsity patterns when diffing two commits, but when either the index > > or the working tree is part of the diff, the sparsity specification > > *might* be temporarily expanded. > > Yes, I may have to look at more code to better understand how and when the > "sparsity specification" is extended. Any recommendations for places to read? Yes, please review the newer version of my sparse-checkout.txt directions file; it covers this in more detail. > > > @@ -439,6 +459,20 @@ static void do_oneway_diff(struct unpack_trees_options *o, > > > > do_oneway_diff is for cases where we are diffing against the index... > > > > > return; /* nothing to diff.. */ > > > } > > > > > > + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE) { > > > + if (idx && tree) { > > > + if (!diff_paths_in_sparse_checkout(idx->name, tree->name)) > > > + return; > > > + } else if (idx) { > > > + if (!diff_path_in_sparse_checkout(idx->name)) > > > + return; > > > + } else if (tree) { > > > + if (!diff_path_in_sparse_checkout(tree->name)) > > > + return; > > > + } else > > > + return; > > > + } > > > > ...and you again mistakenly only compare to the sparsity patterns > > instead of the sparse specification. > > > > So here we should use ce_skip_worktree(idx) to check if this index entry matches > sparse specification. ce_skip_worktree(idx) only checks for expansions to the sparse specification in the working tree. If dealing with index files, the sparse specification is also expanded to handle all files in the index that differ from HEAD. > > > + } else > > > + return error(_("invalid --scope value: %s"), optarg); > > > + } > > > > As written with no follow-on else clause here, wouldn't this silently > > accept "--scope" without an "=<something>" argument without an error > > and without properly initializing opt->scope? > > > > Because opt will be initializing with default_diff_options in repo_diff_setup(), > and opt->scope should respect config value first. So I don't think there's a > mistake here. Okay, it's good that you've got the variables initialized somehow, but that's only half the point here. The main point is that the user can specify something that makes no sense and if they do, we should throw an error informing them of their mistake. --scope should not be passed without an argument (either "all" or "sparse", currently), but this code allowed it. > > > diff --git a/t/t4070-diff-sparse-checkout-scope.sh b/t/t4070-diff-sparse-checkout-scope.sh > > > new file mode 100644 > > > > This needs to be fixed. Since you didn't comment on this but usually do comment on thing, I'll just re-ping it to make sure you don't miss the comment about the incorrect file mode here. > > > +reset_repo () { > > > + rm -rf repo && > > > + git clone --no-checkout temp repo > > > > Why --no-checkout rather than say --sparse? > > > > This is because I accidentally associated it with a > partial clone. I often use "git clone -filter=blob:none -no-checkout" > to do a partial clone, then "git sparse- checkout set <pattern>" > to set sparse-checkout patterns, and "git checkout" to download > the missing blobs and checkout to a branch. But in this > test file, we only need sparse-checkout, so it's true that I should > not do such strange no-checkout thing. Yeah, no need to involve partial clones here. (As an aside, though, I think even with a partial clone that --sparse makes more sense than --no-checkout.) > > > +change_worktree_and_index() { > > > + ( > > > + cd repo && > > > + mkdir sub2 sub3 && > > > + echo sub1/file3 >sub1/file3 && > > > + echo sub2/file3 >sub2/file3 && > > > + echo sub3/file3 >sub3/file3 && > > > + echo file3 >file3 && > > > + git add --all --sparse && > > > + echo sub1/file3 >>sub1/file3 && > > > + echo sub2/file3 >>sub2/file3 && > > > + echo sub3/file3 >>sub3/file3 && > > > + echo file3 >>file3 > > > + ) > > > +} > > > > It would be nice to modify different paths in the working tree and > > index, to see if we can handle cases where the sparse specification > > does not match the sparsity patterns better. (So, modify files inside > > and outside the sparsity patterns, stage those changes, and then do a > > `git sparse-checkout reapply` to make the files outside the sparsity > > patterns disappear from the working tree...then modify different files > > in the working tree both inside and outside the sparsity patterns. > > And also remove some file that matches the sparsity patterns and > > manually mark it as SKIP_WORKTREE.) That'd give us much better > > coverage. > > > > Nice addition. So I should use git update-index --skip-worktree to set > skip_worktree bit for index entries. Well, I'd say use normal "sparse-checkout" commands to set most of them. However, adding one or two extra SKIP_WORKTREE paths via running `git update-index --skip-worktree $PATH` (and removing the corresponding local file) would make the testcases more interesting. > > I think I'm going to stop reviewing here. I'm probably just going to > > keep repeating the same issues I identified earlier if I continue. > > Thank you very much for your review, you have pointed out very many > problems with this patch :) I'm glad it was helpful. :)
Elijah Newren <newren@gmail.com> 于2022年11月6日周日 14:58写道: > > Hi, > > On Sat, Nov 5, 2022 at 7:11 PM ZheNing Hu <adlternative@gmail.com> wrote: > > Elijah Newren <newren@gmail.com> 于2022年11月1日周二 13:18写道: > > > On Sun, Oct 30, 2022 at 9:11 PM ZheNing Hu via GitGitGadget > > > <gitgitgadget@gmail.com> wrote: > > > > > > > > From: ZheNing Hu <adlternative@gmail.com> > > > > > > > > When we use sparse-checkout, we often want the set of files > > > > that some commands operate on to be restricted to the > > > > sparse-checkout specification. > > > > > > It seems a bit premature to send this, when the guideline document[*] > > > detailing how these options should work is still in the "Needs Review" > > > state. I know, it's been waiting for a while, but it's a _long_ > > > document. > > > > > > [*] https://lore.kernel.org/git/pull.1367.v3.git.1665269538608.gitgitgadget@gmail.com/ > > > > > > > Fine, I just want to start trying to experiment with this feature in > > git-diff earlier, > > and I can wait for the sparse-checkout.txt documentation to finish > > first if needed :) > > Note that you may be able to help reduce the wait by reviewing the > document. (You commented on v1 -- thanks! -- but no one has commented > on any of the newer versions, despite being out for over a month and > still being marked as "Needs Review" in the "What's cooking" reports.) > I've skimmed v3 in general, but haven't looked at v4 yet, I'll take a closer look at it. > > > > `--no-scope` is the default > > > > option for now. > > > > > > What does --no-scope even mean? You didn't explain it, and I don't > > > see how it could make sense. We explicitly avoided a `--no-` prefix > > > by allowing the --scope option to take a value. I don't think there > > > should be a --no-scope option. > > > > I think the “--no-scope” here does nothing, as if it were unaffected by scope > > (just like correctly "--scope=all", right?) > > Reading your patch, I was unable to determine where you made > --no-scope behave differently than how you implemented --scope=all. > Maybe there was a difference and I missed it. > > Anyway, I don't think there should be a --no-scope option. > Fine, I'll get rid of it. > > > > diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt > > > > index 3674ac48e92..04bf83e8be1 100644 > > > > --- a/Documentation/diff-options.txt > > > > +++ b/Documentation/diff-options.txt > > > > @@ -195,6 +195,24 @@ For instance, if you configured the `diff.algorithm` variable to a > > > > non-default value and want to use the default one, then you > > > > have to use `--diff-algorithm=default` option. > > > > > > > > +ifndef::git-format-patch[] > > > > +ifndef::git-log[] > > > > + > > > > +--scope={sparse|all}:: > > > > + Choose diff scope. The variants are as follows: > > > > ++ > > > > +-- > > > > +`--sparse`;; > > > > + Restrict diff paths to those matching sparse-checkout specification. > > > > +`--all`;; > > > > + Without restriction, diff is performed regardless of whether the path > > > > + meets the sparse-checkout specification. > > > > +-- > > > > ++ > > > > + > > > > +endif::git-log[] > > > > +endif::git-format-patch[] > > > > > > What about diff-files, diff-index, diff-tree, and show? > > > > > > > diff-options.txt included in their documents, and git-format-patch.txt, > > git-log.txt, but should not in git-format-patch.txt. I don't know if it > > should be included in git-diff-files.txt, because git diff-files compare > > the files in the working tree and the index (so it's the same as > > "simple" git-diff, which should not be affected by scope?) > > Oh, good point. Yeah, git-diff-files.txt should not get this option > as it won't have any affect. git-diff-index and git-diff-tree and > git-show and git-log should get it, though. > > > > > +int diff_path_in_sparse_checkout(const char *path) { > > > > + if (core_sparse_checkout_cone) > > > > + return path_in_cone_mode_sparse_checkout(path, the_repository->index); > > > > + else > > > > + return path_in_sparse_checkout(path, the_repository->index); > > > > +} > > > > > > This says we are including the path if it matches the sparsity > > > patterns. Thus, we have to be careful when we use this function, > > > because the relevant paths are ones that match the sparsity > > > specification. The sparsity specification will always match the > > > sparsity patterns when diffing two commits, but when either the index > > > or the working tree is part of the diff, the sparsity specification > > > *might* be temporarily expanded. > > > > Yes, I may have to look at more code to better understand how and when the > > "sparsity specification" is extended. Any recommendations for places to read? > > Yes, please review the newer version of my sparse-checkout.txt > directions file; it covers this in more detail. > Ok, thanks. > > > > @@ -439,6 +459,20 @@ static void do_oneway_diff(struct unpack_trees_options *o, > > > > > > do_oneway_diff is for cases where we are diffing against the index... > > > > > > > return; /* nothing to diff.. */ > > > > } > > > > > > > > + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE) { > > > > + if (idx && tree) { > > > > + if (!diff_paths_in_sparse_checkout(idx->name, tree->name)) > > > > + return; > > > > + } else if (idx) { > > > > + if (!diff_path_in_sparse_checkout(idx->name)) > > > > + return; > > > > + } else if (tree) { > > > > + if (!diff_path_in_sparse_checkout(tree->name)) > > > > + return; > > > > + } else > > > > + return; > > > > + } > > > > > > ...and you again mistakenly only compare to the sparsity patterns > > > instead of the sparse specification. > > > > > > > So here we should use ce_skip_worktree(idx) to check if this index entry matches > > sparse specification. > > ce_skip_worktree(idx) only checks for expansions to the sparse > specification in the working tree. If dealing with index files, the > sparse specification is also expanded to handle all files in the index > that differ from HEAD. > I will read more code to understand this detail. > > > > + } else > > > > + return error(_("invalid --scope value: %s"), optarg); > > > > + } > > > > > > As written with no follow-on else clause here, wouldn't this silently > > > accept "--scope" without an "=<something>" argument without an error > > > and without properly initializing opt->scope? > > > > > > > Because opt will be initializing with default_diff_options in repo_diff_setup(), > > and opt->scope should respect config value first. So I don't think there's a > > mistake here. > > Okay, it's good that you've got the variables initialized somehow, but > that's only half the point here. The main point is that the user can > specify something that makes no sense and if they do, we should throw > an error informing them of their mistake. --scope should not be > passed without an argument (either "all" or "sparse", currently), but > this code allowed it. > Ah, it makes sense, I will fix it. > > > > diff --git a/t/t4070-diff-sparse-checkout-scope.sh b/t/t4070-diff-sparse-checkout-scope.sh > > > > new file mode 100644 > > > > > > This needs to be fixed. > > Since you didn't comment on this but usually do comment on thing, I'll > just re-ping it to make sure you don't miss the comment about the > incorrect file mode here. > Yes, I've noticed the file permissions issue here, but there may be some other issues with this test file: it fails when running CI tests, presumably because of some shell syntax incompatibility. > > > > +reset_repo () { > > > > + rm -rf repo && > > > > + git clone --no-checkout temp repo > > > > > > Why --no-checkout rather than say --sparse? > > > > > > > This is because I accidentally associated it with a > > partial clone. I often use "git clone -filter=blob:none -no-checkout" > > to do a partial clone, then "git sparse- checkout set <pattern>" > > to set sparse-checkout patterns, and "git checkout" to download > > the missing blobs and checkout to a branch. But in this > > test file, we only need sparse-checkout, so it's true that I should > > not do such strange no-checkout thing. > > Yeah, no need to involve partial clones here. (As an aside, though, I > think even with a partial clone that --sparse makes more sense than > --no-checkout.) > Using --sparse may indeed be better than -no-checkout, as there is no need to perform an additional checkout after using clone --no-checkout and sparse-checkout. But it uses sparse-checkout cone mode by default. A little curious, why can't we specify --no-cone mode when doing git clone and why can't we specify sparse-checkout patterns here? If such a feature is available, git clone and git sparse-checkout will be combined in one step. > > > > +change_worktree_and_index() { > > > > + ( > > > > + cd repo && > > > > + mkdir sub2 sub3 && > > > > + echo sub1/file3 >sub1/file3 && > > > > + echo sub2/file3 >sub2/file3 && > > > > + echo sub3/file3 >sub3/file3 && > > > > + echo file3 >file3 && > > > > + git add --all --sparse && > > > > + echo sub1/file3 >>sub1/file3 && > > > > + echo sub2/file3 >>sub2/file3 && > > > > + echo sub3/file3 >>sub3/file3 && > > > > + echo file3 >>file3 > > > > + ) > > > > +} > > > > > > It would be nice to modify different paths in the working tree and > > > index, to see if we can handle cases where the sparse specification > > > does not match the sparsity patterns better. (So, modify files inside > > > and outside the sparsity patterns, stage those changes, and then do a > > > `git sparse-checkout reapply` to make the files outside the sparsity > > > patterns disappear from the working tree...then modify different files > > > in the working tree both inside and outside the sparsity patterns. > > > And also remove some file that matches the sparsity patterns and > > > manually mark it as SKIP_WORKTREE.) That'd give us much better > > > coverage. > > > > > > > Nice addition. So I should use git update-index --skip-worktree to set > > skip_worktree bit for index entries. > > Well, I'd say use normal "sparse-checkout" commands to set most of > them. However, adding one or two extra SKIP_WORKTREE paths via > running `git update-index --skip-worktree $PATH` (and removing the > corresponding local file) would make the testcases more interesting. > Get it. > > > I think I'm going to stop reviewing here. I'm probably just going to > > > keep repeating the same issues I identified earlier if I continue. > > > > Thank you very much for your review, you have pointed out very many > > problems with this patch :) > > I'm glad it was helpful. :) Thanks. ZheNing Hu
diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt index 35a7bf86d77..52707e1b2d6 100644 --- a/Documentation/config/diff.txt +++ b/Documentation/config/diff.txt @@ -201,6 +201,18 @@ diff.algorithm:: -- + +diff.scope:: + Choose diff scope. The variants are as follows: ++ +-- +`sparse`;; + Restrict diff paths to those matching sparse-checkout specification. +`all`;; + Without restriction, diff is performed regardless of whether the path + meets the sparse-checkout specification. +-- ++ + diff.wsErrorHighlight:: Highlight whitespace errors in the `context`, `old` or `new` lines of the diff. Multiple values are separated by comma, diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 3674ac48e92..04bf83e8be1 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -195,6 +195,24 @@ For instance, if you configured the `diff.algorithm` variable to a non-default value and want to use the default one, then you have to use `--diff-algorithm=default` option. +ifndef::git-format-patch[] +ifndef::git-log[] + +--scope={sparse|all}:: + Choose diff scope. The variants are as follows: ++ +-- +`--sparse`;; + Restrict diff paths to those matching sparse-checkout specification. +`--all`;; + Without restriction, diff is performed regardless of whether the path + meets the sparse-checkout specification. +-- ++ + +endif::git-log[] +endif::git-format-patch[] + --stat[=<width>[,<name-width>[,<count>]]]:: Generate a diffstat. By default, as much space as necessary will be used for the filename part, and the rest for the graph diff --git a/builtin/diff.c b/builtin/diff.c index 854d2c5a5c4..6b450f7184c 100644 --- a/builtin/diff.c +++ b/builtin/diff.c @@ -54,6 +54,10 @@ static void stuff_change(struct diff_options *opt, oideq(old_oid, new_oid) && (old_mode == new_mode)) return; + if (opt->scope == DIFF_SCOPE_SPARSE && + !diff_paths_in_sparse_checkout(old_path, new_path)) + return; + if (opt->flags.reverse_diff) { SWAP(old_mode, new_mode); SWAP(old_oid, new_oid); diff --git a/diff-lib.c b/diff-lib.c index 2edea41a234..a3381f2e0ff 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -88,6 +88,22 @@ static int match_stat_with_submodule(struct diff_options *diffopt, return changed; } +int diff_path_in_sparse_checkout(const char *path) { + if (core_sparse_checkout_cone) + return path_in_cone_mode_sparse_checkout(path, the_repository->index); + else + return path_in_sparse_checkout(path, the_repository->index); +} + +int diff_paths_in_sparse_checkout(const char *one, const char*two) { + if (one == two || !strcmp(one, two)) + return diff_path_in_sparse_checkout(one); + else + return diff_path_in_sparse_checkout(one) && + diff_path_in_sparse_checkout(two); +} + + int run_diff_files(struct rev_info *revs, unsigned int option) { int entries, i; @@ -113,6 +129,9 @@ int run_diff_files(struct rev_info *revs, unsigned int option) if (diff_can_quit_early(&revs->diffopt)) break; + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE && + !diff_path_in_sparse_checkout(ce->name)) + continue; if (!ce_path_match(istate, ce, &revs->prune_data, NULL)) continue; @@ -202,7 +221,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option) continue; } - if (ce_uptodate(ce) || ce_skip_worktree(ce)) + if (ce_uptodate(ce) || + (revs->diffopt.scope != DIFF_SCOPE_ALL && ce_skip_worktree(ce))) continue; /* @@ -439,6 +459,20 @@ static void do_oneway_diff(struct unpack_trees_options *o, return; /* nothing to diff.. */ } + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE) { + if (idx && tree) { + if (!diff_paths_in_sparse_checkout(idx->name, tree->name)) + return; + } else if (idx) { + if (!diff_path_in_sparse_checkout(idx->name)) + return; + } else if (tree) { + if (!diff_path_in_sparse_checkout(tree->name)) + return; + } else + return; + } + /* if the entry is not checked out, don't examine work tree */ cached = o->index_only || (idx && ((idx->ce_flags & CE_VALID) || ce_skip_worktree(idx))); diff --git a/diff-no-index.c b/diff-no-index.c index 18edbdf4b59..ea94a104ea4 100644 --- a/diff-no-index.c +++ b/diff-no-index.c @@ -281,6 +281,10 @@ int diff_no_index(struct rev_info *revs, fixup_paths(paths, &replacement); + if (revs->diffopt.scope == DIFF_SCOPE_SPARSE && + !diff_paths_in_sparse_checkout(paths[0], paths[1])) + goto out; + revs->diffopt.skip_stat_unmatch = 1; if (!revs->diffopt.output_format) revs->diffopt.output_format = DIFF_FORMAT_PATCH; diff --git a/diff.c b/diff.c index 285d6e2d575..9de4044ae05 100644 --- a/diff.c +++ b/diff.c @@ -48,6 +48,7 @@ static int diff_interhunk_context_default; static const char *diff_word_regex_cfg; static const char *external_diff_cmd_cfg; static const char *diff_order_file_cfg; +static const char *external_diff_scope_cfg; int diff_auto_refresh_index = 1; static int diff_mnemonic_prefix; static int diff_no_prefix; @@ -57,6 +58,7 @@ static int diff_dirstat_permille_default = 30; static struct diff_options default_diff_options; static long diff_algorithm; static unsigned ws_error_highlight_default = WSEH_NEW; +static enum diff_scope external_diff_scope; static char diff_colors[][COLOR_MAXLEN] = { GIT_COLOR_RESET, @@ -423,6 +425,16 @@ int git_diff_ui_config(const char *var, const char *value, void *cb) return 0; } + if (!strcmp(var, "diff.scope")) { + git_config_string(&external_diff_scope_cfg, var, value); + if (!strcmp(value, "all")) + external_diff_scope = DIFF_SCOPE_ALL; + else if (!strcmp(value, "sparse")) + external_diff_scope = DIFF_SCOPE_SPARSE; + else + return -1; + } + if (git_color_config(var, value, cb) < 0) return -1; @@ -4663,6 +4675,7 @@ void repo_diff_setup(struct repository *r, struct diff_options *options) options->color_moved = diff_color_moved_default; options->color_moved_ws_handling = diff_color_moved_ws_default; + options->scope = external_diff_scope; prep_parse_options(options); } @@ -4914,6 +4927,29 @@ static int parse_dirstat_opt(struct diff_options *options, const char *params) return 1; } +static int diff_opt_diff_scope(const struct option *option, + const char *optarg, int unset) +{ + struct diff_options *opt = option->value; + + if (unset) { + opt->scope = DIFF_SCOPE_NONE; + } else if (optarg) { + if (!strcmp(optarg, "all")) { + if (core_apply_sparse_checkout) { + opt->scope = DIFF_SCOPE_ALL; + } + } else if (!strcmp(optarg, "sparse")) { + if (core_apply_sparse_checkout) { + opt->scope = DIFF_SCOPE_SPARSE; + } + } else + return error(_("invalid --scope value: %s"), optarg); + } + + return 0; +} + static int diff_opt_diff_filter(const struct option *option, const char *optarg, int unset) { @@ -5683,6 +5719,9 @@ static void prep_parse_options(struct diff_options *options) OPT_CALLBACK_F(0, "diff-filter", options, N_("[(A|C|D|M|R|T|U|X|B)...[*]]"), N_("select files by diff type"), PARSE_OPT_NONEG, diff_opt_diff_filter), + OPT_CALLBACK_F(0, "scope", options, N_("[sparse|all]"), + N_("choose diff scope"), + PARSE_OPT_OPTARG, diff_opt_diff_scope), { OPTION_CALLBACK, 0, "output", options, N_("<file>"), N_("output to a specific file"), PARSE_OPT_NONEG, NULL, 0, diff_opt_output }, diff --git a/diff.h b/diff.h index 8ae18e5ab1e..90f7512034c 100644 --- a/diff.h +++ b/diff.h @@ -230,6 +230,12 @@ enum diff_submodule_format { DIFF_SUBMODULE_INLINE_DIFF }; +enum diff_scope { + DIFF_SCOPE_NONE = 0, + DIFF_SCOPE_ALL, + DIFF_SCOPE_SPARSE, +}; + /** * the set of options the calling program wants to affect the operation of * diffcore library with. @@ -285,6 +291,9 @@ struct diff_options { /* diff-filter bits */ unsigned int filter, filter_not; + /* diff sparse-checkout scope */ + enum diff_scope scope; + int use_color; /* Number of context lines to generate in patch output. */ @@ -696,4 +705,6 @@ void print_stat_summary(FILE *fp, int files, int insertions, int deletions); void setup_diff_pager(struct diff_options *); +int diff_path_in_sparse_checkout(const char *path); +int diff_paths_in_sparse_checkout(const char *one, const char *two); #endif /* DIFF_H */ diff --git a/t/t4070-diff-sparse-checkout-scope.sh b/t/t4070-diff-sparse-checkout-scope.sh new file mode 100644 index 00000000000..dca75a3308b --- /dev/null +++ b/t/t4070-diff-sparse-checkout-scope.sh @@ -0,0 +1,469 @@ +#!/bin/sh + +test_description='diff sparse-checkout scope' + +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME + +. ./test-lib.sh + + +test_expect_success 'setup' ' + git init temp && + ( + cd temp && + mkdir sub1 && + mkdir sub2 && + echo sub1/file1 >sub1/file1 && + echo sub2/file2 >sub2/file2 && + echo file1 >file1 && + echo file2 >file2 && + git add --all && + git commit -m init && + echo sub1/file1 >>sub1/file1 && + echo sub1/file2 >>sub1/file2 && + echo sub2/file1 >>sub2/file1 && + echo sub2/file2 >>sub2/file2 && + echo file1 >>file1 && + echo file2 >>file2 && + git add --all && + git commit -m change1 && + echo sub1/file1 >>sub1/file1 && + echo sub1/file2 >>sub1/file2 && + echo sub2/file1 >>sub2/file1 && + echo sub2/file2 >>sub2/file2 && + echo file1 >>file1 && + echo file2 >>file2 && + git add --all && + git commit -m change2 + ) +' + +reset_repo () { + rm -rf repo && + git clone --no-checkout temp repo +} + +reset_with_sparse_checkout() { + reset_repo && + git -C repo sparse-checkout set $1 sub1 && + git -C repo checkout +} + +change_worktree_and_index() { + ( + cd repo && + mkdir sub2 sub3 && + echo sub1/file3 >sub1/file3 && + echo sub2/file3 >sub2/file3 && + echo sub3/file3 >sub3/file3 && + echo file3 >file3 && + git add --all --sparse && + echo sub1/file3 >>sub1/file3 && + echo sub2/file3 >>sub2/file3 && + echo sub3/file3 >>sub3/file3 && + echo file3 >>file3 + ) +} + +diff_scope() { + title=$1 + need_change_worktree_and_index=$2 + sparse_checkout_option=$3 + scope_option=$4 + expect=$5 + shift 5 + args=("$@") + + test_expect_success "$title $sparse_checkout_option $scope_option" " + reset_with_sparse_checkout $sparse_checkout_option && + if test \"$need_change_worktree_and_index\" = \"true\" ; then + change_worktree_and_index + fi && + git -C repo diff $scope_option ${args[*]} >actual && + if test -z \"$expect\" ; then + >expect + else + cat > expect <<-EOF +$expect + EOF + fi && + test_cmp expect actual + " +} + +args=("--name-only" "HEAD" "HEAD~") +diff_scope builtin_diff_tree false "--no-cone" "--scope=sparse" \ +"sub1/file1 +sub1/file2" "${args[@]}" + +diff_scope builtin_diff_tree false "--no-cone" "--scope=all" \ +"file1 +file2 +sub1/file1 +sub1/file2 +sub2/file1 +sub2/file2" "${args[@]}" + +diff_scope builtin_diff_tree false "--no-cone" "--no-scope" \ +"file1 +file2 +sub1/file1 +sub1/file2 +sub2/file1 +sub2/file2" "${args[@]}" + +diff_scope builtin_diff_tree false "--cone" "--scope=sparse" \ +"file1 +file2 +sub1/file1 +sub1/file2" "${args[@]}" + +diff_scope builtin_diff_tree false "--cone" "--scope=all" \ +"file1 +file2 +sub1/file1 +sub1/file2 +sub2/file1 +sub2/file2" "${args[@]}" + +diff_scope builtin_diff_tree false "--cone" "--no-scope" \ +"file1 +file2 +sub1/file1 +sub1/file2 +sub2/file1 +sub2/file2" "${args[@]}" + +args=("--name-only" "HEAD~") +diff_scope builtin_diff_index true "--no-cone" "--scope=sparse" \ +"sub1/file1 +sub1/file2 +sub1/file3" "${args[@]}" + +diff_scope builtin_diff_index true "--no-cone" "--scope=all" \ +"file1 +file2 +file3 +sub1/file1 +sub1/file2 +sub1/file3 +sub2/file1 +sub2/file2 +sub2/file3 +sub3/file3" "${args[@]}" + +diff_scope builtin_diff_index true "--no-cone" "--no-scope" \ +"file1 +file2 +file3 +sub1/file1 +sub1/file2 +sub1/file3 +sub2/file1 +sub2/file2 +sub2/file3 +sub3/file3" "${args[@]}" + +diff_scope builtin_diff_index true "--cone" "--scope=sparse" \ +"file1 +file2 +file3 +sub1/file1 +sub1/file2 +sub1/file3" "${args[@]}" + +diff_scope builtin_diff_index true "--cone" "--scope=all" \ +"file1 +file2 +file3 +sub1/file1 +sub1/file2 +sub1/file3 +sub2/file1 +sub2/file2 +sub2/file3 +sub3/file3" "${args[@]}" + +diff_scope builtin_diff_index true "--cone" "--no-scope" \ +"file1 +file2 +file3 +sub1/file1 +sub1/file2 +sub1/file3 +sub2/file1 +sub2/file2 +sub2/file3 +sub3/file3" "${args[@]}" + +args=("--name-only" "file3" "sub1/" "sub2/") + +diff_scope builtin_diff_files true "--no-cone" "--scope=sparse" \ +"sub1/file3" "${args[@]}" + +diff_scope builtin_diff_files true "--no-cone" "--scope=all" \ +"file3 +sub1/file3 +sub2/file1 +sub2/file2 +sub2/file3" "${args[@]}" + +diff_scope builtin_diff_files true "--no-cone" "--no-scope" \ +"file3 +sub1/file3 +sub2/file3" "${args[@]}" + +diff_scope builtin_diff_files true "--cone" "--scope=sparse" \ +"file3 +sub1/file3" "${args[@]}" + +diff_scope builtin_diff_files true "--cone" "--scope=all" \ +"file3 +sub1/file3 +sub2/file1 +sub2/file2 +sub2/file3" "${args[@]}" + +diff_scope builtin_diff_files true "--cone" "--no-scope" \ +"file3 +sub1/file3 +sub2/file3" "${args[@]}" + + +args=("--name-only" "HEAD~:sub2/file2" "sub1/file2") + +diff_scope builtin_diff_b_f true "--no-cone" "--scope=sparse" \ +"" "${args[@]}" + +diff_scope builtin_diff_b_f true "--no-cone" "--scope=all" \ +"sub1/file2" "${args[@]}" + +diff_scope builtin_diff_b_f true "--no-cone" "--no-scope" \ +"sub1/file2" "${args[@]}" + +args=("--name-only" "HEAD~:sub1/file1" "file3") + +diff_scope builtin_diff_b_f true "--cone" "--scope=sparse" \ +"file3" "${args[@]}" + +diff_scope builtin_diff_b_f true "--cone" "--scope=all" \ +"file3" "${args[@]}" + +diff_scope builtin_diff_b_f true "--cone" "--no-scope" \ +"file3" "${args[@]}" + +args=("--name-only" HEAD~:sub2/file2 HEAD:sub1/file2) + +diff_scope builtin_diff_blobs true "--no-cone" "--scope=sparse" \ +"" "${args[@]}" + +diff_scope builtin_diff_blobs true "--no-cone" "--scope=all" \ +"sub1/file2" "${args[@]}" + +diff_scope builtin_diff_blobs true "--no-cone" "--no-scope" \ +"sub1/file2" "${args[@]}" + +args=("--name-only" HEAD~:sub1/file1 HEAD:file2) + +diff_scope builtin_diff_blobs false "--cone" "--scope=sparse" \ +"file2" "${args[@]}" + +diff_scope builtin_diff_blobs false "--cone" "--scope=all" \ +"file2" "${args[@]}" + +diff_scope builtin_diff_blobs false "--cone" "--no-scope" \ +"file2" "${args[@]}" + +args=("--name-only" HEAD~2 HEAD~ HEAD) + +diff_scope builtin_diff_combined false "--no-cone" "--scope=sparse" \ +"sub1/file1 +sub1/file2" "${args[@]}" + +diff_scope builtin_diff_combined false "--no-cone" "--scope=all" \ +"file1 +file2 +sub1/file1 +sub1/file2 +sub2/file1 +sub2/file2" "${args[@]}" + +diff_scope builtin_diff_combined false "--no-cone" "--no-scope" \ +"file1 +file2 +sub1/file1 +sub1/file2 +sub2/file1 +sub2/file2" "${args[@]}" + +diff_scope builtin_diff_combined false "--cone" "--scope=sparse" \ +"file1 +file2 +sub1/file1 +sub1/file2" "${args[@]}" + +diff_scope builtin_diff_combined false "--cone" "--scope=all" \ +"file1 +file2 +sub1/file1 +sub1/file2 +sub2/file1 +sub2/file2" "${args[@]}" + +diff_scope builtin_diff_combined false "--cone" "--no-scope" \ +"file1 +file2 +sub1/file1 +sub1/file2 +sub2/file1 +sub2/file2" "${args[@]}" + +test_expect_success 'diff_no_index --no-cone, --scope=sparse' ' + reset_with_sparse_checkout --no-cone && + ( + cd repo && + mkdir sub3 && + echo sub3/file3 >sub3/file3 + ) && + test_expect_code 1 git -C repo diff --no-index --name-only --scope=sparse sub1/file1 sub1/file2 >actual && + cat > expect <<-EOF && +sub1/file2 + EOF + test_expect_code 1 git -C repo diff --no-index --name-only --scope=sparse sub1/file1 sub3/file3 >actual && + >expect && + test_cmp expect actual +' + +test_expect_success 'diff_no_index --no-cone, --scope=all' ' + reset_with_sparse_checkout --no-cone && + ( + cd repo && + mkdir sub3 && + echo sub3/file3 >sub3/file3 + ) && + test_expect_code 1 git -C repo diff --no-index --name-only --scope=all sub1/file1 sub1/file2 >actual && + cat > expect <<-EOF && +sub1/file2 + EOF + test_expect_code 1 git -C repo diff --no-index --name-only --scope=all sub1/file1 sub3/file3 >actual && + cat > expect <<-EOF && +sub3/file3 + EOF + test_cmp expect actual +' + +test_expect_success 'diff_no_index --no-cone, --no-scope' ' + reset_with_sparse_checkout --no-cone && + ( + cd repo && + mkdir sub3 && + echo sub3/file3 >sub3/file3 + ) && + test_expect_code 1 git -C repo diff --no-index --name-only --no-scope sub1/file1 sub1/file2 >actual && + cat > expect <<-EOF && +sub1/file2 + EOF + test_expect_code 1 git -C repo diff --no-index --name-only --no-scope sub1/file1 sub3/file3 >actual && + cat > expect <<-EOF && +sub3/file3 + EOF + test_cmp expect actual +' + +test_expect_success 'diff_no_index --cone, --scope=sparse' ' + reset_with_sparse_checkout --cone && + ( + cd repo && + echo file3 >file3 && + mkdir sub3 && + echo sub3/file3 >sub3/file3 + ) && + test_expect_code 1 git -C repo diff --no-index --name-only --scope=sparse sub1/file1 file3 >actual && + cat > expect <<-EOF && +file3 + EOF + test_expect_code 1 git -C repo diff --no-index --name-only --scope=sparse sub1/file1 sub3/file3 >actual && + >expect && + test_cmp expect actual +' + +test_expect_success 'diff_no_index --cone, --scope=all' ' + reset_with_sparse_checkout --cone && + ( + cd repo && + echo file3 >file3 && + mkdir sub3 && + echo sub3/file3 >sub3/file3 + ) && + test_expect_code 1 git -C repo diff --no-index --name-only --scope=all sub1/file1 file3 >actual && + cat > expect <<-EOF && +file3 + EOF + test_expect_code 1 git -C repo diff --no-index --name-only --scope=all sub1/file1 sub3/file3 >actual && + cat > expect <<-EOF && +sub3/file3 + EOF + test_cmp expect actual +' + +test_expect_success 'diff_no_index --cone, --no-scope' ' + reset_with_sparse_checkout --cone && + ( + cd repo && + echo file3 >file3 && + mkdir sub3 && + echo sub3/file3 >sub3/file3 + ) && + test_expect_code 1 git -C repo diff --no-index --name-only --no-scope sub1/file1 file3 >actual && + cat > expect <<-EOF && +file3 + EOF + test_expect_code 1 git -C repo diff --no-index --name-only --no-scope sub1/file1 sub3/file3 >actual && + cat > expect <<-EOF && +sub3/file3 + EOF + test_cmp expect actual +' + +test_expect_success 'diff scope config sparse' ' + reset_with_sparse_checkout --cone && + git -C repo -c diff.scope=sparse diff --name-only HEAD~ >actual && + cat > expect <<-EOF && +file1 +file2 +sub1/file1 +sub1/file2 + EOF + test_cmp expect actual +' + +test_expect_success 'diff scope config all' ' + reset_with_sparse_checkout --cone && + git -C repo -c diff.scope=all diff --name-only HEAD~ >actual && + cat > expect <<-EOF && +file1 +file2 +sub1/file1 +sub1/file2 +sub2/file1 +sub2/file2 + EOF + test_cmp expect actual +' + +test_expect_success 'diff scope config override by option' ' + reset_with_sparse_checkout --cone && + git -C repo -c diff.scope=sparse diff --name-only --scope=all HEAD~ >actual && + cat > expect <<-EOF && +file1 +file2 +sub1/file1 +sub1/file2 +sub2/file1 +sub2/file2 + EOF + test_cmp expect actual +' + +test_done diff --git a/tree-diff.c b/tree-diff.c index 69031d7cbae..67f99c8e4df 100644 --- a/tree-diff.c +++ b/tree-diff.c @@ -76,6 +76,11 @@ static int tree_entry_pathcmp(struct tree_desc *t1, struct tree_desc *t2) static int emit_diff_first_parent_only(struct diff_options *opt, struct combine_diff_path *p) { struct combine_diff_parent *p0 = &p->parent[0]; + + if (opt->scope == DIFF_SCOPE_SPARSE && + !diff_path_in_sparse_checkout(p->path)) + return 0; + if (p->mode && p0->mode) { opt->change(opt, p0->mode, p->mode, &p0->oid, &p->oid, 1, 1, p->path, 0, 0);