diff mbox series

[v2,1/3] read-cache: optionally collect pathspec matching info

Message ID 20240329205649.1483032-3-shyamthakkar001@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v2,1/3] read-cache: optionally collect pathspec matching info | expand

Commit Message

Ghanshyam Thakkar March 29, 2024, 8:56 p.m. UTC
The add_files_to_cache() adds files to the index. And
add_files_to_cache() in turn calls run_diff_files() to perform this
operation. The run_diff_files() uses ce_path_match() to match the
pathspec against cache entries. However, it is called with NULL value
for the seen parameter, which collects the pathspec matching
information.

Therefore, introduce a new parameter 'char *ps_matched' to 
add_files_to_cache() and in turn to run_diff_files(), to feed it to
ce_path_match() to optionally collect the pathspec matching
information. This will be helpful in reporting error in case of an
untracked path being passed when the expectation is a known path. Thus,
this will be used in the subsequent commits to fix 'commit -i' and 'add
-u' not erroring out when given untracked paths.

Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
---
 add-interactive.c           | 2 +-
 builtin/add.c               | 6 +++---
 builtin/checkout.c          | 3 ++-
 builtin/commit.c            | 2 +-
 builtin/diff-files.c        | 2 +-
 builtin/diff.c              | 2 +-
 builtin/merge.c             | 2 +-
 builtin/stash.c             | 2 +-
 builtin/submodule--helper.c | 4 ++--
 diff-lib.c                  | 5 +++--
 diff.h                      | 3 ++-
 read-cache-ll.h             | 4 ++--
 read-cache.c                | 6 +++---
 wt-status.c                 | 6 +++---
 14 files changed, 26 insertions(+), 23 deletions(-)

Comments

Junio C Hamano March 29, 2024, 9:35 p.m. UTC | #1
Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:

> The add_files_to_cache() adds files to the index. And
> add_files_to_cache() in turn calls run_diff_files() to perform this
> operation. The run_diff_files() uses ce_path_match() to match the
> pathspec against cache entries. However, it is called with NULL value
> for the seen parameter, which collects the pathspec matching
> information.

", which collects" -> ", which means we lose"

> Therefore, introduce a new parameter 'char *ps_matched' to 

"Therefore, introduce" -> "Introduce"

> add_files_to_cache() and in turn to run_diff_files(), to feed it to
> ce_path_match() to optionally collect the pathspec matching
> information. This will be helpful in reporting error in case of an
> untracked path being passed when the expectation is a known path. Thus,
> this will be used in the subsequent commits to fix 'commit -i' and 'add
> -u' not erroring out when given untracked paths.

A new parameter to run_diff_files() came as a bit of surprise to me.

When I responded to the previous round, I somehow thought that we'd
add a new member to the rev structure that points at an optional
.ps_matched member next to the existing .prune_data member.  

That way, it would hopefully be easy for a future code to see if a
"diff" invocation, not necessarily run_diff_files() that compares
the working tree and the index, consumed all the pathspec elements.
If such a new .ps_matched member is initialized to NULL, all the
patch noise we see in this patch will become unnecessary, no?

> diff --git a/diff-lib.c b/diff-lib.c
> index 5e8717c774..2dc3864abd 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -101,7 +101,8 @@ static int match_stat_with_submodule(struct diff_options *diffopt,
>  	return changed;
>  }
>  
> -void run_diff_files(struct rev_info *revs, unsigned int option)
> +void run_diff_files(struct rev_info *revs, char *ps_matched,
> +		    unsigned int option)
>  {
>  	int entries, i;
>  	int diff_unmerged_stage = revs->max_count;
> @@ -127,7 +128,7 @@ void run_diff_files(struct rev_info *revs, unsigned int option)
>  		if (diff_can_quit_early(&revs->diffopt))
>  			break;
>  
> -		if (!ce_path_match(istate, ce, &revs->prune_data, NULL))
> +		if (!ce_path_match(istate, ce, &revs->prune_data, ps_matched))
>  			continue;
>  
>  		if (revs->diffopt.prefix &&

This may be a non-issue, but after this point we see the beginning
of another filter to reject paths outside the hierarchy "--relative"
specifies.  It is possible that a pathspec element matches ce->name
but the matched cache entry is outside the current area.  Shouldn't
we then consider that the pathspec element did not match?  E.g., in
our project, what should happen if we did this?

    $ echo >>diff.h
    $ cd t
    $ git diff --relative \*.h

The command should show nothing.  Did the pathspec '*.h' match?  From
those who know how the machinery works, yes it did before the resulting
paths were further filtered out, but from the end-user's point of view,
because "--relative" limits the diff to the current directory and below,
and because 't' and below did not have any C header files, wouldn't it
be more natural and useful to say the pathspec wasn't used?

This does not matter right now because we are not planning to add a
new "--error-unmatch" option to "git diff", but when/if we do, it
starts to matter.  The hunk at least needs a NEEDSWORK comment,
summarizing the above.

	/*
	 * NEEDSWORK:
	 * Here we filter with pathspec but the result is further
	 * filtered out when --relative is in effect.  To end-users,
         * a pathspec element that matched only to paths outside the
         * current directory is like not matching anything at all;
         * the handling of ps_matched[] here may become problematic
	 * if/when we add the "--error-unmatch" option to "git diff".
	 */ 

A solution to that problem might be just a matter of swapping the
order of filtering, but it may have performance implications and I'd
rather not have to worry about it right now in the context of the
current topic, hence a NEEDSWORK comment without attempting to "fix"
it would be the most preferred approach to such a side issue.
Junio C Hamano March 29, 2024, 10:16 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> A new parameter to run_diff_files() came as a bit of surprise to me.
>
> When I responded to the previous round, I somehow thought that we'd
> add a new member to the rev structure that points at an optional
> .ps_matched member next to the existing .prune_data member.  
>
> That way, it would hopefully be easy for a future code to see if a
> "diff" invocation, not necessarily run_diff_files() that compares
> the working tree and the index, consumed all the pathspec elements.
> If such a new .ps_matched member is initialized to NULL, all the
> patch noise we see in this patch will become unnecessary, no?

This is how such a change may look like.  After applying [2/3] and
[3/3] steps from your series on top of this patch, the updated tests
in your series (2200 and 7501) seem to still pass.

------- >8 ------------- >8 ------------- >8 ------------- >8 -------

Subject: [PATCH] revision: optionally record matches with pathspec elements

Unlike "git add" and other end-user facing command, where it is
diagnosed as an error to give a pathspec with an element that does
not match any path, the diff machinery does not care if some
elements of the pathspec does not match.  Given that the diff
machinery is heavily used in pathspec-limited "git log" machinery,
and it is common for a path to come and go while traversing the
project history, this is usually a good thing.

However, in some cases we would want to know if all the pathspec
elements matched.  For example, "git add -u <pathspec>" internally
uses the machinery used by "git diff-files" to decide contents from
what paths to add to the index, and as an end-user facing command,
"git add -u" would want to report an unmatched pathspec element.

Add a new .ps_matched member next to the .prune_data member in
"struct rev_info" so that we can optionally keep track of the use of
.prune_data pathspec elements that can be inspected by the caller.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/add.c      |  4 ++--
 builtin/checkout.c |  3 ++-
 builtin/commit.c   |  2 +-
 diff-lib.c         | 11 ++++++++++-
 read-cache-ll.h    |  4 ++--
 read-cache.c       |  8 +++++---
 revision.h         |  1 +
 7 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 393c10cbcf..dc4b42d0ad 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -553,8 +553,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		exit_status |= renormalize_tracked_files(&pathspec, flags);
 	else
 		exit_status |= add_files_to_cache(the_repository, prefix,
-						  &pathspec, include_sparse,
-						  flags);
+						  &pathspec, NULL,
+						  include_sparse, flags);
 
 	if (add_new_files)
 		exit_status |= add_files(&dir, flags);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 2e8b0d18f4..56d1828856 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -878,7 +878,8 @@ static int merge_working_tree(const struct checkout_opts *opts,
 			 * entries in the index.
 			 */
 
-			add_files_to_cache(the_repository, NULL, NULL, 0, 0);
+			add_files_to_cache(the_repository, NULL, NULL, NULL, 0,
+					   0);
 			init_merge_options(&o, the_repository);
 			o.verbosity = 0;
 			work = write_in_core_index_as_tree(the_repository);
diff --git a/builtin/commit.c b/builtin/commit.c
index b27b56c8be..8f31decc6b 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -444,7 +444,7 @@ static const char *prepare_index(const char **argv, const char *prefix,
 		repo_hold_locked_index(the_repository, &index_lock,
 				       LOCK_DIE_ON_ERROR);
 		add_files_to_cache(the_repository, also ? prefix : NULL,
-				   &pathspec, 0, 0);
+				   &pathspec, NULL, 0, 0);
 		refresh_cache_or_die(refresh_flags);
 		cache_tree_update(&the_index, WRITE_TREE_SILENT);
 		if (write_locked_index(&the_index, &index_lock, 0))
diff --git a/diff-lib.c b/diff-lib.c
index 1cd790a4d2..683f11e509 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -127,7 +127,16 @@ void run_diff_files(struct rev_info *revs, unsigned int option)
 		if (diff_can_quit_early(&revs->diffopt))
 			break;
 
-		if (!ce_path_match(istate, ce, &revs->prune_data, NULL))
+		/*
+		 * NEEDSWORK:
+		 * Here we filter with pathspec but the result is further
+		 * filtered out when --relative is in effect.  To end-users,
+		 * a pathspec element that matched only to paths outside the
+		 * current directory is like not matching anything at all;
+		 * the handling of ps_matched[] here may become problematic
+		 * if/when we add the "--error-unmatch" option to "git diff".
+		 */
+		if (!ce_path_match(istate, ce, &revs->prune_data, revs->ps_matched))
 			continue;
 
 		if (revs->diffopt.prefix &&
diff --git a/read-cache-ll.h b/read-cache-ll.h
index 2a50a784f0..09414afd04 100644
--- a/read-cache-ll.h
+++ b/read-cache-ll.h
@@ -480,8 +480,8 @@ extern int verify_ce_order;
 int cmp_cache_name_compare(const void *a_, const void *b_);
 
 int add_files_to_cache(struct repository *repo, const char *prefix,
-		       const struct pathspec *pathspec, int include_sparse,
-		       int flags);
+		       const struct pathspec *pathspec, char *ps_matched,
+		       int include_sparse, int flags);
 
 void overlay_tree_on_index(struct index_state *istate,
 			   const char *tree_name, const char *prefix);
diff --git a/read-cache.c b/read-cache.c
index f546cf7875..e1723ad796 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -3958,8 +3958,8 @@ static void update_callback(struct diff_queue_struct *q,
 }
 
 int add_files_to_cache(struct repository *repo, const char *prefix,
-		       const struct pathspec *pathspec, int include_sparse,
-		       int flags)
+		       const struct pathspec *pathspec, char *ps_matched,
+		       int include_sparse, int flags)
 {
 	struct update_callback_data data;
 	struct rev_info rev;
@@ -3971,8 +3971,10 @@ int add_files_to_cache(struct repository *repo, const char *prefix,
 
 	repo_init_revisions(repo, &rev, prefix);
 	setup_revisions(0, NULL, &rev, NULL);
-	if (pathspec)
+	if (pathspec) {
 		copy_pathspec(&rev.prune_data, pathspec);
+		rev.ps_matched = ps_matched;
+	}
 	rev.diffopt.output_format = DIFF_FORMAT_CALLBACK;
 	rev.diffopt.format_callback = update_callback;
 	rev.diffopt.format_callback_data = &data;
diff --git a/revision.h b/revision.h
index 94c43138bc..0e470d1df1 100644
--- a/revision.h
+++ b/revision.h
@@ -142,6 +142,7 @@ struct rev_info {
 	/* Basic information */
 	const char *prefix;
 	const char *def;
+	char *ps_matched; /* optionally record matches of prune_data */
 	struct pathspec prune_data;
 
 	/*
Ghanshyam Thakkar March 30, 2024, 2:27 p.m. UTC | #3
On Fri, 29 Mar 2024, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > A new parameter to run_diff_files() came as a bit of surprise to me.
> >
> > When I responded to the previous round, I somehow thought that we'd
> > add a new member to the rev structure that points at an optional
> > .ps_matched member next to the existing .prune_data member.  
> >
> > That way, it would hopefully be easy for a future code to see if a
> > "diff" invocation, not necessarily run_diff_files() that compares
> > the working tree and the index, consumed all the pathspec elements.
> > If such a new .ps_matched member is initialized to NULL, all the
> > patch noise we see in this patch will become unnecessary, no?
> 
> This is how such a change may look like.  After applying [2/3] and
> [3/3] steps from your series on top of this patch, the updated tests
> in your series (2200 and 7501) seem to still pass.

This seems perfect. I hope you're OK with me using this patch as a base
for patch [2/3] and [3/3]. :)

> ------- >8 ------------- >8 ------------- >8 ------------- >8 -------
> 
> Subject: [PATCH] revision: optionally record matches with pathspec elements
> 
> Unlike "git add" and other end-user facing command, where it is
> diagnosed as an error to give a pathspec with an element that does
> not match any path, the diff machinery does not care if some
> elements of the pathspec does not match.  Given that the diff
> machinery is heavily used in pathspec-limited "git log" machinery,
> and it is common for a path to come and go while traversing the
> project history, this is usually a good thing.
> 
> However, in some cases we would want to know if all the pathspec
> elements matched.  For example, "git add -u <pathspec>" internally
> uses the machinery used by "git diff-files" to decide contents from
> what paths to add to the index, and as an end-user facing command,
> "git add -u" would want to report an unmatched pathspec element.
> 
> Add a new .ps_matched member next to the .prune_data member in
> "struct rev_info" so that we can optionally keep track of the use of
> .prune_data pathspec elements that can be inspected by the caller.
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  builtin/add.c      |  4 ++--
>  builtin/checkout.c |  3 ++-
>  builtin/commit.c   |  2 +-
>  diff-lib.c         | 11 ++++++++++-
>  read-cache-ll.h    |  4 ++--
>  read-cache.c       |  8 +++++---
>  revision.h         |  1 +
>  7 files changed, 23 insertions(+), 10 deletions(-)
> 
> diff --git a/builtin/add.c b/builtin/add.c
> index 393c10cbcf..dc4b42d0ad 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -553,8 +553,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>  		exit_status |= renormalize_tracked_files(&pathspec, flags);
>  	else
>  		exit_status |= add_files_to_cache(the_repository, prefix,
> -						  &pathspec, include_sparse,
> -						  flags);
> +						  &pathspec, NULL,
> +						  include_sparse, flags);
>  
>  	if (add_new_files)
>  		exit_status |= add_files(&dir, flags);
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index 2e8b0d18f4..56d1828856 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -878,7 +878,8 @@ static int merge_working_tree(const struct checkout_opts *opts,
>  			 * entries in the index.
>  			 */
>  
> -			add_files_to_cache(the_repository, NULL, NULL, 0, 0);
> +			add_files_to_cache(the_repository, NULL, NULL, NULL, 0,
> +					   0);
>  			init_merge_options(&o, the_repository);
>  			o.verbosity = 0;
>  			work = write_in_core_index_as_tree(the_repository);
> diff --git a/builtin/commit.c b/builtin/commit.c
> index b27b56c8be..8f31decc6b 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -444,7 +444,7 @@ static const char *prepare_index(const char **argv, const char *prefix,
>  		repo_hold_locked_index(the_repository, &index_lock,
>  				       LOCK_DIE_ON_ERROR);
>  		add_files_to_cache(the_repository, also ? prefix : NULL,
> -				   &pathspec, 0, 0);
> +				   &pathspec, NULL, 0, 0);
>  		refresh_cache_or_die(refresh_flags);
>  		cache_tree_update(&the_index, WRITE_TREE_SILENT);
>  		if (write_locked_index(&the_index, &index_lock, 0))
> diff --git a/diff-lib.c b/diff-lib.c
> index 1cd790a4d2..683f11e509 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -127,7 +127,16 @@ void run_diff_files(struct rev_info *revs, unsigned int option)
>  		if (diff_can_quit_early(&revs->diffopt))
>  			break;
>  
> -		if (!ce_path_match(istate, ce, &revs->prune_data, NULL))
> +		/*
> +		 * NEEDSWORK:
> +		 * Here we filter with pathspec but the result is further
> +		 * filtered out when --relative is in effect.  To end-users,
> +		 * a pathspec element that matched only to paths outside the
> +		 * current directory is like not matching anything at all;
> +		 * the handling of ps_matched[] here may become problematic
> +		 * if/when we add the "--error-unmatch" option to "git diff".
> +		 */
> +		if (!ce_path_match(istate, ce, &revs->prune_data, revs->ps_matched))
>  			continue;
>  
>  		if (revs->diffopt.prefix &&
> diff --git a/read-cache-ll.h b/read-cache-ll.h
> index 2a50a784f0..09414afd04 100644
> --- a/read-cache-ll.h
> +++ b/read-cache-ll.h
> @@ -480,8 +480,8 @@ extern int verify_ce_order;
>  int cmp_cache_name_compare(const void *a_, const void *b_);
>  
>  int add_files_to_cache(struct repository *repo, const char *prefix,
> -		       const struct pathspec *pathspec, int include_sparse,
> -		       int flags);
> +		       const struct pathspec *pathspec, char *ps_matched,
> +		       int include_sparse, int flags);
>  
>  void overlay_tree_on_index(struct index_state *istate,
>  			   const char *tree_name, const char *prefix);
> diff --git a/read-cache.c b/read-cache.c
> index f546cf7875..e1723ad796 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -3958,8 +3958,8 @@ static void update_callback(struct diff_queue_struct *q,
>  }
>  
>  int add_files_to_cache(struct repository *repo, const char *prefix,
> -		       const struct pathspec *pathspec, int include_sparse,
> -		       int flags)
> +		       const struct pathspec *pathspec, char *ps_matched,
> +		       int include_sparse, int flags)
>  {
>  	struct update_callback_data data;
>  	struct rev_info rev;
> @@ -3971,8 +3971,10 @@ int add_files_to_cache(struct repository *repo, const char *prefix,
>  
>  	repo_init_revisions(repo, &rev, prefix);
>  	setup_revisions(0, NULL, &rev, NULL);
> -	if (pathspec)
> +	if (pathspec) {
>  		copy_pathspec(&rev.prune_data, pathspec);
> +		rev.ps_matched = ps_matched;
> +	}
>  	rev.diffopt.output_format = DIFF_FORMAT_CALLBACK;
>  	rev.diffopt.format_callback = update_callback;
>  	rev.diffopt.format_callback_data = &data;
> diff --git a/revision.h b/revision.h
> index 94c43138bc..0e470d1df1 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -142,6 +142,7 @@ struct rev_info {
>  	/* Basic information */
>  	const char *prefix;
>  	const char *def;
> +	char *ps_matched; /* optionally record matches of prune_data */
>  	struct pathspec prune_data;
>  
>  	/*
> -- 
> 2.44.0-413-gd6fd04375f
>
Junio C Hamano March 30, 2024, 4:27 p.m. UTC | #4
Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:

>> This is how such a change may look like.  After applying [2/3] and
>> [3/3] steps from your series on top of this patch, the updated tests
>> in your series (2200 and 7501) seem to still pass.
>
> This seems perfect. I hope you're OK with me using this patch as a base
> for patch [2/3] and [3/3]. :)

Yes, as long as you promise to fix typos and grammatical mistakes in
my proposed log messages (there are several I just noticed X-<).

Thanks.
diff mbox series

Patch

diff --git a/add-interactive.c b/add-interactive.c
index 6bf87e7ae7..b33260a611 100644
--- a/add-interactive.c
+++ b/add-interactive.c
@@ -572,7 +572,7 @@  static int get_modified_files(struct repository *r,
 			run_diff_index(&rev, DIFF_INDEX_CACHED);
 		else {
 			rev.diffopt.flags.ignore_dirty_submodules = 1;
-			run_diff_files(&rev, 0);
+			run_diff_files(&rev, NULL, 0);
 		}
 
 		release_revisions(&rev);
diff --git a/builtin/add.c b/builtin/add.c
index 393c10cbcf..ffe5fd8d44 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -191,7 +191,7 @@  static int edit_patch(int argc, const char **argv, const char *prefix)
 	out = xopen(file, O_CREAT | O_WRONLY | O_TRUNC, 0666);
 	rev.diffopt.file = xfdopen(out, "w");
 	rev.diffopt.close_file = 1;
-	run_diff_files(&rev, 0);
+	run_diff_files(&rev, NULL, 0);
 
 	if (launch_editor(file, NULL, NULL))
 		die(_("editing patch failed"));
@@ -553,8 +553,8 @@  int cmd_add(int argc, const char **argv, const char *prefix)
 		exit_status |= renormalize_tracked_files(&pathspec, flags);
 	else
 		exit_status |= add_files_to_cache(the_repository, prefix,
-						  &pathspec, include_sparse,
-						  flags);
+						  &pathspec, NULL,
+						  include_sparse, flags);
 
 	if (add_new_files)
 		exit_status |= add_files(&dir, flags);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 902c97ab23..02bd035081 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -876,7 +876,8 @@  static int merge_working_tree(const struct checkout_opts *opts,
 			 * entries in the index.
 			 */
 
-			add_files_to_cache(the_repository, NULL, NULL, 0, 0);
+			add_files_to_cache(the_repository, NULL, NULL, NULL, 0,
+					   0);
 			init_merge_options(&o, the_repository);
 			o.verbosity = 0;
 			work = write_in_core_index_as_tree(the_repository);
diff --git a/builtin/commit.c b/builtin/commit.c
index a91197245f..24efeaca98 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -444,7 +444,7 @@  static const char *prepare_index(const char **argv, const char *prefix,
 		repo_hold_locked_index(the_repository, &index_lock,
 				       LOCK_DIE_ON_ERROR);
 		add_files_to_cache(the_repository, also ? prefix : NULL,
-				   &pathspec, 0, 0);
+				   &pathspec, NULL, 0, 0);
 		refresh_cache_or_die(refresh_flags);
 		cache_tree_update(&the_index, WRITE_TREE_SILENT);
 		if (write_locked_index(&the_index, &index_lock, 0))
diff --git a/builtin/diff-files.c b/builtin/diff-files.c
index 018011f29e..8559aa254c 100644
--- a/builtin/diff-files.c
+++ b/builtin/diff-files.c
@@ -81,7 +81,7 @@  int cmd_diff_files(int argc, const char **argv, const char *prefix)
 
 	if (repo_read_index_preload(the_repository, &rev.diffopt.pathspec, 0) < 0)
 		die_errno("repo_read_index_preload");
-	run_diff_files(&rev, options);
+	run_diff_files(&rev, NULL, options);
 	result = diff_result_code(&rev.diffopt);
 	release_revisions(&rev);
 	return result;
diff --git a/builtin/diff.c b/builtin/diff.c
index 6e196e0c7d..3e9b838bdd 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -283,7 +283,7 @@  static void builtin_diff_files(struct rev_info *revs, int argc, const char **arg
 				    0) < 0) {
 		die_errno("repo_read_index_preload");
 	}
-	run_diff_files(revs, options);
+	run_diff_files(revs, NULL, options);
 }
 
 struct symdiff {
diff --git a/builtin/merge.c b/builtin/merge.c
index a0ba1f9815..4b4c1d6a31 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -979,7 +979,7 @@  static int evaluate_result(void)
 		DIFF_FORMAT_CALLBACK;
 	rev.diffopt.format_callback = count_diff_files;
 	rev.diffopt.format_callback_data = &cnt;
-	run_diff_files(&rev, 0);
+	run_diff_files(&rev, NULL, 0);
 
 	/*
 	 * Check how many unmerged entries are
diff --git a/builtin/stash.c b/builtin/stash.c
index 7fb355bff0..2c00026390 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -1121,7 +1121,7 @@  static int check_changes_tracked_files(const struct pathspec *ps)
 		goto done;
 	}
 
-	run_diff_files(&rev, 0);
+	run_diff_files(&rev, NULL, 0);
 	if (diff_result_code(&rev.diffopt)) {
 		ret = 1;
 		goto done;
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index fda50f2af1..e9047021e0 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -667,7 +667,7 @@  static void status_submodule(const char *path, const struct object_id *ce_oid,
 	repo_init_revisions(the_repository, &rev, NULL);
 	rev.abbrev = 0;
 	setup_revisions(diff_files_args.nr, diff_files_args.v, &rev, &opt);
-	run_diff_files(&rev, 0);
+	run_diff_files(&rev, NULL, 0);
 
 	if (!diff_result_code(&rev.diffopt)) {
 		print_status(flags, ' ', path, ce_oid,
@@ -1141,7 +1141,7 @@  static int compute_summary_module_list(struct object_id *head_oid,
 	if (diff_cmd == DIFF_INDEX)
 		run_diff_index(&rev, info->cached ? DIFF_INDEX_CACHED : 0);
 	else
-		run_diff_files(&rev, 0);
+		run_diff_files(&rev, NULL, 0);
 	prepare_submodule_summary(info, &list);
 cleanup:
 	strvec_clear(&diff_args);
diff --git a/diff-lib.c b/diff-lib.c
index 5e8717c774..2dc3864abd 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -101,7 +101,8 @@  static int match_stat_with_submodule(struct diff_options *diffopt,
 	return changed;
 }
 
-void run_diff_files(struct rev_info *revs, unsigned int option)
+void run_diff_files(struct rev_info *revs, char *ps_matched,
+		    unsigned int option)
 {
 	int entries, i;
 	int diff_unmerged_stage = revs->max_count;
@@ -127,7 +128,7 @@  void run_diff_files(struct rev_info *revs, unsigned int option)
 		if (diff_can_quit_early(&revs->diffopt))
 			break;
 
-		if (!ce_path_match(istate, ce, &revs->prune_data, NULL))
+		if (!ce_path_match(istate, ce, &revs->prune_data, ps_matched))
 			continue;
 
 		if (revs->diffopt.prefix &&
diff --git a/diff.h b/diff.h
index 66bd8aeb29..a01feaa586 100644
--- a/diff.h
+++ b/diff.h
@@ -638,7 +638,8 @@  void diff_get_merge_base(const struct rev_info *revs, struct object_id *mb);
 #define DIFF_SILENT_ON_REMOVED 01
 /* report racily-clean paths as modified */
 #define DIFF_RACY_IS_MODIFIED 02
-void run_diff_files(struct rev_info *revs, unsigned int option);
+void run_diff_files(struct rev_info *revs, char *ps_matched,
+		    unsigned int option);
 
 #define DIFF_INDEX_CACHED 01
 #define DIFF_INDEX_MERGE_BASE 02
diff --git a/read-cache-ll.h b/read-cache-ll.h
index 2a50a784f0..09414afd04 100644
--- a/read-cache-ll.h
+++ b/read-cache-ll.h
@@ -480,8 +480,8 @@  extern int verify_ce_order;
 int cmp_cache_name_compare(const void *a_, const void *b_);
 
 int add_files_to_cache(struct repository *repo, const char *prefix,
-		       const struct pathspec *pathspec, int include_sparse,
-		       int flags);
+		       const struct pathspec *pathspec, char *ps_matched,
+		       int include_sparse, int flags);
 
 void overlay_tree_on_index(struct index_state *istate,
 			   const char *tree_name, const char *prefix);
diff --git a/read-cache.c b/read-cache.c
index f546cf7875..e179444445 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -3958,8 +3958,8 @@  static void update_callback(struct diff_queue_struct *q,
 }
 
 int add_files_to_cache(struct repository *repo, const char *prefix,
-		       const struct pathspec *pathspec, int include_sparse,
-		       int flags)
+		       const struct pathspec *pathspec, char *ps_matched,
+		       int include_sparse, int flags)
 {
 	struct update_callback_data data;
 	struct rev_info rev;
@@ -3985,7 +3985,7 @@  int add_files_to_cache(struct repository *repo, const char *prefix,
 	 * may not have their own transaction active.
 	 */
 	begin_odb_transaction();
-	run_diff_files(&rev, DIFF_RACY_IS_MODIFIED);
+	run_diff_files(&rev, ps_matched, DIFF_RACY_IS_MODIFIED);
 	end_odb_transaction();
 
 	release_revisions(&rev);
diff --git a/wt-status.c b/wt-status.c
index 2db4bb3a12..cf6d61e60c 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -629,7 +629,7 @@  static void wt_status_collect_changes_worktree(struct wt_status *s)
 	rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : rev.diffopt.rename_limit;
 	rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : rev.diffopt.rename_score;
 	copy_pathspec(&rev.prune_data, &s->pathspec);
-	run_diff_files(&rev, 0);
+	run_diff_files(&rev, NULL, 0);
 	release_revisions(&rev);
 }
 
@@ -1173,7 +1173,7 @@  static void wt_longstatus_print_verbose(struct wt_status *s)
 		setup_work_tree();
 		rev.diffopt.a_prefix = "i/";
 		rev.diffopt.b_prefix = "w/";
-		run_diff_files(&rev, 0);
+		run_diff_files(&rev, NULL, 0);
 	}
 	release_revisions(&rev);
 }
@@ -2594,7 +2594,7 @@  int has_unstaged_changes(struct repository *r, int ignore_submodules)
 	}
 	rev_info.diffopt.flags.quick = 1;
 	diff_setup_done(&rev_info.diffopt);
-	run_diff_files(&rev_info, 0);
+	run_diff_files(&rev_info, NULL, 0);
 	result = diff_result_code(&rev_info.diffopt);
 	release_revisions(&rev_info);
 	return result;