diff mbox series

diff-lib: ignore all outsider if --relative asked

Message ID bc7eda4ed8d52072b929a4af6e4e4ed7478ef9d6.1629361733.git.congdanhqx@gmail.com (mailing list archive)
State Superseded
Headers show
Series diff-lib: ignore all outsider if --relative asked | expand

Commit Message

Đoàn Trần Công Danh Aug. 19, 2021, 8:29 a.m. UTC
For diff family commands, we can tell them to exclude changes outside
of some directories if --relative is requested.

In diff_unmerge(), NULL will be returned if the requested path is
outside of the interesting directories, thus we'll run into NULL
pointer dereference in run_diff_files when trying to dereference
its return value.

We can simply check for NULL there before dereferencing said
return value.  However, we can do better by not running diff
on those unintesting entries.  Let's do that instead.

Reported-by: Thomas De Zeeuw <thomas@slight.dev>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---

Cc: Junio C Hamano <gitster@pobox.com>

Notes:
    Check for return value of diff_unmerge is not enough.
    
    Yes, it works with --name-only, however, with only --relative,
    git-diff shows unmerged entries outside of subdir, too.
    
    Furthermore, the filename in "diff --cc" ignores the relative prefix.
    Fixing this requires touching all over places, at least from my study.
    Let's fix the crash, first.
    
    We have two choices here:
    
    * Check pair, aka return value of diff_unmerge, like my original
      suggestion, and the unmerged entries from outside will be shown, too.
      Some inconsistent will be observed, --name-only won't list files
      outside of subdir, while the patch shows them.  At least, it doesn't
      create false impression of no change outside of subdir.
    
    * Skip all outsiders, like this patch.
    
    While I prefer this approach, I don't know all ramifications of this change,
    let's say an entry moved to outside of subdir in one side, and modified in
    other side.
    
    Because, I pick the different approach, Junio's ack isn't included here.
    

 diff-lib.c               |  4 +++
 t/t4045-diff-relative.sh | 53 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

Comments

Carlo Marcelo Arenas Belón Aug. 19, 2021, 9:02 a.m. UTC | #1
Awesome work, and this not only fixes the crash but a bug in
--relative that has been going for a while, after all the
documentation clearly says that --relative should filter out entries
outside the tree and it does unless in this crashing scenario, which
as pointed out was also showing the wrong paths for the diff chunks
outside of the directory.

My only concern is that it seems this has been broken for a while
(couldn't bisect, but AFAIK the first implementation did filter like
this one does), so some people might be expecting the broken
behaviour.

got my Tested-by or Reviewed-by and gratitude.

Carlo
Junio C Hamano Aug. 19, 2021, 4:55 p.m. UTC | #2
Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> For diff family commands, we can tell them to exclude changes outside
> of some directories if --relative is requested.
>
> In diff_unmerge(), NULL will be returned if the requested path is
> outside of the interesting directories, thus we'll run into NULL
> pointer dereference in run_diff_files when trying to dereference
> its return value.
>
> We can simply check for NULL there before dereferencing said
> return value.  However, we can do better by not running diff
> on those unintesting entries.  Let's do that instead.
>
> Reported-by: Thomas De Zeeuw <thomas@slight.dev>
> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> ---

Nicely done.

If we look at cd676a51 (diff --relative: output paths as relative to
the current subdirectory, 2008-02-12) where the "--relative" feature
was introduced a bit more carefully, we notice that it wanted to
implement "anything outside the .prefix gets discarded" at
diff_addremove(), diff_change(), and diff_unmerge() level, instead
of the side that enumerates the paths and calls these helpers, and
that way, the "--relative" feature would consistently work across
diff-files, diff-tree, and diff-index, as they all share these three
helpers.

But filtering upfront before the codepath even has to decide if it
needs to call diff_addremove() or diff_change(), like this patch
does, makes sense, especially in the context of diff-files where the
enumeration of paths is just to walk a single flat array that is the
in-core index.

The proposed log message needs a bit more work, though.  It would be
an 80% OK explanation if the "check diff_unmerge()'s return value"
approach was sufficient to correct bugs and we took the approach,
but that is not the case.  As you found out, it is not sufficient,
and it is not the approach you took.  The only part in the proposed
log that explains the approach that was actually taken was "we can
do better by ...".

Until/unless we do similar "filter with diffopt.prefix upfront" in
diff-index and diff-tree codepaths, we unfortunately cannot lose the
filter added to diff_addremove() and diff_change(), but I think this
is a good first step towards such a longer-term clean-up.

Thanks.
Junio C Hamano Aug. 19, 2021, 4:58 p.m. UTC | #3
Carlo Arenas <carenas@gmail.com> writes:

> My only concern is that it seems this has been broken for a while
> (couldn't bisect, but AFAIK the first implementation did filter like
> this one does), so some people might be expecting the broken
> behaviour.

I do not think it is likely.

It only shows changes to unmerged paths outside the current
directory, which does not sound all that useful.
diff mbox series

Patch

diff --git a/diff-lib.c b/diff-lib.c
index f9eadc4fc1..ca085a03ef 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -117,6 +117,10 @@  int run_diff_files(struct rev_info *revs, unsigned int option)
 		if (!ce_path_match(istate, ce, &revs->prune_data, NULL))
 			continue;
 
+		if (revs->diffopt.prefix &&
+		    strncmp(ce->name, revs->diffopt.prefix, revs->diffopt.prefix_length))
+			continue;
+
 		if (ce_stage(ce)) {
 			struct combine_diff_path *dpath;
 			struct diff_filepair *pair;
diff --git a/t/t4045-diff-relative.sh b/t/t4045-diff-relative.sh
index 61ba5f707f..8cbbe53262 100755
--- a/t/t4045-diff-relative.sh
+++ b/t/t4045-diff-relative.sh
@@ -162,4 +162,57 @@  check_diff_relative_option subdir file2 true --no-relative --relative
 check_diff_relative_option . file2 false --no-relative --relative=subdir
 check_diff_relative_option . file2 true --no-relative --relative=subdir
 
+test_expect_success 'setup diff --relative unmerged' '
+	test_commit zero file0 &&
+	test_commit base subdir/file0 &&
+	git switch -c br1 &&
+	test_commit one file0 &&
+	test_commit sub1 subdir/file0 &&
+	git switch -c br2 base &&
+	test_commit two file0 &&
+	git switch -c br3 &&
+	test_commit sub3 subdir/file0
+'
+
+test_expect_success 'diff --relative without change in subdir' '
+	git switch br2 &&
+	test_when_finished "git merge --abort" &&
+	test_must_fail git merge one &&
+	git -C subdir diff --relative >out &&
+	test_must_be_empty out &&
+	git -C subdir diff --relative --name-only >out &&
+	test_must_be_empty out
+'
+
+test_expect_success 'diff --relative --name-only with change in subdir' '
+	git switch br3 &&
+	test_when_finished "git merge --abort" &&
+	test_must_fail git merge sub1 &&
+	test_write_lines file0 file0 >expected &&
+	git -C subdir diff --relative --name-only >out &&
+	test_cmp expected out
+'
+
+test_expect_failure 'diff --relative with change in subdir' '
+	git switch br3 &&
+	br1_blob=$(git rev-parse --short --verify br1:subdir/file0) &&
+	br3_blob=$(git rev-parse --short --verify br3:subdir/file0) &&
+	test_when_finished "git merge --abort" &&
+	test_must_fail git merge br1 &&
+	cat >expected <<-EOF &&
+	diff --cc file0
+	index $br3_blob,$br1_blob..0000000
+	--- a/file0
+	+++ b/file0
+	@@@ -1,1 -1,1 +1,5 @@@
+	++<<<<<<< HEAD
+	 +sub3
+	++=======
+	+ sub1
+	++>>>>>>> sub1
+	EOF
+	git -C subdir diff --relative >out &&
+	test_cmp expected out
+'
+
 test_done