Message ID | bc7eda4ed8d52072b929a4af6e4e4ed7478ef9d6.1629361733.git.congdanhqx@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | diff-lib: ignore all outsider if --relative asked | expand |
Awesome work, and this not only fixes the crash but a bug in --relative that has been going for a while, after all the documentation clearly says that --relative should filter out entries outside the tree and it does unless in this crashing scenario, which as pointed out was also showing the wrong paths for the diff chunks outside of the directory. My only concern is that it seems this has been broken for a while (couldn't bisect, but AFAIK the first implementation did filter like this one does), so some people might be expecting the broken behaviour. got my Tested-by or Reviewed-by and gratitude. Carlo
Đoàn Trần Công Danh <congdanhqx@gmail.com> writes: > For diff family commands, we can tell them to exclude changes outside > of some directories if --relative is requested. > > In diff_unmerge(), NULL will be returned if the requested path is > outside of the interesting directories, thus we'll run into NULL > pointer dereference in run_diff_files when trying to dereference > its return value. > > We can simply check for NULL there before dereferencing said > return value. However, we can do better by not running diff > on those unintesting entries. Let's do that instead. > > Reported-by: Thomas De Zeeuw <thomas@slight.dev> > Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> > --- Nicely done. If we look at cd676a51 (diff --relative: output paths as relative to the current subdirectory, 2008-02-12) where the "--relative" feature was introduced a bit more carefully, we notice that it wanted to implement "anything outside the .prefix gets discarded" at diff_addremove(), diff_change(), and diff_unmerge() level, instead of the side that enumerates the paths and calls these helpers, and that way, the "--relative" feature would consistently work across diff-files, diff-tree, and diff-index, as they all share these three helpers. But filtering upfront before the codepath even has to decide if it needs to call diff_addremove() or diff_change(), like this patch does, makes sense, especially in the context of diff-files where the enumeration of paths is just to walk a single flat array that is the in-core index. The proposed log message needs a bit more work, though. It would be an 80% OK explanation if the "check diff_unmerge()'s return value" approach was sufficient to correct bugs and we took the approach, but that is not the case. As you found out, it is not sufficient, and it is not the approach you took. The only part in the proposed log that explains the approach that was actually taken was "we can do better by ...". Until/unless we do similar "filter with diffopt.prefix upfront" in diff-index and diff-tree codepaths, we unfortunately cannot lose the filter added to diff_addremove() and diff_change(), but I think this is a good first step towards such a longer-term clean-up. Thanks.
Carlo Arenas <carenas@gmail.com> writes: > My only concern is that it seems this has been broken for a while > (couldn't bisect, but AFAIK the first implementation did filter like > this one does), so some people might be expecting the broken > behaviour. I do not think it is likely. It only shows changes to unmerged paths outside the current directory, which does not sound all that useful.
diff --git a/diff-lib.c b/diff-lib.c index f9eadc4fc1..ca085a03ef 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -117,6 +117,10 @@ int run_diff_files(struct rev_info *revs, unsigned int option) if (!ce_path_match(istate, ce, &revs->prune_data, NULL)) continue; + if (revs->diffopt.prefix && + strncmp(ce->name, revs->diffopt.prefix, revs->diffopt.prefix_length)) + continue; + if (ce_stage(ce)) { struct combine_diff_path *dpath; struct diff_filepair *pair; diff --git a/t/t4045-diff-relative.sh b/t/t4045-diff-relative.sh index 61ba5f707f..8cbbe53262 100755 --- a/t/t4045-diff-relative.sh +++ b/t/t4045-diff-relative.sh @@ -162,4 +162,57 @@ check_diff_relative_option subdir file2 true --no-relative --relative check_diff_relative_option . file2 false --no-relative --relative=subdir check_diff_relative_option . file2 true --no-relative --relative=subdir +test_expect_success 'setup diff --relative unmerged' ' + test_commit zero file0 && + test_commit base subdir/file0 && + git switch -c br1 && + test_commit one file0 && + test_commit sub1 subdir/file0 && + git switch -c br2 base && + test_commit two file0 && + git switch -c br3 && + test_commit sub3 subdir/file0 +' + +test_expect_success 'diff --relative without change in subdir' ' + git switch br2 && + test_when_finished "git merge --abort" && + test_must_fail git merge one && + git -C subdir diff --relative >out && + test_must_be_empty out && + git -C subdir diff --relative --name-only >out && + test_must_be_empty out +' + +test_expect_success 'diff --relative --name-only with change in subdir' ' + git switch br3 && + test_when_finished "git merge --abort" && + test_must_fail git merge sub1 && + test_write_lines file0 file0 >expected && + git -C subdir diff --relative --name-only >out && + test_cmp expected out +' + +test_expect_failure 'diff --relative with change in subdir' ' + git switch br3 && + br1_blob=$(git rev-parse --short --verify br1:subdir/file0) && + br3_blob=$(git rev-parse --short --verify br3:subdir/file0) && + test_when_finished "git merge --abort" && + test_must_fail git merge br1 && + cat >expected <<-EOF && + diff --cc file0 + index $br3_blob,$br1_blob..0000000 + --- a/file0 + +++ b/file0 + @@@ -1,1 -1,1 +1,5 @@@ + ++<<<<<<< HEAD + +sub3 + ++======= + + sub1 + ++>>>>>>> sub1 + EOF + git -C subdir diff --relative >out && + test_cmp expected out +' + test_done
For diff family commands, we can tell them to exclude changes outside of some directories if --relative is requested. In diff_unmerge(), NULL will be returned if the requested path is outside of the interesting directories, thus we'll run into NULL pointer dereference in run_diff_files when trying to dereference its return value. We can simply check for NULL there before dereferencing said return value. However, we can do better by not running diff on those unintesting entries. Let's do that instead. Reported-by: Thomas De Zeeuw <thomas@slight.dev> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> --- Cc: Junio C Hamano <gitster@pobox.com> Notes: Check for return value of diff_unmerge is not enough. Yes, it works with --name-only, however, with only --relative, git-diff shows unmerged entries outside of subdir, too. Furthermore, the filename in "diff --cc" ignores the relative prefix. Fixing this requires touching all over places, at least from my study. Let's fix the crash, first. We have two choices here: * Check pair, aka return value of diff_unmerge, like my original suggestion, and the unmerged entries from outside will be shown, too. Some inconsistent will be observed, --name-only won't list files outside of subdir, while the patch shows them. At least, it doesn't create false impression of no change outside of subdir. * Skip all outsiders, like this patch. While I prefer this approach, I don't know all ramifications of this change, let's say an entry moved to outside of subdir in one side, and modified in other side. Because, I pick the different approach, Junio's ack isn't included here. diff-lib.c | 4 +++ t/t4045-diff-relative.sh | 53 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 57 insertions(+)