diff mbox series

[v3,5/5] diff: the -w option breaks --exit-code for --raw and other output modes

Message ID 20230818235932.3253552-6-gitster@pobox.com (mailing list archive)
State Accepted
Commit a64f8b25953b9cf0f6d81ed24e87448ae513b094
Headers show
Series fix interactions with "-w" and "--exit-code" | expand

Commit Message

Junio C Hamano Aug. 18, 2023, 11:59 p.m. UTC
The output from "--raw", "--name-status", and "--name-only" modes in
"git diff" does depend on and does not reflect how certain different
contents are considered equal, unlike "--patch" and "--stat" output
modes do, when used with options like "-w" (another way of thinking
about it is that it is not like we recompute the hash of the blob
after removing all whitespaces to show "git diff --raw -w" output).

But that is not an excuse for "git diff --exit-code --raw" to fail
to report differences with its exit status, when used with options
like "-w".  Make sure the command exits with 1 when these options
report paths that are different.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c                     | 6 ++++++
 t/t4015-diff-whitespace.sh | 8 ++++++--
 2 files changed, 12 insertions(+), 2 deletions(-)

Comments

Jeff King Aug. 21, 2023, 9 p.m. UTC | #1
On Fri, Aug 18, 2023 at 04:59:32PM -0700, Junio C Hamano wrote:

> The output from "--raw", "--name-status", and "--name-only" modes in
> "git diff" does depend on and does not reflect how certain different
> contents are considered equal, unlike "--patch" and "--stat" output
> modes do, when used with options like "-w" (another way of thinking
> about it is that it is not like we recompute the hash of the blob
> after removing all whitespaces to show "git diff --raw -w" output).
> 
> But that is not an excuse for "git diff --exit-code --raw" to fail
> to report differences with its exit status, when used with options
> like "-w".  Make sure the command exits with 1 when these options
> report paths that are different.

I think s/with options like/without options like/ in the final
paragraph?

I like this approach much better than the earlier round. It brings the
output and exit code of "--raw", etc, into harmony. I am open to the
argument that if you ask for "--raw -w", we should start doing
content-level comparisons (since otherwise the "-w" is somewhat
useless). But even if we went that route, it should affect both the
output and the exit code. So this patch would be a prerequisite anyway.

And I say "I am open to" and not "I think it is a good idea" above,
because I suspect there are many lurking corner cases. The hash output
you mentioned is one. But also, what "--raw --patch -w" does right now
is sensible (show the raw output, but also show the whitespace-ignoring
patch), and would be impossible if "-w" affected "--raw".

> diff --git a/diff.c b/diff.c
> index da965ff688..78f4e7518f 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4744,6 +4744,10 @@ void diff_setup_done(struct diff_options *options)
>  	else
>  		options->prefix_length = 0;
>  
> +	/*
> +	 * --name-only, --name-status, --checkdiff, and -s
> +	 * turn other output format off.
> +	 */
>  	if (options->output_format & (DIFF_FORMAT_NAME |
>  				      DIFF_FORMAT_NAME_STATUS |
>  				      DIFF_FORMAT_CHECKDIFF |

OK, this hunk is just documenting the current state, not changing
anything.

> @@ -6072,6 +6076,8 @@ static void flush_one_pair(struct diff_filepair *p, struct diff_options *opt)
>  		fprintf(opt->file, "%s", diff_line_prefix(opt));
>  		write_name_quoted(name_a, opt->file, opt->line_termination);
>  	}
> +
> +	opt->found_changes = 1;
>  }

This seems deceptively simple. :) The key here is that flush_one_pair()
is called only for the formats that are otherwise broken (because they
are not looking at the content). I think this is a good spot to put the
fix, as any similar formats would hopefully get added to the conditional
in diff_flush() that puts us here.

Alternatively, we could put it in the caller, like so:

diff --git a/diff.c b/diff.c
index 78f4e7518f..e7281e75eb 100644
--- a/diff.c
+++ b/diff.c
@@ -6528,6 +6528,7 @@ void diff_flush(struct diff_options *options)
 			if (check_pair_status(p))
 				flush_one_pair(p, options);
 		}
+		options->found_changes = !!q->nr;
 		separator++;
 	}
 

which matches the diffstat technique (I almost thought we could share
the code, but for the diffstat we are counting what ends up in the
diffstat struct; it does not clean out the original diff_queue when it
sees a noop pair).

I don't see a real reason to prefer one over the other.

> -for opts in --patch --quiet -s --stat --shortstat --dirstat=lines
> +for opt_res in --patch --quiet -s --stat --shortstat --dirstat=lines \
> +	       --raw! --name-only! --name-status!
>  do
> +	opts=${opt_res%!} expect_failure=
> +	test "$opts" = "$opt_res" ||
> +		expect_failure="test_expect_code 1"

Cute "!" convention. :)

-Peff
Jeff King Aug. 21, 2023, 9:08 p.m. UTC | #2
On Mon, Aug 21, 2023 at 05:00:58PM -0400, Jeff King wrote:

> Alternatively, we could put it in the caller, like so:
> 
> diff --git a/diff.c b/diff.c
> index 78f4e7518f..e7281e75eb 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -6528,6 +6528,7 @@ void diff_flush(struct diff_options *options)
>  			if (check_pair_status(p))
>  				flush_one_pair(p, options);
>  		}
> +		options->found_changes = !!q->nr;
>  		separator++;
>  	}
>  
> 
> which matches the diffstat technique (I almost thought we could share
> the code, but for the diffstat we are counting what ends up in the
> diffstat struct; it does not clean out the original diff_queue when it
> sees a noop pair).
> 
> I don't see a real reason to prefer one over the other.

Actually, on second look these are not quite the same. Yours only
triggers if check_pair_status() is true. So something like
--diff-filter should affect both output and exit code. Yours gets that
right, and mine does not. Sorry for the noise. :)

-Peff
Junio C Hamano Aug. 21, 2023, 10:16 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

> Alternatively, we could put it in the caller, like so:
>
> diff --git a/diff.c b/diff.c
> index 78f4e7518f..e7281e75eb 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -6528,6 +6528,7 @@ void diff_flush(struct diff_options *options)
>  			if (check_pair_status(p))
>  				flush_one_pair(p, options);
>  		}
> +		options->found_changes = !!q->nr;
>  		separator++;
>  	}

Yup, I suspect they amount to the same thing in practice, but I
couldn't come up with a good explanation to give casual readers of
the conditional call to flush_one_pair() a few lines above why
this is correct.
Jeff King Aug. 21, 2023, 10:23 p.m. UTC | #4
On Mon, Aug 21, 2023 at 05:08:05PM -0400, Jeff King wrote:

> On Mon, Aug 21, 2023 at 05:00:58PM -0400, Jeff King wrote:
> 
> > Alternatively, we could put it in the caller, like so:
> > 
> > diff --git a/diff.c b/diff.c
> > index 78f4e7518f..e7281e75eb 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -6528,6 +6528,7 @@ void diff_flush(struct diff_options *options)
> >  			if (check_pair_status(p))
> >  				flush_one_pair(p, options);
> >  		}
> > +		options->found_changes = !!q->nr;
> >  		separator++;
> >  	}
> >  
> > 
> > which matches the diffstat technique (I almost thought we could share
> > the code, but for the diffstat we are counting what ends up in the
> > diffstat struct; it does not clean out the original diff_queue when it
> > sees a noop pair).
> > 
> > I don't see a real reason to prefer one over the other.
> 
> Actually, on second look these are not quite the same. Yours only
> triggers if check_pair_status() is true. So something like
> --diff-filter should affect both output and exit code. Yours gets that
> right, and mine does not. Sorry for the noise. :)

Sorry, I'm dumb again. That is not where diff-filter is handled (it is
handled by diffcore, duh). check_pair_status() is only checking for
DIFF_STATUS_UNKNOWN. I'm not sure when that would ever be set, but it
seems like we should be matching the "if it is output" logic, which is
what you get by calling flush_one_pair(). So yours is definitely
preferable, even if I don't understand the possible differences. ;)

-Peff
Junio C Hamano Aug. 21, 2023, 10:26 p.m. UTC | #5
Jeff King <peff@peff.net> writes:

>> ...
>> about it is that it is not like we recompute the hash of the blob
>> after removing all whitespaces to show "git diff --raw -w" output).
>> 
>> But that is not an excuse for "git diff --exit-code --raw" to fail
>> to report differences with its exit status, when used with options
>> like "-w".  Make sure the command exits with 1 when these options
>> report paths that are different.
>
> I think s/with options like/without options like/ in the final
> paragraph?

Sorry, I am confused.  "diff --raw --exit-code" without "-w" reports
with its exit status that it found differences just fine.  When "-w"
is given, without this patch, it always reports 0.

What I wanted to convey was ...

    "--raw" and friends deliberately ignore "-w" and other "look at
    contents" options.

    The fact they do ignore the "look at contents" options is not a
    good excuse for "diff --raw -w --exit-code" to also ignore the
    request to report the differences with its exit status.  "diff
    --raw --exit-code" does report the differences as requested, and
    we should do the same when given "-w".

I guess the two have no logical connection so the latter sentence is
not making sense, with its "with" kept as-is or changed to "without".
Jeff King Aug. 22, 2023, 1:30 a.m. UTC | #6
On Mon, Aug 21, 2023 at 03:26:37PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> ...
> >> about it is that it is not like we recompute the hash of the blob
> >> after removing all whitespaces to show "git diff --raw -w" output).
> >> 
> >> But that is not an excuse for "git diff --exit-code --raw" to fail
> >> to report differences with its exit status, when used with options
> >> like "-w".  Make sure the command exits with 1 when these options
> >> report paths that are different.
> >
> > I think s/with options like/without options like/ in the final
> > paragraph?
> 
> Sorry, I am confused.  "diff --raw --exit-code" without "-w" reports
> with its exit status that it found differences just fine.  When "-w"
> is given, without this patch, it always reports 0.

I think I just got confused by the multiple negatives "not an excuse to
fail to...".

> What I wanted to convey was ...
> 
>     "--raw" and friends deliberately ignore "-w" and other "look at
>     contents" options.
> 
>     The fact they do ignore the "look at contents" options is not a
>     good excuse for "diff --raw -w --exit-code" to also ignore the
>     request to report the differences with its exit status.  "diff
>     --raw --exit-code" does report the differences as requested, and
>     we should do the same when given "-w".

Yes, that conveys it more clearly (to me, anyway).

-Peff
diff mbox series

Patch

diff --git a/diff.c b/diff.c
index da965ff688..78f4e7518f 100644
--- a/diff.c
+++ b/diff.c
@@ -4744,6 +4744,10 @@  void diff_setup_done(struct diff_options *options)
 	else
 		options->prefix_length = 0;
 
+	/*
+	 * --name-only, --name-status, --checkdiff, and -s
+	 * turn other output format off.
+	 */
 	if (options->output_format & (DIFF_FORMAT_NAME |
 				      DIFF_FORMAT_NAME_STATUS |
 				      DIFF_FORMAT_CHECKDIFF |
@@ -6072,6 +6076,8 @@  static void flush_one_pair(struct diff_filepair *p, struct diff_options *opt)
 		fprintf(opt->file, "%s", diff_line_prefix(opt));
 		write_name_quoted(name_a, opt->file, opt->line_termination);
 	}
+
+	opt->found_changes = 1;
 }
 
 static void show_file_mode_name(struct diff_options *opt, const char *newdelete, struct diff_filespec *fs)
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 230a89b951..7fcffa4b11 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -11,8 +11,12 @@  TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-diff.sh
 
-for opts in --patch --quiet -s --stat --shortstat --dirstat=lines
+for opt_res in --patch --quiet -s --stat --shortstat --dirstat=lines \
+	       --raw! --name-only! --name-status!
 do
+	opts=${opt_res%!} expect_failure=
+	test "$opts" = "$opt_res" ||
+		expect_failure="test_expect_code 1"
 
 	test_expect_success "status with $opts (different)" '
 		echo foo >x &&
@@ -40,7 +44,7 @@  do
 		echo foo >x &&
 		git add x &&
 		echo " foo" >x &&
-		git diff -w $opts --exit-code x
+		$expect_failure git diff -w $opts --exit-code x
 	'
 done