diff mbox series

[v3,1/5] t3701: redefine what is "bogus" output of a diff filter

Message ID a01fa5d25e4a94dd8ece5e328f853c000a2ad0f9.1661785916.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series built-in add -p: support diff-so-fancy better | expand

Commit Message

Johannes Schindelin Aug. 29, 2022, 3:11 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

When parsing the colored version of a diff, the interactive `add`
command really relies on the colored version having the same number of
lines as the non-colored version. That is an invariant.

However, in the 'detect bogus diffFilter output' test case in t3701, we
essentially required a hunk header that contains parseable `@@ ... @@`
hunk headers, and called all colored diffs without such hunks bogus.

The reason for this is that we would like to show the users the adjusted
hunk headers _including_ the extra part after the `@@ ... @@`
information, which usually contains things like the function name or
soms such.

Now, there is a _very_ popular diff colorizer called `diff-so-fancy`
that does not produce such colored diffs as the built-in `add` command
expects. Nevertheless, the Perl variant of the `add` command handles
those nicely, essentially by ignoring the hunk header and saying "there
is nothing else we can show except the original hunk header, even if we
had to adjust the line range and the original hunk header might get that
wrong".

In preparation for teaching the built-in interactive `add` to be a bit
more lenient, let's change the 'detect bogus diffFilter output' test
case so that it verifies that a mismatched number of lines causes the
command to error out, but not an unparseable hunk header.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t3701-add-interactive.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Phillip Wood Aug. 30, 2022, 1:17 p.m. UTC | #1
Hi Dscho

On 29/08/2022 16:11, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> When parsing the colored version of a diff, the interactive `add`
> command really relies on the colored version having the same number of
> lines as the non-colored version. That is an invariant.
> 
> However, in the 'detect bogus diffFilter output' test case in t3701, we
> essentially required a hunk header that contains parseable `@@ ... @@`
> hunk headers, and called all colored diffs without such hunks bogus.
> 
> The reason for this is that we would like to show the users the adjusted
> hunk headers _including_ the extra part after the `@@ ... @@`
> information, which usually contains things like the function name or
> soms such.
> 
> Now, there is a _very_ popular diff colorizer called `diff-so-fancy`
> that does not produce such colored diffs as the built-in `add` command
> expects. Nevertheless, the Perl variant of the `add` command handles
> those nicely, essentially by ignoring the hunk header and saying "there
> is nothing else we can show except the original hunk header, even if we
> had to adjust the line range and the original hunk header might get that
> wrong".
> 
> In preparation for teaching the built-in interactive `add` to be a bit
> more lenient, let's change the 'detect bogus diffFilter output' test
> case so that it verifies that a mismatched number of lines causes the
> command to error out, but not an unparseable hunk header.

The existing test deletes the first line of the diff[1] - so it is 
removing the "diff --git ..." header not the hunk header. This patch 
changes the filter to delete everything except the diff header which 
seems like a less realistic test.

Best Wishes

Phillip

[1] To verify this I changed the filter to  "sed 1d | tee 
filtered-diff". filtered diff-contains
index 0889435..d95f3ad 100644

--- a/test

+++ b/test

@@ -1,6 +1 @@

-10

-20

-30

-40

-50

-60

+content



> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>   t/t3701-add-interactive.sh | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
> index 3b7df9bed5a..88d8133f38f 100755
> --- a/t/t3701-add-interactive.sh
> +++ b/t/t3701-add-interactive.sh
> @@ -761,7 +761,7 @@ test_expect_success 'detect bogus diffFilter output' '
>   	git reset --hard &&
>   
>   	echo content >test &&
> -	test_config interactive.diffFilter "sed 1d" &&
> +	test_config interactive.diffFilter "sed q" &&
>   	printf y >y &&
>   	force_color test_must_fail git add -p <y
>   '
Junio C Hamano Aug. 30, 2022, 9:36 p.m. UTC | #2
Phillip Wood <phillip.wood123@gmail.com> writes:

> The existing test deletes the first line of the diff[1] - so it is
> removing the "diff --git ..." header not the hunk header. This patch
> changes the filter to delete everything except the diff header which
> seems like a less realistic test.

Is it that all it cares about is that the output has the same number
of lines as the input?  If so, I agree that it is much less realistic,
but it may not matter in practice.  Even an "exit 0" might do ;-)

I may be quite off the mark on this one, though.
Phillip Wood Aug. 31, 2022, 9:26 a.m. UTC | #3
Hi Junio

On 30/08/2022 22:36, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
>> The existing test deletes the first line of the diff[1] - so it is
>> removing the "diff --git ..." header not the hunk header. This patch
>> changes the filter to delete everything except the diff header which
>> seems like a less realistic test.
> 
> Is it that all it cares about is that the output has the same number
> of lines as the input?  If so, I agree that it is much less realistic,
> but it may not matter in practice.  Even an "exit 0" might do ;-)

Yes I think it only cares that there are the same number of lines which 
begs the question "why are we changing this test in the first place?". 
The commit message talks about the being unable to parse the hunk header 
but that's only true because we would be looking in the wrong place for 
it - the output does in fact contain a valid hunk header. With this 
change there is no hunk header in the filtered output at all. In 
practice if the number of lines differs it is normally because the 
filter messed with the diff header and removed some lines from it which 
is what the existing test simulates.

I'm struggling to understand the need for this change from the 
explanation in the commit message as it seems to me  to assume the line 
being deleted is the hunk header when in fact it is deleting the diff 
header.

Best Wishes

Phillip

> I may be quite off the mark on this one, though.
>
Jeff King Aug. 31, 2022, 3:36 p.m. UTC | #4
On Wed, Aug 31, 2022 at 10:26:17AM +0100, Phillip Wood wrote:

> > Is it that all it cares about is that the output has the same number
> > of lines as the input?  If so, I agree that it is much less realistic,
> > but it may not matter in practice.  Even an "exit 0" might do ;-)
> 
> Yes I think it only cares that there are the same number of lines which begs
> the question "why are we changing this test in the first place?". The commit
> message talks about the being unable to parse the hunk header but that's
> only true because we would be looking in the wrong place for it - the output
> does in fact contain a valid hunk header. With this change there is no hunk
> header in the filtered output at all. In practice if the number of lines
> differs it is normally because the filter messed with the diff header and
> removed some lines from it which is what the existing test simulates.
> 
> I'm struggling to understand the need for this change from the explanation
> in the commit message as it seems to me  to assume the line being deleted is
> the hunk header when in fact it is deleting the diff header.

FWIW, as the author of the original test, I'm also confused about why it
needs to be changed. The filter I wrote in the original test was just
"echo too-short". It was switched to "sed 1d" because the original did
not read the input at all, which racily caused Git to see SIGPIPE:

  https://lore.kernel.org/git/20200113170417.GK32750@szeder.dev/

Switching to "exit 0" would bring that problem back. But I think "sed q"
potentially does, too, because sed will quit without reading all of the
input. We really do want something like "sed 1d" that changes the number
of lines, but ensures that the filter reads to EOF.

-Peff
Jeff King Aug. 31, 2022, 3:47 p.m. UTC | #5
On Wed, Aug 31, 2022 at 11:36:19AM -0400, Jeff King wrote:

> > I'm struggling to understand the need for this change from the explanation
> > in the commit message as it seems to me  to assume the line being deleted is
> > the hunk header when in fact it is deleting the diff header.
> 
> FWIW, as the author of the original test, I'm also confused about why it
> needs to be changed. The filter I wrote in the original test was just
> "echo too-short". It was switched to "sed 1d" because the original did
> not read the input at all, which racily caused Git to see SIGPIPE:
> 
>   https://lore.kernel.org/git/20200113170417.GK32750@szeder.dev/
> 
> Switching to "exit 0" would bring that problem back. But I think "sed q"
> potentially does, too, because sed will quit without reading all of the
> input. We really do want something like "sed 1d" that changes the number
> of lines, but ensures that the filter reads to EOF.

By the way, the test change is needed for it to pass with the change in
patch 2, where we become more lenient about parsing the hunk header.
That implies to me that the builtin version's check for one-to-one line
correspondence is broken, but we didn't notice.

In the perl version, that test fails because it triggers the
mismatched-output check:

  $ GIT_TEST_ADD_I_USE_BUILTIN=false ./t3701-add-interactive.sh --verbose-only=56
  [...]
  fatal: mismatched output from interactive.diffFilter
  hint: Your filter must maintain a one-to-one correspondence
  hint: between its input and output lines.
  ok 56 - detect bogus diffFilter output

but the builtin version complains about the hunk header:

  $ GIT_TEST_ADD_I_USE_BUILTIN=true ./t3701-add-interactive.sh --verbose-only=56
  [...]
  error: could not parse colored hunk header '?[31m-10?[m'
  ok 56 - detect bogus diffFilter output

Once patch 2 is applied, we stop complaining there, and we _should_
complain that the number of lines isn't the same. But we don't:

  $ GIT_TEST_ADD_I_USE_BUILTIN=true ./t3701-add-interactive.sh --verbose-only=56
  test_must_fail: command succeeded: git add -p
  not ok 56 - detect bogus diffFilter output

-Peff
Johannes Schindelin Aug. 31, 2022, 7:57 p.m. UTC | #6
Hi Peff,

On Wed, 31 Aug 2022, Jeff King wrote:

> On Wed, Aug 31, 2022 at 11:36:19AM -0400, Jeff King wrote:
>
> > > I'm struggling to understand the need for this change from the explanation
> > > in the commit message as it seems to me  to assume the line being deleted is
> > > the hunk header when in fact it is deleting the diff header.
> >
> > FWIW, as the author of the original test, I'm also confused about why it
> > needs to be changed. The filter I wrote in the original test was just
> > "echo too-short". It was switched to "sed 1d" because the original did
> > not read the input at all, which racily caused Git to see SIGPIPE:
> >
> >   https://lore.kernel.org/git/20200113170417.GK32750@szeder.dev/
> >
> > Switching to "exit 0" would bring that problem back. But I think "sed q"
> > potentially does, too, because sed will quit without reading all of the
> > input. We really do want something like "sed 1d" that changes the number
> > of lines, but ensures that the filter reads to EOF.
>
> By the way, the test change is needed for it to pass with the change in
> patch 2, where we become more lenient about parsing the hunk header.
> That implies to me that the builtin version's check for one-to-one line
> correspondence is broken, but we didn't notice.

And that's exactly the case. It was an off-by-one bug ;-)

> In the perl version, that test fails because it triggers the
> mismatched-output check:
>
>   $ GIT_TEST_ADD_I_USE_BUILTIN=false ./t3701-add-interactive.sh --verbose-only=56
>   [...]
>   fatal: mismatched output from interactive.diffFilter
>   hint: Your filter must maintain a one-to-one correspondence
>   hint: between its input and output lines.
>   ok 56 - detect bogus diffFilter output
>
> but the builtin version complains about the hunk header:
>
>   $ GIT_TEST_ADD_I_USE_BUILTIN=true ./t3701-add-interactive.sh --verbose-only=56
>   [...]
>   error: could not parse colored hunk header '?[31m-10?[m'
>   ok 56 - detect bogus diffFilter output
>
> Once patch 2 is applied, we stop complaining there, and we _should_
> complain that the number of lines isn't the same. But we don't:
>
>   $ GIT_TEST_ADD_I_USE_BUILTIN=true ./t3701-add-interactive.sh --verbose-only=56
>   test_must_fail: command succeeded: git add -p
>   not ok 56 - detect bogus diffFilter output

Yes, that matches my analysis, and I already pushed a new iteration before
even noticing your mail, only waiting for the CI run to finish before I
submit it.

Ciao,
Dscho
diff mbox series

Patch

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 3b7df9bed5a..88d8133f38f 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -761,7 +761,7 @@  test_expect_success 'detect bogus diffFilter output' '
 	git reset --hard &&
 
 	echo content >test &&
-	test_config interactive.diffFilter "sed 1d" &&
+	test_config interactive.diffFilter "sed q" &&
 	printf y >y &&
 	force_color test_must_fail git add -p <y
 '