mbox series

[v3,0/5] built-in add -p: support diff-so-fancy better

Message ID pull.1336.v3.git.1661785916.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series built-in add -p: support diff-so-fancy better | expand

Message

Philippe Blain via GitGitGadget Aug. 29, 2022, 3:11 p.m. UTC
Philippe Blain reported in
https://lore.kernel.org/git/ecf6f5be-22ca-299f-a8f1-bda38e5ca246@gmail.com
that there is a problem when running the built-in version of git add -p with
diff-so-fancy [https://github.com/so-fancy/diff-so-fancy] as diff colorizer.
The symptom is this:

    error: could not parse colored hunk header '?[36m?[1m?[38;5;13m@ file:1 @?[1m?[0m'


This patch series addresses that and should fix
https://github.com/so-fancy/diff-so-fancy/issues/437

Changes since v2:

 * Added the appropriate "Reported-by" trailer to the commit message.
 * Split out the logic to insert a space between the colored line range and
   the extra information, if needed.
 * That logic was now corrected to see whether that space is really needed.
 * To verify that the logic does what we need it to do, the added regression
   test now specifically tests for that (single) extra space that we want to
   be inserted.
 * Reworded a stale comment that claimed that we might suppress the entire
   colored hunk header (which we no longer do).
 * Rebased to the current tip of the main branch to avoid a merge conflict
   with 716c1f649e3 (pipe_command(): mark stdin descriptor as non-blocking,
   2022-08-17).

Changes since v1:

 * Added a commit to ignore dirty submodules just like the Perl version
   does.

Johannes Schindelin (5):
  t3701: redefine what is "bogus" output of a diff filter
  add -p: gracefully ignore unparseable hunk headers in colored diffs
  add -p: insert space in colored hunk header as needed
  add -p: handle `diff-so-fancy`'s hunk headers better
  add -p: ignore dirty submodules

 add-patch.c                | 49 ++++++++++++++++++++++++++++++--------
 t/t3701-add-interactive.sh | 35 ++++++++++++++++++++++++++-
 2 files changed, 73 insertions(+), 11 deletions(-)


base-commit: 07ee72db0e97b5c233f8ada0abb412248c2f1c6f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1336%2Fdscho%2Fdiff-so-fancy-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1336/dscho/diff-so-fancy-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1336

Range-diff vs v2:

 1:  74ab50eeb1c = 1:  a01fa5d25e4 t3701: redefine what is "bogus" output of a diff filter
 2:  b07f85a0359 ! 2:  cbe833bd141 add -p: gracefully ignore unparseable hunk headers in colored diffs
     @@ Commit message
      
          [diff-so-fancy]: https://github.com/so-fancy/diff-so-fancy
      
     +    Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## add-patch.c ##
     @@ t/t3701-add-interactive.sh: test_expect_success 'detect bogus diffFilter output'
       	force_color test_must_fail git add -p <y
       '
       
     -+test_expect_success 'gracefully fail to parse colored hunk header' '
     ++test_expect_success 'handle iffy colored hunk headers' '
      +	git reset --hard &&
      +
      +	echo content >test &&
     -+	test_config interactive.diffFilter "sed s/@@/XX/g" &&
     -+	printf y >y &&
     -+	force_color git add -p <y
     ++	printf n >n &&
     ++	force_color git -c interactive.diffFilter="sed s/@@/XX/g" \
     ++		add -p <n
      +'
      +
     - test_expect_success 'diff.algorithm is passed to `git diff-files`' '
     + test_expect_success 'handle very large filtered diff' '
       	git reset --hard &&
     - 
     + 	# The specific number here is not important, but it must
 3:  9dac9f74d2e ! 3:  7a9f0b107e6 add -p: handle `diff-so-fancy`'s hunk headers better
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    add -p: handle `diff-so-fancy`'s hunk headers better
     +    add -p: insert space in colored hunk header as needed
      
     -    The `diff-so-fancy` diff colorizer produces hunk headers that look
     -    nothing like the built-in `add -p` expects: there is no `@@ ... @@` line
     -    range, and therefore the parser cannot determine where any extra
     -    information starts, such as the function name that is often added to
     -    those hunk header lines.
     +    We are about to teach `git add -p` to show the entire hunk header if the
     +    `@@ ... @@` line range cannot be parsed. Previously, we showed only the
     +    remainder of that hunk header as an "colored_extra" part.
      
     -    However, we can do better than simply swallowing the unparseable hunk
     -    header. In the `diff-so-fancy` case, it shows something like `@ file:1
     -    @`. Let's just show the complete hunk header because it probably offers
     -    useful information.
     +    To prepare for that, detect if that "colored_extra" part starts with any
     +    non-whitespace character (ignoring ANSI escape sequences) and insert a
     +    space, to make the output much more pleasant.
     +
     +    Note that this has an effect already before we make `git add -p` more
     +    lenient when parsing the hunk headers: diff filters could already remove
     +    the space after the line range, which is precisely what we do in the
     +    regression test introduced by this commit.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
     @@ add-patch.c
       
       enum prompt_mode_type {
       	PROMPT_MODE_CHANGE = 0, PROMPT_DELETION, PROMPT_ADDITION, PROMPT_HUNK,
     -@@ add-patch.c: static int parse_hunk_header(struct add_p_state *s, struct hunk *hunk)
     - 		header->colored_extra_start = p + 3 - s->colored.buf;
     - 	else
     - 		/* could not parse colored hunk header, showing nothing */
     --		header->colored_extra_start = hunk->colored_start;
     -+		header->colored_extra_start = line - s->colored.buf;
     - 	header->colored_extra_end = hunk->colored_start;
     +@@ add-patch.c: static size_t find_next_line(struct strbuf *sb, size_t offset)
     + 	return eol - sb->buf + 1;
     + }
       
     - 	return 0;
     ++static int starts_with_non_ws(const char *p, size_t len)
     ++{
     ++	for (;;) {
     ++		size_t skip;
     ++
     ++		if (!len || isspace(*p))
     ++			return 0;
     ++		skip = display_mode_esc_sequence_len(p);
     ++		if (!skip)
     ++			return 1;
     ++		if (skip > len)
     ++			return 0;
     ++		p += skip;
     ++		len -= skip;
     ++	}
     ++}
     ++
     + static void render_hunk(struct add_p_state *s, struct hunk *hunk,
     + 			ssize_t delta, int colored, struct strbuf *out)
     + {
      @@ add-patch.c: static void render_hunk(struct add_p_state *s, struct hunk *hunk,
       		size_t len;
       		unsigned long old_offset = header->old_offset;
     @@ add-patch.c: static void render_hunk(struct add_p_state *s, struct hunk *hunk,
       			p = s->colored.buf + header->colored_extra_start;
       			len = header->colored_extra_end
       				- header->colored_extra_start;
     -+			if (utf8_strnwidth(p, len, 1 /* skip ANSI */) > 0)
     -+				needs_extra_space = 1;
     ++			needs_extra_space = starts_with_non_ws(p, len);
       		}
       
       		if (s->mode->is_reverse)
     @@ add-patch.c: static void render_hunk(struct add_p_state *s, struct hunk *hunk,
       		else if (colored)
      
       ## t/t3701-add-interactive.sh ##
     -@@ t/t3701-add-interactive.sh: test_expect_success 'gracefully fail to parse colored hunk header' '
     +@@ t/t3701-add-interactive.sh: test_expect_success 'handle iffy colored hunk headers' '
       	echo content >test &&
     - 	test_config interactive.diffFilter "sed s/@@/XX/g" &&
     - 	printf y >y &&
     --	force_color git add -p <y
     -+	force_color git add -p >output 2>&1 <y &&
     -+	grep XX output
     + 	printf n >n &&
     + 	force_color git -c interactive.diffFilter="sed s/@@/XX/g" \
     +-		add -p <n
     ++		add -p <n &&
     ++	force_color git -c interactive.diffFilter="sed \"s/\(.*@@\).*/\1FN/\"" \
     ++		add -p >output 2>&1 <n &&
     ++	if test_have_prereq ADD_I_USE_BUILTIN
     ++	then
     ++		grep "@ FN\$" output
     ++	else
     ++		grep "@FN\$" output
     ++	fi
       '
       
     - test_expect_success 'diff.algorithm is passed to `git diff-files`' '
     + test_expect_success 'handle very large filtered diff' '
 -:  ----------- > 4:  e3e3a178f98 add -p: handle `diff-so-fancy`'s hunk headers better
 4:  540ce27c38a = 5:  cfa6914aee0 add -p: ignore dirty submodules

Comments

Junio C Hamano Aug. 29, 2022, 6:01 p.m. UTC | #1
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

>  * That logic was now corrected to see whether that space is really needed.
>  * To verify that the logic does what we need it to do, the added regression
>    test now specifically tests for that (single) extra space that we want to
>    be inserted.


>  * Rebased to the current tip of the main branch to avoid a merge conflict
>    with 716c1f649e3 (pipe_command(): mark stdin descriptor as non-blocking,
>    2022-08-17).

I'd prefer you (or any other contributor) not to pick a more recent
base only to work around a merge conflict, especially when it is
something simple (like this case, where both sides added a new test
piece each in a way that semantically do not conflict).  I'll ask
the author to rebase if the conflicts becomes unmanageable.

We promoted the built-in one as the primary implementation in
2.37.0; I think we want to keep this fix mergeable down to the
2.37.x maintenance track (either by me, or by distro LTS folks who
are motivated enough).

This time, as you can guess by my above description on the exact way
how conflict happens, I've rebased the series back for 2.37.x and
will resolve the conflict myself (and make sure the result matches
the application of these patches directly on top of 'master').

It appears that this round is good to go down to 'next' soonish, but
let's see what others find.

Thanks.
Phillip Wood Aug. 30, 2022, 1:29 p.m. UTC | #2
Hi Dscho

Thanks for working on this, I've left a few comments on the patches - 
I'm not convinced about patch 1 or about showing the hunk offsets with 
interactive.diffFilter in patch 4 but the series looks basically sound.

Best Wishes

Phillip

On 29/08/2022 16:11, Johannes Schindelin via GitGitGadget wrote:
> Philippe Blain reported in
> https://lore.kernel.org/git/ecf6f5be-22ca-299f-a8f1-bda38e5ca246@gmail.com
> that there is a problem when running the built-in version of git add -p with
> diff-so-fancy [https://github.com/so-fancy/diff-so-fancy] as diff colorizer.
> The symptom is this:
> 
>      error: could not parse colored hunk header '?[36m?[1m?[38;5;13m@ file:1 @?[1m?[0m'
> 
> 
> This patch series addresses that and should fix
> https://github.com/so-fancy/diff-so-fancy/issues/437
> 
> Changes since v2:
> 
>   * Added the appropriate "Reported-by" trailer to the commit message.
>   * Split out the logic to insert a space between the colored line range and
>     the extra information, if needed.
>   * That logic was now corrected to see whether that space is really needed.
>   * To verify that the logic does what we need it to do, the added regression
>     test now specifically tests for that (single) extra space that we want to
>     be inserted.
>   * Reworded a stale comment that claimed that we might suppress the entire
>     colored hunk header (which we no longer do).
>   * Rebased to the current tip of the main branch to avoid a merge conflict
>     with 716c1f649e3 (pipe_command(): mark stdin descriptor as non-blocking,
>     2022-08-17).
> 
> Changes since v1:
> 
>   * Added a commit to ignore dirty submodules just like the Perl version
>     does.
> 
> Johannes Schindelin (5):
>    t3701: redefine what is "bogus" output of a diff filter
>    add -p: gracefully ignore unparseable hunk headers in colored diffs
>    add -p: insert space in colored hunk header as needed
>    add -p: handle `diff-so-fancy`'s hunk headers better
>    add -p: ignore dirty submodules
> 
>   add-patch.c                | 49 ++++++++++++++++++++++++++++++--------
>   t/t3701-add-interactive.sh | 35 ++++++++++++++++++++++++++-
>   2 files changed, 73 insertions(+), 11 deletions(-)
> 
> 
> base-commit: 07ee72db0e97b5c233f8ada0abb412248c2f1c6f
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1336%2Fdscho%2Fdiff-so-fancy-v3
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1336/dscho/diff-so-fancy-v3
> Pull-Request: https://github.com/gitgitgadget/git/pull/1336
> 
> Range-diff vs v2:
> 
>   1:  74ab50eeb1c = 1:  a01fa5d25e4 t3701: redefine what is "bogus" output of a diff filter
>   2:  b07f85a0359 ! 2:  cbe833bd141 add -p: gracefully ignore unparseable hunk headers in colored diffs
>       @@ Commit message
>        
>            [diff-so-fancy]: https://github.com/so-fancy/diff-so-fancy
>        
>       +    Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
>            Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>        
>         ## add-patch.c ##
>       @@ t/t3701-add-interactive.sh: test_expect_success 'detect bogus diffFilter output'
>         	force_color test_must_fail git add -p <y
>         '
>         
>       -+test_expect_success 'gracefully fail to parse colored hunk header' '
>       ++test_expect_success 'handle iffy colored hunk headers' '
>        +	git reset --hard &&
>        +
>        +	echo content >test &&
>       -+	test_config interactive.diffFilter "sed s/@@/XX/g" &&
>       -+	printf y >y &&
>       -+	force_color git add -p <y
>       ++	printf n >n &&
>       ++	force_color git -c interactive.diffFilter="sed s/@@/XX/g" \
>       ++		add -p <n
>        +'
>        +
>       - test_expect_success 'diff.algorithm is passed to `git diff-files`' '
>       + test_expect_success 'handle very large filtered diff' '
>         	git reset --hard &&
>       -
>       + 	# The specific number here is not important, but it must
>   3:  9dac9f74d2e ! 3:  7a9f0b107e6 add -p: handle `diff-so-fancy`'s hunk headers better
>       @@ Metadata
>        Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
>        
>         ## Commit message ##
>       -    add -p: handle `diff-so-fancy`'s hunk headers better
>       +    add -p: insert space in colored hunk header as needed
>        
>       -    The `diff-so-fancy` diff colorizer produces hunk headers that look
>       -    nothing like the built-in `add -p` expects: there is no `@@ ... @@` line
>       -    range, and therefore the parser cannot determine where any extra
>       -    information starts, such as the function name that is often added to
>       -    those hunk header lines.
>       +    We are about to teach `git add -p` to show the entire hunk header if the
>       +    `@@ ... @@` line range cannot be parsed. Previously, we showed only the
>       +    remainder of that hunk header as an "colored_extra" part.
>        
>       -    However, we can do better than simply swallowing the unparseable hunk
>       -    header. In the `diff-so-fancy` case, it shows something like `@ file:1
>       -    @`. Let's just show the complete hunk header because it probably offers
>       -    useful information.
>       +    To prepare for that, detect if that "colored_extra" part starts with any
>       +    non-whitespace character (ignoring ANSI escape sequences) and insert a
>       +    space, to make the output much more pleasant.
>       +
>       +    Note that this has an effect already before we make `git add -p` more
>       +    lenient when parsing the hunk headers: diff filters could already remove
>       +    the space after the line range, which is precisely what we do in the
>       +    regression test introduced by this commit.
>        
>            Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>        
>       @@ add-patch.c
>         
>         enum prompt_mode_type {
>         	PROMPT_MODE_CHANGE = 0, PROMPT_DELETION, PROMPT_ADDITION, PROMPT_HUNK,
>       -@@ add-patch.c: static int parse_hunk_header(struct add_p_state *s, struct hunk *hunk)
>       - 		header->colored_extra_start = p + 3 - s->colored.buf;
>       - 	else
>       - 		/* could not parse colored hunk header, showing nothing */
>       --		header->colored_extra_start = hunk->colored_start;
>       -+		header->colored_extra_start = line - s->colored.buf;
>       - 	header->colored_extra_end = hunk->colored_start;
>       +@@ add-patch.c: static size_t find_next_line(struct strbuf *sb, size_t offset)
>       + 	return eol - sb->buf + 1;
>       + }
>         
>       - 	return 0;
>       ++static int starts_with_non_ws(const char *p, size_t len)
>       ++{
>       ++	for (;;) {
>       ++		size_t skip;
>       ++
>       ++		if (!len || isspace(*p))
>       ++			return 0;
>       ++		skip = display_mode_esc_sequence_len(p);
>       ++		if (!skip)
>       ++			return 1;
>       ++		if (skip > len)
>       ++			return 0;
>       ++		p += skip;
>       ++		len -= skip;
>       ++	}
>       ++}
>       ++
>       + static void render_hunk(struct add_p_state *s, struct hunk *hunk,
>       + 			ssize_t delta, int colored, struct strbuf *out)
>       + {
>        @@ add-patch.c: static void render_hunk(struct add_p_state *s, struct hunk *hunk,
>         		size_t len;
>         		unsigned long old_offset = header->old_offset;
>       @@ add-patch.c: static void render_hunk(struct add_p_state *s, struct hunk *hunk,
>         			p = s->colored.buf + header->colored_extra_start;
>         			len = header->colored_extra_end
>         				- header->colored_extra_start;
>       -+			if (utf8_strnwidth(p, len, 1 /* skip ANSI */) > 0)
>       -+				needs_extra_space = 1;
>       ++			needs_extra_space = starts_with_non_ws(p, len);
>         		}
>         
>         		if (s->mode->is_reverse)
>       @@ add-patch.c: static void render_hunk(struct add_p_state *s, struct hunk *hunk,
>         		else if (colored)
>        
>         ## t/t3701-add-interactive.sh ##
>       -@@ t/t3701-add-interactive.sh: test_expect_success 'gracefully fail to parse colored hunk header' '
>       +@@ t/t3701-add-interactive.sh: test_expect_success 'handle iffy colored hunk headers' '
>         	echo content >test &&
>       - 	test_config interactive.diffFilter "sed s/@@/XX/g" &&
>       - 	printf y >y &&
>       --	force_color git add -p <y
>       -+	force_color git add -p >output 2>&1 <y &&
>       -+	grep XX output
>       + 	printf n >n &&
>       + 	force_color git -c interactive.diffFilter="sed s/@@/XX/g" \
>       +-		add -p <n
>       ++		add -p <n &&
>       ++	force_color git -c interactive.diffFilter="sed \"s/\(.*@@\).*/\1FN/\"" \
>       ++		add -p >output 2>&1 <n &&
>       ++	if test_have_prereq ADD_I_USE_BUILTIN
>       ++	then
>       ++		grep "@ FN\$" output
>       ++	else
>       ++		grep "@FN\$" output
>       ++	fi
>         '
>         
>       - test_expect_success 'diff.algorithm is passed to `git diff-files`' '
>       + test_expect_success 'handle very large filtered diff' '
>   -:  ----------- > 4:  e3e3a178f98 add -p: handle `diff-so-fancy`'s hunk headers better
>   4:  540ce27c38a = 5:  cfa6914aee0 add -p: ignore dirty submodules
>
Johannes Schindelin Aug. 30, 2022, 2:22 p.m. UTC | #3
Hi Junio,

On Mon, 29 Aug 2022, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> >  * That logic was now corrected to see whether that space is really needed.
> >  * To verify that the logic does what we need it to do, the added regression
> >    test now specifically tests for that (single) extra space that we want to
> >    be inserted.
>
>
> >  * Rebased to the current tip of the main branch to avoid a merge conflict
> >    with 716c1f649e3 (pipe_command(): mark stdin descriptor as non-blocking,
> >    2022-08-17).
>
> I'd prefer you (or any other contributor) not to pick a more recent
> base only to work around a merge conflict, [...]

Noted. I just wanted to be nice.

> We promoted the built-in one as the primary implementation in
> 2.37.0; I think we want to keep this fix mergeable down to the
> 2.37.x maintenance track (either by me, or by distro LTS folks who
> are motivated enough).
>
> This time, as you can guess by my above description on the exact way
> how conflict happens, I've rebased the series back for 2.37.x and
> will resolve the conflict myself (and make sure the result matches
> the application of these patches directly on top of 'master').
>
> It appears that this round is good to go down to 'next' soonish, but
> let's see what others find.

Sounds good.

Thanks,
Dscho
Johannes Schindelin Aug. 31, 2022, 8:44 p.m. UTC | #4
Hi Phillip,

On Tue, 30 Aug 2022, Phillip Wood wrote:

> I'm not convinced about patch 1 or about showing the hunk offsets with
> interactive.diffFilter in patch 4 but the series looks basically sound.

The fourth iteration of this patch series that I just sent out should
address both concerns.

Thank you!
Dscho