diff mbox series

[v4,1/1] mergetool: add automerge configuration

Message ID 20201218124905.1072514-2-felipe.contreras@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v4,1/1] mergetool: add automerge configuration | expand

Commit Message

Felipe Contreras Dec. 18, 2020, 12:49 p.m. UTC
It doesn't make sense to display lines without conflicts in the
different views of all mergetools.

Only the lines that warrant conflict markers should be displayed.

Most people would want this behavior on, but in case some don't; add a
new configuration: mergetool.autoMerge.

See Seth House's blog post [1] for the idea, and the rationale.

[1] https://www.eseth.org/2020/mergetools.html

Original-idea-by: Seth House <seth@eseth.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
---
 Documentation/config/mergetool.txt |  3 +++
 git-mergetool.sh                   | 17 +++++++++++++++++
 t/t7610-mergetool.sh               | 18 ++++++++++++++++++
 3 files changed, 38 insertions(+)

Comments

Phillip Wood Dec. 19, 2020, 11:14 a.m. UTC | #1
Hi Felipe

On 18/12/2020 12:49, Felipe Contreras wrote:
> It doesn't make sense to display lines without conflicts in the
> different views of all mergetools.
> 
> Only the lines that warrant conflict markers should be displayed.
> 
> Most people would want this behavior on, but in case some don't; add a
> new configuration: mergetool.autoMerge.
> 
> See Seth House's blog post [1] for the idea, and the rationale.
> 
> [1] https://www.eseth.org/2020/mergetools.html

I would be good to have a summary of the idea in this commit message so 
people do not have to go and find a blog post which may well disappear 
in the future

> Original-idea-by: Seth House <seth@eseth.com>
> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
> ---
>   Documentation/config/mergetool.txt |  3 +++
>   git-mergetool.sh                   | 17 +++++++++++++++++
>   t/t7610-mergetool.sh               | 18 ++++++++++++++++++
>   3 files changed, 38 insertions(+)
> 
> diff --git a/Documentation/config/mergetool.txt b/Documentation/config/mergetool.txt
> index 16a27443a3..7ce6d0d3ac 100644
> --- a/Documentation/config/mergetool.txt
> +++ b/Documentation/config/mergetool.txt
> @@ -61,3 +61,6 @@ mergetool.writeToTemp::
>   
>   mergetool.prompt::
>   	Prompt before each invocation of the merge resolution program.
> +
> +mergetool.autoMerge::
> +	Remove lines without conflicts from all the files. Defaults to `true`.
> diff --git a/git-mergetool.sh b/git-mergetool.sh
> index e3f6d543fb..f4db0cac8d 100755
> --- a/git-mergetool.sh
> +++ b/git-mergetool.sh
> @@ -239,6 +239,17 @@ checkout_staged_file () {
>   	fi
>   }
>   
> +auto_merge () {
> +	git merge-file --diff3 --marker-size=7 -q -p "$LOCAL" "$BASE" "$REMOTE" >"$DIFF3"

I've been wondering if we want to recreate the merge or just get the 
merged BASE LOCAL and REMOTE from the merged file in the working tree. 
If the user wants to resolve the conflicts in stages, or opens the file 
in a editor and fixes some conflicts and then realizes they want to use 
a merge tool that work is thrown away if we recreate the merge. They can 
always use `checkout --merge` to throw away their changes and start 
again with a mergetool. It would mean checking the size of the conflict 
markers and using 
'/^<{$conflict_marker_size}/,^|{$conflict_marker_size}/' for sed. 
Getting the merged BASE would be tricky if the user does not have diff3 
conflicts enabled, I'm not sure if we can safely get BASE from `git 
merge-file ...` and LOCAL and REMOTE from the working tree.

> +	if test -s "$DIFF3"
> +	then
> +		sed -e '/^<<<<<<< /,/^||||||| /d' -e '/^=======\r\?$/,/^>>>>>>> /d' "$DIFF3" >"$BASE"
> +		sed -e '/^||||||| /,/^>>>>>>> /d' -e '/^<<<<<<< /d' "$DIFF3" >"$LOCAL"
> +		sed -e '/^<<<<<<< /,/^=======\r\?$/d' -e '/^>>>>>>> /d' "$DIFF3" >"$REMOTE"
> +	fi
> +	rm -- "$DIFF3"
> +}
> +
>   merge_file () {
>   	MERGED="$1"
>   
> @@ -274,6 +285,7 @@ merge_file () {
>   		BASE=${BASE##*/}
>   	fi
>   
> +	DIFF3="$MERGETOOL_TMPDIR/${BASE}_DIFF3_$$$ext"
>   	BACKUP="$MERGETOOL_TMPDIR/${BASE}_BACKUP_$$$ext"
>   	LOCAL="$MERGETOOL_TMPDIR/${BASE}_LOCAL_$$$ext"
>   	REMOTE="$MERGETOOL_TMPDIR/${BASE}_REMOTE_$$$ext"
> @@ -322,6 +334,11 @@ merge_file () {
>   	checkout_staged_file 2 "$MERGED" "$LOCAL"
>   	checkout_staged_file 3 "$MERGED" "$REMOTE"
>   
> +	if test "$(git config --bool mergetool.autoMerge)" != "false"

If I run `git config --bool mergetool.autoMerge` it returns an empty 
string so I think you need to test it is actually equal to "true".

I also share the view that this should be per tool. Your demand that 
someone comes up with an example that breaks assumes that we have access 
to all the tools that users are using. Seth has done a great job of 
surveying the popular tools but given the size of git's user-base and 
the diversity of uses it is very likely that there will be people using 
in-house or proprietary tools that no one on the list has access to. I 
would much prefer to avoid breaking them rather than waiting for a bug 
report before implementing a per-tool setting. It is quite possible 
people are using different tools for different files in the same way as 
they use different merge drivers for different files and want the 
setting disabled for a tool that does semantic merging but enabled 
textual merges.

Best Wishes

Phillip

> +	then
> +		auto_merge
> +	fi
> +
>   	if test -z "$local_mode" || test -z "$remote_mode"
>   	then
>   		echo "Deleted merge conflict for '$MERGED':"
> diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh
> index 70afdd06fa..ccabd04823 100755
> --- a/t/t7610-mergetool.sh
> +++ b/t/t7610-mergetool.sh
> @@ -828,4 +828,22 @@ test_expect_success 'mergetool -Oorder-file is honored' '
>   	test_cmp expect actual
>   '
>   
> +test_expect_success 'mergetool automerge' '
> +	test_config mergetool.automerge true &&
> +	test_when_finished "git reset --hard" &&
> +	git checkout -b test${test_count}_b master &&
> +	test_write_lines >file1 base "" a &&
> +	git commit -a -m "base" &&
> +	test_write_lines >file1 base "" c &&
> +	git commit -a -m "remote update" &&
> +	git checkout -b test${test_count}_a HEAD~ &&
> +	test_write_lines >file1 local "" b &&
> +	git commit -a -m "local update" &&
> +	test_must_fail git merge test${test_count}_b &&
> +	yes "" | git mergetool file1 &&
> +	test_write_lines >expect local "" c &&
> +	test_cmp expect file1 &&
> +	git commit -m "test resolved with mergetool"
> +'
> +
>   test_done
>
Felipe Contreras Dec. 19, 2020, 12:53 p.m. UTC | #2
Phillip Wood wrote:
> Hi Felipe
> 
> On 18/12/2020 12:49, Felipe Contreras wrote:
> > It doesn't make sense to display lines without conflicts in the
> > different views of all mergetools.
> > 
> > Only the lines that warrant conflict markers should be displayed.
> > 
> > Most people would want this behavior on, but in case some don't; add a
> > new configuration: mergetool.autoMerge.
> > 
> > See Seth House's blog post [1] for the idea, and the rationale.
> > 
> > [1] https://www.eseth.org/2020/mergetools.html
> 
> I would be good to have a summary of the idea in this commit message so 
> people do not have to go and find a blog post which may well disappear 
> in the future

I thought I did, in the paragraphs above. How about adding this
explanation:

When merging, not all lines with changes are considered conflicts, for
example:

  cat >BASE <<EOF
  Patagraph 1

  Paragraph 2
  EOF

  cat >LOCAL <<EOF
  Paragraph 1

  Paragraph 2
  EOF

  cat >REMOTE <<EOF
  Patagraph 1.

  Paragraph 2.
  EOF

In this case the first paragraph does have a conflict because there are
two changes (in LOCAL and REMOTE), that the user must resolve.

However, the second paragraph doesn't have a conflict; it's
straightforward to decide that we want the only change present (in
REMOTE).

In fact, if it were not for the first paragraph with a conflict, git
wouldn't have bothered the user since the automatic merge would have
succeeded.

So it doesn't make sense to display these unconflicted lines to the user
inside the mergetool; it only creates noise.

We can fix that by propagating the final version of the file with the
automatic merge to all the panes of the mergetool (BASE, LOCAL, and
REMOTE), and only make them differ on the places where the are actual
conflicts (and they are demarcated with conflict markers).

(this is mostly my explanation though, not Seth's, who used visual
examples)

> > Original-idea-by: Seth House <seth@eseth.com>
> > Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
> > ---
> >   Documentation/config/mergetool.txt |  3 +++
> >   git-mergetool.sh                   | 17 +++++++++++++++++
> >   t/t7610-mergetool.sh               | 18 ++++++++++++++++++
> >   3 files changed, 38 insertions(+)
> > 
> > diff --git a/Documentation/config/mergetool.txt b/Documentation/config/mergetool.txt
> > index 16a27443a3..7ce6d0d3ac 100644
> > --- a/Documentation/config/mergetool.txt
> > +++ b/Documentation/config/mergetool.txt
> > @@ -61,3 +61,6 @@ mergetool.writeToTemp::
> >   
> >   mergetool.prompt::
> >   	Prompt before each invocation of the merge resolution program.
> > +
> > +mergetool.autoMerge::
> > +	Remove lines without conflicts from all the files. Defaults to `true`.
> > diff --git a/git-mergetool.sh b/git-mergetool.sh
> > index e3f6d543fb..f4db0cac8d 100755
> > --- a/git-mergetool.sh
> > +++ b/git-mergetool.sh
> > @@ -239,6 +239,17 @@ checkout_staged_file () {
> >   	fi
> >   }
> >   
> > +auto_merge () {
> > +	git merge-file --diff3 --marker-size=7 -q -p "$LOCAL" "$BASE" "$REMOTE" >"$DIFF3"
> 
> I've been wondering if we want to recreate the merge or just get the 
> merged BASE LOCAL and REMOTE from the merged file in the working tree. 
> If the user wants to resolve the conflicts in stages, or opens the file 
> in a editor and fixes some conflicts and then realizes they want to use 
> a merge tool that work is thrown away if we recreate the merge. They can 
> always use `checkout --merge` to throw away their changes and start 
> again with a mergetool. It would mean checking the size of the conflict 
> markers and using 
> '/^<{$conflict_marker_size}/,^|{$conflict_marker_size}/' for sed. 
> Getting the merged BASE would be tricky if the user does not have diff3 
> conflicts enabled, I'm not sure if we can safely get BASE from `git 
> merge-file ...` and LOCAL and REMOTE from the working tree.

That's a good point.

However, their work is not thrown away; MERGED is not touched by this.

It's only for visualization purposes that some already-fixed conflicts
would be shown in the mergetool, which yeah; it's not ideal.

That's an improvement that can be done later, on top of this patch. The
bulk of improvements are already enabled by this, and the marginal
gains can be added later.

> > +	if test -s "$DIFF3"
> > +	then
> > +		sed -e '/^<<<<<<< /,/^||||||| /d' -e '/^=======\r\?$/,/^>>>>>>> /d' "$DIFF3" >"$BASE"
> > +		sed -e '/^||||||| /,/^>>>>>>> /d' -e '/^<<<<<<< /d' "$DIFF3" >"$LOCAL"
> > +		sed -e '/^<<<<<<< /,/^=======\r\?$/d' -e '/^>>>>>>> /d' "$DIFF3" >"$REMOTE"
> > +	fi
> > +	rm -- "$DIFF3"
> > +}
> > +
> >   merge_file () {
> >   	MERGED="$1"
> >   
> > @@ -274,6 +285,7 @@ merge_file () {
> >   		BASE=${BASE##*/}
> >   	fi
> >   
> > +	DIFF3="$MERGETOOL_TMPDIR/${BASE}_DIFF3_$$$ext"
> >   	BACKUP="$MERGETOOL_TMPDIR/${BASE}_BACKUP_$$$ext"
> >   	LOCAL="$MERGETOOL_TMPDIR/${BASE}_LOCAL_$$$ext"
> >   	REMOTE="$MERGETOOL_TMPDIR/${BASE}_REMOTE_$$$ext"
> > @@ -322,6 +334,11 @@ merge_file () {
> >   	checkout_staged_file 2 "$MERGED" "$LOCAL"
> >   	checkout_staged_file 3 "$MERGED" "$REMOTE"
> >   
> > +	if test "$(git config --bool mergetool.autoMerge)" != "false"
> 
> If I run `git config --bool mergetool.autoMerge` it returns an empty 
> string so I think you need to test it is actually equal to "true".

Yeah, this would evaluate to positive:

  test "" != "false"

It's enabled by default since I heard Junio mention it would make sense.

> I also share the view that this should be per tool. Your demand that 
> someone comes up with an example that breaks assumes that we have access 
> to all the tools that users are using.

It's not a demand. It's a fact that unless we have an example (even if
hypothetical), the burden of proof has not been met.

The default position is that we don't know if such configuration would
make sense or not.

> Seth has done a great job of 
> surveying the popular tools but given the size of git's user-base and 
> the diversity of uses it is very likely that there will be people using 
> in-house or proprietary tools that no one on the list has access to.

Yes, they can just turn off the flag.

> I would much prefer to avoid breaking them rather than waiting for a
> bug report before implementing a per-tool setting.

Even with a per-tool configuration they would be broken (until the user
configures otherwise).

> It is quite possible people are using different tools for different
> files in the same way as they use different merge drivers for
> different files and want the setting disabled for a tool that does
> semantic merging but enabled textual merges.

I think your definition of what's possible and mine are very different.

But this is actually what I was asking: an example. You are bringing a
hypothetical "semantic mergetool" that would somehow benefit from having
unconflicted lines. Can you explain how it would benefit?

Also, neither Seth nor Junio responded to my example, can you?

Do you agree there is no conflict here?

  echo Hello > BASE
  echo Hello > LOCAL
  echo Hello. > REMOTE
  git merge-file -p LOCAL BASE REMOTE

Cheers.
Phillip Wood Dec. 20, 2020, 7:21 p.m. UTC | #3
On 19/12/2020 12:53, Felipe Contreras wrote:
> Phillip Wood wrote:
>> Hi Felipe
>>
>> On 18/12/2020 12:49, Felipe Contreras wrote:
>>> It doesn't make sense to display lines without conflicts in the
>>> different views of all mergetools.
>>>
>>> Only the lines that warrant conflict markers should be displayed.
>>>
>>> Most people would want this behavior on, but in case some don't; add a
>>> new configuration: mergetool.autoMerge.
>>>
>>> See Seth House's blog post [1] for the idea, and the rationale.
>>>
>>> [1] https://www.eseth.org/2020/mergetools.html
>>
>> I would be good to have a summary of the idea in this commit message so
>> people do not have to go and find a blog post which may well disappear
>> in the future
> 
> I thought I did in the paragraphs above. How about adding this > explanation:
> 
> When merging, not all lines with changes are considered conflicts, for
> example:
> 
>    cat >BASE <<EOF
>    Patagraph 1
> 
>    Paragraph 2
>    EOF
> 
>    cat >LOCAL <<EOF
>    Paragraph 1
> 
>    Paragraph 2
>    EOF
> 
>    cat >REMOTE <<EOF
>    Patagraph 1.
> 
>    Paragraph 2.
>    EOF
> 
> In this case the first paragraph does have a conflict because there are
> two changes (in LOCAL and REMOTE), that the user must resolve.
> 
> However, the second paragraph doesn't have a conflict; it's
> straightforward to decide that we want the only change present (in
> REMOTE).
> 
> In fact, if it were not for the first paragraph with a conflict, git
> wouldn't have bothered the user since the automatic merge would have
> succeeded.
> 
> So it doesn't make sense to display these unconflicted lines to the user
> inside the mergetool; it only creates noise.
> 
> We can fix that by propagating the final version of the file with the
> automatic merge to all the panes of the mergetool (BASE, LOCAL, and
> REMOTE), and only make them differ on the places where the are actual
> conflicts (and they are demarcated with conflict markers).
> 
> (this is mostly my explanation though, not Seth's, who used visual
> examples)

I'm not sure we need that much detail, it just needs to explain that the 
merge tools display non-conflicting changes. Maybe something along the 
lines of

Most merge tools ask the user to merge all the changes in the merge 
including changes to just one side which do not create conflicts rather 
than just the conflicting changes. This is inconvenient and a waste of 
the user's time. We can avoid this by passing the tool two files which 
resolve the conflicts in favor of the LOCAL and REMOTE side of the merge 
as the LOCAL and REMOTE merge heads respectively rather than the real 
merge heads.

>>> Original-idea-by: Seth House <seth@eseth.com>
>>> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
>>> ---
>>>    Documentation/config/mergetool.txt |  3 +++
>>>    git-mergetool.sh                   | 17 +++++++++++++++++
>>>    t/t7610-mergetool.sh               | 18 ++++++++++++++++++
>>>    3 files changed, 38 insertions(+)
>>>
>>> diff --git a/Documentation/config/mergetool.txt b/Documentation/config/mergetool.txt
>>> index 16a27443a3..7ce6d0d3ac 100644
>>> --- a/Documentation/config/mergetool.txt
>>> +++ b/Documentation/config/mergetool.txt
>>> @@ -61,3 +61,6 @@ mergetool.writeToTemp::
>>>    
>>>    mergetool.prompt::
>>>    	Prompt before each invocation of the merge resolution program.
>>> +
>>> +mergetool.autoMerge::
>>> +	Remove lines without conflicts from all the files. Defaults to `true`.
>>> diff --git a/git-mergetool.sh b/git-mergetool.sh
>>> index e3f6d543fb..f4db0cac8d 100755
>>> --- a/git-mergetool.sh
>>> +++ b/git-mergetool.sh
>>> @@ -239,6 +239,17 @@ checkout_staged_file () {
>>>    	fi
>>>    }
>>>    
>>> +auto_merge () {
>>> +	git merge-file --diff3 --marker-size=7 -q -p "$LOCAL" "$BASE" "$REMOTE" >"$DIFF3"
>>
>> I've been wondering if we want to recreate the merge or just get the
>> merged BASE LOCAL and REMOTE from the merged file in the working tree.
>> If the user wants to resolve the conflicts in stages, or opens the file
>> in a editor and fixes some conflicts and then realizes they want to use
>> a merge tool that work is thrown away if we recreate the merge. They can
>> always use `checkout --merge` to throw away their changes and start
>> again with a mergetool. It would mean checking the size of the conflict
>> markers and using
>> '/^<{$conflict_marker_size}/,^|{$conflict_marker_size}/' for sed.
>> Getting the merged BASE would be tricky if the user does not have diff3
>> conflicts enabled, I'm not sure if we can safely get BASE from `git
>> merge-file ...` and LOCAL and REMOTE from the working tree.
> 
> That's a good point.
> 
> However, their work is not thrown away; MERGED is not touched by this.

I wasn't sure whether the tools would overwrite MERGED with a new file 
or if they started with that and just edited it. If it is the latter 
then I agree the users changes are safe

> It's only for visualization purposes that some already-fixed conflicts
> would be shown in the mergetool, which yeah; it's not ideal.
> 
> That's an improvement that can be done later, on top of this patch. The
> bulk of improvements are already enabled by this, and the marginal
> gains can be added later.

There's also the issue of what happens when the user has set merge 
driver for a file. If we use the file from the working tree we are using 
the result of that driver, if we re-merge with `git merge-file` then the 
files passed to the mergetool will not match the output of the merge 
driver set for that file.

>>> +	if test -s "$DIFF3"
>>> +	then
>>> +		sed -e '/^<<<<<<< /,/^||||||| /d' -e '/^=======\r\?$/,/^>>>>>>> /d' "$DIFF3" >"$BASE"
>>> +		sed -e '/^||||||| /,/^>>>>>>> /d' -e '/^<<<<<<< /d' "$DIFF3" >"$LOCAL"
>>> +		sed -e '/^<<<<<<< /,/^=======\r\?$/d' -e '/^>>>>>>> /d' "$DIFF3" >"$REMOTE"
>>> +	fi
>>> +	rm -- "$DIFF3"
>>> +}
>>> +
>>>    merge_file () {
>>>    	MERGED="$1"
>>>    
>>> @@ -274,6 +285,7 @@ merge_file () {
>>>    		BASE=${BASE##*/}
>>>    	fi
>>>    
>>> +	DIFF3="$MERGETOOL_TMPDIR/${BASE}_DIFF3_$$$ext"
>>>    	BACKUP="$MERGETOOL_TMPDIR/${BASE}_BACKUP_$$$ext"
>>>    	LOCAL="$MERGETOOL_TMPDIR/${BASE}_LOCAL_$$$ext"
>>>    	REMOTE="$MERGETOOL_TMPDIR/${BASE}_REMOTE_$$$ext"
>>> @@ -322,6 +334,11 @@ merge_file () {
>>>    	checkout_staged_file 2 "$MERGED" "$LOCAL"
>>>    	checkout_staged_file 3 "$MERGED" "$REMOTE"
>>>    
>>> +	if test "$(git config --bool mergetool.autoMerge)" != "false"
>>
>> If I run `git config --bool mergetool.autoMerge` it returns an empty
>> string so I think you need to test it is actually equal to "true".
> 
> Yeah, this would evaluate to positive:
> 
>    test "" != "false"
> 
> It's enabled by default since I heard Junio mention it would make sense.

I think it probably does make sense in which case it would be good to 
make that explicit in the commit message. Maybe

As most people will want the new behavior we enable it by default. 
Users that do not want the new behavior can set mergetool.autoMerge to 
false.

>> I also share the view that this should be per tool. Your demand that
>> someone comes up with an example that breaks assumes that we have access
>> to all the tools that users are using.
> 
> It's not a demand. It's a fact that unless we have an example (even if
> hypothetical), the burden of proof has not been met.
> 
> The default position is that we don't know if such configuration would
> make sense or not.
> 
>> Seth has done a great job of
>> surveying the popular tools but given the size of git's user-base and
>> the diversity of uses it is very likely that there will be people using
>> in-house or proprietary tools that no one on the list has access to.
> 
> Yes, they can just turn off the flag.
> 
>> I would much prefer to avoid breaking them rather than waiting for a
>> bug report before implementing a per-tool setting.
> 
> Even with a per-tool configuration they would be broken (until the user
> configures otherwise).
> 
>> It is quite possible people are using different tools for different
>> files in the same way as they use different merge drivers for
>> different files and want the setting disabled for a tool that does
>> semantic merging but enabled textual merges.
> 
> I think your definition of what's possible and mine are very different.

All I'm saying is that if a user has different tools for different 
file-types they may want this on for one tool but not another.

> But this is actually what I was asking: an example. You are bringing a
> hypothetical "semantic mergetool" that would somehow benefit from having
> unconflicted lines. Can you explain how it would benefit?

Because the result of the merge depends on the diff and a semantic tool 
(there was a talk about one for C# a few years ago at git merge I think) 
will diff the file based on it's semantics rather than matching lines.

> Also, neither Seth nor Junio responded to my example, can you?
> 
> Do you agree there is no conflict here?
> 
>    echo Hello > BASE
>    echo Hello > LOCAL
>    echo Hello. > REMOTE
>    git merge-file -p LOCAL BASE REMOTE

There is no conflict but I don't see what point you're making by that. 
I've been thinking about a different example

BASE    LOCAL   REMOTE
A	A	A
A	A	A
A	A	A
	B	A

Is there a conflict or not? I think it depends on the diff algorithm. 
These are both valid diffs of BASE and LOCAL but only the first one will 
lead to conflicts

  A	+A
  A	 A
  A	 A
+A	 A

If a tool implements a different diff algorithm to git then it may want 
to do the whole merge itself.

I'm going to be off the list for the next couple of weeks

Best Wishes

Phillip

> Cheers.
>
Felipe Contreras Dec. 21, 2020, 3:04 a.m. UTC | #4
Phillip Wood wrote:
> On 19/12/2020 12:53, Felipe Contreras wrote:
> > Phillip Wood wrote:
> >> Hi Felipe
> >>
> >> On 18/12/2020 12:49, Felipe Contreras wrote:
> >>> It doesn't make sense to display lines without conflicts in the
> >>> different views of all mergetools.
> >>>
> >>> Only the lines that warrant conflict markers should be displayed.
> >>>
> >>> Most people would want this behavior on, but in case some don't; add a
> >>> new configuration: mergetool.autoMerge.
> >>>
> >>> See Seth House's blog post [1] for the idea, and the rationale.
> >>>
> >>> [1] https://www.eseth.org/2020/mergetools.html
> >>
> >> I would be good to have a summary of the idea in this commit message so
> >> people do not have to go and find a blog post which may well disappear
> >> in the future
> > 
> > I thought I did in the paragraphs above. How about adding this > explanation:
> > 
> > When merging, not all lines with changes are considered conflicts, for
> > example:
> > 
> >    cat >BASE <<EOF
> >    Patagraph 1
> > 
> >    Paragraph 2
> >    EOF
> > 
> >    cat >LOCAL <<EOF
> >    Paragraph 1
> > 
> >    Paragraph 2
> >    EOF
> > 
> >    cat >REMOTE <<EOF
> >    Patagraph 1.
> > 
> >    Paragraph 2.
> >    EOF
> > 
> > In this case the first paragraph does have a conflict because there are
> > two changes (in LOCAL and REMOTE), that the user must resolve.
> > 
> > However, the second paragraph doesn't have a conflict; it's
> > straightforward to decide that we want the only change present (in
> > REMOTE).
> > 
> > In fact, if it were not for the first paragraph with a conflict, git
> > wouldn't have bothered the user since the automatic merge would have
> > succeeded.
> > 
> > So it doesn't make sense to display these unconflicted lines to the user
> > inside the mergetool; it only creates noise.
> > 
> > We can fix that by propagating the final version of the file with the
> > automatic merge to all the panes of the mergetool (BASE, LOCAL, and
> > REMOTE), and only make them differ on the places where the are actual
> > conflicts (and they are demarcated with conflict markers).
> > 
> > (this is mostly my explanation though, not Seth's, who used visual
> > examples)
> 
> I'm not sure we need that much detail, it just needs to explain that the 
> merge tools display non-conflicting changes. Maybe something along the 
> lines of
> 
> Most merge tools ask the user to merge all the changes in the merge 
> including changes to just one side which do not create conflicts rather 
> than just the conflicting changes. This is inconvenient and a waste of 
> the user's time. We can avoid this by passing the tool two files which 
> resolve the conflicts in favor of the LOCAL and REMOTE side of the merge 
> as the LOCAL and REMOTE merge heads respectively rather than the real 
> merge heads.

It's not just two files. And at least me personally I find the above a
little confusing. How about:

The purpose of mergetools is to resolve conflicts when git cannot
automatically do so. For that git has added markers in the specific
areas that need resolving, which the user must manually fix. The tool is
supposed to help with that.

However, by passing the original BASE, LOCAL, and REMOTE files, many
changes without conflict are presented to the user when in fact nothing
needs to be done for them.

We can fix that by propagating the final version of the file with the
automatic merge to all the panes of the mergetool (BASE, LOCAL, and
REMOTE), and only make them differ on the places where there are actual
conflicts.

> >>> Original-idea-by: Seth House <seth@eseth.com>
> >>> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
> >>> ---
> >>>    Documentation/config/mergetool.txt |  3 +++
> >>>    git-mergetool.sh                   | 17 +++++++++++++++++
> >>>    t/t7610-mergetool.sh               | 18 ++++++++++++++++++
> >>>    3 files changed, 38 insertions(+)
> >>>
> >>> diff --git a/Documentation/config/mergetool.txt b/Documentation/config/mergetool.txt
> >>> index 16a27443a3..7ce6d0d3ac 100644
> >>> --- a/Documentation/config/mergetool.txt
> >>> +++ b/Documentation/config/mergetool.txt
> >>> @@ -61,3 +61,6 @@ mergetool.writeToTemp::
> >>>    
> >>>    mergetool.prompt::
> >>>    	Prompt before each invocation of the merge resolution program.
> >>> +
> >>> +mergetool.autoMerge::
> >>> +	Remove lines without conflicts from all the files. Defaults to `true`.
> >>> diff --git a/git-mergetool.sh b/git-mergetool.sh
> >>> index e3f6d543fb..f4db0cac8d 100755
> >>> --- a/git-mergetool.sh
> >>> +++ b/git-mergetool.sh
> >>> @@ -239,6 +239,17 @@ checkout_staged_file () {
> >>>    	fi
> >>>    }
> >>>    
> >>> +auto_merge () {
> >>> +	git merge-file --diff3 --marker-size=7 -q -p "$LOCAL" "$BASE" "$REMOTE" >"$DIFF3"
> >>
> >> I've been wondering if we want to recreate the merge or just get the
> >> merged BASE LOCAL and REMOTE from the merged file in the working tree.
> >> If the user wants to resolve the conflicts in stages, or opens the file
> >> in a editor and fixes some conflicts and then realizes they want to use
> >> a merge tool that work is thrown away if we recreate the merge. They can
> >> always use `checkout --merge` to throw away their changes and start
> >> again with a mergetool. It would mean checking the size of the conflict
> >> markers and using
> >> '/^<{$conflict_marker_size}/,^|{$conflict_marker_size}/' for sed.
> >> Getting the merged BASE would be tricky if the user does not have diff3
> >> conflicts enabled, I'm not sure if we can safely get BASE from `git
> >> merge-file ...` and LOCAL and REMOTE from the working tree.
> > 
> > That's a good point.
> > 
> > However, their work is not thrown away; MERGED is not touched by this.
> 
> I wasn't sure whether the tools would overwrite MERGED with a new file 
> or if they started with that and just edited it. If it is the latter 
> then I agree the users changes are safe

All mergetools are passed MERGED and are supposed to edit it.

I suppose some mergetools would want to recreate MERGED, but the user
may have already resolved some of the conflicts.

> > It's only for visualization purposes that some already-fixed conflicts
> > would be shown in the mergetool, which yeah; it's not ideal.
> > 
> > That's an improvement that can be done later, on top of this patch. The
> > bulk of improvements are already enabled by this, and the marginal
> > gains can be added later.
> 
> There's also the issue of what happens when the user has set merge 
> driver for a file. If we use the file from the working tree we are using 
> the result of that driver, if we re-merge with `git merge-file` then the 
> files passed to the mergetool will not match the output of the merge 
> driver set for that file.

I don't know how that situation would look like, but presumably the
conflicts would be around the same areas anyway, no?

> >>> +	if test -s "$DIFF3"
> >>> +	then
> >>> +		sed -e '/^<<<<<<< /,/^||||||| /d' -e '/^=======\r\?$/,/^>>>>>>> /d' "$DIFF3" >"$BASE"
> >>> +		sed -e '/^||||||| /,/^>>>>>>> /d' -e '/^<<<<<<< /d' "$DIFF3" >"$LOCAL"
> >>> +		sed -e '/^<<<<<<< /,/^=======\r\?$/d' -e '/^>>>>>>> /d' "$DIFF3" >"$REMOTE"
> >>> +	fi
> >>> +	rm -- "$DIFF3"
> >>> +}
> >>> +
> >>>    merge_file () {
> >>>    	MERGED="$1"
> >>>    
> >>> @@ -274,6 +285,7 @@ merge_file () {
> >>>    		BASE=${BASE##*/}
> >>>    	fi
> >>>    
> >>> +	DIFF3="$MERGETOOL_TMPDIR/${BASE}_DIFF3_$$$ext"
> >>>    	BACKUP="$MERGETOOL_TMPDIR/${BASE}_BACKUP_$$$ext"
> >>>    	LOCAL="$MERGETOOL_TMPDIR/${BASE}_LOCAL_$$$ext"
> >>>    	REMOTE="$MERGETOOL_TMPDIR/${BASE}_REMOTE_$$$ext"
> >>> @@ -322,6 +334,11 @@ merge_file () {
> >>>    	checkout_staged_file 2 "$MERGED" "$LOCAL"
> >>>    	checkout_staged_file 3 "$MERGED" "$REMOTE"
> >>>    
> >>> +	if test "$(git config --bool mergetool.autoMerge)" != "false"
> >>
> >> If I run `git config --bool mergetool.autoMerge` it returns an empty
> >> string so I think you need to test it is actually equal to "true".
> > 
> > Yeah, this would evaluate to positive:
> > 
> >    test "" != "false"
> > 
> > It's enabled by default since I heard Junio mention it would make sense.
> 
> I think it probably does make sense in which case it would be good to 
> make that explicit in the commit message. Maybe

Right, I thought I did.

> As most people will want the new behavior we enable it by default. 
> Users that do not want the new behavior can set mergetool.autoMerge to 
> false.

Sounds good.

> >> I also share the view that this should be per tool. Your demand that
> >> someone comes up with an example that breaks assumes that we have access
> >> to all the tools that users are using.
> > 
> > It's not a demand. It's a fact that unless we have an example (even if
> > hypothetical), the burden of proof has not been met.
> > 
> > The default position is that we don't know if such configuration would
> > make sense or not.
> > 
> >> Seth has done a great job of
> >> surveying the popular tools but given the size of git's user-base and
> >> the diversity of uses it is very likely that there will be people using
> >> in-house or proprietary tools that no one on the list has access to.
> > 
> > Yes, they can just turn off the flag.
> > 
> >> I would much prefer to avoid breaking them rather than waiting for a
> >> bug report before implementing a per-tool setting.
> > 
> > Even with a per-tool configuration they would be broken (until the user
> > configures otherwise).
> > 
> >> It is quite possible people are using different tools for different
> >> files in the same way as they use different merge drivers for
> >> different files and want the setting disabled for a tool that does
> >> semantic merging but enabled textual merges.
> > 
> > I think your definition of what's possible and mine are very different.
> 
> All I'm saying is that if a user has different tools for different 
> file-types they may want this on for one tool but not another.

Possible yeah, I just don't find it very likely.

What different mergetools would you use for different file-types?

> > But this is actually what I was asking: an example. You are bringing a
> > hypothetical "semantic mergetool" that would somehow benefit from having
> > unconflicted lines. Can you explain how it would benefit?
> 
> Because the result of the merge depends on the diff and a semantic tool 
> (there was a talk about one for C# a few years ago at git merge I think) 
> will diff the file based on it's semantics rather than matching lines.

But you would want this tool to run on every merge, regardless if there
are conflicts or not.

It's this pre-mergetool tool that would determine if there are
conflicts to be resolved by the user or not.

> > Also, neither Seth nor Junio responded to my example, can you?
> > 
> > Do you agree there is no conflict here?
> > 
> >    echo Hello > BASE
> >    echo Hello > LOCAL
> >    echo Hello. > REMOTE
> >    git merge-file -p LOCAL BASE REMOTE
> 
> There is no conflict but I don't see what point you're making by that. 

If there's no conflict there's no opportunity to run "git mergetool".

If there's a conflict some lines bellow that doesn't make the above
magically be a conflict. So why would the mergetool show it to the
user?

> I've been thinking about a different example
> 
> BASE    LOCAL   REMOTE
> A	A	A
> A	A	A
> A	A	A
> 	B	A
> 
> Is there a conflict or not? I think it depends on the diff algorithm. 
> These are both valid diffs of BASE and LOCAL but only the first one will 
> lead to conflicts
> 
>   A	+A
>   A	 A
>   A	 A
> +A	 A

Isn't there a B in LOCAL?

> If a tool implements a different diff algorithm to git then it may want 
> to do the whole merge itself.

Yes, in which case it would want take ownership of the whole "are there
conflicts" decision, instead of letting git decide there are no
conflicts.

> I'm going to be off the list for the next couple of weeks

All right. Thanks for the input anyway.

Cheers.
diff mbox series

Patch

diff --git a/Documentation/config/mergetool.txt b/Documentation/config/mergetool.txt
index 16a27443a3..7ce6d0d3ac 100644
--- a/Documentation/config/mergetool.txt
+++ b/Documentation/config/mergetool.txt
@@ -61,3 +61,6 @@  mergetool.writeToTemp::
 
 mergetool.prompt::
 	Prompt before each invocation of the merge resolution program.
+
+mergetool.autoMerge::
+	Remove lines without conflicts from all the files. Defaults to `true`.
diff --git a/git-mergetool.sh b/git-mergetool.sh
index e3f6d543fb..f4db0cac8d 100755
--- a/git-mergetool.sh
+++ b/git-mergetool.sh
@@ -239,6 +239,17 @@  checkout_staged_file () {
 	fi
 }
 
+auto_merge () {
+	git merge-file --diff3 --marker-size=7 -q -p "$LOCAL" "$BASE" "$REMOTE" >"$DIFF3"
+	if test -s "$DIFF3"
+	then
+		sed -e '/^<<<<<<< /,/^||||||| /d' -e '/^=======\r\?$/,/^>>>>>>> /d' "$DIFF3" >"$BASE"
+		sed -e '/^||||||| /,/^>>>>>>> /d' -e '/^<<<<<<< /d' "$DIFF3" >"$LOCAL"
+		sed -e '/^<<<<<<< /,/^=======\r\?$/d' -e '/^>>>>>>> /d' "$DIFF3" >"$REMOTE"
+	fi
+	rm -- "$DIFF3"
+}
+
 merge_file () {
 	MERGED="$1"
 
@@ -274,6 +285,7 @@  merge_file () {
 		BASE=${BASE##*/}
 	fi
 
+	DIFF3="$MERGETOOL_TMPDIR/${BASE}_DIFF3_$$$ext"
 	BACKUP="$MERGETOOL_TMPDIR/${BASE}_BACKUP_$$$ext"
 	LOCAL="$MERGETOOL_TMPDIR/${BASE}_LOCAL_$$$ext"
 	REMOTE="$MERGETOOL_TMPDIR/${BASE}_REMOTE_$$$ext"
@@ -322,6 +334,11 @@  merge_file () {
 	checkout_staged_file 2 "$MERGED" "$LOCAL"
 	checkout_staged_file 3 "$MERGED" "$REMOTE"
 
+	if test "$(git config --bool mergetool.autoMerge)" != "false"
+	then
+		auto_merge
+	fi
+
 	if test -z "$local_mode" || test -z "$remote_mode"
 	then
 		echo "Deleted merge conflict for '$MERGED':"
diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh
index 70afdd06fa..ccabd04823 100755
--- a/t/t7610-mergetool.sh
+++ b/t/t7610-mergetool.sh
@@ -828,4 +828,22 @@  test_expect_success 'mergetool -Oorder-file is honored' '
 	test_cmp expect actual
 '
 
+test_expect_success 'mergetool automerge' '
+	test_config mergetool.automerge true &&
+	test_when_finished "git reset --hard" &&
+	git checkout -b test${test_count}_b master &&
+	test_write_lines >file1 base "" a &&
+	git commit -a -m "base" &&
+	test_write_lines >file1 base "" c &&
+	git commit -a -m "remote update" &&
+	git checkout -b test${test_count}_a HEAD~ &&
+	test_write_lines >file1 local "" b &&
+	git commit -a -m "local update" &&
+	test_must_fail git merge test${test_count}_b &&
+	yes "" | git mergetool file1 &&
+	test_write_lines >expect local "" c &&
+	test_cmp expect file1 &&
+	git commit -m "test resolved with mergetool"
+'
+
 test_done