[v4,00/10] Improve path collision conflict resolutions

Message ID	20181102185317.31015-1-newren@gmail.com (mailing list archive)
Headers	show Return-Path: <git-owner@kernel.org> From: Elijah Newren <newren@gmail.com> To: <git@vger.kernel.org> CC: <gitster@pobox.com>, <stolee@gmail.com>, Elijah Newren <newren@gmail.com> Subject: [PATCH v4 00/10] Improve path collision conflict resolutions Date: Fri, 2 Nov 2018 11:53:08 -0700 Message-ID: <20181102185317.31015-1-newren@gmail.com> In-Reply-To: <20181019193111.12051-1-newren@gmail.com> References: <20181019193111.12051-1-newren@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII Sender: git-owner@vger.kernel.org Precedence: bulk
Series	Improve path collision conflict resolutions \| expand [v4,00/10] Improve path collision conflict resolutions [v4,01/10] Add testcases for consistency in file collision conflict handling [v4,02/10] t6036, t6042: testcases for rename collision of already conflicting files [v4,03/10] merge-recursive: increase marker length with depth of recursion [v4,04/10] merge-recursive: new function for better colliding conflict resolutions [v4,05/10] merge-recursive: fix rename/add conflict handling [v4,06/10] merge-recursive: improve handling for rename/rename(2to1) conflicts [v4,07/10] merge-recursive: use handle_file_collision for add/add conflicts [v4,08/10] merge-recursive: improve rename/rename(1to2)/add[/add] handling [RFC,v4,09/10] fixup! merge-recursive: fix rename/add conflict handling [v4,10/10] fixup! merge-recursive: improve rename/rename(1to2)/add[/add] handling

Message ID

20181102185317.31015-1-newren@gmail.com (mailing list archive)

Headers

From: Elijah Newren <newren@gmail.com>
To: <git@vger.kernel.org>
CC: <gitster@pobox.com>, <stolee@gmail.com>,
        Elijah Newren <newren@gmail.com>
Subject: [PATCH v4 00/10] Improve path collision conflict resolutions
Date: Fri, 2 Nov 2018 11:53:08 -0700
Message-ID: <20181102185317.31015-1-newren@gmail.com>
In-Reply-To: <20181019193111.12051-1-newren@gmail.com>
References: <20181019193111.12051-1-newren@gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 7BIT
Content-Type: text/plain; charset=US-ASCII
Sender: git-owner@vger.kernel.org
Precedence: bulk

Series

Improve path collision conflict resolutions | expand

Message

Elijah Newren Nov. 2, 2018, 6:53 p.m. UTC

This series depends on en/merge-cleanup-more and is built on that
series.  (It merges cleanly to master, next, and pu -- well, as long
as v3 of this series is excluded from pu, that is).

This series makes all the "file collision" conflict types be handled
consistently; making them all behave like add/add (as suggested by
Jonathan[1] and Junio[2]).  These types are:
  * add/add
  * rename/add
  * rename/rename(2to1)
  * each rename/add piece of a rename/rename(1to2)/add[/add] conflict

[1] https://public-inbox.org/git/20180312213521.GB58506@aiede.svl.corp.google.com/
[2] https://public-inbox.org/git/CAPc5daVu8vv9RdGON8JiXEO3ycDVqQ38ySzZc-cpo+AQcAKXjA@mail.gmail.com

Changes since v3:
  * Fixed test names to be surrounded by single quotes instead of double
    quotes, as suggested by Derrick.
  * Two more (RFC) patches add a couple testcases to cover previously
    uncovered code, as pointed out by Derrick and his test coverage report.
  * Full range-diff below.

Major question:
  * You'll note that I edited the last two patches to mark them as RFC.
    To be honest, I'm not sure what to do with these.  They improve code
    coverage of new code, but the same gaps existed in the old code;
    they only show up in the coverage-diff because I essentially moved
    code around to a new improved function.  Since the new code doesn't
    really add new abilities but rather just shifts the handling of
    these conflicts to a common function, they shouldn't need any more
    testcases than previously and modifying the existing patches thus
    feels slightly misleading.  That line of thought leads me to believe
    that perhaps putting them in a separate combined patch of their own
    with a decent commit message is the right way to go.  On the other
    hand, squashing them to the commits they're marked as fixups for
    shows which logical part of the code the tests are related to, which
    seems like a good thing.  So what's the right way to handle these?


 1:  1be9e213db !  1:  0fa67d6109 t6036, t6042: testcases for rename collision of already conflicting files
    @@ -51,7 +51,7 @@
     +#   conflict markers.  This is a pretty weird corner case, but we just want
     +#   to ensure that we handle it as well as practical.
     +
    -+test_expect_success "setup nested conflicts" '
    ++test_expect_success 'setup nested conflicts' '
     +	test_create_repo nested_conflicts &&
     +	(
     +		cd nested_conflicts &&
    @@ -130,7 +130,7 @@
     +	)
     +'
     +
    -+test_expect_failure "check nested conflicts" '
    ++test_expect_failure 'check nested conflicts' '
     +	(
     +		cd nested_conflicts &&
     +
    @@ -241,7 +241,7 @@
     +#
     +#   So, we have four different conflicting files that all end up at path
     +#   'three'.
    -+test_expect_success "setup nested conflicts from rename/rename(2to1)" '
    ++test_expect_success 'setup nested conflicts from rename/rename(2to1)' '
     +	test_create_repo nested_conflicts_from_rename_rename &&
     +	(
     +		cd nested_conflicts_from_rename_rename &&
    @@ -294,7 +294,7 @@
     +	)
     +'
     +
    -+test_expect_failure "check nested conflicts from rename/rename(2to1)" '
    ++test_expect_failure 'check nested conflicts from rename/rename(2to1)' '
     +	(
     +		cd nested_conflicts_from_rename_rename &&
     +
 2:  d3356ff525 !  2:  9f5f0105d0 merge-recursive: increase marker length with depth of recursion
    @@ -179,7 +179,7 @@
     +#   nested conflict markers from X2 in the base version -- that means we
     +#   have three levels of conflict markers.  Can we distinguish all three?
     +
    -+test_expect_success "setup virtual merge base with nested conflicts" '
    ++test_expect_success 'setup virtual merge base with nested conflicts' '
     +	test_create_repo virtual_merge_base_has_nested_conflicts &&
     +	(
     +		cd virtual_merge_base_has_nested_conflicts &&
    @@ -241,7 +241,7 @@
     +	)
     +'
     +
    -+test_expect_success "check virtual merge base with nested conflicts" '
    ++test_expect_success 'check virtual merge base with nested conflicts' '
     +	(
     +		cd virtual_merge_base_has_nested_conflicts &&
     +
 3:  aa68e3d675 =  3:  5922c40fa7 merge-recursive: new function for better colliding conflict resolutions
 4:  f046ba6362 =  4:  dcf88dd363 merge-recursive: fix rename/add conflict handling
 5:  37742bdefd !  5:  1d11288be4 merge-recursive: improve handling for rename/rename(2to1) conflicts
    @@ -209,8 +209,8 @@
      	)
      '
      
    --test_expect_failure "check nested conflicts" '
    -+test_expect_success "check nested conflicts" '
    +-test_expect_failure 'check nested conflicts' '
    ++test_expect_success 'check nested conflicts' '
      	(
      		cd nested_conflicts &&
      
    @@ -290,8 +290,8 @@
      	)
      '
      
    --test_expect_failure "check nested conflicts from rename/rename(2to1)" '
    -+test_expect_success "check nested conflicts from rename/rename(2to1)" '
    +-test_expect_failure 'check nested conflicts from rename/rename(2to1)' '
    ++test_expect_success 'check nested conflicts from rename/rename(2to1)' '
      	(
      		cd nested_conflicts_from_rename_rename &&
      
 6:  776dff8bc4 =  6:  1fad3428a4 merge-recursive: use handle_file_collision for add/add conflicts
 7:  45940724d5 =  7:  e7ac0d894e merge-recursive: improve rename/rename(1to2)/add[/add] handling
 -:  ---------- >  8:  9328f66ed1 fixup! merge-recursive: fix rename/add conflict handling
 -:  ---------- >  9:  d061509573 fixup! merge-recursive: improve rename/rename(1to2)/add[/add] handling
  
Elijah Newren (10):
  Add testcases for consistency in file collision conflict handling
  t6036, t6042: testcases for rename collision of already conflicting
    files
  merge-recursive: increase marker length with depth of recursion
  merge-recursive: new function for better colliding conflict
    resolutions
  merge-recursive: fix rename/add conflict handling
  merge-recursive: improve handling for rename/rename(2to1) conflicts
  merge-recursive: use handle_file_collision for add/add conflicts
  merge-recursive: improve rename/rename(1to2)/add[/add] handling
  fixup! merge-recursive: fix rename/add conflict handling
  fixup! merge-recursive: improve rename/rename(1to2)/add[/add] handling

 ll-merge.c                           |   4 +-
 ll-merge.h                           |   1 +
 merge-recursive.c                    | 528 ++++++++++++++++-----------
 t/t6036-recursive-corner-cases.sh    | 430 +++++++++++++++++++++-
 t/t6042-merge-rename-corner-cases.sh | 333 ++++++++++++++++-
 t/t6043-merge-rename-directories.sh  | 144 +++++---
 6 files changed, 1148 insertions(+), 292 deletions(-)

Comments

Derrick Stolee Nov. 2, 2018, 7:09 p.m. UTC | #1

On 11/2/2018 2:53 PM, Elijah Newren wrote:
> Major question:
>    * You'll note that I edited the last two patches to mark them as RFC.
>      To be honest, I'm not sure what to do with these.  They improve code
>      coverage of new code, but the same gaps existed in the old code;
>      they only show up in the coverage-diff because I essentially moved
>      code around to a new improved function.  Since the new code doesn't
>      really add new abilities but rather just shifts the handling of
>      these conflicts to a common function, they shouldn't need any more
>      testcases than previously and modifying the existing patches thus
>      feels slightly misleading.  That line of thought leads me to believe
>      that perhaps putting them in a separate combined patch of their own
>      with a decent commit message is the right way to go.  On the other
>      hand, squashing them to the commits they're marked as fixups for
>      shows which logical part of the code the tests are related to, which
>      seems like a good thing.  So what's the right way to handle these?

I appreciate the effort you made to improve test coverage! It's 
unfortunate that this portion wasn't covered earlier, because we could 
have broken it and not noticed until a release.

I think making them separate commits is fine, and the comment on the 
test case is helpful. The fact that you only had to change the commit 
timestamps in order to get the coverage makes me reexamine the code and 
realize that maybe the "right" thing to do is to reduce our code clones. 
(This is also how I was looking at the wrong block of the patch when 
talking about it not being covered.) I'll look at the patch and see if I 
can contribute a concrete code suggestion there.

Aside: I hope that I am not being annoying by poking around with the 
test coverage reports. It does give me a way to target my review 
efforts, especially into changes that touch areas outside my expertise 
(like this one).

Thanks,

-Stolee

Elijah Newren Nov. 2, 2018, 8:06 p.m. UTC | #2

On Fri, Nov 2, 2018 at 12:09 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 11/2/2018 2:53 PM, Elijah Newren wrote:
> > Major question:
> >    * You'll note that I edited the last two patches to mark them as RFC.
> >      To be honest, I'm not sure what to do with these.  They improve code
> >      coverage of new code, but the same gaps existed in the old code;
> >      they only show up in the coverage-diff because I essentially moved
> >      code around to a new improved function.  Since the new code doesn't
> >      really add new abilities but rather just shifts the handling of
> >      these conflicts to a common function, they shouldn't need any more
> >      testcases than previously and modifying the existing patches thus
> >      feels slightly misleading.  That line of thought leads me to believe
> >      that perhaps putting them in a separate combined patch of their own
> >      with a decent commit message is the right way to go.  On the other
> >      hand, squashing them to the commits they're marked as fixups for
> >      shows which logical part of the code the tests are related to, which
> >      seems like a good thing.  So what's the right way to handle these?
>
> I appreciate the effort you made to improve test coverage! It's
> unfortunate that this portion wasn't covered earlier, because we could
> have broken it and not noticed until a release.

Yes, I agree...except that I think we might not have noticed until a
couple releases down the road; these things tend to not come up a lot
in practice.  (Which may make it even more important to pay attention
to code coverage.)

> I think making them separate commits is fine, and the comment on the
> test case is helpful. The fact that you only had to change the commit
> timestamps in order to get the coverage makes me reexamine the code and
> realize that maybe the "right" thing to do is to reduce our code clones.
> (This is also how I was looking at the wrong block of the patch when
> talking about it not being covered.) I'll look at the patch and see if I
> can contribute a concrete code suggestion there.

Yeah, I had the same feeling, _again_, while re-looking at the tests
and code as well.  I think the history of how we got here goes
something like this:

  * there is some fairly simple code to handle these rename/rename
(1to2 and 2to1) cases, with logic for handling each side being a
neary-copy of each other.
  * someone does some analysis about trying to remove duplication and
notes that there are 3-4 pieces that change; adding logic to change
out those pieces and rewrite it as a loop adds some complexity, which
isn't worth it given the simple code
  * additional issues are discovered, such as D/F conflicts or
inappropriate handling of untracked or dirty files, and due to
merge-recursive's bad design[1], the fixes have to be sprinkled
*everywhere* throughout the whole code base.  Lather, rinse, repeat a
few times.
  * Now, although the original analysis of removing the duplication
was correct given the amount of code that exited at the time, the
weights have changed as new code was added to both codepaths.  But the
original analysis is long since forgotten, the code is more complex,
and we have to think about whether fixing it now is worth it if we're
going to rewrite it all anyway to fix that fundamental design flaw[2].

[1] https://public-inbox.org/git/xmqqd147kpdm.fsf@gitster.mtv.corp.google.com/
[2] https://public-inbox.org/git/xmqqk1ydkbx0.fsf@gitster.mtv.corp.google.com/

> Aside: I hope that I am not being annoying by poking around with the
> test coverage reports. It does give me a way to target my review
> efforts, especially into changes that touch areas outside my expertise
> (like this one).

Not annoying at all; I think it's a very valuable thing you're doing.
And I think these tests make things better (there have definitely been
cases in the past where a merge one way would work and a merge the
other way would have funny bugs).  I was just unsure about what the
best way to present it amongst the other changes was.