diff mbox series

[3/4] t6423: update directory rename detection tests with new rule

Message ID 146d6f4735497f0228d22fc7aae2c419363e063a.1602794791.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 8536821d05697a6da991fbecc628058329f39364
Headers show
Series [1/4] directory-rename-detection.txt: update references to regression tests | expand

Commit Message

Elijah Newren Oct. 15, 2020, 8:46 p.m. UTC
From: Elijah Newren <newren@gmail.com>

While investigating the issues highlighted by the testcase in the
previous patch, I also found a shortcoming in the directory rename
detection rules.  Split testcase 6b into two to explain this issue
and update directory-rename-detection.txt to remove one of the previous
rules that I know believe to be detrimental.  Also, update the wording
around testcase 8e; while we are not modifying the results of that
testcase, we were previously unsure of the appropriate resolution of
that test and the new rule makes the previously chosen resolution for
that testcase a bit more solid.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 .../technical/directory-rename-detection.txt  |   5 +-
 t/t6423-merge-rename-directories.sh           | 144 +++++++++++++++---
 2 files changed, 124 insertions(+), 25 deletions(-)
diff mbox series

Patch

diff --git a/Documentation/technical/directory-rename-detection.txt b/Documentation/technical/directory-rename-detection.txt
index ce042cfcae..5d03539412 100644
--- a/Documentation/technical/directory-rename-detection.txt
+++ b/Documentation/technical/directory-rename-detection.txt
@@ -70,10 +70,7 @@  additional rules:
   a) If renames split a directory into two or more others, the directory
      with the most renames, "wins".
 
-  b) Avoid directory-rename-detection for a path, if that path is the
-     source of a rename on either side of a merge.
-
-  c) Only apply implicit directory renames to directories if the other side
+  b) Only apply implicit directory renames to directories if the other side
      of history is the one doing the renaming.
 
 Limitations -- support in different commands
diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh
index 31aea47522..00eac6e9a2 100755
--- a/t/t6423-merge-rename-directories.sh
+++ b/t/t6423-merge-rename-directories.sh
@@ -1277,20 +1277,114 @@  test_expect_success '6a: Tricky rename/delete' '
 	)
 '
 
-# Testcase 6b, Same rename done on both sides
+# Testcase 6b1, Same rename done on both sides
+#   (Related to testcase 6b2 and 8e)
+#   Commit O: z/{b,c,d,e}
+#   Commit A: y/{b,c,d}, x/e
+#   Commit B: y/{b,c,d}, z/{e,f}
+#   Expected: y/{b,c,d,f}, x/e
+#   Note: Directory rename detection says A renamed z/ -> y/ (3 paths renamed
+#         to y/ and only 1 renamed to x/), therefore the new file 'z/f' in B
+#         should be moved to 'y/f'.
+#
+#         This is a bit of an edge case where any behavior might surprise users,
+#         whether that is treating A as renaming z/ -> y/, treating A as renaming
+#         z/ -> x/, or treating A as not doing any directory rename.  However, I
+#         think this answer is the least confusing and most consistent with the
+#         rules elsewhere.
+#
+#         A note about z/ -> x/, since it may not be clear how that could come
+#         about: If we were to ignore files renamed by both sides
+#         (i.e. z/{b,c,d}), as directory rename detection did in git-2.18 thru
+#         at least git-2.28, then we would note there are no renames from z/ to
+#         y/ and one rename from z/ to x/ and thus come to the conclusion that
+#         A renamed z/ -> x/.  This seems more confusing for end users than a
+#         rename of z/ to y/, it makes directory rename detection behavior
+#         harder for them to predict.  As such, we modified the rule, changed
+#         the behavior on testcases 6b2 and 8e, and introduced this 6b1 testcase.
+
+test_setup_6b1 () {
+	test_create_repo 6b1 &&
+	(
+		cd 6b1 &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		echo e >z/e &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		mkdir x &&
+		git mv y/e x/e &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		mkdir z &&
+		git mv y/e z/e &&
+		echo f >z/f &&
+		git add z/f &&
+		test_tick &&
+		git commit -m "B"
+	)
+}
+
+test_expect_failure '6b1: Same renames done on both sides, plus another rename' '
+	test_setup_6b1 &&
+	(
+		cd 6b1 &&
+
+		git checkout A^0 &&
+
+		git -c merge.directoryRenames=true merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:x/e HEAD:y/f &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:z/d    O:z/e    B:z/f &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 6b2, Same rename done on both sides
 #   (Related to testcases 6c and 8e)
 #   Commit O: z/{b,c}
 #   Commit A: y/{b,c}
 #   Commit B: y/{b,c}, z/d
-#   Expected: y/{b,c}, z/d
-#   Note: If we did directory rename detection here, we'd move z/d into y/,
-#         but B did that rename and still decided to put the file into z/,
-#         so we probably shouldn't apply directory rename detection for it.
-
-test_setup_6b () {
-	test_create_repo 6b &&
+#   Expected: y/{b,c,d}
+#   Alternate: y/{b,c}, z/d
+#   Note: Directory rename detection says A renamed z/ -> y/, therefore the new
+#         file 'z/d' in B should be moved to 'y/d'.
+#
+#         We could potentially ignore the renames of z/{b,c} on side A since
+#         those were renamed on both sides.  However, it's a bit of a corner
+#         case because what if there was also a z/e that side A moved to x/e
+#         and side B left alone?  If we used the "ignore renames done on both
+#         sides" logic, then we'd compute that A renamed z/ -> x/, and move
+#         z/d to x/d.  That seems more surprising and uglier than allowing
+#         the z/ -> y/ rename.
+
+test_setup_6b2 () {
+	test_create_repo 6b2 &&
 	(
-		cd 6b &&
+		cd 6b2 &&
 
 		mkdir z &&
 		echo b >z/b &&
@@ -1318,10 +1412,10 @@  test_setup_6b () {
 	)
 }
 
-test_expect_success '6b: Same rename done on both sides' '
-	test_setup_6b &&
+test_expect_failure '6b2: Same rename done on both sides' '
+	test_setup_6b2 &&
 	(
-		cd 6b &&
+		cd 6b2 &&
 
 		git checkout A^0 &&
 
@@ -1335,7 +1429,7 @@  test_expect_success '6b: Same rename done on both sides' '
 		test_line_count = 1 out &&
 
 		git rev-parse >actual \
-			HEAD:y/b HEAD:y/c HEAD:z/d &&
+			HEAD:y/b HEAD:y/c HEAD:y/d &&
 		git rev-parse >expect \
 			O:z/b    O:z/c    B:z/d &&
 		test_cmp expect actual
@@ -1343,7 +1437,7 @@  test_expect_success '6b: Same rename done on both sides' '
 '
 
 # Testcase 6c, Rename only done on same side
-#   (Related to testcases 6b and 8e)
+#   (Related to testcases 6b1, 6b2, and 8e)
 #   Commit O: z/{b,c}
 #   Commit A: z/{b,c} (no change)
 #   Commit B: y/{b,c}, z/d
@@ -2269,14 +2363,22 @@  test_expect_success '8d: rename/delete...or not?' '
 # Notes: In commit A, directory z got renamed to y.  In commit B, directory z
 #        did NOT get renamed; the directory is still present; instead it is
 #        considered to have just renamed a subset of paths in directory z
-#        elsewhere.  However, this is much like testcase 6b (where commit B
-#        moves all the original paths out of z/ but opted to keep d
-#        within z/).  This makes it hard to judge where d should end up.
+#        elsewhere.  This is much like testcase 6b2 (where commit B moves all
+#        the original paths out of z/ but opted to keep d within z/).
+#
+#        It was not clear in the past what should be done with this testcase;
+#        in fact, I noted that I "just picked one" previously.  However,
+#        following the new logic for testcase 6b2, we should take the rename
+#        and move z/d to y/d.
 #
-#        It's possible that users would get confused about this, but what
-#        should we do instead?  It's not at all clear to me whether z/d or
-#        y/d or something else is a better resolution here, and other cases
-#        start getting really tricky, so I just picked one.
+#        6b1, 6b2, and this case are definitely somewhat fuzzy in terms of
+#        whether they are optimal for end users, but (a) the default for
+#        directory rename detection is to mark these all as conflicts
+#        anyway, (b) it feels like this is less prone to higher order corner
+#        case confusion, and (c) the current algorithm requires less global
+#        knowledge (i.e. less coupling in the algorithm between renames done
+#        on both sides) which thus means users are better able to predict
+#        the behavior, and predict it without computing as many details.
 
 test_setup_8e () {
 	test_create_repo 8e &&