Message ID | d4595397052568ea6ea0cf46190374c74140fed7.1607542887.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | merge-ort: add basic rename detection | expand |
On 12/9/2020 2:41 PM, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@gmail.com> > > Implement rename/rename(1to2) handling, i.e. both sides of history > renaming a file and rename it differently. This code replaces the > following from merge-recurisve.c: > > * all the 1to2 code in process_renames() > * the RENAME_ONE_FILE_TO_TWO case of process_entry() > * handle_rename_rename_1to2() > > Also, there is some shared code from merge-recursive.c for multiple > different rename cases which we will no longer need for this case (or > other rename cases): > > * handle_file_collision() > * setup_rename_conflict_info() > > The consolidation of five separate codepaths into one is made possible > by a change in design: Excellent! > /* This is a rename/rename(1to2) */ > - die("Not yet implemented"); > + clean_merge = handle_content_merge(opt, > + pair->one->path, > + &base->stages[0], > + &side1->stages[1], > + &side2->stages[2], > + pathnames, > + 1 + 2 * opt->priv->call_depth, > + &merged); (this method currently die()s. ok) > + if (!clean_merge && > + merged.mode == side1->stages[1].mode && > + oideq(&merged.oid, &side1->stages[1].oid)) { > + was_binary_blob = 1; > + } nit: Extraneous braces? > + memcpy(&side1->stages[1], &merged, sizeof(merged)); > + if (was_binary_blob) { > + /* > + * Getting here means we were attempting to > + * merge a binary blob. > + * > + * Since we can't merge binaries, > + * handle_content_merge() just takes one > + * side. But we don't want to copy the > + * contents of one side to both paths. We > + * used the contents of side1 above for > + * side1->stages, let's use the contents of > + * side2 for side2->stages below. > + */ > + oidcpy(&merged.oid, &side2->stages[2].oid); > + merged.mode = side2->stages[2].mode; > + } > + memcpy(&side2->stages[2], &merged, sizeof(merged)); > + > + side1->path_conflict = 1; > + side2->path_conflict = 1; > + /* > + * TODO: For renames we normally remove the path at the > + * old name. It would thus seem consistent to do the > + * same for rename/rename(1to2) cases, but we haven't > + * done so traditionally and a number of the regression > + * tests now encode an expectation that the file is > + * left there at stage 1. If we ever decide to change > + * this, add the following two lines here: > + * base->merged.is_null = 1; > + * base->merged.clean = 1; > + * and remove the setting of base->path_conflict to 1. > + */ > + base->path_conflict = 1; I'm getting the point of the review/evening where I'm starting to gloss over these important details. Time to take a break (after this patch). > + path_msg(opt, oldpath, 0, > + _("CONFLICT (rename/rename): %s renamed to " > + "%s in %s and to %s in %s."), > + pathnames[0], > + pathnames[1], opt->branch1, > + pathnames[2], opt->branch2); This output differs a bit from handle_rename_rename_1to2() in merge-recursive.c: output(opt, 1, _("CONFLICT (rename/rename): " "Rename \"%s\"->\"%s\" in branch \"%s\" " "rename \"%s\"->\"%s\" in \"%s\"%s"), o->path, a->path, ci->ren1->branch, o->path, b->path, ci->ren2->branch, opt->priv->call_depth ? _(" (left unresolved)") : ""); How much do we want to have _exact_ output matches between the two strategies, at least in the short term? > @@ -1257,13 +1309,13 @@ static void process_entry(struct merge_options *opt, > int side = (ci->filemask == 4) ? 2 : 1; > ci->merged.result.mode = ci->stages[side].mode; > oidcpy(&ci->merged.result.oid, &ci->stages[side].oid); > - ci->merged.clean = !ci->df_conflict; > + ci->merged.clean = !ci->df_conflict && !ci->path_conflict; > } else if (ci->filemask == 1) { > /* Deleted on both sides */ > ci->merged.is_null = 1; > ci->merged.result.mode = 0; > oidcpy(&ci->merged.result.oid, &null_oid); > - ci->merged.clean = 1; > + ci->merged.clean = !ci->path_conflict; These exist because this is the first time we assign path_conflict. Sure. Thanks, -Stolee
On Thu, Dec 10, 2020 at 7:39 PM Derrick Stolee <stolee@gmail.com> wrote: > > On 12/9/2020 2:41 PM, Elijah Newren via GitGitGadget wrote: > > From: Elijah Newren <newren@gmail.com> > > > > Implement rename/rename(1to2) handling, i.e. both sides of history > > renaming a file and rename it differently. This code replaces the > > following from merge-recurisve.c: > > > > * all the 1to2 code in process_renames() > > * the RENAME_ONE_FILE_TO_TWO case of process_entry() > > * handle_rename_rename_1to2() > > > > Also, there is some shared code from merge-recursive.c for multiple > > different rename cases which we will no longer need for this case (or > > other rename cases): > > > > * handle_file_collision() > > * setup_rename_conflict_info() > > > > The consolidation of five separate codepaths into one is made possible > > by a change in design: > > Excellent! > > > /* This is a rename/rename(1to2) */ > > - die("Not yet implemented"); > > + clean_merge = handle_content_merge(opt, > > + pair->one->path, > > + &base->stages[0], > > + &side1->stages[1], > > + &side2->stages[2], > > + pathnames, > > + 1 + 2 * opt->priv->call_depth, > > + &merged); > > (this method currently die()s. ok) > > > + if (!clean_merge && > > + merged.mode == side1->stages[1].mode && > > + oideq(&merged.oid, &side1->stages[1].oid)) { > > + was_binary_blob = 1; > > + } > > nit: Extraneous braces? Yeah. > > + memcpy(&side1->stages[1], &merged, sizeof(merged)); > > + if (was_binary_blob) { > > + /* > > + * Getting here means we were attempting to > > + * merge a binary blob. > > + * > > + * Since we can't merge binaries, > > + * handle_content_merge() just takes one > > + * side. But we don't want to copy the > > + * contents of one side to both paths. We > > + * used the contents of side1 above for > > + * side1->stages, let's use the contents of > > + * side2 for side2->stages below. > > + */ > > + oidcpy(&merged.oid, &side2->stages[2].oid); > > + merged.mode = side2->stages[2].mode; > > + } > > + memcpy(&side2->stages[2], &merged, sizeof(merged)); > > + > > + side1->path_conflict = 1; > > + side2->path_conflict = 1; > > + /* > > + * TODO: For renames we normally remove the path at the > > + * old name. It would thus seem consistent to do the > > + * same for rename/rename(1to2) cases, but we haven't > > + * done so traditionally and a number of the regression > > + * tests now encode an expectation that the file is > > + * left there at stage 1. If we ever decide to change > > + * this, add the following two lines here: > > + * base->merged.is_null = 1; > > + * base->merged.clean = 1; > > + * and remove the setting of base->path_conflict to 1. > > + */ > > + base->path_conflict = 1; > > I'm getting the point of the review/evening where I'm starting to gloss > over these important details. Time to take a break (after this patch). I'm surprised you didn't take a break after giving a talk at GitHub Universe earlier yesterday. I don't think I got anything done the rest of the day after I gave my talk at GitMerge 2020. Nice job on your talk, by the way; I'm sending it along to some others to watch. > > + path_msg(opt, oldpath, 0, > > + _("CONFLICT (rename/rename): %s renamed to " > > + "%s in %s and to %s in %s."), > > + pathnames[0], > > + pathnames[1], opt->branch1, > > + pathnames[2], opt->branch2); > > This output differs a bit from handle_rename_rename_1to2() in > merge-recursive.c: > > output(opt, 1, _("CONFLICT (rename/rename): " > "Rename \"%s\"->\"%s\" in branch \"%s\" " > "rename \"%s\"->\"%s\" in \"%s\"%s"), > o->path, a->path, ci->ren1->branch, > o->path, b->path, ci->ren2->branch, > opt->priv->call_depth ? _(" (left unresolved)") : ""); > > How much do we want to have _exact_ output matches between the > two strategies, at least in the short term? Good question. I started with such a goal, and discarded it when I discovered it was unrealistic and even actively harmful -- at least in the extreme of exact matching in all cases. I've already updated the regression tests to expect differences in output and behavior in a few different series that merged down months ago[1][2][3]. [1] https://lore.kernel.org/git/pull.827.v3.git.git.1597098559.gitgitgadget@gmail.com/ [2] https://lore.kernel.org/git/pull.769.v2.git.1603731704.gitgitgadget@gmail.com/ [3] https://lore.kernel.org/git/pull.879.git.git.1602794790.gitgitgadget@gmail.com/ Of particular note from those series, at least as far as output messages, are the following commits: 1f3c9ba707 ("t6425: be more flexible with rename/delete conflict messages", 2020-08-10) 2a7c16c980 ("t6422, t6426: be more flexible for add/add conflicts involving renames", 2020-08-10) c8c35f6a02 ("merge tests: expect slight differences in output for recursive vs. ort", 2020-10-26) Summarizing those, I split some conflict messages into multiple messages, changed the name of some conflict types that get reported, avoided having some messages go to stdout while others go to stderr (based on who added them rather than differences in severity of the message), and even just changed some messages (because once you've accepted the other changes, exact matching just doesn't matter). Other commits in the series also note how I changed the merge behavior in addition to the output in various cases. The rename/delete conflict commit message is particularly illustrative. It demonstrates why exact matching of output messages is unachievable short of keeping a completely busted code design. We can certainly tweak individual messages if we feel something will make them clearer, but matching isn't a goal for me. > > @@ -1257,13 +1309,13 @@ static void process_entry(struct merge_options *opt, > > int side = (ci->filemask == 4) ? 2 : 1; > > ci->merged.result.mode = ci->stages[side].mode; > > oidcpy(&ci->merged.result.oid, &ci->stages[side].oid); > > - ci->merged.clean = !ci->df_conflict; > > + ci->merged.clean = !ci->df_conflict && !ci->path_conflict; > > } else if (ci->filemask == 1) { > > /* Deleted on both sides */ > > ci->merged.is_null = 1; > > ci->merged.result.mode = 0; > > oidcpy(&ci->merged.result.oid, &null_oid); > > - ci->merged.clean = 1; > > + ci->merged.clean = !ci->path_conflict; > > These exist because this is the first time we assign path_conflict. > Sure. > > Thanks, > -Stolee Thanks for all your detailed reviews!
diff --git a/merge-ort.c b/merge-ort.c index 085e81196a5..75e638a23eb 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -680,7 +680,59 @@ static int process_renames(struct merge_options *opt, } /* This is a rename/rename(1to2) */ - die("Not yet implemented"); + clean_merge = handle_content_merge(opt, + pair->one->path, + &base->stages[0], + &side1->stages[1], + &side2->stages[2], + pathnames, + 1 + 2 * opt->priv->call_depth, + &merged); + if (!clean_merge && + merged.mode == side1->stages[1].mode && + oideq(&merged.oid, &side1->stages[1].oid)) { + was_binary_blob = 1; + } + memcpy(&side1->stages[1], &merged, sizeof(merged)); + if (was_binary_blob) { + /* + * Getting here means we were attempting to + * merge a binary blob. + * + * Since we can't merge binaries, + * handle_content_merge() just takes one + * side. But we don't want to copy the + * contents of one side to both paths. We + * used the contents of side1 above for + * side1->stages, let's use the contents of + * side2 for side2->stages below. + */ + oidcpy(&merged.oid, &side2->stages[2].oid); + merged.mode = side2->stages[2].mode; + } + memcpy(&side2->stages[2], &merged, sizeof(merged)); + + side1->path_conflict = 1; + side2->path_conflict = 1; + /* + * TODO: For renames we normally remove the path at the + * old name. It would thus seem consistent to do the + * same for rename/rename(1to2) cases, but we haven't + * done so traditionally and a number of the regression + * tests now encode an expectation that the file is + * left there at stage 1. If we ever decide to change + * this, add the following two lines here: + * base->merged.is_null = 1; + * base->merged.clean = 1; + * and remove the setting of base->path_conflict to 1. + */ + base->path_conflict = 1; + path_msg(opt, oldpath, 0, + _("CONFLICT (rename/rename): %s renamed to " + "%s in %s and to %s in %s."), + pathnames[0], + pathnames[1], opt->branch1, + pathnames[2], opt->branch2); i++; /* We handled both renames, i.e. i+1 handled */ continue; @@ -1257,13 +1309,13 @@ static void process_entry(struct merge_options *opt, int side = (ci->filemask == 4) ? 2 : 1; ci->merged.result.mode = ci->stages[side].mode; oidcpy(&ci->merged.result.oid, &ci->stages[side].oid); - ci->merged.clean = !ci->df_conflict; + ci->merged.clean = !ci->df_conflict && !ci->path_conflict; } else if (ci->filemask == 1) { /* Deleted on both sides */ ci->merged.is_null = 1; ci->merged.result.mode = 0; oidcpy(&ci->merged.result.oid, &null_oid); - ci->merged.clean = 1; + ci->merged.clean = !ci->path_conflict; } /*