diff mbox series

merge-tree.c: add --merge-base=<commit> option

Message ID pull.1397.git.1666872761355.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series merge-tree.c: add --merge-base=<commit> option | expand

Commit Message

Kyle Zhao Oct. 27, 2022, 12:12 p.m. UTC
From: Kyle Zhao <kylezhao@tencent.com>

This option allows users to specify a merge-base commit for the merge.

It will give our callers more flexibility to use the `git merge-tree`.
For example:

    git merge-tree --merge-base=<sha1>^1 source-branch <sha1>

This allows us to implement `git cherry-pick` in bare repositories.

Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
---
    merge-tree.c: add --merge-base= option
    
    Thanks for Elijah's work. I'm very excited that merge-ort is integrated
    into the git merge-tree, which means that we can use merge-ort in bare
    repositories to optimize merge performance.
    
    In this patch, I introduce a new --merge-base=<commit> option to allow
    callers to specify a merge-base for the merge. This may allow users to
    implement git cherry-pick and git rebase in bare repositories with git
    merge-tree cmd.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1397%2Fkeyu98%2Fkz%2Fmerge-tree-option-merge-base-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1397/keyu98/kz/merge-tree-option-merge-base-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1397

 Documentation/git-merge-tree.txt |  4 +++
 builtin/merge-tree.c             | 23 +++++++++++++----
 t/t4301-merge-tree-write-tree.sh | 44 ++++++++++++++++++++++++++++++++
 3 files changed, 66 insertions(+), 5 deletions(-)


base-commit: db29e6bbaed156ae9025ff0474fd788e58230367

Comments

Junio C Hamano Oct. 27, 2022, 6:09 p.m. UTC | #1
"Kyle Zhao via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
> index d6c356740ef..e762209b76d 100644
> --- a/Documentation/git-merge-tree.txt
> +++ b/Documentation/git-merge-tree.txt
> @@ -64,6 +64,10 @@ OPTIONS
>  	share no common history.  This flag can be given to override that
>  	check and make the merge proceed anyway.
>  
> +--merge-base=<commit>::
> +	Instead of finding the merge-bases for <branch1> and <branch2>,
> +	specify a merge-base for the merge.

I like adding and exposing this feature to allow the end-user
specify which commit to use as the base (instead of allowing the
tool compute it from the two branches), but I wonder if a new option
is even needed.

In the original "trivial merge" mode, the command takes three trees
without having to have this new option.  In the new "write-tree"
mode, currently it is incapable of taking the base, but it does not
have to stay that way.  Wouldn't it be sufficient to update the UI
to

    git merge-tree [--write-tree] [<options>] [<base-commit>] <branch1> <branch2>
    git merge-tree [--trivial-merge] <base-commit> <branch1> <branch2>

IOW, when you want to supply the base, you'd be explicit and ask for
the new "write-tree" mode, i.e.

    $ git merge-tree --write-tree $(git merge-base branch^ branch) HEAD branch 

would be how you would use merge-tree to cherry-pick the commit at
the tip of the branch on top of the current commit.

> @@ -402,6 +403,7 @@ struct merge_tree_options {
>  	int allow_unrelated_histories;
>  	int show_messages;
>  	int name_only;
> +	char* merge_base;

Style.  We write in C, not in C++, and our asterisks stick to
variables and members of structs, not types.

> -	/*
> -	 * Get the merge bases, in reverse order; see comment above
> -	 * merge_incore_recursive in merge-ort.h
> -	 */
> -	merge_bases = get_merge_bases(parent1, parent2);
> +	if (o->merge_base) {
> +		struct commit *c = lookup_commit_reference_by_name(o->merge_base);
> +		if (!c)
> +			die(_("could not lookup commit %s"), o->merge_base);
> +		commit_list_insert(c, &merge_bases);

Curious.  The original code unconditionally assigned merge_bases, so
there wasn't a good reason to initialize the variable before this point,
but this new code assumes that merge_bases to be initialized to NULL.

Luckily, it is initialized in the original code, even though it
wasn't necessary at all.  So this new code can work correctly.
Good.

> +	} else {
> +		/*
> +		 * Get the merge bases, in reverse order; see comment above
> +		 * merge_incore_recursive in merge-ort.h
> +		 */
> +		merge_bases = get_merge_bases(parent1, parent2);
> +	}

Yes, this feature was very much lacking and is a welcome addition.

I also have to wonder how this should interact with a topic that is
in-flight to feed multiple merge-tree requests from the standard
input to have a single process perform multiple (not necessarily
related) merges.  Elijah knows much better, but my gut feeling is
that it shouldn't be hard to allow feeding an extra commit on the
same line to be used as the base.

Thanks.
Elijah Newren Oct. 28, 2022, 8:13 a.m. UTC | #2
Hi Kyle!

Thanks for the interest in this area of the code.  There's lots of
interesting history you're probably not aware of here...

On Thu, Oct 27, 2022 at 5:12 AM Kyle Zhao via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Kyle Zhao <kylezhao@tencent.com>
>
> This option allows users to specify a merge-base commit for the merge.

Note that this has been implemented previously.  I may have
implemented it previously and ripped it out (about a year ago), or
maybe I just avoided it once I ran across the reasons below.  Dscho
also implemented it over at
https://lore.kernel.org/git/nycvar.QRO.7.76.6.2202210956430.26495@tvgsbejvaqbjf.bet/

You may want to take a look at his code for comparison, especially the
merge_incore_nonrecursive() stuff.

> It will give our callers more flexibility to use the `git merge-tree`.
> For example:
>
>     git merge-tree --merge-base=<sha1>^1 source-branch <sha1>
>
> This allows us to implement `git cherry-pick` in bare repositories.

This capability is something we may ultimately want for completeness,
but it definitely should not be marketed as a building block for
implementing cherry-pick or rebase.  There are several reasons for
that:

Performance reasons:
  * it's a design with a separate fork for every commit to be picked,
which is quite wasteful.  cherry-pick and rebase should do better.
  * it is explicitly single-commit replaying, which discards the
remembering renames optimization I placed into merge-ort.  We should
have something series-aware so it can take advantage of that.

Usability/Design reasons:
  * with cherry-picking or rebasing you often have a sequence of
commits.  Picking them one at a time not only has the performance
issues above, but also introduces usability and design questions
around the creation of commits from the toplevel trees created by
merge-tree.  For normal merges, we have no idea what the commit
message should be without explicit directives and user editing; this
is reflected in the fact that `git merge` has 9 different flags for
controlling how to specify the commit message or components of it or
how to tweak it or whatever.  I really don't want to copy all of that
logic from git-merge into git-merge-tree, and merge-tree is therefore
just a dry-run merge.  However, in the case of cherry-picking or
rebasing, we *do* have a commit message (and author) that we can just
re-use.  But if we make merge-tree handle the commit message and
create commits, you are fundamentally changing the output style of
merge-tree that we have carefully documented (and promised backward
compatibility on).  Figuring out how to extend the output and when
(and do we also allow that for non-specified merge bases) probably
requires some careful thought.
  * a cherry-pick or rebase tool actually might want to do something
clever with sequences of replayed commits.  For example, if later
commits in the series have commit messages that refer to earlier
commits in the series by their (abbreviated) commit hash, it would be
nice to update those (abbreviated) commit hashes to instead refer to
the replayed version of the earlier commit.  But that fundamentally
requires operating on a series rather than an individual commit.

Correctness/capability reason:
  * this patch is fundamentally limited to replaying regular commits.
Replaying merge-commits correctly doesn't fit within this design; it
needs a different framework.

Maintenance reason:
  * suggesting this code as a basis for cherry-pick or rebase is
likely to lead to yet another shell-scripted rebase; we've been trying
to generally nuke shell scripts from Git for a variety of reasons.
It'd be sad to regress here.

> Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
> ---
>     merge-tree.c: add --merge-base= option
>
>     Thanks for Elijah's work. I'm very excited that merge-ort is integrated
>     into the git merge-tree, which means that we can use merge-ort in bare
>     repositories to optimize merge performance.
>
>     In this patch, I introduce a new --merge-base=<commit> option to allow
>     callers to specify a merge-base for the merge. This may allow users to
>     implement git cherry-pick and git rebase in bare repositories with git
>     merge-tree cmd.

My apologies for having very limited git time (which is often entirely
used up just in reviewing/responding to patches and other queries on
the list instead of on new features like this, or maybe on making some
nice slides for a conference); if I had more time, I think git-replay
could have easily been done many months ago (or perhaps even last
year).  Then you'd have the high level tool you need for server side
cherry-picking and rebasing.  But, I haven't had much time.  So, it
makes sense that folks might want to use an interim hack while waiting
for the more robust tool to materialize.

> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1397%2Fkeyu98%2Fkz%2Fmerge-tree-option-merge-base-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1397/keyu98/kz/merge-tree-option-merge-base-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1397
>
>  Documentation/git-merge-tree.txt |  4 +++
>  builtin/merge-tree.c             | 23 +++++++++++++----
>  t/t4301-merge-tree-write-tree.sh | 44 ++++++++++++++++++++++++++++++++
>  3 files changed, 66 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
> index d6c356740ef..e762209b76d 100644
> --- a/Documentation/git-merge-tree.txt
> +++ b/Documentation/git-merge-tree.txt
> @@ -64,6 +64,10 @@ OPTIONS
>         share no common history.  This flag can be given to override that
>         check and make the merge proceed anyway.
>
> +--merge-base=<commit>::
> +       Instead of finding the merge-bases for <branch1> and <branch2>,
> +       specify a merge-base for the merge.

We could potentially just handle this as an additional positional
argument, which might be more in-keeping with merge-tree style.
However...

* it would force the --write-tree option to be specified when three
commits are listed...at least until we can declare the deprecation
period for --trivial-merge over and nuke that code and flag.
* if we ever want to allow octopus merges, the additional positional
argument does not fly well; the explicit flag for --merge-base might
be necessary.

So, this kind of raises questions on what the deprecation period for
--trivial-merge should be, and whether we ever want to support octopus
merges within git-merge-tree.

>  [[OUTPUT]]
>  OUTPUT

No output changes...makes sense, but kind of reinforces the fact that
this really isn't a basis for cherry-pick or rebase (as noted above).

>  ------
> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index ae5782917b9..adc461f00f3 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -3,6 +3,7 @@
>  #include "tree-walk.h"
>  #include "xdiff-interface.h"
>  #include "help.h"
> +#include "commit.h"
>  #include "commit-reach.h"
>  #include "merge-ort.h"
>  #include "object-store.h"
> @@ -402,6 +403,7 @@ struct merge_tree_options {
>         int allow_unrelated_histories;
>         int show_messages;
>         int name_only;
> +       char* merge_base;
>  };
>
>  static int real_merge(struct merge_tree_options *o,
> @@ -430,11 +432,18 @@ static int real_merge(struct merge_tree_options *o,
>         opt.branch1 = branch1;
>         opt.branch2 = branch2;
>
> -       /*
> -        * Get the merge bases, in reverse order; see comment above
> -        * merge_incore_recursive in merge-ort.h
> -        */
> -       merge_bases = get_merge_bases(parent1, parent2);
> +       if (o->merge_base) {
> +               struct commit *c = lookup_commit_reference_by_name(o->merge_base);
> +               if (!c)
> +                       die(_("could not lookup commit %s"), o->merge_base);

Would it make sense to have o->merge_base be a struct commit *, and
move this lookup/die logic out to the parsing logic in
cmd_merge_tree()?

Just a thought.

> +               commit_list_insert(c, &merge_bases);
> +       } else {
> +               /*
> +                * Get the merge bases, in reverse order; see comment above
> +                * merge_incore_recursive in merge-ort.h
> +                */
> +               merge_bases = get_merge_bases(parent1, parent2);
> +       }
>         if (!merge_bases && !o->allow_unrelated_histories)
>                 die(_("refusing to merge unrelated histories"));
>         merge_bases = reverse_commit_list(merge_bases);

and the next line of code, not showing in the context here, is a call
to merge_incore_recursive().  While that technically works, it doesn't
make logical sense.  You're not doing a recursive merge when you have
a specified merge base, so I think the code should instead call
merge_incore_nonrecursive() in such a case (see Dscho's code for a
comparison here).

> @@ -505,6 +514,10 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix)
>                            &o.allow_unrelated_histories,
>                            N_("allow merging unrelated histories"),
>                            PARSE_OPT_NONEG),
> +               OPT_STRING(0, "merge-base",
> +                        &o.merge_base,
> +                        N_("commit"),
> +                        N_("specify a merge-base commit for the merge")),
>                 OPT_END()
>         };
>
> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> index 013b77144bd..032ff4a1c3d 100755
> --- a/t/t4301-merge-tree-write-tree.sh
> +++ b/t/t4301-merge-tree-write-tree.sh
> @@ -819,4 +819,48 @@ test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository
>         test_must_fail git -C read-only merge-tree side1 side2
>  '
>
> +# specify merge-base as parent of branch2.
> +# git merge-tree --merge-base=A O B
> +#   Commit O: foo, bar
> +#   Commit A: modify foo after Commit O
> +#   Commit B: modify bar after Commit A
> +#   Expected: foo is unchanged, modify bar

So, the O-A-B thing thing was taken from t6423; it appears I didn't
copy the comment at the top of t6423 over to t4301 to explain this.
Sorry about that.  Anyway, for these types of comments in these files,
O is always the merge base of both A and B, and neither A nor B
contain each other in their history.  From that basis, your
description here makes no sense (A would be the tip of some branch,
not the parent of anything).  I'd call your commits something else
(maybe just C1, C2, and C3 or something?).

> +
> +test_expect_success 'specify merge-base as parent of branch2' '
> +       # Setup
> +       git init base-b2-p && (
> +               cd base-b2-p &&
> +               echo foo >foo &&
> +               echo bar >bar &&
> +               git add foo bar &&
> +               git commit -m O &&
> +
> +               git branch O &&
> +               git branch A &&
> +
> +               git checkout A &&
> +               echo "A" >foo &&
> +               git add foo &&
> +               git commit -m A &&
> +
> +               git checkout -b B &&
> +               echo "B" >bar &&
> +               git add bar &&
> +               git commit -m B

Would test_commit be useful here, given that you aren't worried about
the exact contents of files?

> +       ) &&
> +       # Testing
> +       (
> +               cd base-b2-p &&
> +               TREE_OID=$(git merge-tree --merge-base=A O B) &&
> +
> +               q_to_tab <<-EOF >expect &&
> +               100644 blob $(git rev-parse B:bar)Qbar
> +               100644 blob $(git rev-parse O:foo)Qfoo
> +               EOF
> +
> +               git ls-tree $TREE_OID >actual &&
> +               test_cmp expect actual
> +       )
> +'
> +
>  test_done
>
> base-commit: db29e6bbaed156ae9025ff0474fd788e58230367
> --
> gitgitgadget
Elijah Newren Oct. 28, 2022, 8:20 a.m. UTC | #3
On Thu, Oct 27, 2022 at 11:09 AM Junio C Hamano <gitster@pobox.com> wrote:
>
[...]
> Wouldn't it be sufficient to update the UI to
>
>     git merge-tree [--write-tree] [<options>] [<base-commit>] <branch1> <branch2>
>     git merge-tree [--trivial-merge] <base-commit> <branch1> <branch2>
>
> IOW, when you want to supply the base, you'd be explicit and ask for
> the new "write-tree" mode, i.e.
>
>     $ git merge-tree --write-tree $(git merge-base branch^ branch) HEAD branch
>
> would be how you would use merge-tree to cherry-pick the commit at
> the tip of the branch on top of the current commit.

This was my thought too; but would we be painting ourselves into a
corner if we ever want to make merge-tree support octopus merges?

Also, why did you write $(git merge-base branch^ branch) rather than
just branch^ ?

> I also have to wonder how this should interact with a topic that is
> in-flight to feed multiple merge-tree requests from the standard
> input to have a single process perform multiple (not necessarily
> related) merges.  Elijah knows much better, but my gut feeling is
> that it shouldn't be hard to allow feeding an extra commit on the
> same line to be used as the base.

Yeah, I don't think that'd be too hard...if we could rule out ever
supporting octopus merges in merge-tree (which I'm not so sure is a
good assumption).  Otherwise, we might need to figure out the
appropriate backward-compatible input parsing (and output format
changes?)
Kyle Zhao Oct. 28, 2022, 11:43 a.m. UTC | #4
Thanks for your reviews.

> In the original "trivial merge" mode, the command takes three trees
> without having to have this new option.  In the new "write-tree"
> mode, currently it is incapable of taking the base, but it does not
> have to stay that way.  Wouldn't it be sufficient to update the UI
> to
>
>     git merge-tree [--write-tree] [<options>] [<base-commit>] <branch1> <branch2>
>     git merge-tree [--trivial-merge] <base-commit> <branch1> <branch2>
>
> IOW, when you want to supply the base, you'd be explicit and ask for
> the new "write-tree" mode, i.e.
>
>     $ git merge-tree --write-tree $(git merge-base branch^ branch) HEAD branch 
>
> would be how you would use merge-tree to cherry-pick the commit at
> the tip of the branch on top of the current commit.

Referring to Newren's reply, if we need to implement octopus merges for git-merge-tree in the future,  still need a new option, 
so I haven't modified it yet.

>> @@ -402,6 +403,7 @@ struct merge_tree_options {
>>  	int allow_unrelated_histories;
>>  	int show_messages;
>>  	int name_only;
>> +	char* merge_base;
>
> Style.  We write in C, not in C++, and our asterisks stick to variables and members of structs, not types.

Done.
Kyle Zhao Oct. 28, 2022, 11:54 a.m. UTC | #5
> Note that this has been implemented previously.  I may have implemented it previously and ripped it out (about a year ago), or maybe I just avoided it once I ran across the reasons below.  Dscho also implemented it over at https://lore.kernel.org/git/nycvar.QRO.7.76.6.2202210956430.26495@tvgsbejvaqbjf.bet/
>
> You may want to take a look at his code for comparison, especially the
merge_incore_nonrecursive() stuff.

Thanks. I'll take a look later.

> This capability is something we may ultimately want for completeness,
> but it definitely should not be marketed as a building block for
> implementing cherry-pick or rebase.  There are several reasons for
> that:
> 
> Performance reasons:
>  * it's a design with a separate fork for every commit to be picked,
> ....
> Maintenance reason:
>   * suggesting this code as a basis for cherry-pick or rebase is
> likely to lead to yet another shell-scripted rebase; we've been trying
> to generally nuke shell scripts from Git for a variety of reasons.
> It'd be sad to regress here.

Thank you for your very detailed explanation, I feel like I have a deeper understanding of rebase and cherry-pick.

> My apologies for having very limited git time (which is often entirely used up just in reviewing/responding to patches and other queries on the list instead of on new features like this, or maybe on making some nice slides for a conference); if I had more time, I think git-replay could have easily been done many months ago (or perhaps even last year).  Then you'd have the high level tool you need for server side cherry-picking and rebasing.  But, I haven't had much time.  So, it makes sense that folks might want to use an interim hack while waiting for the more robust tool to materialize.

git-replay sounds great, looking forward to it one day.

> Would it make sense to have o->merge_base be a struct commit *, and move this lookup/die logic out to the parsing logic in cmd_merge_tree()?
Done.

> and the next line of code, not showing in the context here, is a call
> to merge_incore_recursive().  While that technically works, it doesn't
> make logical sense.  You're not doing a recursive merge when you have
> a specified merge base, so I think the code should instead call
> merge_incore_nonrecursive() in such a case (see Dscho's code for a
> comparison here).
Done. I've changed it to merge_incore_recursive(). It works,  but I don't know if it's written correctly, especially opt.ancestor.

> So, the O-A-B thing thing was taken from t6423; it appears I didn't copy the comment at the top of t6423 over to t4301 to explain this.
> Sorry about that.  Anyway, for these types of comments in these files, O is always the merge base of both A and B, and neither A nor B contain each other in their history.  From that basis, your description here makes no sense (A would be the tip of some branch, not the parent of anything).  I'd call your commits something else (maybe just C1, C2, and C3 or something?).

Done.

> Would test_commit be useful here, given that you aren't worried about
> the exact contents of files?

It's useful, which makes the test clearer.
Junio C Hamano Oct. 28, 2022, 4:03 p.m. UTC | #6
Elijah Newren <newren@gmail.com> writes:

> On Thu, Oct 27, 2022 at 11:09 AM Junio C Hamano <gitster@pobox.com> wrote:
>>
> [...]
>> Wouldn't it be sufficient to update the UI to
>>
>>     git merge-tree [--write-tree] [<options>] [<base-commit>] <branch1> <branch2>
>>     git merge-tree [--trivial-merge] <base-commit> <branch1> <branch2>
>>
>> IOW, when you want to supply the base, you'd be explicit and ask for
>> the new "write-tree" mode, i.e.
>>
>>     $ git merge-tree --write-tree $(git merge-base branch^ branch) HEAD branch
>>
>> would be how you would use merge-tree to cherry-pick the commit at
>> the tip of the branch on top of the current commit.
>
> This was my thought too; but would we be painting ourselves into a
> corner if we ever want to make merge-tree support octopus merges?

True, the above suggestion was especially bad as it would not be
possible to extend to support multiple bases *and* octopus at the
same time.  Consider it scrapped.  Taking zero or more --base-commit
options explicitly from the command line would be a better
interface.

> Also, why did you write $(git merge-base branch^ branch) rather than
> just branch^ ?

Just to be explicit which one is doing what.

> Yeah, I don't think that'd be too hard...if we could rule out ever
> supporting octopus merges in merge-tree (which I'm not so sure is a
> good assumption).  Otherwise, we might need to figure out the
> appropriate backward-compatible input parsing (and output format
> changes?)

I'd prefer an approach to tackle one thing at a time while making
sure we leave the door open for the future ones.  I think the
backend interface from "merge" to external strategies use a
separator to signal the boundary between the heads being merged
(i.e. branchN above) and the bases being used, which is easy to
parse and support multiple bases and octopus at the same time.

As to making it easy to implement "cherry-pick", I do not think you
should dogmatically forbid it on the basis for merge-tree from the
command line is inherently one mergy operation per invocation.  You
will have the --stdin interface, and a way forward may be a notation
to refer to previous results in the --stdin input stream.  That way,
a single invocation of merge-tree "server" can be fed a sequence of
3-way merges that are actually steps of multi-commit cherry-pick,
that merge the differences of multiple commits on top of the
previous result, one step at a time.

IOW, as long as you make sure --stdin interface is rich enough to
allow you do what people would do a sequence of "git merge-tree"
single shot invocations (perhaps even driven by a script), you would
be OK, I would think.
Elijah Newren Oct. 29, 2022, 1:52 a.m. UTC | #7
On Fri, Oct 28, 2022 at 9:03 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> > Yeah, I don't think that'd be too hard...if we could rule out ever
> > supporting octopus merges in merge-tree (which I'm not so sure is a
> > good assumption).  Otherwise, we might need to figure out the
> > appropriate backward-compatible input parsing (and output format
> > changes?)
>
> I'd prefer an approach to tackle one thing at a time while making
> sure we leave the door open for the future ones.  I think the
> backend interface from "merge" to external strategies use a
> separator to signal the boundary between the heads being merged
> (i.e. branchN above) and the bases being used, which is easy to
> parse and support multiple bases and octopus at the same time.

Ooh, the separator idea sounds like a good solution.

> As to making it easy to implement "cherry-pick", I do not think you
> should dogmatically forbid it on the basis for merge-tree from the
> command line is inherently one mergy operation per invocation.  You
> will have the --stdin interface, and a way forward may be a notation
> to refer to previous results in the --stdin input stream.  That way,
> a single invocation of merge-tree "server" can be fed a sequence of
> 3-way merges that are actually steps of multi-commit cherry-pick,
> that merge the differences of multiple commits on top of the
> previous result, one step at a time.
>
> IOW, as long as you make sure --stdin interface is rich enough to
> allow you do what people would do a sequence of "git merge-tree"
> single shot invocations (perhaps even driven by a script), you would
> be OK, I would think.

I agree that this kind of functionality probably makes sense for
inclusion from a completeness perspective, but I still maintain that
if its sole purpose is implementing a cherry-pick command then it's at
best an ill-advised or interim hack and we should instead do something
better.  While combining this patch with --stdin does solve the
"single operation per invocation" issue, that's only one of six issues
I listed in my email to Kyle about why this is the wrong base for
building a cherry-pick alternative.
diff mbox series

Patch

diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
index d6c356740ef..e762209b76d 100644
--- a/Documentation/git-merge-tree.txt
+++ b/Documentation/git-merge-tree.txt
@@ -64,6 +64,10 @@  OPTIONS
 	share no common history.  This flag can be given to override that
 	check and make the merge proceed anyway.
 
+--merge-base=<commit>::
+	Instead of finding the merge-bases for <branch1> and <branch2>,
+	specify a merge-base for the merge.
+
 [[OUTPUT]]
 OUTPUT
 ------
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index ae5782917b9..adc461f00f3 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -3,6 +3,7 @@ 
 #include "tree-walk.h"
 #include "xdiff-interface.h"
 #include "help.h"
+#include "commit.h"
 #include "commit-reach.h"
 #include "merge-ort.h"
 #include "object-store.h"
@@ -402,6 +403,7 @@  struct merge_tree_options {
 	int allow_unrelated_histories;
 	int show_messages;
 	int name_only;
+	char* merge_base;
 };
 
 static int real_merge(struct merge_tree_options *o,
@@ -430,11 +432,18 @@  static int real_merge(struct merge_tree_options *o,
 	opt.branch1 = branch1;
 	opt.branch2 = branch2;
 
-	/*
-	 * Get the merge bases, in reverse order; see comment above
-	 * merge_incore_recursive in merge-ort.h
-	 */
-	merge_bases = get_merge_bases(parent1, parent2);
+	if (o->merge_base) {
+		struct commit *c = lookup_commit_reference_by_name(o->merge_base);
+		if (!c)
+			die(_("could not lookup commit %s"), o->merge_base);
+		commit_list_insert(c, &merge_bases);
+	} else {
+		/*
+		 * Get the merge bases, in reverse order; see comment above
+		 * merge_incore_recursive in merge-ort.h
+		 */
+		merge_bases = get_merge_bases(parent1, parent2);
+	}
 	if (!merge_bases && !o->allow_unrelated_histories)
 		die(_("refusing to merge unrelated histories"));
 	merge_bases = reverse_commit_list(merge_bases);
@@ -505,6 +514,10 @@  int cmd_merge_tree(int argc, const char **argv, const char *prefix)
 			   &o.allow_unrelated_histories,
 			   N_("allow merging unrelated histories"),
 			   PARSE_OPT_NONEG),
+		OPT_STRING(0, "merge-base",
+			 &o.merge_base,
+			 N_("commit"),
+			 N_("specify a merge-base commit for the merge")),
 		OPT_END()
 	};
 
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 013b77144bd..032ff4a1c3d 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -819,4 +819,48 @@  test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository
 	test_must_fail git -C read-only merge-tree side1 side2
 '
 
+# specify merge-base as parent of branch2.
+# git merge-tree --merge-base=A O B
+#   Commit O: foo, bar
+#   Commit A: modify foo after Commit O
+#   Commit B: modify bar after Commit A
+#   Expected: foo is unchanged, modify bar
+
+test_expect_success 'specify merge-base as parent of branch2' '
+	# Setup
+	git init base-b2-p && (
+		cd base-b2-p &&
+		echo foo >foo &&
+		echo bar >bar &&
+		git add foo bar &&
+		git commit -m O &&
+
+		git branch O &&
+		git branch A &&
+
+		git checkout A &&
+		echo "A" >foo &&
+		git add foo &&
+		git commit -m A &&
+
+		git checkout -b B &&
+		echo "B" >bar &&
+		git add bar &&
+		git commit -m B
+	) &&
+	# Testing
+	(
+		cd base-b2-p &&
+		TREE_OID=$(git merge-tree --merge-base=A O B) &&
+
+		q_to_tab <<-EOF >expect &&
+		100644 blob $(git rev-parse B:bar)Qbar
+		100644 blob $(git rev-parse O:foo)Qfoo
+		EOF
+
+		git ls-tree $TREE_OID >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done