diff mbox series

[GSoC,RFC] git-merge.adoc: detail submodule merge

Message ID 20250217232847.8567-1-lucasseikioshiro@gmail.com (mailing list archive)
State New
Headers show
Series [GSoC,RFC] git-merge.adoc: detail submodule merge | expand

Commit Message

Lucas Oshiro Feb. 17, 2025, 11:28 p.m. UTC
Submodule merges are, in general, similar to other merges based on oid
three-way-merge. When a conflict happens, however, Git has two special
cases on handling the conflict before yielding it to the user. From the
merge-ort and merge-recursive sources:

- "Case #1: a is contained in b or vice versa": both strategies try to
perform a fast-forward in the submodules if the commit referred by the
conflicted submodule is descendant of another;

- "Case #2: There are one or more merges that contain a and b in the
submodule.  If there is only one, then present it as a suggestion to the
user, but leave it marked unmerged so the user needs to confirm the
resolution."

Add a small paragraph on git-merge.adoc describing this behavior.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
---

Hi,

This is a "scratch-my-own-itch" documentation patch. Some years ago I was
questioned why some submodule merges on GitHub lead to conflicts while locally
they didn't. I only could find a answer for that reading the merge-ort source
code, then a wrote a blog post about that, which you check here:
https://lucasoshiro.github.io/posts-en/2022-03-12-merge-submodule/

Thus, this patch adds to the official documentation what I found at the time.
I wasn't certain if this should belong to the submodule or merge documentation,
so, by now, I'm sending it as a merge documentation patch.

 Documentation/git-merge.adoc | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Elijah Newren Feb. 18, 2025, 2:18 a.m. UTC | #1
On Mon, Feb 17, 2025 at 3:29 PM Lucas Seiki Oshiro
<lucasseikioshiro@gmail.com> wrote:
>
> Submodule merges are, in general, similar to other merges based on oid
> three-way-merge. When a conflict happens, however, Git has two special
> cases on handling the conflict before yielding it to the user. From the
> merge-ort and merge-recursive sources:
>
> - "Case #1: a is contained in b or vice versa": both strategies try to
> perform a fast-forward in the submodules if the commit referred by the
> conflicted submodule is descendant of another;
>
> - "Case #2: There are one or more merges that contain a and b in the
> submodule.  If there is only one, then present it as a suggestion to the
> user, but leave it marked unmerged so the user needs to confirm the
> resolution."
>
> Add a small paragraph on git-merge.adoc describing this behavior.

It may be worth referencing the commit(s) that introduced the behavior
for other reviewers: commit 68d03e4a6e44 (Implement automatic
fast-forward merge for submodules, 2010-07-07).

> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
> ---
>
> Hi,
>
> This is a "scratch-my-own-itch" documentation patch. Some years ago I was
> questioned why some submodule merges on GitHub lead to conflicts while locally
> they didn't. I only could find a answer for that reading the merge-ort source
> code, then a wrote a blog post about that, which you check here:
> https://lucasoshiro.github.io/posts-en/2022-03-12-merge-submodule/
>
> Thus, this patch adds to the official documentation what I found at the time.
> I wasn't certain if this should belong to the submodule or merge documentation,
> so, by now, I'm sending it as a merge documentation patch.

The merge_submodule() function was moved years ago from the submodule
code to the merge code -- see 18cfc088661 ("submodule.c: move
submodule merging to merge-recursive.c", 2018-05-15)
-- so to me the documentation makes more sense to be associated with
merging than with submodules.

However, a bigger problem I see is that documenting how the merge
machinery works is not really specific to `git merge` but is general
to any command that uses the merge machinery.  Thus, it also applies
for cherry-pick, merge-tree, rebase, replay, and revert. But I don't
know where that kind of general how-merging-logic-behaves
documentation should go...

>  Documentation/git-merge.adoc | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/Documentation/git-merge.adoc b/Documentation/git-merge.adoc
> index 64281d6d44..7b12c0d648 100644
> --- a/Documentation/git-merge.adoc
> +++ b/Documentation/git-merge.adoc
> @@ -205,6 +205,13 @@ happens:
>     same and the index entries for them stay as they were,
>     i.e. matching `HEAD`.
>
> +In the case where the path is a submodule, if the commit referred by it in HEAD
> +is descendant of the one referred by it in MERGE_HEAD or vice-versa, Git

"referred by it" is hard for me to parse.  Maybe something like

"""
In the case where the path is a submodule, if the HEAD version of the
submodule is a descendant of the MERGE_HEAD version of the submodule,
or vice-versa, Git...
"""
?

Also, the references to HEAD and MERGE_HEAD do tie this documentation
rather directly to `git merge`; the basic idea is applicable to all
callers of the merge machinery, but none of the other callers use
MERGE_HEAD (some use CHERRY_PICK_HEAD or REBASE_HEAD), and some do not
assume HEAD points to one of the parents either (e.g. merge-tree and
replay).  So, if we want to move this somewhere more general, we'd
need to reword it a bit.

> +attempts to fast-forward to the descendant, when using `ort` or `recursive`
> +strategies.

Oh, maybe we could put this information in
Documentation/merge-strategies.txt?  Hmm....

> Otherwise, Git will treat this case as a conflict, suggesting as a
> +resolution a submodule commit that is descendant of the conflicting ones, if one
> +exists.

Thanks for sending this in.  It's always helpful to get researched
documentation improvements, even if I can't help but nitpick and
complicate matters here and there....  ;-)
Lucas Oshiro Feb. 18, 2025, 6:54 p.m. UTC | #2
> It may be worth referencing the commit(s) that introduced the behavior
> for other reviewers: commit 68d03e4a6e44 (Implement automatic
> fast-forward merge for submodules, 2010-07-07).

Ok! I'll inspect the codebase and reference it in a future v2.

> "referred by it" is hard for me to parse.  Maybe something like
> 
> """
> In the case where the path is a submodule, if the HEAD version of the
> submodule is a descendant of the MERGE_HEAD version of the submodule,
> or vice-versa, Git...
> """
> ?

Perfect! Actually, I find submodules a little abstract to be explained
using only words, but your sentence is very clear.

> Also, the references to HEAD and MERGE_HEAD do tie this documentation
> rather directly to `git merge`; the basic idea is applicable to all
> callers of the merge machinery, but none of the other callers use
> MERGE_HEAD (some use CHERRY_PICK_HEAD or REBASE_HEAD), and some do not
> assume HEAD points to one of the parents either (e.g. merge-tree and
> replay).  So, if we want to move this somewhere more general, we'd
> need to reword it a bit.

Given your previous suggestion, what about:

"""
In the case where the path is a submodule, if one of the versions of
submodule is descendant of another, Git...
"""

?

> Oh, maybe we could put this information in
> Documentation/merge-strategies.txt?  Hmm....

Looks like a good place to put this. My only concerns are:

1. It would need to be documented in both `ort` and `recursive`. I don't
think it would be a big deal as most of the first paragraph of both
strategies are equal.

2. Maybe it would detail too much on this specific case, while not
covering others (e.g. changing file permissions, symlinks, etc).

> Thanks for sending this in.  It's always helpful to get researched
> documentation improvements, even if I can't help but nitpick and
> complicate matters here and there....  ;-)

Thank you! Given how deeply you understand the merge machinery any nitpick
is immensely valuable!
Elijah Newren Feb. 18, 2025, 9:58 p.m. UTC | #3
On Tue, Feb 18, 2025 at 10:55 AM Lucas Oshiro
<lucasseikioshiro@gmail.com> wrote:
>
> > It may be worth referencing the commit(s) that introduced the behavior
> > for other reviewers: commit 68d03e4a6e44 (Implement automatic
> > fast-forward merge for submodules, 2010-07-07).
>
> Ok! I'll inspect the codebase and reference it in a future v2.
>
> > "referred by it" is hard for me to parse.  Maybe something like
> >
> > """
> > In the case where the path is a submodule, if the HEAD version of the
> > submodule is a descendant of the MERGE_HEAD version of the submodule,
> > or vice-versa, Git...
> > """
> > ?
>
> Perfect! Actually, I find submodules a little abstract to be explained
> using only words, but your sentence is very clear.
>
> > Also, the references to HEAD and MERGE_HEAD do tie this documentation
> > rather directly to `git merge`; the basic idea is applicable to all
> > callers of the merge machinery, but none of the other callers use
> > MERGE_HEAD (some use CHERRY_PICK_HEAD or REBASE_HEAD), and some do not
> > assume HEAD points to one of the parents either (e.g. merge-tree and
> > replay).  So, if we want to move this somewhere more general, we'd
> > need to reword it a bit.
>
> Given your previous suggestion, what about:
>
> """
> In the case where the path is a submodule, if one of the versions of
> submodule is descendant of another, Git...
> """
>
> ?

That seems like the right direction, but I think "descendant of
another" is vague/confusing.  Perhaps

"""
In the case where the path is a submodule, if the submodule commit used on
one side of the merge is a descendant of the submodule commit used on the
other side of the merge, Git...
"""

?


>
> > Oh, maybe we could put this information in
> > Documentation/merge-strategies.txt?  Hmm....
>
> Looks like a good place to put this. My only concerns are:
>
> 1. It would need to be documented in both `ort` and `recursive`. I don't
> think it would be a big deal as most of the first paragraph of both
> strategies are equal.

Yes, until `recursive` is deleted anyway.  (At which point we'll just
remap `recursive` to mean `ort` and not have to have separate
documentation for the two.)

> 2. Maybe it would detail too much on this specific case, while not
> covering others (e.g. changing file permissions, symlinks, etc).

Yeah, but we don't have a way to resolve differences for those kinds
of changes when neither side matches the base version (unless
something like -Xours or -Xtheirs is passed, but even then that
belongs under the -X documentation); submodules are somewhat special
in that regard.

> > Thanks for sending this in.  It's always helpful to get researched
> > documentation improvements, even if I can't help but nitpick and
> > complicate matters here and there....  ;-)
>
> Thank you! Given how deeply you understand the merge machinery any nitpick
> is immensely valuable!
Lucas Oshiro Feb. 19, 2025, 2:54 p.m. UTC | #4
> """
> In the case where the path is a submodule, if the submodule commit used on
> one side of the merge is a descendant of the submodule commit used on the
> other side of the merge, Git...
> """

Looks good to me!

> Yes, until `recursive` is deleted anyway.  (At which point we'll just
> remap `recursive` to mean `ort` and not have to have separate
> documentation for the two.)

Ok. I'll write for both `ort` and `recursive`.

> Yeah, but we don't have a way to resolve differences for those kinds
> of changes when neither side matches the base version (unless
> something like -Xours or -Xtheirs is passed, but even then that
> belongs under the -X documentation); submodules are somewhat special
> in that regard.

Thanks again! I'll send a v2 soon.
diff mbox series

Patch

diff --git a/Documentation/git-merge.adoc b/Documentation/git-merge.adoc
index 64281d6d44..7b12c0d648 100644
--- a/Documentation/git-merge.adoc
+++ b/Documentation/git-merge.adoc
@@ -205,6 +205,13 @@  happens:
    same and the index entries for them stay as they were,
    i.e. matching `HEAD`.
 
+In the case where the path is a submodule, if the commit referred by it in HEAD
+is descendant of the one referred by it in MERGE_HEAD or vice-versa, Git
+attempts to fast-forward to the descendant, when using `ort` or `recursive`
+strategies. Otherwise, Git will treat this case as a conflict, suggesting as a
+resolution a submodule commit that is descendant of the conflicting ones, if one
+exists.
+
 If you tried a merge which resulted in complex conflicts and
 want to start over, you can recover with `git merge --abort`.