Message ID | pull.1122.git.1642888562.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | RFC: In-core git merge-tree ("Server side merges") | expand |
On Sat, Jan 22, 2022 at 10:56 PM Elijah Newren via GitGitGadget <gitgitgadget@gmail.com> wrote: > Updates since v2 (thanks to Christian, Dscho, Ramsay, and René for > suggestions and comments on v2): > > * Significant changes to output format: > * Flags no longer take a filename for additional output; they write to > stdout instead. > * More information included by default when there are conflicts (no need > to request it with additional flags, instead flags can be used to > suppress it). > * Provide (mode, oid, stage, file) tuples -- i.e. ls-files -u style of > information -- when there are conflicts. Add a flag to only list > conflicted files if that's preferred. The above changes seem good to me. > * Much more thorough manual for git-merge-tree.txt > * Renamed option from --real to --write-tree > * Accept an optional --trivial-merge option to get old style merge-tree > behavior > * Allow both --write-tree and --trivial-merge to be omitted since we can > deduce which from number of arguments I still think that it might be simpler and cleaner to leave 'git merge-tree' alone for now, and just add a new command named for example 'git write-merge-tree'. Later we can always add flags to 'git merge-tree' or add 'git trivial-merge-tree' as an alias for 'git merge-tree', and eventually slowly switch 'git merge-tree' to mean only 'git write-merge-tree' if that's where we want to go. > * Document exit code when the merge cannot be run (so we can distinguish > other error cases from conflicts) > * testcase cleanups: test_tick, early skip of test when using recursive > backend, variable renames, etc. > * various minor code cleanups > * Add a new --allow-unrelated-histories option (with same meaning as the > one used in git merge) The above changes seem good to me too. > Stuff intentionally NOT included, but which others seemed to feel strongly > about; they'd need to convince me more on these: > > * Any form of diff output[1] It's not a big issue for me to not include them right now as long as it's possible to add cli options later that add them. The reason is that I think in many cases when there are conflicts, the conflicts will be small and the user will want to see them. So it would be simpler to just have an option to show any conflict right away, rather than have the user launch another command (a diff-tree against which tree and with which options?). Thanks for working on this anyway!
Hi Christian, On Wed, 26 Jan 2022, Christian Couder wrote: > On Sat, Jan 22, 2022 at 10:56 PM Elijah Newren via GitGitGadget > <gitgitgadget@gmail.com> wrote: > > > * Accept an optional --trivial-merge option to get old style merge-tree > > behavior > > * Allow both --write-tree and --trivial-merge to be omitted since we can > > deduce which from number of arguments > > I still think that it might be simpler and cleaner to leave 'git > merge-tree' alone for now, and just add a new command named for > example 'git write-merge-tree'. That would assume that the original `git merge-tree` implementation was useful. That notion has been thoroughly refuted in the meantime, though. I am really opposed to introducing a new command here. Elijah took the best approach we can take here: save the `merge-tree` command by teaching it to do something useful. > Later we can always add flags to 'git merge-tree' or add 'git > trivial-merge-tree' as an alias for 'git merge-tree', and eventually > slowly switch 'git merge-tree' to mean only 'git write-merge-tree' if > that's where we want to go. I suggested before, and seem to need to repeat again, that we need to let ourselves be guided less by hypothetical scenarios, and more by actual, concrete use cases where the revamped `merge-tree` command is useful. And since I already provided some feedback based on my work from working on a server-side backend, I am fairly certain that we already have a pretty good idea where we want to go. > > Stuff intentionally NOT included, but which others seemed to feel strongly > > about; they'd need to convince me more on these: > > > > * Any form of diff output[1] > > It's not a big issue for me to not include them right now as long as > it's possible to add cli options later that add them. But why? That _so_ smells like a hypothetical scenario. We do not need the diffs. It is highly unlikely that the server-side wants to have diffs, and if a user does want the diffs, it is very, very easy to generate them by chaining low-level commands. So there is absolutely no need for `git merge-tree` to produce diffs. > The reason is that I think in many cases when there are conflicts, the > conflicts will be small and the user will want to see them. So it would > be simpler to just have an option to show any conflict right away, > rather than have the user launch another command (a diff-tree against > which tree and with which options?). That assumes that server-side merge UIs will present merge conflicts in the form of diffs containing merge conflict markers. Which I don't think will happen, like, ever. In short: I completely disagree that we should introduce a new command, and I also completely disagree that the `merge-tree` command should output any diffs. I do agree that we need to be mindful of what we actually need, and in that regard, I reiterate that we need to let concrete use cases guide us. As part of GitLab, you might be in an excellent position to look at GitLab's concrete server-side needs when it comes to use `git merge-tree` to perform merges. Ciao, Dscho
Hi Dscho, On Wed, Jan 26, 2022 at 1:03 PM Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > Hi Christian, > > On Wed, 26 Jan 2022, Christian Couder wrote: > > > On Sat, Jan 22, 2022 at 10:56 PM Elijah Newren via GitGitGadget > > <gitgitgadget@gmail.com> wrote: > > > > > * Accept an optional --trivial-merge option to get old style merge-tree > > > behavior > > > * Allow both --write-tree and --trivial-merge to be omitted since we can > > > deduce which from number of arguments > > > > I still think that it might be simpler and cleaner to leave 'git > > merge-tree' alone for now, and just add a new command named for > > example 'git write-merge-tree'. > > That would assume that the original `git merge-tree` implementation was > useful. That notion has been thoroughly refuted in the meantime, though. > > I am really opposed to introducing a new command here. Elijah took the > best approach we can take here: save the `merge-tree` command by teaching > it to do something useful. I think it's a question of point of view. If a command is completely useless, then most of the time it needs to die, not be "saved". We would need good statistics, but I doubt we have "saved" many useless commands before, compared to commands we have just killed. > > Later we can always add flags to 'git merge-tree' or add 'git > > trivial-merge-tree' as an alias for 'git merge-tree', and eventually > > slowly switch 'git merge-tree' to mean only 'git write-merge-tree' if > > that's where we want to go. > > I suggested before, and seem to need to repeat again, that we need to let > ourselves be guided less by hypothetical scenarios, and more by actual, > concrete use cases where the revamped `merge-tree` command is useful. Ok, see below. > And since I already provided some feedback based on my work from working > on a server-side backend, I am fairly certain that we already have a > pretty good idea where we want to go. > > > > Stuff intentionally NOT included, but which others seemed to feel strongly > > > about; they'd need to convince me more on these: > > > > > > * Any form of diff output[1] > > > > It's not a big issue for me to not include them right now as long as > > it's possible to add cli options later that add them. > > But why? That _so_ smells like a hypothetical scenario. > > We do not need the diffs. It is highly unlikely that the server-side wants > to have diffs, and if a user does want the diffs, it is very, very easy to > generate them by chaining low-level commands. > > So there is absolutely no need for `git merge-tree` to produce diffs. > > > The reason is that I think in many cases when there are conflicts, the > > conflicts will be small and the user will want to see them. So it would > > be simpler to just have an option to show any conflict right away, > > rather than have the user launch another command (a diff-tree against > > which tree and with which options?). > > That assumes that server-side merge UIs will present merge conflicts in > the form of diffs containing merge conflict markers. Which I don't think > will happen, like, ever. Please take a look at: https://docs.gitlab.com/ee/user/project/merge_requests/conflicts.html#resolve-conflicts-in-the-inline-editor As you can see in the image there are conflict markers in the file displayed by the server UI. > In short: I completely disagree that we should introduce a new command, > and I also completely disagree that the `merge-tree` command should output > any diffs. > > I do agree that we need to be mindful of what we actually need, and in > that regard, I reiterate that we need to let concrete use cases guide us. > As part of GitLab, you might be in an excellent position to look at > GitLab's concrete server-side needs when it comes to use `git merge-tree` > to perform merges. I hope I provided a concrete use case with the link above.
Hi Christian, On Wed, 26 Jan 2022, Christian Couder wrote: > On Wed, Jan 26, 2022 at 1:03 PM Johannes Schindelin > <Johannes.Schindelin@gmx.de> wrote: > > > > On Wed, 26 Jan 2022, Christian Couder wrote: > > > > > The reason is that I think in many cases when there are conflicts, > > > the conflicts will be small and the user will want to see them. So > > > it would be simpler to just have an option to show any conflict > > > right away, rather than have the user launch another command (a > > > diff-tree against which tree and with which options?). > > > > That assumes that server-side merge UIs will present merge conflicts in > > the form of diffs containing merge conflict markers. Which I don't think > > will happen, like, ever. > > Please take a look at: > > https://docs.gitlab.com/ee/user/project/merge_requests/conflicts.html#resolve-conflicts-in-the-inline-editor > > As you can see in the image there are conflict markers in the file > displayed by the server UI. Please note the difference between what I wrote above (present merge conflicts in the form of diffs containing merge conflict markers) and what is shown in the document you linked to (present a file annotated with merge conflict markers). There is no diff in that page. What's more: there are not only conflict markers in the editor, there is clearly a visual marker next to the line number that indicates that the editor has a fundamental understanding where the conflict markers are. Which means that the conflict markers must have been generated independently of Git rather than parsed in some random diff that was produced by Git. In other words: you are making my case for me that `git merge-tree` should not generate diff output because it would not even be used. > > In short: I completely disagree that we should introduce a new command, > > and I also completely disagree that the `merge-tree` command should output > > any diffs. > > > > I do agree that we need to be mindful of what we actually need, and in > > that regard, I reiterate that we need to let concrete use cases guide us. > > As part of GitLab, you might be in an excellent position to look at > > GitLab's concrete server-side needs when it comes to use `git merge-tree` > > to perform merges. > > I hope I provided a concrete use case with the link above. Sorry, I apparently was a bit unclear. In the context of discussing `git merge-tree`, a low-level Git command, when I talk about a user, I do not mean a human being, but a program that calls that command and parses its output. Corollary: by "use case" I refer to a concrete implementation of a server-side merge operation, I refer to backend code that currently calls into libgit2 to perform the merge, and would benefit from calling `git merge-tree` instead. Such a use case informs us about the type and amount of information that is required of the code that is currently being discussed in this mail thread. And I highly doubt that you will find such a use case that wants libgit2 (and later, `git merge-tree`) to generate diffs. Because _diffs_ are certainly _not_ what is consumed by the inline editor you referenced. Of course, I am still left guessing what the server-side needs concretely, because that is not at all obvious from the user-facing web site to which you sent me. What is needed is a good, hard look at the actual _code_, the code that calls into libgit2 to perform a merge, and that could instead spawn a `git merge-tree` process to accomplish the same thing. We need to get away from hypothetical scenarios. They're not helping. Ciao, Johannes
Hi Dscho, On Fri, Jan 28, 2022 at 1:58 PM Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > Hi Christian, > > On Wed, 26 Jan 2022, Christian Couder wrote: > > > On Wed, Jan 26, 2022 at 1:03 PM Johannes Schindelin > > <Johannes.Schindelin@gmx.de> wrote: > > > > > > On Wed, 26 Jan 2022, Christian Couder wrote: > > > > > > > The reason is that I think in many cases when there are conflicts, > > > > the conflicts will be small and the user will want to see them. So > > > > it would be simpler to just have an option to show any conflict > > > > right away, rather than have the user launch another command (a > > > > diff-tree against which tree and with which options?). > > > > > > That assumes that server-side merge UIs will present merge conflicts in > > > the form of diffs containing merge conflict markers. Which I don't think > > > will happen, like, ever. > > > > Please take a look at: > > > > https://docs.gitlab.com/ee/user/project/merge_requests/conflicts.html#resolve-conflicts-in-the-inline-editor > > > > As you can see in the image there are conflict markers in the file > > displayed by the server UI. > > Please note the difference between what I wrote above (present merge > conflicts in the form of diffs containing merge conflict markers) and what > is shown in the document you linked to (present a file annotated with > merge conflict markers). > > There is no diff in that page. The server UI could just get a diff with the conflicts inside instead of the full files with conflict inside, as the diff would be smaller to parse than the full files. So even if it's not shown, the diff could still be useful. Also just above the section of the link I sent, there is this section https://docs.gitlab.com/ee/user/project/merge_requests/conflicts.html#resolve-conflicts-in-interactive-mode where one can see diff markers in the image. There are no conflict markers in those images, but it's possible that a future UI could combine both a diff and conflict markers. Also please note that I don't absolutely require diffs. At the beginning of the paragraph from my original email that you quoted above, there was: "It's not a big issue for me to not include them right now as long as it's possible to add cli options later that add them." So I was just saying that the format and code should be flexible enough to be able to easily accommodate sending further data like diffs with additional options. I think it's a very reasonable request. So please don't make it a huge issue. You can always NACK a patch adding such an option later. > What's more: there are not only conflict markers in the editor, You don't see the ">>>>>>>"? > there is > clearly a visual marker next to the line number that indicates that the > editor has a fundamental understanding where the conflict markers are. Yeah, so this shows that those markers can be important for the editor. > Which means that the conflict markers must have been generated > independently of Git rather than parsed in some random diff that was > produced by Git. Why couldn't they be generated by Git and then just parsed from a diff in the future, even if that was not the case here? > In other words: you are making my case for me that `git merge-tree` should > not generate diff output because it would not even be used. The other link above in this email actually shows that diffs are used right now to resolve conflicts. > > > In short: I completely disagree that we should introduce a new command, > > > and I also completely disagree that the `merge-tree` command should output > > > any diffs. > > > > > > I do agree that we need to be mindful of what we actually need, and in > > > that regard, I reiterate that we need to let concrete use cases guide us. > > > As part of GitLab, you might be in an excellent position to look at > > > GitLab's concrete server-side needs when it comes to use `git merge-tree` > > > to perform merges. > > > > I hope I provided a concrete use case with the link above. > > Sorry, I apparently was a bit unclear. > > In the context of discussing `git merge-tree`, a low-level Git command, > when I talk about a user, I do not mean a human being, but a program that > calls that command and parses its output. So it could very well parse diffs containing conflict markers and show those conflict markers. [...] > Of course, I am still left guessing what the server-side needs concretely, > because that is not at all obvious from the user-facing web site to which > you sent me. What is needed is a good, hard look at the actual _code_, the > code that calls into libgit2 to perform a merge, and that could instead > spawn a `git merge-tree` process to accomplish the same thing. > > We need to get away from hypothetical scenarios. They're not helping. I am not even asking for a feature, just to make it possible to extend the output of a brand new command in an RFC with some possibly useful things, and you are making such requests... Please relax a bit on this.
Hi Christian, On Fri, 28 Jan 2022, Christian Couder wrote: > On Fri, Jan 28, 2022 at 1:58 PM Johannes Schindelin > <Johannes.Schindelin@gmx.de> wrote: > > > > On Wed, 26 Jan 2022, Christian Couder wrote: > > > > > On Wed, Jan 26, 2022 at 1:03 PM Johannes Schindelin > > > <Johannes.Schindelin@gmx.de> wrote: > > > > > > > > On Wed, 26 Jan 2022, Christian Couder wrote: > > > > > > > > > The reason is that I think in many cases when there are conflicts, > > > > > the conflicts will be small and the user will want to see them. So > > > > > it would be simpler to just have an option to show any conflict > > > > > right away, rather than have the user launch another command (a > > > > > diff-tree against which tree and with which options?). > > > > > > > > That assumes that server-side merge UIs will present merge conflicts in > > > > the form of diffs containing merge conflict markers. Which I don't think > > > > will happen, like, ever. > > > > > > Please take a look at: > > > > > > https://docs.gitlab.com/ee/user/project/merge_requests/conflicts.html#resolve-conflicts-in-the-inline-editor > > > > > > As you can see in the image there are conflict markers in the file > > > displayed by the server UI. > > > > Please note the difference between what I wrote above (present merge > > conflicts in the form of diffs containing merge conflict markers) and what > > is shown in the document you linked to (present a file annotated with > > merge conflict markers). > > > > There is no diff in that page. > > The server UI could just get a diff with the conflicts inside instead > of the full files with conflict inside, as the diff would be smaller > to parse than the full files. So even if it's not shown, the diff > could still be useful. You really need to get away from talking about this in hypothetical terms. > Also just above the section of the link I sent, there is this section > > https://docs.gitlab.com/ee/user/project/merge_requests/conflicts.html#resolve-conflicts-in-interactive-mode > > where one can see diff markers in the image. That's a side-by-side diff. Git cannot even produce those. > > What's more: there are not only conflict markers in the editor, > > You don't see the ">>>>>>>"? Yes, I do. And not only that. I also see that the editor knows very specifically where the conflict happens. And since any file can contain `>>>>>>>` _without_ it being a conflict marker, the editor most likely does not parse the output of Git that contains a conflict marker. At least I hope it does not because it would then very easily be confused by strings that look like conflict markers, but aren't. Think about our very own test suite, and why we specifically set `conflict-marker-size=32` for those files. Same reason why the server backend cannot simply ingest files with conflict markers and then hope to figure out which `>>>>>>>` are real conflict markers and which are not. > > there is clearly a visual marker next to the line number that > > indicates that the editor has a fundamental understanding where the > > conflict markers are. > > Yeah, so this shows that those markers can be important for the editor. Of course they are important! That's my point! > > In other words: you are making my case for me that `git merge-tree` should > > not generate diff output because it would not even be used. > > The other link above in this email actually shows that diffs are used > right now to resolve conflicts. It shows that Git was not used to generate the diff, is what it shows. I see that you are still trying to guess what the server-side needs actually are. It really is time to stop guessing. So I will keep challenging you to actually look at the GitLab code, to take a stab at teaching it to use `git merge-tree` to perform merges. And then to come back with what you learned. I guarantee you that that will be multiple times more useful than talking about it in hypotheticals. And you are in such an almost unique position to contribute to this patch series, to provide that very valuable feedback how `git merge-tree` could be improved to support actual, real-life server-side code that is currently in use! So why not make the most out of it? Ciao, Johannes
Hi Elijah, On Sat, 22 Jan 2022, Elijah Newren via GitGitGadget wrote: > Note1: Depends on en/remerge-diff (but no need to pick it up; it's still > RFC). For reasons, I will have to backport this on top of v2.33.1 ;-) > == Updates Log == > > Updates since v2 (thanks to Christian, Dscho, Ramsay, and René for > suggestions and comments on v2): > > * Significant changes to output format: > * Flags no longer take a filename for additional output; they write to > stdout instead. > * More information included by default when there are conflicts (no need > to request it with additional flags, instead flags can be used to > suppress it). > * Provide (mode, oid, stage, file) tuples -- i.e. ls-files -u style of > information -- when there are conflicts. Add a flag to only list > conflicted files if that's preferred. > * Much more thorough manual for git-merge-tree.txt > * Renamed option from --real to --write-tree > * Accept an optional --trivial-merge option to get old style merge-tree > behavior > * Allow both --write-tree and --trivial-merge to be omitted since we can > deduce which from number of arguments > * Document exit code when the merge cannot be run (so we can distinguish > other error cases from conflicts) > * testcase cleanups: test_tick, early skip of test when using recursive > backend, variable renames, etc. > * various minor code cleanups > * Add a new --allow-unrelated-histories option (with same meaning as the > one used in git merge) > * Rebased on top of en/remerge-diff to avoid a small conflict I am really happy with the way this patch series is going, and hope to be a lot more active on it in the near future. I've read through the current iteration and left a few suggestions, nothing major. Thank you so much for your outstanding work! Dscho
On Wed, Jan 26, 2022 at 12:48 AM Christian Couder <christian.couder@gmail.com> wrote: > > On Sat, Jan 22, 2022 at 10:56 PM Elijah Newren via GitGitGadget > <gitgitgadget@gmail.com> wrote: > > > Updates since v2 (thanks to Christian, Dscho, Ramsay, and René for > > suggestions and comments on v2): > > > > * Significant changes to output format: > > * Flags no longer take a filename for additional output; they write to > > stdout instead. > > * More information included by default when there are conflicts (no need > > to request it with additional flags, instead flags can be used to > > suppress it). > > * Provide (mode, oid, stage, file) tuples -- i.e. ls-files -u style of > > information -- when there are conflicts. Add a flag to only list > > conflicted files if that's preferred. > > The above changes seem good to me. :-) > > * Much more thorough manual for git-merge-tree.txt > > * Renamed option from --real to --write-tree > > * Accept an optional --trivial-merge option to get old style merge-tree > > behavior > > * Allow both --write-tree and --trivial-merge to be omitted since we can > > deduce which from number of arguments > > I still think that it might be simpler and cleaner to leave 'git > merge-tree' alone for now, and just add a new command named for > example 'git write-merge-tree'. Later we can always add flags to 'git > merge-tree' or add 'git trivial-merge-tree' as an alias for 'git > merge-tree', and eventually slowly switch 'git merge-tree' to mean > only 'git write-merge-tree' if that's where we want to go. Understood. Since we can't immediately kill the old merge-tree, I don't think there's a perfectly clean solution here, and it's totally understandable that different folks would have different opinions on which interim choice would be cleanest. What you suggest is reasonable, I just personally prefer the path in this series. > > * Document exit code when the merge cannot be run (so we can distinguish > > other error cases from conflicts) > > * testcase cleanups: test_tick, early skip of test when using recursive > > backend, variable renames, etc. > > * various minor code cleanups > > * Add a new --allow-unrelated-histories option (with same meaning as the > > one used in git merge) > > The above changes seem good to me too. Thanks for reading over things carefully and for providing many detailed, helpful comments! > > Stuff intentionally NOT included, but which others seemed to feel strongly > > about; they'd need to convince me more on these: > > > > * Any form of diff output[1] > > It's not a big issue for me to not include them right now as long as > it's possible to add cli options later that add them. My main concern is just that `merge-tree` remain a low-level tool and have machine-parseable output. I was a little worried that both you and Dscho wanted everything on stdout rather than in separate files, as the <Informational messages> part of the output is rather free-form. But since it's at the end, and has a machine-parseable beginning, it can just be slurped in and we're all good. The diff output raises my eyebrow because I'm worried we're losing this property. If there are clear usecases for adding more output, and we can do so without losing this machine-parseable property, I don't have a problem with adding an option for it. One analogy we might use here is that `git merge` provides a diffstat at the end. What you're asking is more than a diffstat, but might be considered similar-ish in nature. > The reason is > that I think in many cases when there are conflicts, the conflicts > will be small and the user will want to see them. I'm a little worried about the assumption here that conflict size is measurable and visible via diffs. That might be true in some cases, but a UI written with that assumption is going to be very confusing when hitting cases where that assumption does not hold. For example: * What if there is a binary file conflict, or a modify/delete or rename/delete conflict, or failed-to-merge submodule conflict, or a file location conflict? (For these, there is no diff relative to the first parent and hence this conflict would have no diff output for it)? * What if there was a simple file/directory conflict? A diff would show a rename (even when neither side did any renames), but not any conflict markers. * What if there was a rename/rename conflict (both sides renamed same file differently) or a distinct types conflict? The former results in three different conflicting files, none of them with conflict markers, while the latter results in two different conflicting files both without conflict markers? Showing individual per-file diffs totally loses all context here -- it'll show no-diff for one of the files, and totally new additions for the ones. Such a problem statement just seems fraught with edge cases to me, and suggests that the problem statement might be in need of revisiting. Don't read this as me closing the door on the possibility of diffs; I'm not trying to do that. I'm listing my misgivings about how I think they might be used (i.e. be careful if you're headed down this path as you might be digging yourself a never-ending support hole). You can also think of my comments as feedback to consider and address when you propose a future feature addition for adding diffs. If/when you propose such a feature, we'd probably be able to dive more into specifics and usecases at that time, which may or may not circumvent my concerns. > So it would be > simpler to just have an option to show any conflict right away, rather > than have the user launch another command (a diff-tree against which > tree and with which options?). Um, this part I'm not sure I get. I thought the reason for the diffs was performance -- you knew you wanted the diffs, and you wanted it done as part of the same process. But why would this be simpler? Your patch series included three different diffs, and the emails you pointed me at suggested all kinds of configurability. That suggests the merge-tree command would have to take the exact same options the user would supply to diff, and thus would have to be told all the same options, right? I don't see how this removes any complexity at all for the user. Unless...is the request in some way similar to merge's diffstat where there is always a very specific type of diff that is wanted and you aren't envisioning much flexibility in what kind of diff or what to diff against -- is that where the simplification comes from?
On Sat, Jan 29, 2022 at 8:04 AM Elijah Newren <newren@gmail.com> wrote: > > On Wed, Jan 26, 2022 at 12:48 AM Christian Couder > <christian.couder@gmail.com> wrote: > > > > On Sat, Jan 22, 2022 at 10:56 PM Elijah Newren via GitGitGadget > > <gitgitgadget@gmail.com> wrote: > > > Stuff intentionally NOT included, but which others seemed to feel strongly > > > about; they'd need to convince me more on these: > > > > > > * Any form of diff output[1] > > > > It's not a big issue for me to not include them right now as long as > > it's possible to add cli options later that add them. > > My main concern is just that `merge-tree` remain a low-level tool and > have machine-parseable output. I was a little worried that both you > and Dscho wanted everything on stdout rather than in separate files, > as the <Informational messages> part of the output is rather > free-form. But since it's at the end, and has a machine-parseable > beginning, it can just be slurped in and we're all good. The diff > output raises my eyebrow because I'm worried we're losing this > property. If there are clear usecases for adding more output, and we > can do so without losing this machine-parseable property, I don't have > a problem with adding an option for it. That's ok for me for now. I will certainly not work on adding options for diff output without any usecase. > One analogy we might use here is that `git merge` provides a diffstat > at the end. What you're asking is more than a diffstat, but might be > considered similar-ish in nature. > > > The reason is > > that I think in many cases when there are conflicts, the conflicts > > will be small and the user will want to see them. > > I'm a little worried about the assumption here that conflict size is > measurable and visible via diffs. That might be true in some cases, > but a UI written with that assumption is going to be very confusing > when hitting cases where that assumption does not hold. For example: > > * What if there is a binary file conflict, or a modify/delete or > rename/delete conflict, or failed-to-merge submodule conflict, or a > file location conflict? (For these, there is no diff relative to the > first parent and hence this conflict would have no diff output for > it)? > * What if there was a simple file/directory conflict? A diff would > show a rename (even when neither side did any renames), but not any > conflict markers. > * What if there was a rename/rename conflict (both sides renamed > same file differently) or a distinct types conflict? The former > results in three different conflicting files, none of them with > conflict markers, while the latter results in two different > conflicting files both without conflict markers? Showing individual > per-file diffs totally loses all context here -- it'll show no-diff > for one of the files, and totally new additions for the ones. In those cases we just tell users that they cannot resolve those conflicts in the user interface, see the following doc: https://docs.gitlab.com/ee/user/project/merge_requests/conflicts.html#conflicts-you-can-resolve-in-the-user-interface > Such a problem statement just seems fraught with edge cases to me, and > suggests that the problem statement might be in need of revisiting. Users understand that some kinds of conflicts cannot yet be resolved using a user interface. Maybe we will be able to make improvements so that more kinds of conflicts can be resolved in a UI in the future though. That's why a flexible and extensible output could help. > Don't read this as me closing the door on the possibility of diffs; > I'm not trying to do that. I'm listing my misgivings about how I > think they might be used (i.e. be careful if you're headed down this > path as you might be digging yourself a never-ending support hole). > You can also think of my comments as feedback to consider and address > when you propose a future feature addition for adding diffs. If/when > you propose such a feature, we'd probably be able to dive more into > specifics and usecases at that time, which may or may not circumvent > my concerns. I know that diffs, or any new single feature, will likely not be a silver bullet. > > So it would be > > simpler to just have an option to show any conflict right away, rather > > than have the user launch another command (a diff-tree against which > > tree and with which options?). > > Um, this part I'm not sure I get. I thought the reason for the diffs > was performance -- you knew you wanted the diffs, and you wanted it > done as part of the same process. But why would this be simpler? In the commit message of 4/12 you show an example of using it in simple scripts: NEWTREE=$(git merge-tree --write-tree $BRANCH1 $BRANCH2) test $? -eq 0 || die "There were conflicts..." ... So I think it would be simpler for someone interested in seeing the conflicts, like a script writer, or maybe someone using it manually for example as a dry run before performing a merge, to be able to get them right away from the command rather than to have to use another command (which means finding the right command, arguments and options for that) to get them. > Your patch series included three different diffs, and the emails you > pointed me at suggested all kinds of configurability. That suggests > the merge-tree command would have to take the exact same options the > user would supply to diff, and thus would have to be told all the same > options, right? I don't see how this removes any complexity at all > for the user. > > Unless...is the request in some way similar to merge's diffstat where > there is always a very specific type of diff that is wanted and you > aren't envisioning much flexibility in what kind of diff or what to > diff against -- is that where the simplification comes from? Well I just think the default diff output could be tailored for the most likely usecases and options made available later for more advanced usecases or users.
Hi Christian, On Sat, Jan 29, 2022 at 12:18 AM Christian Couder <christian.couder@gmail.com> wrote: > > On Sat, Jan 29, 2022 at 8:04 AM Elijah Newren <newren@gmail.com> wrote: > > > > On Wed, Jan 26, 2022 at 12:48 AM Christian Couder > > <christian.couder@gmail.com> wrote: > > > > > > On Sat, Jan 22, 2022 at 10:56 PM Elijah Newren via GitGitGadget > > > <gitgitgadget@gmail.com> wrote: > > > > > Stuff intentionally NOT included, but which others seemed to feel strongly > > > > about; they'd need to convince me more on these: > > > > > > > > * Any form of diff output[1] > > > > > > It's not a big issue for me to not include them right now as long as > > > it's possible to add cli options later that add them. > > > > My main concern is just that `merge-tree` remain a low-level tool and > > have machine-parseable output. I was a little worried that both you > > and Dscho wanted everything on stdout rather than in separate files, > > as the <Informational messages> part of the output is rather > > free-form. But since it's at the end, and has a machine-parseable > > beginning, it can just be slurped in and we're all good. The diff > > output raises my eyebrow because I'm worried we're losing this > > property. If there are clear usecases for adding more output, and we > > can do so without losing this machine-parseable property, I don't have > > a problem with adding an option for it. > > That's ok for me for now. I will certainly not work on adding options > for diff output without any usecase. > > > One analogy we might use here is that `git merge` provides a diffstat > > at the end. What you're asking is more than a diffstat, but might be > > considered similar-ish in nature. > > > > > The reason is > > > that I think in many cases when there are conflicts, the conflicts > > > will be small and the user will want to see them. > > > > I'm a little worried about the assumption here that conflict size is > > measurable and visible via diffs. That might be true in some cases, > > but a UI written with that assumption is going to be very confusing > > when hitting cases where that assumption does not hold. For example: > > > > * What if there is a binary file conflict, or a modify/delete or > > rename/delete conflict, or failed-to-merge submodule conflict, or a > > file location conflict? (For these, there is no diff relative to the > > first parent and hence this conflict would have no diff output for > > it)? > > * What if there was a simple file/directory conflict? A diff would > > show a rename (even when neither side did any renames), but not any > > conflict markers. > > * What if there was a rename/rename conflict (both sides renamed > > same file differently) or a distinct types conflict? The former > > results in three different conflicting files, none of them with > > conflict markers, while the latter results in two different > > conflicting files both without conflict markers? Showing individual > > per-file diffs totally loses all context here -- it'll show no-diff > > for one of the files, and totally new additions for the ones. > > In those cases we just tell users that they cannot resolve those > conflicts in the user interface, see the following doc: > > https://docs.gitlab.com/ee/user/project/merge_requests/conflicts.html#conflicts-you-can-resolve-in-the-user-interface So...I think you may have just convinced me that my fears were justified and that I should probably NAK any attempt to add diffs to the merge-tree command. I won't jump to conclusions but you've provided some pretty strong signal to me against going down that route. The list of limitations in the link you provide do mostly avoid the broken cases I listed above, but it enshrines those limitations on that webpage as fundamental rather than just as current implementation shortcomings. You may not be able to remove those limitations on that webpage without either expunging the diffs from the UI or exposing the brokenness of the various cases above. If you do propose a diff option in the future, come prepared to discuss how you'll avoid accidentally leading others down into paths with the same fundamental issues, and/or how the above types of conflicts might still be meaningfully handled. Also, the list of limitations you have may not be quite comprehensive enough to avoid all problems (though it certainly avoids most of them). Can I ask a couple clarifying questions about your list of limitations in that link? : * When that page says the file cannot already contain conflict markers, is the check performed on the version of the file in the two trees being merged, or is the check performed on the 2nd and 3rd index stage of the merge result (these are not equivalent checks, even if they often give the same answer)? * When that page says the file must already exist in the same path on both branches, is the check performed on by checking the path in the two trees being merged, or is the check performed on the 2nd and 3rd index stage of the merge result (again, these are not equivalent checks)? > > Such a problem statement just seems fraught with edge cases to me, and > > suggests that the problem statement might be in need of revisiting. > > Users understand that some kinds of conflicts cannot yet be resolved > using a user interface. Maybe we will be able to make improvements so > that more kinds of conflicts can be resolved in a UI in the future > though. That's why a flexible and extensible output could help. I think future improvements to handle more conflict types may well hinge on removing the diff-output-using portion of your interface; I think it may well be fundamentally incompatible. I agree we want to leave the output format open for extension, I'm just saying we have to be careful about what extensions are included and this one worries me. > > Don't read this as me closing the door on the possibility of diffs; > > I'm not trying to do that. I'm listing my misgivings about how I > > think they might be used (i.e. be careful if you're headed down this > > path as you might be digging yourself a never-ending support hole). > > You can also think of my comments as feedback to consider and address > > when you propose a future feature addition for adding diffs. If/when > > you propose such a feature, we'd probably be able to dive more into > > specifics and usecases at that time, which may or may not circumvent > > my concerns. > > I know that diffs, or any new single feature, will likely not be a > silver bullet. Sure that's fair; not being a silver bullet is fine. We do need to avoid providing a kryptonite bullet, though. > > > So it would be > > > simpler to just have an option to show any conflict right away, rather > > > than have the user launch another command (a diff-tree against which > > > tree and with which options?). > > > > Um, this part I'm not sure I get. I thought the reason for the diffs > > was performance -- you knew you wanted the diffs, and you wanted it > > done as part of the same process. But why would this be simpler? > > In the commit message of 4/12 you show an example of using it in simple scripts: > > NEWTREE=$(git merge-tree --write-tree $BRANCH1 $BRANCH2) > test $? -eq 0 || die "There were conflicts..." > ... > > So I think it would be simpler for someone interested in seeing the > conflicts, like a script writer, or maybe someone using it manually > for example as a dry run before performing a merge, to be able to get > them right away from the command... Okay, but the command already does that. When there are conflicts, NEWTREE won't actually be a tree; it'll be lots of lines of output. That's (part of) the reason for the exit status check. So users already get that information from the command. > ...rather than to have to use another > command (which means finding the right command, arguments and options > for that) to get them. As for finding the right arguments and options... > > Your patch series included three different diffs, and the emails you > > pointed me at suggested all kinds of configurability. That suggests > > the merge-tree command would have to take the exact same options the > > user would supply to diff, and thus would have to be told all the same > > options, right? I don't see how this removes any complexity at all > > for the user. > > > > Unless...is the request in some way similar to merge's diffstat where > > there is always a very specific type of diff that is wanted and you > > aren't envisioning much flexibility in what kind of diff or what to > > diff against -- is that where the simplification comes from? > > Well I just think the default diff output could be tailored for the > most likely usecases and options made available later for more > advanced usecases or users. Ah, so this may go against your earlier comments at [1] about a merge-tree on steroids and a huge array of diff options, or against your comments about diffs not being provided by default (also [1]). Because if you have that huge range of diff options, and the diffs are not provided by default, then it's not clear how you've simplified things because users would still need to figure out the right arguments and options to pass, it's just that the user would have to pass all (or maybe just most?) of those arguments and options to merge-tree instead of to diff. Or is the simpler UI you're discussing really just about not needing to include 1 argument, the name of the new toplevel tree, since that single argument could be implicit? I'm having a hard time buying the "simpler UI for script writers" angle of argument here, especially for script writers who should fully be able to look up the appropriate commands and use them. Your earlier arguments about performance being important (having both the merge & the diff in the same process) seemed much more convincing to me. But maybe I'm still just missing something about your "simpler" angle? [1] https://lore.kernel.org/git/CAP8UFD0wKnAg5oyMWchXysPTg3K9Vb4M1tRcPzPE81QM903pYg@mail.gmail.com/
On Sat, Jan 29, 2022 at 9:43 AM Elijah Newren <newren@gmail.com> wrote: > > Hi Christian, > > On Sat, Jan 29, 2022 at 12:18 AM Christian Couder > <christian.couder@gmail.com> wrote: > > > > On Sat, Jan 29, 2022 at 8:04 AM Elijah Newren <newren@gmail.com> wrote: > > > > > > On Wed, Jan 26, 2022 at 12:48 AM Christian Couder > > > <christian.couder@gmail.com> wrote: [...] > > > > The reason is > > > > that I think in many cases when there are conflicts, the conflicts > > > > will be small and the user will want to see them. > > > > > > I'm a little worried about the assumption here that conflict size is > > > measurable and visible via diffs. That might be true in some cases, > > > but a UI written with that assumption is going to be very confusing > > > when hitting cases where that assumption does not hold. For example: > > > > > > * What if there is a binary file conflict, or a modify/delete or > > > rename/delete conflict, or failed-to-merge submodule conflict, or a > > > file location conflict? (For these, there is no diff relative to the > > > first parent and hence this conflict would have no diff output for > > > it)? > > > * What if there was a simple file/directory conflict? A diff would > > > show a rename (even when neither side did any renames), but not any > > > conflict markers. > > > * What if there was a rename/rename conflict (both sides renamed > > > same file differently) or a distinct types conflict? The former > > > results in three different conflicting files, none of them with > > > conflict markers, while the latter results in two different > > > conflicting files both without conflict markers? Showing individual > > > per-file diffs totally loses all context here -- it'll show no-diff > > > for one of the files, and totally new additions for the ones. > > > > In those cases we just tell users that they cannot resolve those > > conflicts in the user interface, see the following doc: > > > > https://docs.gitlab.com/ee/user/project/merge_requests/conflicts.html#conflicts-you-can-resolve-in-the-user-interface > > So...I think you may have just convinced me that my fears were > justified and that I should probably NAK any attempt to add diffs to > the merge-tree command. I won't jump to conclusions but you've > provided some pretty strong signal to me against going down that > route. The list of limitations in the link you provide do mostly > avoid the broken cases I listed above, but it enshrines those > limitations on that webpage as fundamental rather than just as current > implementation shortcomings. You may not be able to remove those > limitations on that webpage without either expunging the diffs from > the UI or exposing the brokenness of the various cases above. > > If you do propose a diff option in the future, come prepared to > discuss how you'll avoid accidentally leading others down into paths > with the same fundamental issues, and/or how the above types of > conflicts might still be meaningfully handled. Actually, after having a few extra days to think about it, I thought of something that should have been obvious to me, given my other in-flight series that this depends upon... If you used the same trick that remerge-diff does to include the CONFLICT (and related messages) headers in the diff, then the kinds of conflicts that are normally either invisible or misleading/confusing to show via a diff would suddenly have the extra notices needed to explain them, and make this problem tractable. Further, it'd only make sense to do the special diff as part of the merge-tree process, since it has the conflict messages strmap needed to do this. And there's not all that much work that would be needed to take advantage of this, especially since this series already depends upon the remerge-diff series. So, maybe this is fine after all. > Also, the list of limitations you have may not be quite comprehensive > enough to avoid all problems (though it certainly avoids most of > them). Can I ask a couple clarifying questions about your list of > limitations in that link? : > > * When that page says the file cannot already contain conflict > markers, is the check performed on the version of the file in the two > trees being merged, or is the check performed on the 2nd and 3rd index > stage of the merge result (these are not equivalent checks, even if > they often give the same answer)? > * When that page says the file must already exist in the same path > on both branches, is the check performed on by checking the path in > the two trees being merged, or is the check performed on the 2nd and > 3rd index stage of the merge result (again, these are not equivalent > checks)? I am still curious about this either way.