[0/1] Object ID support for git merge-file

Message ID	20231024195655.2413191-1-sandals@crustytoothpaste.net (mailing list archive)
Headers	show Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A7AF53B7A8 for <git@vger.kernel.org>; Tue, 24 Oct 2023 19:58:06 +0000 (UTC) From: "brian m. carlson" <sandals@crustytoothpaste.net> To: <git@vger.kernel.org> Cc: Junio C Hamano <gitster@pobox.com>, Elijah Newren <newren@gmail.com>, Phillip Wood <phillip.wood123@gmail.com> Subject: [PATCH 0/1] Object ID support for git merge-file Date: Tue, 24 Oct 2023 19:56:54 +0000 Message-ID: <20231024195655.2413191-1-sandals@crustytoothpaste.net> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Object ID support for git merge-file \| expand [0/1] Object ID support for git merge-file [1/1] merge-file: add an option to process object IDs

brian m. carlson Oct. 24, 2023, 7:56 p.m. UTC

This series introduces an --object-id option to git merge-file such
that, instead of reading and writing from files on the system, it reads
from and writes to the object store using blobs.  This is in use at
GitHub to produce conflict diffs when a merge fails, and it seems
generally useful, so I'm sending it here.

The only tricky piece is the fact that we have to special-case the empty
blob since otherwise it isn't handled correctly.

brian m. carlson (1):
  merge-file: add an option to process object IDs

 Documentation/git-merge-file.txt | 20 +++++++++++
 builtin/merge-file.c             | 58 +++++++++++++++++++++++---------
 t/t6403-merge-file.sh            | 58 ++++++++++++++++++++++++++++++++
 3 files changed, 120 insertions(+), 16 deletions(-)

Elijah Newren Oct. 29, 2023, 6:24 a.m. UTC | #1

On Tue, Oct 24, 2023 at 12:58 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> This series introduces an --object-id option to git merge-file such
> that, instead of reading and writing from files on the system, it reads
> from and writes to the object store using blobs.

This seems like a reasonable capability to want from such a plumbing command.

> This is in use at
> GitHub to produce conflict diffs when a merge fails, and it seems
> generally useful, so I'm sending it here.

But...wouldn't you already have the conflicts generated when doing the
merge and learning that it fails?  Why would you need to generate them
again?

(Also, generating them again may risk getting names munged for
conflict markers in edge cases involving renames.)

That said, even if I have questions about your particular usecase, I
think the feature you are submitting here makes sense independently.

I left a few minor questions on the patch itself, but overall it looks
good to me.

Phillip Wood Oct. 29, 2023, 10:15 a.m. UTC | #2

On 29/10/2023 06:24, Elijah Newren wrote:
> On Tue, Oct 24, 2023 at 12:58 PM brian m. carlson
>> This is in use at
>> GitHub to produce conflict diffs when a merge fails, and it seems
>> generally useful, so I'm sending it here.
> 
> But...wouldn't you already have the conflicts generated when doing the
> merge and learning that it fails?  Why would you need to generate them
> again?

I was surprised by this as well, but as you say this seems like a 
useful addition independent of any specific use at GitHub.

Best Wishes

Phillip

Taylor Blau Oct. 30, 2023, 3:54 p.m. UTC | #3

On Sat, Oct 28, 2023 at 11:24:06PM -0700, Elijah Newren wrote:
> On Tue, Oct 24, 2023 at 12:58 PM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> >
> > This series introduces an --object-id option to git merge-file such
> > that, instead of reading and writing from files on the system, it reads
> > from and writes to the object store using blobs.
>
> This seems like a reasonable capability to want from such a plumbing command.

Agreed.

> > This is in use at
> > GitHub to produce conflict diffs when a merge fails, and it seems
> > generally useful, so I'm sending it here.
>
> But...wouldn't you already have the conflicts generated when doing the
> merge and learning that it fails?  Why would you need to generate them
> again?

brian would know better than I do, but I believe the reason is because
the "attempt this merge" RPC is handled separately from the "show me the
merge conflict(s) at xyz path". Those probably could be combined
(obviating the need for this patch), but doing so is probably rather
complicated.

Since this feature is generally useful for callers that haven't already
completed a tree-level merge and really just care about the result of
merging a single path, I don't have any objections here.

Thanks,
Taylor

brian m. carlson Oct. 30, 2023, 4:24 p.m. UTC | #4

On 2023-10-30 at 15:54:14, Taylor Blau wrote:
> On Sat, Oct 28, 2023 at 11:24:06PM -0700, Elijah Newren wrote:
> > But...wouldn't you already have the conflicts generated when doing the
> > merge and learning that it fails?  Why would you need to generate them
> > again?
> 
> brian would know better than I do, but I believe the reason is because
> the "attempt this merge" RPC is handled separately from the "show me the
> merge conflict(s) at xyz path". Those probably could be combined
> (obviating the need for this patch), but doing so is probably rather
> complicated.

That's correct.  They could in theory happen at different times, which
is why they're not linked.

Elijah Newren Oct. 30, 2023, 5:14 p.m. UTC | #5

On Mon, Oct 30, 2023 at 9:24 AM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On 2023-10-30 at 15:54:14, Taylor Blau wrote:
> > On Sat, Oct 28, 2023 at 11:24:06PM -0700, Elijah Newren wrote:
> > > But...wouldn't you already have the conflicts generated when doing the
> > > merge and learning that it fails?  Why would you need to generate them
> > > again?
> >
> > brian would know better than I do, but I believe the reason is because
> > the "attempt this merge" RPC is handled separately from the "show me the
> > merge conflict(s) at xyz path". Those probably could be combined
> > (obviating the need for this patch), but doing so is probably rather
> > complicated.
>
> That's correct.  They could in theory happen at different times, which
> is why they're not linked.

Maybe this is digging a little into "historical reasons" too much, but
this still seems a little funny.  If they happen at different times,
you still need multiple pieces of information remembered from the
merge operation in order for git-merge-file to be able to regenerate
the conflict correctly in general.  In particular, you need the OIDs
and the filenames.  Trying to regenerate a conflict without
remembering those from the merge step would only work for common
cases, but would be problematic in the face of either renames being
involved or recursive merges or both.  And if you need to remember
information from the merge step, then why not remember the actual
conflicts (or at least the tree OID generated by the merge operation,
which has the conflicts embedded within it)?

I know, I know, there's probably just historical cruft that needs
cleaning up, and I don't think any of this matters to the patch at
hand since it's independently useful.  It just sounds like a system
has been set up that has some rough edge cases caused by a poor
splitting.

> --
> brian m. carlson (he/him or they/them)
> Toronto, Ontario, CA

[0/1] Object ID support for git merge-file

Message

Comments