mbox series

[0/6] Optionally restrict range-diff output to "left" or "right" range only

Message ID pull.869.git.1612469275.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Optionally restrict range-diff output to "left" or "right" range only | expand

Message

Philippe Blain via GitGitGadget Feb. 4, 2021, 8:07 p.m. UTC
One of my quite common workflows is to see whether an ancient topic branch I
have lying about has made it into Git. Since my local commit OIDs have
nothing to do with the OIDs of the corresponding commits in git/git, my only
way is to fire up git range-diff ...upstream/master, but of course that
output contains way more commits than I care about.

To help this use case, here is a patch series that teaches git range-diff
the --left-only and --right-only options in the end, restricting the output
to those commits and commit pairs that correspond to the commits in the
first and the second range, respectively.

The first part of the series contains cleanup patches that are not strictly
related to the feature I implemented here, but since I already have them, I
figured I could just as well contribute them all together.

This patch series is based on js/range-diff-wo-dotdot.

Johannes Schindelin (6):
  range-diff: avoid leaking memory in two error code paths
  range-diff: libify the read_patches() function again
  range-diff: simplify code spawning `git log`
  range-diff: combine all options in a single data structure
  range-diff: move the diffopt initialization down one layer
  range-diff: offer --left-only/--right-only options

 Documentation/git-range-diff.txt |   9 +++
 builtin/log.c                    |  10 ++-
 builtin/range-diff.c             |  21 +++++--
 log-tree.c                       |   8 ++-
 range-diff.c                     | 101 +++++++++++++++++--------------
 range-diff.h                     |  12 +++-
 t/t3206-range-diff.sh            |  15 +++++
 7 files changed, 118 insertions(+), 58 deletions(-)


base-commit: 43718f6741a87f87bd400bdf5264394e980583c5
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-869%2Fdscho%2Frange-diff-left-and-right-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-869/dscho/range-diff-left-and-right-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/869

Comments

Junio C Hamano Feb. 4, 2021, 10:41 p.m. UTC | #1
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> One of my quite common workflows is to see whether an ancient topic branch I
> have lying about has made it into Git. Since my local commit OIDs have
> nothing to do with the OIDs of the corresponding commits in git/git, my only
> way is to fire up git range-diff ...upstream/master, but of course that
> output contains way more commits than I care about.
>
> To help this use case, here is a patch series that teaches git range-diff
> the --left-only and --right-only options in the end, restricting the output
> to those commits and commit pairs that correspond to the commits in the
> first and the second range, respectively.

Makes sense.
Taylor Blau Feb. 4, 2021, 10:48 p.m. UTC | #2
On Thu, Feb 04, 2021 at 02:41:39PM -0800, Junio C Hamano wrote:
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > One of my quite common workflows is to see whether an ancient topic branch I
> > have lying about has made it into Git. Since my local commit OIDs have
> > nothing to do with the OIDs of the corresponding commits in git/git, my only
> > way is to fire up git range-diff ...upstream/master, but of course that
> > output contains way more commits than I care about.
> >
> > To help this use case, here is a patch series that teaches git range-diff
> > the --left-only and --right-only options in the end, restricting the output
> > to those commits and commit pairs that correspond to the commits in the
> > first and the second range, respectively.
>
> Makes sense.

I'd add an additional use-case, which is ignoring new commits from
upstream when displaying a range-diff in rerolled patch series.

Oftentimes I'll find that the automatically-prepared range diff that
'git format-patch --cover-letter --range-diff' generates will include
new commits from upstream, so these new options should help me ignore
those in the output.

As an aside: I am curious if I'm missing something when you say the
"only way" is to ask for a 'git range-diff ...@{u}'. IIUC what you're
describing, I often resort to using 'git cherry' for that exact thing.
But, I may not be quite understanding your use-case (and why git-cherry
doesn't do what you want already).

My latter question is purely for satisfying my own curiosity; I don't
have any problem with a '--{left,right}-only' option in range-diff. From
my quick read of the patches, it all looks pretty sane to me.


Thanks,
Taylor
Junio C Hamano Feb. 5, 2021, 12:56 a.m. UTC | #3
Taylor Blau <me@ttaylorr.com> writes:

> On Thu, Feb 04, 2021 at 02:41:39PM -0800, Junio C Hamano wrote:
>> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
>> writes:
>>
>> > One of my quite common workflows is to see whether an ancient topic branch I
>> > have lying about has made it into Git. Since my local commit OIDs have
>> > nothing to do with the OIDs of the corresponding commits in git/git, my only
>> > way is to fire up git range-diff ...upstream/master, but of course that
>> > output contains way more commits than I care about.
>> > ...
>> Makes sense.
>
> I'd add an additional use-case, which is ignoring new commits from
> upstream when displaying a range-diff in rerolled patch series.
>
> Oftentimes I'll find that the automatically-prepared range diff that
> 'git format-patch --cover-letter --range-diff' generates will include
> new commits from upstream, so these new options should help me ignore
> those in the output.

Do you mean that the new round is based on an updated upstream
commit, while the old series was based on a bit older upstream?
After rebasing your topic, "range-diff @{1}..." would find the
updates in the base (made in the upstream) plus the new round of
your work on the right hand side of the symmetric range, while the
left hand side solely consists of your old round (unless the
upstream rewound their work, which should not happen).  But that
must not be it, I guess, because in such a case, among the commits
in @{1}..HEAD, we cannot (eh, at least range-diff cannot) tell which
one came from upstream and which one came from our fingers.

So I am a bit puzzled there.

> As an aside: I am curious if I'm missing something when you say the
> "only way" is to ask for a 'git range-diff ...@{u}'. IIUC what you're
> describing, I often resort to using 'git cherry' for that exact thing.
> But, I may not be quite understanding your use-case (and why git-cherry
> doesn't do what you want already).
>
> My latter question is purely for satisfying my own curiosity; I don't
> have any problem with a '--{left,right}-only' option in range-diff. From
> my quick read of the patches, it all looks pretty sane to me.

The question is addressed to Dscho, and I am also somewhat curious.
Perhaps the reason would be that the output from cherry is not as
easy to read as range-diff, without any post-processing.

I do find "range-diff ...@{u}" a bit too blunt and heavy a hammer
for that task, but as they say, when you are familiar with and fond
of a hammer, all tasks look like nails ;-).
Jeff King Feb. 5, 2021, 10:11 a.m. UTC | #4
On Thu, Feb 04, 2021 at 04:56:16PM -0800, Junio C Hamano wrote:

> > As an aside: I am curious if I'm missing something when you say the
> > "only way" is to ask for a 'git range-diff ...@{u}'. IIUC what you're
> > describing, I often resort to using 'git cherry' for that exact thing.
> > But, I may not be quite understanding your use-case (and why git-cherry
> > doesn't do what you want already).
> >
> > My latter question is purely for satisfying my own curiosity; I don't
> > have any problem with a '--{left,right}-only' option in range-diff. From
> > my quick read of the patches, it all looks pretty sane to me.
> 
> The question is addressed to Dscho, and I am also somewhat curious.
> Perhaps the reason would be that the output from cherry is not as
> easy to read as range-diff, without any post-processing.

I had the same curiosity; I'd use git-cherry (or rev-list --cherry) for
this.

I suspect the big difference is the quality of the matching. git-cherry
is purely looking at patch-ids. So it is quite likely to say "this was
not applied upstream" when what got applied differed slightly (e.g.,
fixups upstream, applied to a different base, etc). Whereas range-diff
has some cost heuristics for deciding that two patches are basically the
same thing.  So it would find more cases (and as a bonus, give you the
diff to see what tweaks were made upstream).

It does make me wonder if it would be useful for rev-list, etc to have
an option to make "--cherry" use the more clever heuristics instead of
just a patch-id. It would never show the same diff output as range-diff,
but maybe more scripts would find the advanced heuristic useful.

I know it would probably make rebase's "ignore if in upstream" feature
less clunky when I rebase topics. But it would also make it more
dangerous! E.g., right now I see any upstream tweaks as potential
conflicts when I rebase, and I manually review them for sanity.

-Peff
Taylor Blau Feb. 5, 2021, 8:05 p.m. UTC | #5
On Thu, Feb 04, 2021 at 04:56:16PM -0800, Junio C Hamano wrote:
> > I'd add an additional use-case, which is ignoring new commits from
> > upstream when displaying a range-diff in rerolled patch series.
> >
> > Oftentimes I'll find that the automatically-prepared range diff that
> > 'git format-patch --cover-letter --range-diff' generates will include
> > new commits from upstream, so these new options should help me ignore
> > those in the output.
>
> Do you mean that the new round is based on an updated upstream
> commit, while the old series was based on a bit older upstream?
> After rebasing your topic, "range-diff @{1}..." would find the
> updates in the base (made in the upstream) plus the new round of
> your work on the right hand side of the symmetric range, while the
> left hand side solely consists of your old round (unless the
> upstream rewound their work, which should not happen).  But that
> must not be it, I guess, because in such a case, among the commits
> in @{1}..HEAD, we cannot (eh, at least range-diff cannot) tell which
> one came from upstream and which one came from our fingers.
>
> So I am a bit puzzled there.

I'm talking about a situation where a later re-roll is based of of a
newer upstream. But your judgement is right: upstream's updates look
like "new" commits on the right-hand side.

I have some scripts built around this, but they all boil down to passing
'--range-diff=@{1}' (where @{1} is the tip of the previous reroll) to
format-patch. See:

    https://github.com/ttaylorr/dotfiles/blob/work-gh/bin/git-mail#L8-L10

for details.

IIUC this series, I think I'd also want to start passing '--left-only'
to ignore the new commits from upstream in a range-diff, no?

Thanks,
Taylor
Johannes Schindelin Feb. 8, 2021, 10:36 p.m. UTC | #6
Hi Peff,

On Fri, 5 Feb 2021, Jeff King wrote:

> On Thu, Feb 04, 2021 at 04:56:16PM -0800, Junio C Hamano wrote:
>
> > > As an aside: I am curious if I'm missing something when you say the
> > > "only way" is to ask for a 'git range-diff ...@{u}'. IIUC what you're
> > > describing, I often resort to using 'git cherry' for that exact thing.
> > > But, I may not be quite understanding your use-case (and why git-cherry
> > > doesn't do what you want already).
> > >
> > > My latter question is purely for satisfying my own curiosity; I don't
> > > have any problem with a '--{left,right}-only' option in range-diff. From
> > > my quick read of the patches, it all looks pretty sane to me.
> >
> > The question is addressed to Dscho, and I am also somewhat curious.
> > Perhaps the reason would be that the output from cherry is not as
> > easy to read as range-diff, without any post-processing.
>
> I had the same curiosity; I'd use git-cherry (or rev-list --cherry) for
> this.
>
> I suspect the big difference is the quality of the matching. git-cherry
> is purely looking at patch-ids.

Indeed. Whenever I had tried `git cherry` in the past (which, admittedly,
has been with geometrically decreasing frequency given the results), it
completely failed to help me. And it's not only its reliance on perfect
matches of the diff _with context lines_, it is also that the commit
messages are completely ignored.

`git cherry`'s track record with me is so perfect that I want to put this
line into all my Bash profiles:

	eval "$(set | sed -n '/^__git_main /,/^}$/{s/--list-cmds=list-mainporcelain[^)]*/& | grep -v ^cherry\$/;p}')"

> So it is quite likely to say "this was not applied upstream" when what
> got applied differed slightly (e.g., fixups upstream, applied to a
> different base, etc). Whereas range-diff has some cost heuristics for
> deciding that two patches are basically the same thing.  So it would
> find more cases (and as a bonus, give you the diff to see what tweaks
> were made upstream).
>
> It does make me wonder if it would be useful for rev-list, etc to have
> an option to make "--cherry" use the more clever heuristics instead of
> just a patch-id. It would never show the same diff output as range-diff,
> but maybe more scripts would find the advanced heuristic useful.
>
> I know it would probably make rebase's "ignore if in upstream" feature
> less clunky when I rebase topics. But it would also make it more
> dangerous! E.g., right now I see any upstream tweaks as potential
> conflicts when I rebase, and I manually review them for sanity.

Yeah, I thought the same when I read the paragraphs before this one. It
might sound convenient, but there _are_ false positives in `git
range-diff`'s output, therefore I would recommend never using
`git range-diff --left-only` or `[...] --right-only` with `-s`. IOW
_always_ inspect the differences.

Ciao,
Dscho