mbox series

[00/26] git-log: implement new --diff-merge options

Message ID 20201101193330.24775-1-sorganov@gmail.com (mailing list archive)
Headers show
Series git-log: implement new --diff-merge options | expand

Message

Sergey Organov Nov. 1, 2020, 7:33 p.m. UTC
These patch series implement new set of options governing the diff output
of merge commits, all under the umbrella of the single --diff-merges=<mode>
option. Most of the new options being synonyms for -m/-c/--cc options,
there is also additional functionality provided, allowing to get the format
of "-p --first-parent" without change in history traversal that
--first-parent option causes.

The net result of these series are the following new options:

--diff-merges=	 |  old equivalent
-----------------+----------------
first-parent     | --first-parent (only format implications)
separate         | -m
combined         | -c
dense-combined   | --cc

The series also cleanup logic of handling of diff merges options and fix an
issue found in the original implementation where logically mutually
exclusive options -m/-c/--cc failed to actually override each other.

The series start with the set of pure refactoring commits that are expected
to introduce no functional changes. These are all commits up to and
including:

"diff-merges: revise revs->diff flag handling"

The aim of these commits is to isolate options handling for diff merges so
that it could be easily understood and tweaked to ease introduction of the
new options.

Then the fix of -m/-c/-cc overriding issue follows, starting with a failing
test and followed by the fix.

Then follows a little bit of additional refactoring in order to prepare for
introduction of the new options, and finally the series are finished by the
implementation, testing, and documentation update for the new options.

Sergey Organov (26):
  revision: factor out parsing of diff-merge related options
  revision: factor out setup of diff-merge related settings
  revision: factor out initialization of diff-merge related settings
  revision: provide implementation for diff merges tweaks
  revision: move diff merges functions to its own diff-merges.c
  diff-merges: rename all functions to have common prefix
  diff-merges: move checks for first_parent_only out of the module
  diff-merges: rename diff_merges_default_to_enable() to match semantics
  diff-merges: re-arrange functions to match the order they are called
    in
  diff-merges: new function diff_merges_suppress()
  diff-merges: new function diff_merges_set_dense_combined_if_unset()
  diff-merges: introduce revs->first_parent_merges flag
  diff-merges: revise revs->diff flag handling
  t4013: support test_expect_failure through ':failure' magic
  t4013: add tests for -m failing to override -c/--cc
  diff-merges: fix -m to properly override -c/--cc
  diff-merges: split 'ignore_merges' field
  diff-merges: group diff-merge flags next to each other inside
    'rev_info'
  diff-merges: get rid of now empty diff_merges_init_revs()
  diff-merges: refactor opt settings into separate functions
  diff-merges: make -m/-c/--cc explicitly mutually exclusive
  diff-merges: implement new values for --diff-merges
  t4013: add test for --diff-merges=first-parent
  doc/git-log: describe new --diff-merges options
  doc/diff-generate-patch: mention new --diff-merges option
  doc/rev-list-options: document --first-parent implies
    --diff-merges=first-parent

 Documentation/diff-generate-patch.txt         |   6 +-
 Documentation/git-log.txt                     |  79 ++++---
 Documentation/rev-list-options.txt            |   3 +
 Makefile                                      |   1 +
 builtin/diff-files.c                          |   5 +-
 builtin/diff.c                                |   9 +-
 builtin/log.c                                 |  18 +-
 builtin/merge.c                               |   3 +-
 diff-merges.c                                 | 120 +++++++++++
 diff-merges.h                                 |  18 ++
 fmt-merge-msg.c                               |   3 +-
 log-tree.c                                    |  17 +-
 revision.c                                    |  38 +---
 revision.h                                    |   7 +-
 t/t4013-diff-various.sh                       |  10 +-
 t/t4013/diff.log_--cc_-m_-p_master            | 200 ++++++++++++++++++
 t/t4013/diff.log_-c_-m_-p_master              | 200 ++++++++++++++++++
 ...f.log_-p_--diff-merges=first-parent_master | 137 ++++++++++++
 18 files changed, 774 insertions(+), 100 deletions(-)
 create mode 100644 diff-merges.c
 create mode 100644 diff-merges.h
 create mode 100644 t/t4013/diff.log_--cc_-m_-p_master
 create mode 100644 t/t4013/diff.log_-c_-m_-p_master
 create mode 100644 t/t4013/diff.log_-p_--diff-merges=first-parent_master

Comments

Sergey Organov Dec. 8, 2020, 8:07 p.m. UTC | #1
Sergey Organov <sorganov@gmail.com> writes:


[...]

> The series also cleanup logic of handling of diff merges options and
> fix an issue found in the original implementation where logically
> mutually exclusive options -m/-c/--cc failed to actually override each
> other.

Working further on this, I've noticed very irregular interactions
between -m/-c/--cc and --oneline:

1. --oneline disables -m output for 'git log', and leaves -m output enabled
for 'git show':

$ /usr/bin/git show -n1 -m --oneline 2e673356aef | wc -l
80
$ /usr/bin/git log -n1 -m --oneline 2e673356aef | wc -l
1

2. For 'git log', --oneline disables -m output, and leaves -c/--cc output
enabled:

$ /usr/bin/git log -n1 -m --oneline 2e673356aef | wc -l
1
$ /usr/bin/git log -n1 -c --oneline 2e673356aef | wc -l
16
$ /usr/bin/git log -n1 --cc --oneline 2e673356aef | wc -l
16

The question is: what's the right interaction between --oneline and
-m/-c/--cc?

I tend to think they should be independent, so that --oneline doesn't
affect diff output, and then the only offender is -m.

What do you think?

$ /usr/bin/git --version
git version 2.25.1

Thanks,
-- Sergey Organov
Elijah Newren Dec. 8, 2020, 8:52 p.m. UTC | #2
Hi Sergey,

On Tue, Dec 8, 2020 at 12:07 PM Sergey Organov <sorganov@gmail.com> wrote:
>
> Sergey Organov <sorganov@gmail.com> writes:
>
>
> [...]
>
> > The series also cleanup logic of handling of diff merges options and
> > fix an issue found in the original implementation where logically
> > mutually exclusive options -m/-c/--cc failed to actually override each
> > other.
>
> Working further on this, I've noticed very irregular interactions
> between -m/-c/--cc and --oneline:
>
> 1. --oneline disables -m output for 'git log', and leaves -m output enabled
> for 'git show':
>
> $ /usr/bin/git show -n1 -m --oneline 2e673356aef | wc -l
> 80
> $ /usr/bin/git log -n1 -m --oneline 2e673356aef | wc -l
> 1

If you leave off --oneline, you'll note that git show produces a diff
and git log does not (regardless of whether 2e673356aef is a merge
commit or a regular commit).  So, I don't think this is related to
--oneline.

> 2. For 'git log', --oneline disables -m output, and leaves -c/--cc output
> enabled:
>
> $ /usr/bin/git log -n1 -m --oneline 2e673356aef | wc -l
> 1
> $ /usr/bin/git log -n1 -c --oneline 2e673356aef | wc -l
> 16
> $ /usr/bin/git log -n1 --cc --oneline 2e673356aef | wc -l
> 16
>
> The question is: what's the right interaction between --oneline and
> -m/-c/--cc?

I believe the right question is: Should -m be a no-op unless -p is
also specified?  In the past, --cc and -c were no-ops except when -p
was also specified.  It was somewhat unfriendly and surprising, and
thus was changed so that --cc and -c implied -p (and thus would cause
output for non-merge commits to be shown differently, namely shown
with a diff, in addition to affecting the type of diff shown for merge
commits).  I think -m was overlooked at the time.

> I tend to think they should be independent, so that --oneline doesn't
> affect diff output, and then the only offender is -m.

I agree that they should be independent, but I believe they are
already independent unless you have more evidence of weirdness
somewhere.  The differences you are seeing are due to -m, -c, and --cc
being handled differently, and I think we should probably just give -m
the same treatment that we give to -c and --cc (namely, make all three
imply -p).

Hope that helps,
Elijah
Sergey Organov Dec. 8, 2020, 10:30 p.m. UTC | #3
Elijah Newren <newren@gmail.com> writes:

> Hi Sergey,

Hi Elijah,

>
> On Tue, Dec 8, 2020 at 12:07 PM Sergey Organov <sorganov@gmail.com> wrote:
>>
>> Sergey Organov <sorganov@gmail.com> writes:
>>
>>
>> [...]
>>
>> > The series also cleanup logic of handling of diff merges options and
>> > fix an issue found in the original implementation where logically
>> > mutually exclusive options -m/-c/--cc failed to actually override each
>> > other.
>>
>> Working further on this, I've noticed very irregular interactions
>> between -m/-c/--cc and --oneline:
>>
>> 1. --oneline disables -m output for 'git log', and leaves -m output enabled
>> for 'git show':
>>
>> $ /usr/bin/git show -n1 -m --oneline 2e673356aef | wc -l
>> 80
>> $ /usr/bin/git log -n1 -m --oneline 2e673356aef | wc -l
>> 1
>
> If you leave off --oneline, you'll note that git show produces a diff
> and git log does not (regardless of whether 2e673356aef is a merge
> commit or a regular commit).  So, I don't think this is related to
> --oneline.

Yeah, looks exactly like this, thanks for correcting!

>
>> 2. For 'git log', --oneline disables -m output, and leaves -c/--cc output
>> enabled:
>>
>> $ /usr/bin/git log -n1 -m --oneline 2e673356aef | wc -l
>> 1
>> $ /usr/bin/git log -n1 -c --oneline 2e673356aef | wc -l
>> 16
>> $ /usr/bin/git log -n1 --cc --oneline 2e673356aef | wc -l
>> 16
>>
>> The question is: what's the right interaction between --oneline and
>> -m/-c/--cc?
>
> I believe the right question is: Should -m be a no-op unless -p is
> also specified?

Right.

> In the past, --cc and -c were no-ops except when -p
> was also specified.  It was somewhat unfriendly and surprising, and
> thus was changed so that --cc and -c implied -p (and thus would cause
> output for non-merge commits to be shown differently, namely shown
> with a diff, in addition to affecting the type of diff shown for merge
> commits).

Well, so one surprise has been replaced with another, supposedly more
friendly, right?

I mean, obviously, with --cc I don't ask for diffs for non-merge
commits, so it is still a surprise they are thrown at me.

> I think -m was overlooked at the time.

Looks like it was, but maybe there was rather an actual reason for not
implying -p by -m? Maybe Junio will tell?

>
>> I tend to think they should be independent, so that --oneline doesn't
>> affect diff output, and then the only offender is -m.
>
> I agree that they should be independent, but I believe they are
> already independent unless you have more evidence of weirdness
> somewhere.  The differences you are seeing are due to -m, -c, and --cc
> being handled differently, and I think we should probably just give -m
> the same treatment that we give to -c and --cc (namely, make all three
> imply -p).

I think that either all diff-merge options should imply -p, or none,
from the POV of least surprise.

However, it'd give us yet another challenge: for some time already,
--first-parent implies -m, that once it starts to imply -p, will result in

  git log --first-parent

suddenly producing diff output for everything.

One way out I see is to specify that implied -m/-c/--cc don't imply
-p, only explicit do.

Entirely different approach is to get rid of -m/-c/--cc implying -p, and
just produce diff output for merges independently on -p being provided
or not. This will give us additional functionality (ability to get diff
for merges, but not for regulars), and will get rid of all the related
surprises.

Thoughts?

Thanks,
-- Sergey Organov
Elijah Newren Dec. 8, 2020, 11:11 p.m. UTC | #4
On Tue, Dec 8, 2020 at 2:30 PM Sergey Organov <sorganov@gmail.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > Hi Sergey,
>
> Hi Elijah,
>
> >
> > On Tue, Dec 8, 2020 at 12:07 PM Sergey Organov <sorganov@gmail.com> wrote:
> >>
> >> Sergey Organov <sorganov@gmail.com> writes:
> >>
> >>
> >> [...]
> >>
> >> > The series also cleanup logic of handling of diff merges options and
> >> > fix an issue found in the original implementation where logically
> >> > mutually exclusive options -m/-c/--cc failed to actually override each
> >> > other.
> >>
> >> Working further on this, I've noticed very irregular interactions
> >> between -m/-c/--cc and --oneline:
> >>
> >> 1. --oneline disables -m output for 'git log', and leaves -m output enabled
> >> for 'git show':
> >>
> >> $ /usr/bin/git show -n1 -m --oneline 2e673356aef | wc -l
> >> 80
> >> $ /usr/bin/git log -n1 -m --oneline 2e673356aef | wc -l
> >> 1
> >
> > If you leave off --oneline, you'll note that git show produces a diff
> > and git log does not (regardless of whether 2e673356aef is a merge
> > commit or a regular commit).  So, I don't think this is related to
> > --oneline.
>
> Yeah, looks exactly like this, thanks for correcting!
>
> >
> >> 2. For 'git log', --oneline disables -m output, and leaves -c/--cc output
> >> enabled:
> >>
> >> $ /usr/bin/git log -n1 -m --oneline 2e673356aef | wc -l
> >> 1
> >> $ /usr/bin/git log -n1 -c --oneline 2e673356aef | wc -l
> >> 16
> >> $ /usr/bin/git log -n1 --cc --oneline 2e673356aef | wc -l
> >> 16
> >>
> >> The question is: what's the right interaction between --oneline and
> >> -m/-c/--cc?
> >
> > I believe the right question is: Should -m be a no-op unless -p is
> > also specified?
>
> Right.
>
> > In the past, --cc and -c were no-ops except when -p
> > was also specified.  It was somewhat unfriendly and surprising, and
> > thus was changed so that --cc and -c implied -p (and thus would cause
> > output for non-merge commits to be shown differently, namely shown
> > with a diff, in addition to affecting the type of diff shown for merge
> > commits).
>
> Well, so one surprise has been replaced with another, supposedly more
> friendly, right?
>
> I mean, obviously, with --cc I don't ask for diffs for non-merge
> commits, so it is still a surprise they are thrown at me.

Actually, that wasn't a side-effect but part of the intended change --
see https://lore.kernel.org/git/1440110591-12941-1-git-send-email-gitster@pobox.com/.

> > I think -m was overlooked at the time.
>
> Looks like it was, but maybe there was rather an actual reason for not
> implying -p by -m? Maybe Junio will tell?
>
> >
> >> I tend to think they should be independent, so that --oneline doesn't
> >> affect diff output, and then the only offender is -m.
> >
> > I agree that they should be independent, but I believe they are
> > already independent unless you have more evidence of weirdness
> > somewhere.  The differences you are seeing are due to -m, -c, and --cc
> > being handled differently, and I think we should probably just give -m
> > the same treatment that we give to -c and --cc (namely, make all three
> > imply -p).
>
> I think that either all diff-merge options should imply -p, or none,
> from the POV of least surprise.
>
> However, it'd give us yet another challenge: for some time already,
> --first-parent implies -m, that once it starts to imply -p, will result in
>
>   git log --first-parent
>
> suddenly producing diff output for everything.

That is definitely a pickle.

> One way out I see is to specify that implied -m/-c/--cc don't imply
> -p, only explicit do.
>
> Entirely different approach is to get rid of -m/-c/--cc implying -p, and
> just produce diff output for merges independently on -p being provided
> or not. This will give us additional functionality (ability to get diff
> for merges, but not for regulars), and will get rid of all the related
> surprises.
>
> Thoughts?

I was happy when I found out that --cc had changed to imply -p; I
guess I felt the same as Junio did with his rationale in the link I
posted above.  I've made --remerge-diff behave like --cc (i.e. it
implies -p), and I like it there too.  I use it both to turn on diffs
for merges, and to turn on diffs for regular commits without having to
specify the extra -p flag.  I guess I'm not sure why one would ever
want to see diffs for merges and not for normal commits.  Even in the
unusual case someone did, couldn't they just pass --merges (to strip
out the normal commits entirely)?

I may not have the best vantage point on this, though, because I
personally don't see enough utility in diffing a merge to just one of
its parents that it'd merit having an option to git-log, and yet we
clearly have two such options already (-m and --first-parent when
combined with -p).

But, there is at least one more way to get out of this pickle besides
the two options you listed above: we could make --first-parent be just
about commit limiting and not imply anything about diff behavior.
Honestly, I find it a little surprising that despite the fact that log
-p shows nothing for merge commits, that when I add --first-parent to
see a subset of commits I suddenly get weird, huge diffs shown for the
merges (yeah, yeah, I learned recently that it's documented behavior,
so it's not surprising anymore, just weird).  So, this wouldn't just
get rid of this new nasty pickle, but would remove another negative
surprise too.  If we're going to make a behavioral change, I'd rather
we fixed this side rather than the (IMO) nicely working --cc/-c side.


Hope that helps,
Elijah
Junio C Hamano Dec. 9, 2020, 1:17 a.m. UTC | #5
Elijah Newren <newren@gmail.com> writes:

> ... I guess I'm not sure why one would ever
> want to see diffs for merges and not for normal commits.  Even in the
> unusual case someone did, couldn't they just pass --merges (to strip
> out the normal commits entirely)?

Giving "--merges" would skip the single-strand-of-pearls commits
entirely, not even showing their log messages, so it won't be an
equivalent substitute.

> Honestly, I find it a little surprising that despite the fact that log
> -p shows nothing for merge commits, that when I add --first-parent to
> see a subset of commits I suddenly get weird, huge diffs shown for the
> merges (yeah, yeah, I learned recently that it's documented behavior,
> so it's not surprising anymore, just weird).

I view "--first-parent" as telling "git log" to pretend all merges
are single-parent commits (as if you did squash merges), and it is a
natural consequence if "log --first-parent -p" showed each commit
with its first parent, whether it is a merge or a single-parent
commit.
Elijah Newren Dec. 9, 2020, 3:06 a.m. UTC | #6
Hi Junio,

On Tue, Dec 8, 2020 at 5:17 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > ... I guess I'm not sure why one would ever
> > want to see diffs for merges and not for normal commits.  Even in the
> > unusual case someone did, couldn't they just pass --merges (to strip
> > out the normal commits entirely)?
>
> Giving "--merges" would skip the single-strand-of-pearls commits
> entirely, not even showing their log messages, so it won't be an
> equivalent substitute.

Right, I said that too in my parenthetical comment; --merges would
strip out the normal commits entirely.  As I said, I'm not sure why
anyone would ever want to see diffs for merges and not for normal
commits, the closest useful thing I can imagine is commit messages +
diffs for just merges, stripping the normal commits.  Is there a
usecase here (other than the motivating example of trying to remove an
inconsistency between -m and --cc output)?

I'd personally dislike needing to specify --cc and -p together when
today I can specify just --cc.  You said as much too at [1].  Have you
since changed your mind?

[1] https://lore.kernel.org/git/1440110591-12941-1-git-send-email-gitster@pobox.com/

> > Honestly, I find it a little surprising that despite the fact that log
> > -p shows nothing for merge commits, that when I add --first-parent to
> > see a subset of commits I suddenly get weird, huge diffs shown for the
> > merges (yeah, yeah, I learned recently that it's documented behavior,
> > so it's not surprising anymore, just weird).
>
> I view "--first-parent" as telling "git log" to pretend all merges
> are single-parent commits (as if you did squash merges), and it is a
> natural consequence if "log --first-parent -p" showed each commit
> with its first parent, whether it is a merge or a single-parent
> commit.

Alright, fair enough.  I had always viewed it as commit limiting only
(and thus why it looked weird to me), but I really don't use
--first-parent or -m much.

But that leaves Sergey's question unanswered: should the inconsistency
between --cc and -m (namely that only the former implies -p) be
removed?  If so, since --first-parent implies -m, what's the right way
to avoid --first-parent becoming weird?  (Allow explicit -m to imply
-p, but not an implicit -m that came from --first-parent?)
Junio C Hamano Dec. 9, 2020, 3:22 a.m. UTC | #7
Elijah Newren <newren@gmail.com> writes:

> ...  As I said, I'm not sure why
> anyone would ever want to see diffs for merges and not for normal
> commits, the closest useful thing I can imagine is commit messages +
> diffs for just merges, stripping the normal commits.

If I can run "git log --some-options master..next" (or more
realistically, over the range ko/next..next) to get individual
commits (without patch) and merges (only when --cc gives some
interesting nearby-changes), I would be very happy.  But is there a
set of options that lets me do so?
Elijah Newren Dec. 9, 2020, 3:31 a.m. UTC | #8
On Tue, Dec 8, 2020 at 7:22 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > ...  As I said, I'm not sure why
> > anyone would ever want to see diffs for merges and not for normal
> > commits, the closest useful thing I can imagine is commit messages +
> > diffs for just merges, stripping the normal commits.
>
> If I can run "git log --some-options master..next" (or more
> realistically, over the range ko/next..next) to get individual
> commits (without patch) and merges (only when --cc gives some
> interesting nearby-changes), I would be very happy.  But is there a
> set of options that lets me do so?

So, you're saying you changed your mind since five years ago?[1]  Or
that what you said five years ago is still valid, but you'd appreciate
more/different options that allow this new thing?

[1] https://lore.kernel.org/git/1440110591-12941-1-git-send-email-gitster@pobox.com/

We got to this point in this thread because Sergey suggested reverting
the above change (as one option) in order to bring more consistency
between -m and -c/--cc.
Junio C Hamano Dec. 9, 2020, 4:18 a.m. UTC | #9
Elijah Newren <newren@gmail.com> writes:

>> If I can run "git log --some-options master..next" (or more
>> realistically, over the range ko/next..next) to get individual
>> commits (without patch) and merges (only when --cc gives some
>> interesting nearby-changes), I would be very happy.  But is there a
>> set of options that lets me do so?
>
> So, you're saying you changed your mind since five years ago?[1]  Or
> that what you said five years ago is still valid, but you'd appreciate
> more/different options that allow this new thing?
>
> [1] https://lore.kernel.org/git/1440110591-12941-1-git-send-email-gitster@pobox.com/

Sorry, but I am not seeing in [1] anything that relates to the above
"want to see --cc patch for merge but just log message for single
parent commit". 5 years is a long time even in Git timescale, so I
would not be surprised if I changed my mind over time, but I am not
sure what opinion on the matter you think I expressed back then.

"git log --cc master..next" shows all commits' log messages, patch
for each single-parent commit, and combined-dense patch for each
merge.  There is no option to squelch the patch for only single
parent commits.  It may not be such a bad thing to have as an extra
option.

So, I think what I am saying is that ...

> > ...  As I said, I'm not sure why
> > anyone would ever want to see diffs for merges and not for normal
> > commits, the closest useful thing I can imagine is commit messages +
> > diffs for just merges, stripping the normal commits.

... I see use for such a feature (assuming that you didn't mean by
"diffs for merges" a regular "--first-parent -p" patch, but meant to
say "--cc" patch) in my workflow.  I'd review "log ko/next..next"
before deciding to push out the day's integration of 'next', and at
that point, I trust individual commits that came from contributors
well enough (otherwise I wouldn't be merging them to 'next'), but I
would appreciate the last chance to re-examine conflict resolutions
in merges.

It does not mean that I do not like the current behaviour that
"--cc" always implies "-p"; it is convenient.  It's just I find the
lack of feature slightly less than ideal, but I do not care deeply
enough to design how to express such a feature from the command
line.
Elijah Newren Dec. 9, 2020, 4:54 a.m. UTC | #10
On Tue, Dec 8, 2020 at 8:18 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> >> If I can run "git log --some-options master..next" (or more
> >> realistically, over the range ko/next..next) to get individual
> >> commits (without patch) and merges (only when --cc gives some
> >> interesting nearby-changes), I would be very happy.  But is there a
> >> set of options that lets me do so?
> >
> > So, you're saying you changed your mind since five years ago?[1]  Or
> > that what you said five years ago is still valid, but you'd appreciate
> > more/different options that allow this new thing?
> >
> > [1] https://lore.kernel.org/git/1440110591-12941-1-git-send-email-gitster@pobox.com/
>
> Sorry, but I am not seeing in [1] anything that relates to the above
> "want to see --cc patch for merge but just log message for single
> parent commit". 5 years is a long time even in Git timescale, so I
> would not be surprised if I changed my mind over time, but I am not
> sure what opinion on the matter you think I expressed back then.
>
> "git log --cc master..next" shows all commits' log messages, patch
> for each single-parent commit, and combined-dense patch for each
> merge.  There is no option to squelch the patch for only single
> parent commits.  It may not be such a bad thing to have as an extra
> option.
>
> So, I think what I am saying is that ...
>
> > > ...  As I said, I'm not sure why
> > > anyone would ever want to see diffs for merges and not for normal
> > > commits, the closest useful thing I can imagine is commit messages +
> > > diffs for just merges, stripping the normal commits.
>
> ... I see use for such a feature (assuming that you didn't mean by
> "diffs for merges" a regular "--first-parent -p" patch, but meant to
> say "--cc" patch) in my workflow.  I'd review "log ko/next..next"
> before deciding to push out the day's integration of 'next', and at
> that point, I trust individual commits that came from contributors
> well enough (otherwise I wouldn't be merging them to 'next'), but I
> would appreciate the last chance to re-examine conflict resolutions
> in merges.
>
> It does not mean that I do not like the current behaviour that
> "--cc" always implies "-p"; it is convenient.  It's just I find the
> lack of feature slightly less than ideal, but I do not care deeply
> enough to design how to express such a feature from the command
> line.

Okay, thanks for clarifying.  It sounds like you were focusing on the
tangentially related comment I made (diffs for merges and not for
normal commits) while I was focusing on Sergey's question (should we
revert --cc implies -p).  I was having a hard time understanding if
you were answering his question or not.  This last paragraph of yours
acknowledges the question, though you still avoid answering it.  :-)

However, even my focus was on a secondary question.  His real original
question is: -m and --cc are inconsistent -- one requires -p, while
the other doesn't.  Should that be fixed...and which option(s) should
change?  He gave two possibilities I didn't like.  I added a third
that you didn't like.  So...
Junio C Hamano Dec. 9, 2020, 5:24 a.m. UTC | #11
Elijah Newren <newren@gmail.com> writes:

>> It does not mean that I do not like the current behaviour that
>> "--cc" always implies "-p"; it is convenient.  It's just I find the
>> lack of feature slightly less than ideal, but I do not care deeply
>> enough to design how to express such a feature from the command
>> line.
>
> Okay, thanks for clarifying.  It sounds like you were focusing on the
> tangentially related comment I made (diffs for merges and not for
> normal commits) while I was focusing on Sergey's question (should we
> revert --cc implies -p).  I was having a hard time understanding if
> you were answering his question or not.  This last paragraph of yours
> acknowledges the question, though you still avoid answering it.  :-)

I do like the current behaviour that "--cc" always implies "-p"; it
is convenient.  I just wish there were a way to say "enable -p only
for merges" while taking advantage of the convenience.

So, no we should not stop "--cc" or "-c" implying "-p".

When the end-user gives "-m" on the command line of either "show" or
"log", it cleanly means the user is interested in "individual"
patches shown for each parent when showing a merge commit, so "-p"
implied by "-m" would make sense just as much as "-p" implied by
"--cc".  The same comment about the lack of "want -p to be implied
but only for merges" as an add-on feature applies here, though.

I suspect that the real reason why "-m" does not imply "-p" was
merely a historical implementation detail (e.g. imagine that we
wanted *not* to show things by default in a subcommand in "log"
family, but when "-p" is used, wanted it to mean "-m -p", or
something like that.  Setting "-m" on by default without explicitly
given from the command line, and making sure that m-bit does not
imply "-p", would be an easy-to-implement such a "when '-p' is
given, take it to mean as '-m -p'" _feature_).

If I were to decide now with hindsight, perhaps I'd make "--cc" and
"-m" imply "-p" only for merge commits, and the user can explicitly
give "--cc -p" and "-m -p" to ask patches for single-parent commits
to be shown as well.
Junio C Hamano Dec. 9, 2020, 6:40 a.m. UTC | #12
Junio C Hamano <gitster@pobox.com> writes:

A clarification and a correction.

> I suspect that the real reason why "-m" does not imply "-p" was
> merely a historical implementation detail...

Now I remember better.  The reason was pure oversight.

In the beginning, there was no patch output for merges.  As most
merges just resolve cleanly, and back then the first-parent chains
were treated as much much less special than we treat them today,
"git log -p" showed only patches for single-parent commits and
everybody was happy.  It could have been a possible alternative
design to show first-parent diff for a merge instead of showing no
patch, but because the traversal went to side branches, the changes
made by the merge to the mainline as a single big patch would have
been redundant---we would be seeing individual patches from the side
branch anyway.

Then later we introduced "-m -p"; since the first-parent chain was
not considered all that special, we treated each parent equally.
Nobody, not even Linus and I, thought it was useful by itself even
back then, but we didn't have anything better.

I think it was Paul Mackerras's "gitk" that invented the concept of
combined merges.  We liked it quite a lot, and added "-c" and "--cc"
soon after that, to the core git and kept polishing, until "gitk"
stopped combining the patches with each parent in tcl/tk script and
instead started telling "git" to show with "--cc".

By the time the change to make "--cc" imply "-p" was introduced, it
was pretty much given that "-m -p" was useful to anybody, unless you
are consuming these individual patches in a script or something like
that.  So simply I didn't even think of making "-m" imply "-p".  It
would be logical to make it so, but it would not add much practical
value, I would have to say.

> If I were to decide now with hindsight, perhaps I'd make "--cc" and
> "-m" imply "-p" only for merge commits, and the user can explicitly
> give "--cc -p" and "-m -p" to ask patches for single-parent commits
> to be shown as well.

After "now with hindsight", I need to add "and without having to
worry about backward compatibility issues" here.  IOW, the above is
not my recommendation.  It would be the other way around: "--cc"
implies "-p" for both merges and non-merges, "-m" implies "-p" for
both merges and non-merges.  It is acceptable to add a new option
"--no-patch-for-non-merge" so that the user can ask to see only the
combined diff for merges and no patches for individual commits.

Both "--no-patch-for-non-merge" option, and making "-m" imply "-p"
are very low priority from my point of view, though, since our users
(including me) lived without the former and have been happily using
"log --cc" for a long time, and we've written off the latter as
pretty much useless combination unless you are a script.
Sergey Organov Dec. 9, 2020, 1:34 p.m. UTC | #13
Elijah Newren <newren@gmail.com> writes:

> On Tue, Dec 8, 2020 at 8:18 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Elijah Newren <newren@gmail.com> writes:
>>
>> >> If I can run "git log --some-options master..next" (or more
>> >> realistically, over the range ko/next..next) to get individual
>> >> commits (without patch) and merges (only when --cc gives some
>> >> interesting nearby-changes), I would be very happy.  But is there a
>> >> set of options that lets me do so?
>> >
>> > So, you're saying you changed your mind since five years ago?[1]  Or
>> > that what you said five years ago is still valid, but you'd appreciate
>> > more/different options that allow this new thing?
>> >
>> > [1]
>> > https://lore.kernel.org/git/1440110591-12941-1-git-send-email-gitster@pobox.com/
>>
>> Sorry, but I am not seeing in [1] anything that relates to the above
>> "want to see --cc patch for merge but just log message for single
>> parent commit". 5 years is a long time even in Git timescale, so I
>> would not be surprised if I changed my mind over time, but I am not
>> sure what opinion on the matter you think I expressed back then.
>>
>> "git log --cc master..next" shows all commits' log messages, patch
>> for each single-parent commit, and combined-dense patch for each
>> merge.  There is no option to squelch the patch for only single
>> parent commits.  It may not be such a bad thing to have as an extra
>> option.
>>
>> So, I think what I am saying is that ...
>>
>> > > ...  As I said, I'm not sure why
>> > > anyone would ever want to see diffs for merges and not for normal
>> > > commits, the closest useful thing I can imagine is commit messages +
>> > > diffs for just merges, stripping the normal commits.
>>
>> ... I see use for such a feature (assuming that you didn't mean by
>> "diffs for merges" a regular "--first-parent -p" patch, but meant to
>> say "--cc" patch) in my workflow.  I'd review "log ko/next..next"
>> before deciding to push out the day's integration of 'next', and at
>> that point, I trust individual commits that came from contributors
>> well enough (otherwise I wouldn't be merging them to 'next'), but I
>> would appreciate the last chance to re-examine conflict resolutions
>> in merges.
>>
>> It does not mean that I do not like the current behaviour that
>> "--cc" always implies "-p"; it is convenient.  It's just I find the
>> lack of feature slightly less than ideal, but I do not care deeply
>> enough to design how to express such a feature from the command
>> line.
>
> Okay, thanks for clarifying.  It sounds like you were focusing on the
> tangentially related comment I made (diffs for merges and not for
> normal commits) while I was focusing on Sergey's question (should we
> revert --cc implies -p).  I was having a hard time understanding if
> you were answering his question or not.  This last paragraph of yours
> acknowledges the question, though you still avoid answering it.  :-)
>
> However, even my focus was on a secondary question.  His real original
> question is: -m and --cc are inconsistent -- one requires -p, while
> the other doesn't.  Should that be fixed...and which option(s) should
> change?  He gave two possibilities I didn't like.  I added a third
> that you didn't like.  So...

I believe you've misunderstood me slightly.

I didn't suggest bare reverting of the "-c/-cc imply -p" commit. I
rather suggested to modify current behavior to "-c/--cc enable diff
output for merge commits", then add "-m" to the mix, so that we finally
get:

  "-m/-c/--c enable diff output for merge commits".

And I should emphasize that what I mean differs from "-m/-c/--cc imply -p
for merge commits only" as Junio has put it in this discussion, even if
slightly, -- it won't imply -p, to avoid messing with --first-parent
that would imply -p through -m and enable diff for merges, that is not
what we want.

This is what I'd like to see, as it finally makes -m/-c/--cc (as well as
other --diff-merges options) focus on merge commits only and never
affect regular commits, -- the way it should be.

My alternative suggestion was to rather add "-m implies -p" to the
current state, resulting in:

  "-m/-c/--c imply -p".

However, the latter one has additional complication in handling of
"--first-parent" that currently implies -m that'd then imply -p and
suddenly give lengthy output on bare "git log --first-parent".

Fixing the latter is still possible by complicating of options handling
by specifying that implied options don't imply other options, only
explicit options do, but somehow I don't like this, -- too complex for
the job at hand. I'm still on position that option management in Git
could rather be done in much simpler manner, without need for such
complexities.

Thanks,
-- Sergey
Sergey Organov Dec. 9, 2020, 2:08 p.m. UTC | #14
Junio C Hamano <gitster@pobox.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
> A clarification and a correction.
>
>> I suspect that the real reason why "-m" does not imply "-p" was
>> merely a historical implementation detail...
>
> Now I remember better.  The reason was pure oversight.
>
> In the beginning, there was no patch output for merges.  As most
> merges just resolve cleanly, and back then the first-parent chains
> were treated as much much less special than we treat them today,
> "git log -p" showed only patches for single-parent commits and
> everybody was happy.  It could have been a possible alternative
> design to show first-parent diff for a merge instead of showing no
> patch, but because the traversal went to side branches, the changes
> made by the merge to the mainline as a single big patch would have
> been redundant---we would be seeing individual patches from the side
> branch anyway.
>
> Then later we introduced "-m -p"; since the first-parent chain was
> not considered all that special, we treated each parent equally.
> Nobody, not even Linus and I, thought it was useful by itself even
> back then, but we didn't have anything better.
>
> I think it was Paul Mackerras's "gitk" that invented the concept of
> combined merges.  We liked it quite a lot, and added "-c" and "--cc"
> soon after that, to the core git and kept polishing, until "gitk"
> stopped combining the patches with each parent in tcl/tk script and
> instead started telling "git" to show with "--cc".
>
> By the time the change to make "--cc" imply "-p" was introduced, it
> was pretty much given that "-m -p" was useful to anybody, unless you
> are consuming these individual patches in a script or something like
> that.  So simply I didn't even think of making "-m" imply "-p".  It
> would be logical to make it so, but it would not add much practical
> value, I would have to say.

... and then later the "--first-parent implies -m" change has been made,
that would't work as expected if -m implied -p in the fist place, as
it'd break "git log --first-parent".

>
>> If I were to decide now with hindsight, perhaps I'd make "--cc" and
>> "-m" imply "-p" only for merge commits, and the user can explicitly
>> give "--cc -p" and "-m -p" to ask patches for single-parent commits
>> to be shown as well.
>
> After "now with hindsight", I need to add "and without having to
> worry about backward compatibility issues" here.  IOW, the above is
> not my recommendation.  It would be the other way around: "--cc"
> implies "-p" for both merges and non-merges, "-m" implies "-p" for
> both merges and non-merges.  It is acceptable to add a new option
> "--no-patch-for-non-merge" so that the user can ask to see only the
> combined diff for merges and no patches for individual commits.

OK, so, do we decide that -c/--cc must continue to imply -p and thus
request diffs for everything?

If so, I can rather change --diff-merges=combined/dense-combined
behavior to /not/ imply -p, thus effectively making --cc a synonym for
"--diff-merges=dense-combined --patch", that will have zero backward
compatibility issues.

Looks like it'd have everything covered. Old options won't change their
behavior at all, and the new set of options will behave differently,
exactly if designed from scratch, providing new functionality.

>
> Both "--no-patch-for-non-merge" option, and making "-m" imply "-p"
> are very low priority from my point of view, though, since our users
> (including me) lived without the former and have been happily using
> "log --cc" for a long time, and we've written off the latter as
> pretty much useless combination unless you are a script.

I think my above suggestion covers all the worries without need for this
nasty "--no-patch-for-non-merge" option.

That said, -m is useless, period. It'd likely have some merit in
plumbing, but definitely not in porcelain. So I'm inclined to let it
rest in peace indeed, dying.

Thanks,
-- Sergey
Sergey Organov Dec. 9, 2020, 7:44 p.m. UTC | #15
Junio C Hamano <gitster@pobox.com> writes:

[...]

> By the time the change to make "--cc" imply "-p" was introduced, it
> was pretty much given that "-m -p" was useful to anybody, unless you
> are consuming these individual patches in a script or something like
> that.  So simply I didn't even think of making "-m" imply "-p".  It
> would be logical to make it so, but it would not add much practical
> value, I would have to say.

I need some help here.

Looking at the code and trying to follow the flow, I can't figure what
rev->diff flag is for? Why rev->diffopt.output_format, that actually
affects the output, is not enough?

My confusion originates from the fact that the code in revision.c sets
rev->diff to 1 for -c/--cc , while it doesn't set it for -m, and this
was the case *before* -c/--cc started to imply -p and -m.

It seems that the only place where rev->diff is tested is at the start
of log_tree_diff(), and even there its absence could be ignored when
rev->diffopt.flags.exit_with_status is set.

Is rev->diff an optimization, does it play another significant role, or
is it a remnant?

Thanks,
-- Sergey
Junio C Hamano Dec. 9, 2020, 8:53 p.m. UTC | #16
Sergey Organov <sorganov@gmail.com> writes:

>>> If I were to decide now with hindsight, perhaps I'd make "--cc" and
>>> "-m" imply "-p" only for merge commits, and the user can explicitly
>>> give "--cc -p" and "-m -p" to ask patches for single-parent commits
>>> to be shown as well.
>>
>> After "now with hindsight", I need to add "and without having to
>> worry about backward compatibility issues" here.  IOW, the above is
>> not my recommendation.  It would be the other way around: "--cc"
>> implies "-p" for both merges and non-merges, "-m" implies "-p" for
>> both merges and non-merges.  It is acceptable to add a new option
>> "--no-patch-for-non-merge" so that the user can ask to see only the
>> combined diff for merges and no patches for individual commits.
>
> OK, so, do we decide that -c/--cc must continue to imply -p and thus
> request diffs for everything?

My vote goes to keep the above behaviour as-is for compatibility,
and probably match what happens when -m is given instead of -c/--cc,
if somebody cares enough about "-c/--cc/-m discrepancy".

> That said, -m is useless, period. It'd likely have some merit in
> plumbing, but definitely not in porcelain. So I'm inclined to let it
> rest in peace indeed, dying.

That is fine by me as well.

I do not speak for others, though ;-)
Junio C Hamano Dec. 10, 2020, 6:12 a.m. UTC | #17
Sergey Organov <sorganov@gmail.com> writes:

> My confusion originates from the fact that the code in revision.c sets
> rev->diff to 1 for -c/--cc , while it doesn't set it for -m, and this
> was the case *before* -c/--cc started to imply -p and -m.

Yes, all of this was from cd2bdc53 (Common option parsing for "git
log --diff" and friends, 2006-04-14).  We can see that "-m" is not
treated among the first class citizen in the output of "git show" on
the commit.  Namely, "-m" alone is merely a modifier for "-p" and
does not cause a diff to be generated (in other words, it only
affects the output if used together with "-p").

"git show cd2bdc53:git.c" would give you how "git log" looked like
back then, and how rev.diff field is used.

    static int cmd_log(int argc, const char **argv, char **envp)
    {
    ...

	prepare_revision_walk(&rev);
	setup_pager();
	while ((commit = get_revision(&rev)) != NULL) {
		if (shown && rev.diff && rev.commit_format != CMIT_FMT_ONELINE)
			putchar('\n');

We grab the next commit to show, and if we have already shown
something in the previous iteration, and if we are told to produce
patch output, we put an extra blank line after the patch for the
previous commit.  We omit that extra blank when showing the log
message in oneline format.  And then ...

		fputs(commit_prefix, stdout);
		if (rev.abbrev_commit && rev.abbrev)
			fputs(find_unique_abbrev(commit->object.sha1, rev.abbrev),
			      stdout);
		else
			fputs(sha1_to_hex(commit->object.sha1), stdout);
		if (rev.parents) {
    ...
		}
		if (rev.commit_format == CMIT_FMT_ONELINE)
			putchar(' ');
		else
			putchar('\n');
		pretty_print_commit(rev.commit_format, commit, ~0, buf,
				    LOGSIZE, rev.abbrev);
		printf("%s\n", buf);

... after showing the log message, if we were told to produce diff,

		if (rev.diff)
			log_tree_commit(&rev, commit);

we ask log_tree_commit() to show the patch.

		shown = 1;
		free(commit->buffer);
		commit->buffer = NULL;
	}
    ...

I think the code these days have most of the per-commit logic moved
to log_tree_commit() compared to the code we see above, but the
check at the beginning of log_tree_diff() we have, i.e.

    static int log_tree_diff(struct rev_info *opt, struct commit *...
    {
	int showed_log;
	struct commit_list *parents;
	struct object_id *oid;

	if (!opt->diff && !opt->diffopt.flags.exit_with_status)
		return 0;

directly corresponds to the "if rev.diff is true, then call
log_tree_commit()" in the 2006 code.
Elijah Newren Dec. 10, 2020, 7:26 a.m. UTC | #18
On Wed, Dec 9, 2020 at 11:44 AM Sergey Organov <sorganov@gmail.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> [...]
>
> > By the time the change to make "--cc" imply "-p" was introduced, it
> > was pretty much given that "-m -p" was useful to anybody, unless you
> > are consuming these individual patches in a script or something like
> > that.  So simply I didn't even think of making "-m" imply "-p".  It
> > would be logical to make it so, but it would not add much practical
> > value, I would have to say.
>
> I need some help here.
>
> Looking at the code and trying to follow the flow, I can't figure what
> rev->diff flag is for? Why rev->diffopt.output_format, that actually
> affects the output, is not enough?

I'm not a revision walking expert, but to the best of my understanding...

Showing a diff is not the only reason you might need to compute one.
You may also need to compute them if you are filtering commits by
pathspec (-- $filename), using the pickaxe (-S foo), checking if
commits are cherry-picks (--cherry-mark), checking for commits with
certain type of file changes (--diff-filter=A), selecting commits that
modified a certain function (-L :funcname:filename --no-patch), or
others I've overlooked.  None of these cause a diff to be shown.  I
don't know if all these set rev->diff to 1 or if they special case
some other way, but I suspect that rev->diff exists as a shorthand for
"need a diff", so that the code can check for it without having to
check a half dozen special conditions.

>> My confusion originates from the fact that the code in revision.c sets
> rev->diff to 1 for -c/--cc , while it doesn't set it for -m, and this
> was the case *before* -c/--cc started to imply -p and -m.
>
> It seems that the only place where rev->diff is tested is at the start
> of log_tree_diff(), and even there its absence could be ignored when
> rev->diffopt.flags.exit_with_status is set.

rev->diffopt.flags.exit_with_status seems quite unlikely to be set,
though.  That setting was added with the --exit-code flag to git log
in 2008 (in the pm/log-exit-code topic), but was never documented
(other than to say it's incompatible with --check), the commit message
adding it doesn't say what behavior was intended, and the commit
message which added it added no regression tests either.  I know what
diff --exit-code does, but I'm really not sure what git-log's
--exit-code does (random guess: sets the exit code to an OR of what
git diff would have shown for any one of the commits shown?).  Since I
don't know and can't even figure it out looking at the commits in
question, I suspect there aren't too many users out there using it.
As such, I suspect rev->diffopt.flags.exit_with_status will be 0 most
of the time and that the relevant check at the top of log_tree_diff()
really is the "if (!opt->diff)" part of it.

> Is rev->diff an optimization, does it play another significant role, or
> is it a remnant?
Sergey Organov Dec. 10, 2020, 8:10 p.m. UTC | #19
Elijah Newren <newren@gmail.com> writes:


[...]

> As such, I suspect rev->diffopt.flags.exit_with_status will be 0 most
> of the time and that the relevant check at the top of log_tree_diff()
> really is the "if (!opt->diff)" part of it.

Right, but *why* is it needed? If I do set opt->diff to 1 but don't set
opt->diffopt.output_format I get no diff, so the question essentially
is: which caller of log_tree_diff() would need to set
opt->diffopt.output_format but leave opt->diff to be 0, and why?

Even if such caller exists, it will get no diff, so it seems to be
entirely pointless.

>
>> Is rev->diff an optimization, does it play another significant role, or
>> is it a remnant?

I still don't see what's the correct answer to this question, sorry.

Well, I did a crush-test: commented-out entire if() at the beginning of
the log_tree_diff() and ran all the tests. The result was entirely
surprising: all the tests pass except one:

Test Summary Report
-------------------
./t1410-reflog.sh                                  (Wstat: 256 Tests: 22 Failed: 1)
  Failed test:  4
  Non-zero exit status: 1
Files=912, Tests=23039, 278 wallclock secs (10.48 usr  2.12 sys + 1034.98 cusr 650.30 csys = 1697.88 CPU)
Result: FAIL

And now I wonder how comes none of specific test are able to notice the
difference, yet seemingly unrelated test fails?

Thanks,
-- Sergey
Junio C Hamano Dec. 10, 2020, 9:15 p.m. UTC | #20
Sergey Organov <sorganov@gmail.com> writes:

> Well, I did a crush-test: commented-out entire if() at the beginning of
> the log_tree_diff() and ran all the tests.

Meaning "if we are not told to show diff and if we are not told to
indicate with the exit code whether there was any difference, we do
not have to do any of the following" was skipped?  

For most cases we do use the absense of "diff" as an optimization,
but I wouldn't be surprised if a caller of log_tree_commit() that
does not set "we want diff" bit leaves the other members of rev_info
that need to be set correctly for the body of the function, which is
essentially a repeated call into diff_tree machinerly, in order to
work correctly.
Junio C Hamano Dec. 10, 2020, 9:26 p.m. UTC | #21
Elijah Newren <newren@gmail.com> writes:

> ....  I know what
> diff --exit-code does, but I'm really not sure what git-log's
> --exit-code does (random guess: sets the exit code to an OR of what
> git diff would have shown for any one of the commits shown?).

The end of cmd_log_walk() does try to return diff_result_code(), but
diff_flush() can reset .has_changes to 0 or set it to 1 each
iteration, so I do not think it is "set .has_changes to 0, and never
reset to 0, but flip it to 1 if any change is found", which would
have had some chance of giving us "OR'ed together" semantics.