diff mbox series

revisions(7): clarify that most commands take a single revision range

Message ID xmqqv97g2svd.fsf@gitster.g (mailing list archive)
State Accepted
Commit 83a689d8addae2b54257ca5547b34a7de2b8f17d
Headers show
Series revisions(7): clarify that most commands take a single revision range | expand

Commit Message

Junio C Hamano May 18, 2021, 11:17 a.m. UTC
Sometimes new people are confused by how a revision "range" works,
in that it is not a random collection of commits but a set of
commits that are all connected to each other, and most Git commands
work on a single such "range".

Give an example to clarify it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * So, here it is in a proper patch form, with an extended
   description and illustration.

 Documentation/revisions.txt | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

Comments

Bagas Sanjaya May 20, 2021, 2:27 a.m. UTC | #1
On 18/05/21 18.17, Junio C Hamano wrote:
> +Commands that are specifically designed to take two distinct ranges
> +(e.g. "git range-diff R1 R2" to compare two ranges) do exist, but
> +they are exceptions.  Unless otherwise noted, all "git" commands
> +that operate on a set of commits work on a single revision range.
> +In other words, writing two "two-dot range notation" next to each
> +other, e.g.
> +
> +    $ git log A..B C..D
> +
> +does *not* specify two revision ranges for most commands.  Instead
> +it will name a single connected set of commits, i.e. those that are
> +reachable from either B or D but are reachable from neither A or C.
> +In a linear history like this:
> +
> +    ---A---B---o---o---C---D
> +

So "git log A..B C..D" is same as "A..D", right?
Junio C Hamano May 20, 2021, 4:58 a.m. UTC | #2
Bagas Sanjaya <bagasdotme@gmail.com> writes:

> On 18/05/21 18.17, Junio C Hamano wrote:
>> ...
>> +In other words, writing two "two-dot range notation" next to each
>> +other, e.g.
>> +
>> +    $ git log A..B C..D
>> +
>> +does *not* specify two revision ranges for most commands.  Instead
>> +it will name a single connected set of commits, i.e. those that are
>> +reachable from either B or D but are reachable from neither A or C.
>> +In a linear history like this:
>> +
>> +    ---A---B---o---o---C---D
>> +
>
> So "git log A..B C..D" is same as "A..D", right?

A..B C..D is equivalent to ^A ^C B D, and in order to be part of the
set it represents, a commit must not be reachable from A, must not
be reachable from C, and must be reachable from B or D.

In the picture, A, B and two o's are all reachable from C, therefore
are not part of the set A..B C..D represents.  Neither is C, as it
is reachable from C.  That leaves only D in the resulting range.

A..D is a set of connected five commits, B o o C D in the above
picture.

So, no.

The confusion we often see goes more like "The set A..B contains B
(and nothing else), and C..D contains D (and nothing else), hence
'git log A..B C..D' would show B and D".  But that is not what
happens because "git log" (like most other commands) takes just a
"range" that is "A..B C..D", which is a set of connected commits
each of whose member is reachable from one of the "positive"
endpoints (like B and D) and is not reachable from any of the
"negative" endpoints (like A and C).
Junio C Hamano May 20, 2021, 5:02 a.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> Bagas Sanjaya <bagasdotme@gmail.com> writes:
>
>> On 18/05/21 18.17, Junio C Hamano wrote:
>>> ...
>>> +In other words, writing two "two-dot range notation" next to each
>>> +other, e.g.
>>> +
>>> +    $ git log A..B C..D
>>> +
>>> +does *not* specify two revision ranges for most commands.  Instead
>>> +it will name a single connected set of commits, i.e. those that are
>>> +reachable from either B or D but are reachable from neither A or C.
>>> +In a linear history like this:
>>> +
>>> +    ---A---B---o---o---C---D
>>> +
>>
>> So "git log A..B C..D" is same as "A..D", right?
>
> A..B C..D is equivalent to ^A ^C B D, and in order to be part of the
> set it represents, a commit must not be reachable from A, must not
> be reachable from C, and must be reachable from B or D.
>
> In the picture, A, B and two o's are all reachable from C, therefore
> are not part of the set A..B C..D represents.  Neither is C, as it
> is reachable from C.  That leaves only D in the resulting range.
>
> A..D is a set of connected five commits, B o o C D in the above
> picture.
>
> So, no.
>
> The confusion we often see goes more like "The set A..B contains B
> (and nothing else), and C..D contains D (and nothing else), hence
> 'git log A..B C..D' would show B and D".  But that is not what
> happens because "git log" (like most other commands) takes just a
> "range" that is "A..B C..D", which is a set of connected commits
> each of whose member is reachable from one of the "positive"
> endpoints (like B and D) and is not reachable from any of the
> "negative" endpoints (like A and C).

Well, apparently the proposed text may have failed to educate you
about what a "revision range" is and how it works, so it is not good
enough, so I'll postpone merging the change down further and see if
somebody else can come up with a better description.

Thanks.
Bagas Sanjaya May 20, 2021, 5:26 a.m. UTC | #4
On 20/05/21 12.02, Junio C Hamano wrote:
>> The confusion we often see goes more like "The set A..B contains B
>> (and nothing else), and C..D contains D (and nothing else), hence
>> 'git log A..B C..D' would show B and D".  But that is not what
>> happens because "git log" (like most other commands) takes just a
>> "range" that is "A..B C..D", which is a set of connected commits
>> each of whose member is reachable from one of the "positive"
>> endpoints (like B and D) and is not reachable from any of the
>> "negative" endpoints (like A and C).
> 
> Well, apparently the proposed text may have failed to educate you
> about what a "revision range" is and how it works, so it is not good
> enough, so I'll postpone merging the change down further and see if
> somebody else can come up with a better description.
> 
> Thanks.
> 

 From Pro Git book [1]:
> The most common range specification is the double-dot syntax. This basically asks Git to resolve a range of commits that are reachable from one commit but aren’t reachable from another.
> Say you want to see what is in your experiment branch that hasn’t yet been merged into your master branch. You can ask Git to show you a log of just those commits with master..experiment — that means “all commits reachable from experiment that aren’t reachable from master.” 
> If, on the other hand, you want to see the opposite — all commits in master that aren’t in experiment — you can reverse the branch names. experiment..master shows you everything in master not reachable from experiment

So in the first case, git log master..experiment shows all commits that
are only on experiment, while git log experiment..master shows all commits
that are only on master.

This above are often confused by most Git users, because they execute the
latter when they want semantics of the former.

I CC'ed Scott Chacon because he wrote the description about revision
range in Pro Git book. Let's see what his opinions are.

[1]: https://git-scm.com/book/en/v2/Git-Tools-Revision-Selection
Elijah Newren May 20, 2021, 4:40 p.m. UTC | #5
On Wed, May 19, 2021 at 7:28 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>
> On 18/05/21 18.17, Junio C Hamano wrote:
> > +Commands that are specifically designed to take two distinct ranges
> > +(e.g. "git range-diff R1 R2" to compare two ranges) do exist, but
> > +they are exceptions.  Unless otherwise noted, all "git" commands
> > +that operate on a set of commits work on a single revision range.
> > +In other words, writing two "two-dot range notation" next to each
> > +other, e.g.
> > +
> > +    $ git log A..B C..D
> > +
> > +does *not* specify two revision ranges for most commands.  Instead
> > +it will name a single connected set of commits, i.e. those that are
> > +reachable from either B or D but are reachable from neither A or C.
> > +In a linear history like this:
> > +
> > +    ---A---B---o---o---C---D
> > +

Why did you snip off the immediate next part of Junio's text which said:

+because A and B are reachable from C, the revision range specified
+by these two dotted ranges is a single commit D.

Is this sentence hard to parse or confusing in some way?  I thought
this sentence would have made it pretty clear that the answer to this
question:

>
> So "git log A..B C..D" is same as "A..D", right?

was 'no', so I'm curious if that particular final sentence's wording
could be improved.
Elijah Newren May 20, 2021, 4:45 p.m. UTC | #6
On Wed, May 19, 2021 at 10:03 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Bagas Sanjaya <bagasdotme@gmail.com> writes:
> >
> >> On 18/05/21 18.17, Junio C Hamano wrote:
> >>> ...
> >>> +In other words, writing two "two-dot range notation" next to each
> >>> +other, e.g.
> >>> +
> >>> +    $ git log A..B C..D
> >>> +
> >>> +does *not* specify two revision ranges for most commands.  Instead
> >>> +it will name a single connected set of commits, i.e. those that are
> >>> +reachable from either B or D but are reachable from neither A or C.
> >>> +In a linear history like this:
> >>> +
> >>> +    ---A---B---o---o---C---D
> >>> +
> >>
> >> So "git log A..B C..D" is same as "A..D", right?
> >
> > A..B C..D is equivalent to ^A ^C B D, and in order to be part of the
> > set it represents, a commit must not be reachable from A, must not
> > be reachable from C, and must be reachable from B or D.
> >
> > In the picture, A, B and two o's are all reachable from C, therefore
> > are not part of the set A..B C..D represents.  Neither is C, as it
> > is reachable from C.  That leaves only D in the resulting range.
> >
> > A..D is a set of connected five commits, B o o C D in the above
> > picture.
> >
> > So, no.
> >
> > The confusion we often see goes more like "The set A..B contains B
> > (and nothing else), and C..D contains D (and nothing else), hence
> > 'git log A..B C..D' would show B and D".  But that is not what
> > happens because "git log" (like most other commands) takes just a
> > "range" that is "A..B C..D", which is a set of connected commits
> > each of whose member is reachable from one of the "positive"
> > endpoints (like B and D) and is not reachable from any of the
> > "negative" endpoints (like A and C).
>
> Well, apparently the proposed text may have failed to educate you
> about what a "revision range" is and how it works, so it is not good
> enough, so I'll postpone merging the change down further and see if
> somebody else can come up with a better description.
>
> Thanks.

I think it's helpful and would have answered questions for users that
I've had to manually explain to folks a few times, so while it may not
be optimal, I do think your description is an improvement to the docs.
That said, it can't hurt to see if we can find out what caused Bagas'
confusion and see if we can improve it, but I wouldn't hold it up
indefinitely if no better wording comes along.
Eric Sunshine May 20, 2021, 4:53 p.m. UTC | #7
On Thu, May 20, 2021 at 12:45 PM Elijah Newren <newren@gmail.com> wrote:
> On Wed, May 19, 2021 at 10:03 PM Junio C Hamano <gitster@pobox.com> wrote:
> > Well, apparently the proposed text may have failed to educate you
> > about what a "revision range" is and how it works, so it is not good
> > enough, so I'll postpone merging the change down further and see if
> > somebody else can come up with a better description.
> >
> > Thanks.
>
> I think it's helpful and would have answered questions for users that
> I've had to manually explain to folks a few times, so while it may not
> be optimal, I do think your description is an improvement to the docs.
> That said, it can't hurt to see if we can find out what caused Bagas'
> confusion and see if we can improve it, but I wouldn't hold it up
> indefinitely if no better wording comes along.

For what it's worth, as a person who is far from being a
revision-range expert (and who doesn't typically think about them), I
found the proposed text illuminating and clearly written. I learned
from it. So, I agree with Elijah[1] that it is a good improvement to
have (even if it's not perfect for every reader).

[1]: Extended LInear Jump AHead
Felipe Contreras May 21, 2021, 7:19 p.m. UTC | #8
Junio C Hamano wrote:
> +Commands that are specifically designed to take two distinct ranges
> +(e.g. "git range-diff R1 R2" to compare two ranges) do exist, but
> +they are exceptions.  Unless otherwise noted, all "git" commands

Not sure why "git" is in quotes.

> +that operate on a set of commits work on a single revision range.
> +In other words, writing two "two-dot range notation" next to each
> +other, e.g.
> +
> +    $ git log A..B C..D
> +
> +does *not* specify two revision ranges for most commands.  Instead
> +it will name a single connected set of commits, i.e. those that are
> +reachable from either B or D but are reachable from neither A or C.
> +In a linear history like this:
> +
> +    ---A---B---o---o---C---D
> +
> +because A and B are reachable from C, the revision range specified
> +by these two dotted ranges is a single commit D.

  For example, if you have a linear history like this:

    ---A---B---C---D---E---F

  Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3
  commits, but doing A..F B..E will not retrieve 8 commits. Instead the
  starting point A gets overriden by B, and the ending point of E by F,
  effectively becoming B..F.

  With more complex graphs the result is not so simple and might result
  in two disconnected sets of commits, but that is considered a single
  revision range.
diff mbox series

Patch

diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt
index d9169c062e..f5f17b65a1 100644
--- a/Documentation/revisions.txt
+++ b/Documentation/revisions.txt
@@ -260,6 +260,9 @@  any of the given commits.
 A commit's reachable set is the commit itself and the commits in
 its ancestry chain.
 
+There are several notations to specify a set of connected commits
+(called a "revision range"), illustrated below.
+
 
 Commit Exclusions
 ~~~~~~~~~~~~~~~~~
@@ -294,6 +297,26 @@  is a shorthand for 'HEAD..origin' and asks "What did the origin do since
 I forked from them?"  Note that '..' would mean 'HEAD..HEAD' which is an
 empty range that is both reachable and unreachable from HEAD.
 
+Commands that are specifically designed to take two distinct ranges
+(e.g. "git range-diff R1 R2" to compare two ranges) do exist, but
+they are exceptions.  Unless otherwise noted, all "git" commands
+that operate on a set of commits work on a single revision range.
+In other words, writing two "two-dot range notation" next to each
+other, e.g.
+
+    $ git log A..B C..D
+
+does *not* specify two revision ranges for most commands.  Instead
+it will name a single connected set of commits, i.e. those that are
+reachable from either B or D but are reachable from neither A or C.
+In a linear history like this:
+
+    ---A---B---o---o---C---D
+
+because A and B are reachable from C, the revision range specified
+by these two dotted ranges is a single commit D.
+
+
 Other <rev>{caret} Parent Shorthand Notations
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Three other shorthands exist, particularly useful for merge commits,