mbox series

[RFC,v1,0/1] ls-remote: inconsistency from the order of args and opts

Message ID cover.1642129840.git.dyroneteng@gmail.com (mailing list archive)
Headers show
Series ls-remote: inconsistency from the order of args and opts | expand

Message

Teng Long Jan. 14, 2022, 4:24 a.m. UTC
It has a different results by executing "git ls-remote -heads origin"
and "git ls-remote origin --heads". A test for reproduction is as
follows:

@@ -0,0 +1,17 @@
+#!/bin/sh
+                                                                                                           
+test_description='test ls-tree for reproduction'
+
+. ./test-lib.sh
+
+
+test_must_fail 'Exchange the order of "--heads" and <remote>' '
+    git --version &&
+    git init "test.git" &&
+    test_commit -C "test.git" one &&
+    git -C "test.git" ls-remote --heads ./. > result.1 &&
+    git -C "test.git" ls-remote ./. --heads > result.2 &&
+    test_cmp result.1 result.2
+'
+
+test_done
--
2.34.1.391.g9ef3d6f133

"git -C "test.git" ls-remote ./. --heads" returns nothing here because
"ls-remote" supports to treat args as patterns to filter the refs list
, and "--heads" here as the pattern does not match any.

If a repo exists a ref which is named as "refs/heads/--heads", execute
`mgit ls-remote origin --heads` will return ("mgit" is a alias means
the git binary that is built by "master" in my env):
                                                                                                            
       0319b78d7c31e5203d79157fe82960db88cc4a54        refs/heads/--heads

If we execute again in terms of this patch by `git ls-remote origin --heads`:

      0319b78d7c31e5203d79157fe82960db88cc4a54        refs/heads/--heads
      0319b78d7c31e5203d79157fe82960db88cc4a54        refs/heads/master

And if we hope to let "--heads" to be treated as a "pattern", we can execute
`git ls-remote origin --heads -- "--heads"`:

      0319b78d7c31e5203d79157fe82960db88cc4a54        refs/heads/--heads

Personally, I think the second option is probably a little better, but there
are also possible backward compatibility issues, such as users ending branch
names with "-heads" and so on.

So I sent this patch to try to figure out if this was a requirement of the
original design, or if it was something that could really be improved, or if
we can find a better way to deal this issue with compatibility.

Thanks.

Teng Long (1):
  ls-remote: Make the output independent of the order of opts and
    <remote>

 builtin/ls-remote.c  |  3 +--
 t/t5512-ls-remote.sh | 24 ++++++++++++++++++++++++
 2 files changed, 25 insertions(+), 2 deletions(-)

Range-diff against v0:
-:  ---------- > 1:  9ef3d6f133 ls-remote: Make the output independent of the order of opts and <remote>

Comments

Junio C Hamano Jan. 14, 2022, 5:47 a.m. UTC | #1
Teng Long <dyroneteng@gmail.com> writes:

> +test_must_fail 'Exchange the order of "--heads" and <remote>' '
> +    git --version &&
> +    git init "test.git" &&
> +    test_commit -C "test.git" one &&
> +    git -C "test.git" ls-remote --heads ./. > result.1 &&
> +    git -C "test.git" ls-remote ./. --heads > result.2 &&

I would say that this is working exactly as designed.  As with the
unix tradition, after the command name, first come options
(e.g. "--heads", "-v", etc. that begin with a dash or two dashes),
then arguments like "origin", "master", "." that are not dashed
options/flags.

Then among the arguments, we generally take revs first and then
pathspecs.  "git help cli" explicitly mentions this, because it is
specific to "git" command suite, but it does not mention "dashed
options/flags first and then args", primarily because, at least back
when the documentation was written, this was taken as granted, iow,
those who wrote the "gitcli" documentation thought it was a common
knowledge among users that did not need to be spelled out.

Apparently, it is not a common knowledge at least for you (and
probably others).  Perhaps we should add a paragraph to the cli help
and explicitly mention "options first and then args", before we go
on to say "among args, revs first and then pathspecs".
Junio C Hamano Jan. 14, 2022, 6:27 a.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> Teng Long <dyroneteng@gmail.com> writes:
>
>> +test_must_fail 'Exchange the order of "--heads" and <remote>' '
>> +    git --version &&
>> +    git init "test.git" &&
>> +    test_commit -C "test.git" one &&
>> +    git -C "test.git" ls-remote --heads ./. > result.1 &&
>> +    git -C "test.git" ls-remote ./. --heads > result.2 &&
>
> I would say that this is working exactly as designed.  As with the
> unix tradition, after the command name, first come options
> (e.g. "--heads", "-v", etc. that begin with a dash or two dashes),
> then arguments like "origin", "master", "." that are not dashed
> options/flags.

I failed to say one important thing (I was again fooled by the "it
is too obvious to say" that led "gitcli" not to mention this, too).

"dashed-options first and then args" rule means that generally we
scan the command line from left to right one by one, and as long as
the one we are looking at begins with "-", we try to interpret it as
a dashed option, possibly eat the next one as an argument given to
the option (e.g. "--abbrev 7"), and keep repeating, until the one we
are at right now does not begin with "-".  And then everything after
that, we do not interpret it as an option.  That is how "--heads" on
the second example above, since we have seen . and took it as an
argument (not a dashed option) is considered as a pattern.

> ...
> Apparently, it is not a common knowledge at least for you (and
> probably others).  Perhaps we should add a paragraph to the cli help
> and explicitly mention "options first and then args", before we go
> on to say "among args, revs first and then pathspecs".
Teng Long Jan. 14, 2022, 6:42 a.m. UTC | #3
On Fri, Jan 14, 2022 at 1:47 PM Junio C Hamano <gitster@pobox.com> wrote:

> Apparently, it is not a common knowledge at least for you (and
> probably others).  Perhaps we should add a paragraph to the cli help
> and explicitly mention "options first and then args", before we go
> on to say "among args, revs first and then pathspecs".

It's much clearer now, thanks for the detailed answer.

Another question, if I want to follow your advice and add a short
paragraph in git CLI document, should this patch continue in the
current RFC patchset or launch a new patchset?

Thanks.
Ævar Arnfjörð Bjarmason Jan. 14, 2022, 7:57 p.m. UTC | #4
On Thu, Jan 13 2022, Junio C Hamano wrote:

> Teng Long <dyroneteng@gmail.com> writes:
>
>> +test_must_fail 'Exchange the order of "--heads" and <remote>' '
>> +    git --version &&
>> +    git init "test.git" &&
>> +    test_commit -C "test.git" one &&
>> +    git -C "test.git" ls-remote --heads ./. > result.1 &&
>> +    git -C "test.git" ls-remote ./. --heads > result.2 &&
>
> I would say that this is working exactly as designed.  As with the
> unix tradition, after the command name, first come options
> (e.g. "--heads", "-v", etc. that begin with a dash or two dashes),
> then arguments like "origin", "master", "." that are not dashed
> options/flags.
>
> Then among the arguments, we generally take revs first and then
> pathspecs.  "git help cli" explicitly mentions this, because it is
> specific to "git" command suite, but it does not mention "dashed
> options/flags first and then args", primarily because, at least back
> when the documentation was written, this was taken as granted, iow,
> those who wrote the "gitcli" documentation thought it was a common
> knowledge among users that did not need to be spelled out.
>
> Apparently, it is not a common knowledge at least for you (and
> probably others).  Perhaps we should add a paragraph to the cli help
> and explicitly mention "options first and then args", before we go
> on to say "among args, revs first and then pathspecs".

I don't think this summary is accurate.

We have multiple commands that are in GNU-fashion loose about whether
you provide options first before no-option args, or after. E.g. we
accept both of:

    git push --dry-run <remote> <ref>

And:

    git push <remote> <ref> --dry-run

The "tradition" you're referring to accurately summarizes how POSIX
specifies that things should work.

But when GNU came around its option parser was generally happy to accept
options and args in either order. E.g. these both work with GNU
coreutils, but the latter will fail on FreeBSD and various other
capital-U UNIX-es:

    touch foo; rm -v foo
    touch foo; rm foo -v

The relevant documentation being
https://www.gnu.org/prep/standards/standards.html#Standards-for-Command-Line-Interfaces;
specifically:

    "Note that the GNU version of getopt will normally permit options
    anywhere among the arguments unless the special argument ‘--’ is
    used. This is not what POSIX specifies; it is a GNU extension."

Now, as to git's behavior we mostly follow the GNU style. The default
behavior of parse_options() is to accept the looser GNU style. In this
case we're explicitly passing the PARSE_OPT_STOP_AT_NON_OPTION flag.

But why is that?

Most things that do that are doing so because we're parsing things in
two passes, i.e. for commands that have sub-commands, e.g.:

    git commit-graph [opts] [subcommand] [subcommand-opts]

In those cases we use PARSE_OPT_STOP_AT_NON_OPTION on our first pass, so
that we'll stop at "subcommand", then we'll generally continue the
parsing as of "[subcommand]" without that flag.

The general exception to that rule, as you note, being things that take
pathspecs, although for some of those we'll apply other custom
disambiguation.

In this case a look at ba5f28bf79e (ls-remote: use parse-options api,
2016-01-19) will show that it was faithfully converted from an ad-hoc
custom option parser using the "BSD-like" option parsing discussed
above.

I really don't see why we wouldn't take this patch given that the
command has one parse_options() invocation (so no subcommand
complexity), and doesn't accept pathspecs etc.

We should aim to make the command line UX consistent when possible, not
slavishly bug-for-bug follow whatever the behavior of the
pre-parse_options() code was.

We can't do that in some cases, but this case seems like a pretty clear
case where simply removing the PARSE_OPT_STOP_AT_NON_OPTION flag would
be an improvement. It's not ambiguous, and makes invocations that
currently (and inconsistently v.s. our usual behavior) error out work.
Junio C Hamano Jan. 14, 2022, 8:42 p.m. UTC | #5
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> We have multiple commands that are in GNU-fashion loose about whether
> you provide options first before no-option args, or after. E.g. we
> accept both of:
>
>     git push --dry-run <remote> <ref>
>
> And:
>
>     git push <remote> <ref> --dry-run

Yes, but I consider that a bug that we cannot fix due to backward
compatibility issues.

That is why my preference is to encourage users to stick to the
POSIX way in gltcli, just like we recommend "stuck" form of options
its parameter.

> But when GNU came around its option parser was generally happy to accept
> options and args in either order. E.g. these both work with GNU
> coreutils, but the latter will fail on FreeBSD and various other
> capital-U UNIX-es:
>
>     touch foo; rm -v foo
>     touch foo; rm foo -v

Yes, among the harm GNU has done on mankind, this is one of the
biggest ones.  We shouldn't waste our engineering time to support
more of them in our tools.

As long as users stick to the recommended "dashed options first and
then args, among which revs come first and then pathspecs", they
will be fine.
Ævar Arnfjörð Bjarmason Jan. 14, 2022, 8:57 p.m. UTC | #6
nOn Fri, Jan 14 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> We have multiple commands that are in GNU-fashion loose about whether
>> you provide options first before no-option args, or after. E.g. we
>> accept both of:
>>
>>     git push --dry-run <remote> <ref>
>>
>> And:
>>
>>     git push <remote> <ref> --dry-run
>
> Yes, but I consider that a bug that we cannot fix due to backward
> compatibility issues.
>
> That is why my preference is to encourage users to stick to the
> POSIX way in gltcli, just like we recommend "stuck" form of options
> its parameter.
>
>> But when GNU came around its option parser was generally happy to accept
>> options and args in either order. E.g. these both work with GNU
>> coreutils, but the latter will fail on FreeBSD and various other
>> capital-U UNIX-es:
>>
>>     touch foo; rm -v foo
>>     touch foo; rm foo -v

This is only an approximate list, but:
    
    $ git grep -C3 'parse_options' -- 'builtin/*.c'|grep -c PARSE_OPT_STOP_AT_NON_OPTION
    16
    $ git grep -C3 'parse_options' -- 'builtin/*.c'|grep -c -F ', 0);'
    101

The GNU-like behavior is far more common in our codebase, and I think
it's less surprising if commands work the same way for consistency.

I manually looked through the PARSE_OPT_STOP_AT_NON_OPTION cases, and I
think this is the only one that's using it for no good reason. The
others (e.g. "git config") would become ambiguous or error out as a
result.

> Yes, among the harm GNU has done on mankind, this is one of the
> biggest ones.  We shouldn't waste our engineering time to support
> more of them in our tools.
>
> As long as users stick to the recommended "dashed options first and
> then args, among which revs come first and then pathspecs", they
> will be fine.

I find it quite useful. E.g. if you typo a command or forget/want to remove an option:

    git push origin HEAD --dry-run

You can just (under readline) do C-p M-DEL, instead of the equivalent
navigating back a few words, or having to use more advanced readline
features like ^--dry-run^^ or whatever.

Anecdotally, I've been surprised by the amount of regular terminal users
whose readline skills pretty much and at using the arrow keys to make
command corrections. I think this GNU UX decision has probably saved
several accumulated man-lifetimes by now :)
brian m. carlson Jan. 14, 2022, 9:12 p.m. UTC | #7
On 2022-01-14 at 19:57:17, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jan 13 2022, Junio C Hamano wrote:
> 
> > Teng Long <dyroneteng@gmail.com> writes:
> >
> >> +test_must_fail 'Exchange the order of "--heads" and <remote>' '
> >> +    git --version &&
> >> +    git init "test.git" &&
> >> +    test_commit -C "test.git" one &&
> >> +    git -C "test.git" ls-remote --heads ./. > result.1 &&
> >> +    git -C "test.git" ls-remote ./. --heads > result.2 &&
> >
> > I would say that this is working exactly as designed.  As with the
> > unix tradition, after the command name, first come options
> > (e.g. "--heads", "-v", etc. that begin with a dash or two dashes),
> > then arguments like "origin", "master", "." that are not dashed
> > options/flags.
> >
> > Then among the arguments, we generally take revs first and then
> > pathspecs.  "git help cli" explicitly mentions this, because it is
> > specific to "git" command suite, but it does not mention "dashed
> > options/flags first and then args", primarily because, at least back
> > when the documentation was written, this was taken as granted, iow,
> > those who wrote the "gitcli" documentation thought it was a common
> > knowledge among users that did not need to be spelled out.
> >
> > Apparently, it is not a common knowledge at least for you (and
> > probably others).  Perhaps we should add a paragraph to the cli help
> > and explicitly mention "options first and then args", before we go
> > on to say "among args, revs first and then pathspecs".
> 
> I don't think this summary is accurate.
> 
> We have multiple commands that are in GNU-fashion loose about whether
> you provide options first before no-option args, or after. E.g. we
> accept both of:
> 
>     git push --dry-run <remote> <ref>
> 
> And:
> 
>     git push <remote> <ref> --dry-run
> 
> The "tradition" you're referring to accurately summarizes how POSIX
> specifies that things should work.
>
> But when GNU came around its option parser was generally happy to accept
> options and args in either order. E.g. these both work with GNU
> coreutils, but the latter will fail on FreeBSD and various other
> capital-U UNIX-es:
> 
>     touch foo; rm -v foo
>     touch foo; rm foo -v

Yes, POSIX specifies this is how it should work because it avoids
ambiguity.  According to POSIX, -v is a file, and that's a valid name on
Unix.  If GNU rm fails to delete that file or provide a diagnostic about
why it didn't, that's a bug.

In some cases, we do allow the GNU behavior of providing options
anywhere on the command line, but we don't when it causes ambiguity,
like in this case.  I think we should document the current behavior, but
I also think it's a given when working on Unix because many tools don't
work that way.  For example, test and find don't permit arbitrary
location of options and arguments and they are found on all Unix
systems.  You can't write "test foo -f".

And to prove that this is ambiguous, I provide you the following
example:

$ git update-ref refs/heads/--symref HEAD
$ git ls-remote . --symref
1ffcbaa1a5f10c9f706314d77f88de20a4a498c2        refs/heads/--symref

That prints something very different if I write "git ls-remote --symref
.".  And it is actually the case that people write this kind of syntax
in scripts relying on the current behavior and then those scripts get
used in a variety of situations with arbitrary ref names, so this should
continue to work this way.  I believe a former employer may have these
kinds of scripts, for example.

I'm not opposed to us building new tools which support the GNU behavior,
but I don't think we should change tools where we have the existing
behavior because it does lead to breakage in some scripting situations.
Junio C Hamano Jan. 14, 2022, 9:52 p.m. UTC | #8
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>>> But when GNU came around its option parser was generally happy to accept
>>> options and args in either order. E.g. these both work with GNU
>>> coreutils, but the latter will fail on FreeBSD and various other
>>> capital-U UNIX-es:
>>>
>>>     touch foo; rm -v foo
>>>     touch foo; rm foo -v
>
> This is only an approximate list, but:

Don't waste your time on this, and instead spend on something more
useful.  What I gave wrt gitcli.txt in an earlier message is final.
Ævar Arnfjörð Bjarmason Jan. 15, 2022, 12:13 a.m. UTC | #9
On Fri, Jan 14 2022, brian m. carlson wrote:

> [[PGP Signed Part:Undecided]]
> On 2022-01-14 at 19:57:17, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Thu, Jan 13 2022, Junio C Hamano wrote:
>> 
>> > Teng Long <dyroneteng@gmail.com> writes:
>> >
>> >> +test_must_fail 'Exchange the order of "--heads" and <remote>' '
>> >> +    git --version &&
>> >> +    git init "test.git" &&
>> >> +    test_commit -C "test.git" one &&
>> >> +    git -C "test.git" ls-remote --heads ./. > result.1 &&
>> >> +    git -C "test.git" ls-remote ./. --heads > result.2 &&
>> >
>> > I would say that this is working exactly as designed.  As with the
>> > unix tradition, after the command name, first come options
>> > (e.g. "--heads", "-v", etc. that begin with a dash or two dashes),
>> > then arguments like "origin", "master", "." that are not dashed
>> > options/flags.
>> >
>> > Then among the arguments, we generally take revs first and then
>> > pathspecs.  "git help cli" explicitly mentions this, because it is
>> > specific to "git" command suite, but it does not mention "dashed
>> > options/flags first and then args", primarily because, at least back
>> > when the documentation was written, this was taken as granted, iow,
>> > those who wrote the "gitcli" documentation thought it was a common
>> > knowledge among users that did not need to be spelled out.
>> >
>> > Apparently, it is not a common knowledge at least for you (and
>> > probably others).  Perhaps we should add a paragraph to the cli help
>> > and explicitly mention "options first and then args", before we go
>> > on to say "among args, revs first and then pathspecs".
>> 
>> I don't think this summary is accurate.
>> 
>> We have multiple commands that are in GNU-fashion loose about whether
>> you provide options first before no-option args, or after. E.g. we
>> accept both of:
>> 
>>     git push --dry-run <remote> <ref>
>> 
>> And:
>> 
>>     git push <remote> <ref> --dry-run
>> 
>> The "tradition" you're referring to accurately summarizes how POSIX
>> specifies that things should work.
>>
>> But when GNU came around its option parser was generally happy to accept
>> options and args in either order. E.g. these both work with GNU
>> coreutils, but the latter will fail on FreeBSD and various other
>> capital-U UNIX-es:
>> 
>>     touch foo; rm -v foo
>>     touch foo; rm foo -v
>
> Yes, POSIX specifies this is how it should work because it avoids
> ambiguity.  According to POSIX, -v is a file, and that's a valid name on
> Unix.  If GNU rm fails to delete that file or provide a diagnostic about
> why it didn't, that's a bug.

It's not a bug that its default behavior isn't to slavishly follow
POSIX, but if you'd like you can turn it on:
    
    $ touch -- file -v; rm -v file -v
    removed 'file'
    $ touch -- file -v; POSIXLY_CORRECT=1 rm -v file -v
    removed 'file'
    removed '-v'

There's a good reason for this departure in behavior: If you have a
single file called '-v' POSIX doesn't provide any way to remove it other
than something like:

    rm ./-v

Which can be painful when scripting things. I.e. you'll need to munge
the name itself, v.s. consistently supporting "--":

    rm -- -v

> In some cases, we do allow the GNU behavior of providing options
> anywhere on the command line, but we don't when it causes ambiguity,
> like in this case.  I think we should document the current behavior, but
> I also think it's a given when working on Unix because many tools don't
> work that way.  For example, test and find don't permit arbitrary
> location of options and arguments and they are found on all Unix
> systems.  You can't write "test foo -f".

Yes, *nix systems are all over the place with this. And then e.g. "dd"
and the like accept arguments whose syntax pre-dates "-"-prefixed
arguments.

I'm only talking about how our command collection should behave, and how
we should explain its behavior, or what behavior we might prefer.

> And to prove that this is ambiguous, I provide you the following
> example:
>
> $ git update-ref refs/heads/--symref HEAD
> $ git ls-remote . --symref
> 1ffcbaa1a5f10c9f706314d77f88de20a4a498c2        refs/heads/--symref
>
> That prints something very different if I write "git ls-remote --symref
> .".  And it is actually the case that people write this kind of syntax
> in scripts relying on the current behavior and then those scripts get
> used in a variety of situations with arbitrary ref names, so this should
> continue to work this way.  I believe a former employer may have these
> kinds of scripts, for example.
>
> I'm not opposed to us building new tools which support the GNU behavior,
> but I don't think we should change tools where we have the existing
> behavior because it does lead to breakage in some scripting situations.

Yes, my claim that it's "not ambiguous" in
<220114.867db2rs0n.gmgdl@evledraar.gmail.com> isn't correct, as you
point out.

I suspect those cases are probably too obscure to worry about in
practice, but yes, we might not want to convert any existing command due
to those.

But we should really be recommending use of -- or --end-of-options
whenever possible, as it can be hard to know in Git's CLI whether
options after args are accepted or not.
Junio C Hamano Jan. 15, 2022, 12:25 a.m. UTC | #10
Teng Long <dyroneteng@gmail.com> writes:

> On Fri, Jan 14, 2022 at 1:47 PM Junio C Hamano <gitster@pobox.com> wrote:
>
>> Apparently, it is not a common knowledge at least for you (and
>> probably others).  Perhaps we should add a paragraph to the cli help
>> and explicitly mention "options first and then args", before we go
>> on to say "among args, revs first and then pathspecs".
>
> It's much clearer now, thanks for the detailed answer.
>
> Another question, if I want to follow your advice and add a short
> paragraph in git CLI document, should this patch continue in the
> current RFC patchset or launch a new patchset?

If I were you, I would retract the ls-remote topic and create a
brand new topic that is about clarifying/enhancing the gitcli.txt
file, which has nothing to do with ls-remote, because the command
line option/argument convention is not specific to a single command.
Ævar Arnfjörð Bjarmason Jan. 15, 2022, 12:34 a.m. UTC | #11
On Fri, Jan 14 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>>>> But when GNU came around its option parser was generally happy to accept
>>>> options and args in either order. E.g. these both work with GNU
>>>> coreutils, but the latter will fail on FreeBSD and various other
>>>> capital-U UNIX-es:
>>>>
>>>>     touch foo; rm -v foo
>>>>     touch foo; rm foo -v
>>
>> This is only an approximate list, but:
>
> Don't waste your time on this, and instead spend on something more
> useful.  What I gave wrt gitcli.txt in an earlier message is final.

Whatever we do with git-ls-remote (which I don't really care all that
much about) gitcli should really be documenting how our tooling behaves.

Which is what I was mainly pointing out upthread, that your summary of
options before other types of args omitted that many utilities support
the reverse. Or perhaps you were only describing an asthetic choice
(which I don't think is worth debating). I'm just talking about what the
ground truth is.

What do you think about something like this to clear this up?:

diff --git a/Documentation/gitcli.txt b/Documentation/gitcli.txt
index 92e4ba6a2fa..b1387c4fe68 100644
--- a/Documentation/gitcli.txt
+++ b/Documentation/gitcli.txt
@@ -19,6 +19,13 @@ Many commands take revisions (most often "commits", but sometimes
 "tree-ish", depending on the context and command) and paths as their
 arguments.  Here are the rules:
 
+ * Options are (almost) universally accpted before other types of
+   arguments, e.g. `git cat-file -t HEAD` or `git push --dry-run
+   origin`, but in the case of those commands a GNU-style `git
+   cat-file HEAD -t` and `git push origin --dry-run` would work just
+   as well. The reverse is often not true, many commands do not accept
+   options after non-option arguments.
+
  * Revisions come first and then paths.
    E.g. in `git diff v1.0 v2.0 arch/x86 include/asm-x86`,
    `v1.0` and `v2.0` are revisions and `arch/x86` and `include/asm-x86`
Junio C Hamano Jan. 15, 2022, 12:50 a.m. UTC | #12
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> But we should really be recommending use of -- or --end-of-options
> whenever possible, as it can be hard to know in Git's CLI whether
> options after args are accepted or not.

Way before doing so, we should recommend the "options and then args"
order.  Users do not have to know which subcommands accepts options
after args.
Junio C Hamano Jan. 15, 2022, 1:01 a.m. UTC | #13
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> Which is what I was mainly pointing out upthread, that your summary of
> options before other types of args omitted that many utilities support
> the reverse. Or perhaps you were only describing an asthetic choice
> (which I don't think is worth debating). I'm just talking about what the
> ground truth is.

The ground truth is that it is unlikely that we can fix some of our
commands so that they stop taking options that are given after args,
because of inertia.  But we can teach users especially the new ones
to always use the canonical order to sidestep the whole "some
subcommands imitate misguided GNUism to make it ambiguous, some
don't" problem.

> What do you think about something like this to clear this up?:

As we should aspire to fix the misguided "options can come still
after we saw args" eventually (don't talk back on this point to
waste any more of my time on a release day), I do not think it is a
good idea to say "reverse is often not true" and stopping there.

It will mislead people to think these "not true" commands should
somehow be updated to the GNUism in the future.  It's the other way
around.

> + * Options are (almost) universally accpted before other types of
> +   arguments, e.g. `git cat-file -t HEAD` or `git push --dry-run
> +   origin`, but in the case of those commands a GNU-style `git
> +   cat-file HEAD -t` and `git push origin --dry-run` would work just
> +   as well. The reverse is often not true, many commands do not accept
> +   options after non-option arguments.

  * A subcommand may take dashed options (which may take their own
    arguments, e.g. "--max-parents 2") and arguments.  You SHOULD
    give dashed options first and then arguments.  Some commands may
    accept dashed options after you have already gave non-option
    arguments (which may make the command ambiguous), but you should
    not rely on it (because eventually we may find a way to fix
    these ambiguity by enforcing the "options then args" rule).
Ævar Arnfjörð Bjarmason Jan. 15, 2022, 1:02 a.m. UTC | #14
On Fri, Jan 14 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> But we should really be recommending use of -- or --end-of-options
>> whenever possible, as it can be hard to know in Git's CLI whether
>> options after args are accepted or not.
>
> Way before doing so, we should recommend the "options and then args"
> order.  Users do not have to know which subcommands accepts options
> after args.

If you want to script git to do e.g.:

    touch -- foo -r
    git add foo -r
    git rm foo -r
    git status foo -r
    git log foo -r

You do need to know that those won't work, if that "foo -r" is from
e.g. scripted arguments that those commands will accept or interpret
that -r as an argument.

Hence suggesting that the user just add "--" to resolve the ambiguity,
as gitcli already discusses.
Junio C Hamano Jan. 15, 2022, 1:19 a.m. UTC | #15
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> Hence suggesting that the user just add "--" to resolve the ambiguity,
> as gitcli already discusses.

Sadly, requiring "--end-of-options" is a solution for a problem that
we didn't have to create.  If we didn't take options written after
arg,

    git rm --end-of-options foo -r
    git rm foo ./-r

to force "-r" to be interpreted as a filename wouldn't have been
necessary.  The presence of "foo" before "-r" would have been
sufficient.

I agree with you that, unfortunately, we'd need to teach a way (i.e.
"--" or "--end-of-options") to defeat this misguided GNUism in some
commands.  Even if users stick to "options and then args", sadly,
they need to know it.

But we do already explain "--" and "--end-of-options" in gitcli.txt
so we should be OK.
Teng Long Jan. 17, 2022, 6:27 a.m. UTC | #16
On Fri, Jan 14, 2022 at 12:24 PM Teng Long <dyroneteng@gmail.com> wrote:

> So I sent this patch to try to figure out if this was a requirement of the
> original design, or if it was something that could really be improved, or if
> we can find a better way to deal this issue with compatibility.

I really appreciate your comments and explanations.

Now I've cleared that up, personally. It's not a bad thing that there are some
conventions for use, the point is whether the rules are clear to others. So,
after that, I will try to modify a little in "git-cli.txt" with a new patchset.

Thanks.