diff mbox series

[3/4] clone: validate --origin option before use

Message ID 0dff8cd66930130ffd5f0d7d068ad3ed47cd1c81.1599848727.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series clone: allow configurable default for -o/--origin | expand

Commit Message

Philippe Blain via GitGitGadget Sept. 11, 2020, 6:25 p.m. UTC
From: Sean Barag <sean@barag.org>

Providing a bad origin name to `git clone` currently reports an
'invalid refspec' error instead of a more explicit message explaining
that the `--origin` option was malformed.  Per Junio, it's been this way
since 8434c2f1 (Build in clone, 2008-04-27).  This patch reintroduces
validation for the provided `--origin` option, but notably _doesn't_
include a multi-level check (e.g. "foo/bar") that was present in the
original `git-clone.sh`.  It seems `git remote` allows multi-level
remote names, so applying that same validation in `git clone` seems
reasonable.

Signed-off-by: Sean Barag <sean@barag.org>
Thanks-to: Junio C Hamano <gitster@pobox.com>
---
 builtin/clone.c          | 7 +++++++
 t/t5606-clone-options.sh | 8 ++++++++
 2 files changed, 15 insertions(+)

Comments

Derrick Stolee Sept. 11, 2020, 7:24 p.m. UTC | #1
On 9/11/2020 2:25 PM, Sean Barag via GitGitGadget wrote:
> +	strbuf_addf(&resolved_refspec, "refs/heads/test:refs/remotes/%s/test", option_origin);
> +	if (!valid_fetch_refspec(resolved_refspec.buf))
> +		/* TRANSLATORS: %s will be the user-provided --origin / -o option */
> +		die(_("'%s' is not a valid origin name"), option_origin);

Looking at this again, I'm not sure the translators note is
necessary. Also, I would say "is not a valid remote name".
That makes the string align with the already-translated string
in builtin/remote.c.

This code is duplicated from builtin/remote.c, so I'd rather
see this be a helper method in refspec.c and have both
builtin/clone.c and builtin/remote.c call that helper.

Here is the helper:

void valid_remote_name(const char *name)
{
	int result;
	struct strbuf refspec = STRBUF_INIT;
	strbuf_addf(&refspec, "refs/heads/test:refs/remotes/%s/test", name);
	result = valid_fetch_refspec(refspec.buf);
	strbuf_release(&refspec);
	return result;
}

And here is the use in builtin/clone.c:

	if (!valid_remote_name(option_origin))
		die(_("'%s' is not a valid remote name"), option_origin);

and in builtin/remote.c:

	if (!valid_remote_name(name))
		die(_("'%s' is not a valid remote name"), name);

> +test_expect_success 'rejects invalid -o/--origin' '
> +
> +	test_expect_code 128 git clone -o "bad...name" parent clone-bad-name 2>err &&
> +	test_debug "cat err" &&
> +	test_i18ngrep "'\''bad...name'\'' is not a valid origin name" err
> +
> +'
> +

Double newlines here! I personally appreciate newlines to
spread out content, but it doesn't fit our guidelines.

Thanks,
-Stolee
Junio C Hamano Sept. 11, 2020, 8:39 p.m. UTC | #2
"Sean Barag via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Sean Barag <sean@barag.org>
>
> Providing a bad origin name to `git clone` currently reports an
> 'invalid refspec' error instead of a more explicit message explaining
> that the `--origin` option was malformed.  Per Junio, it's been this way
> since 8434c2f1 (Build in clone, 2008-04-27).  

If you know it as a fact that it has been this way since the first
version of rewritten "git clone", don't blame others.

A micronit.  We describe the current status in present tense when
presenting the problem to be solved, so "currently" can be dropped.

> This patch reintroduces

When presenting the solution, we write as if we are giving an order
to a patch monkey to "make the code like so".

"This patch reintroduces" -> "Reintroduce".  

> validation for the provided `--origin` option, but notably _doesn't_
> include a multi-level check (e.g. "foo/bar") that was present in the
> original `git-clone.sh`.  It seems `git remote` allows multi-level
> remote names, so applying that same validation in `git clone` seems
> reasonable.

Even though I suspect "git remote" is being overly loose and
careless here, I am not sure if it is a good idea to tighten it
retroactively.  But if this is meant as a bugfix for misconversion
made in 8434c2f1, we should be as strict as the original.  I dunno.

> +	strbuf_addf(&resolved_refspec, "refs/heads/test:refs/remotes/%s/test", option_origin);
> +	if (!valid_fetch_refspec(resolved_refspec.buf))

If we reintroduce pre-8434c2f1 check, then we would want

	if (!valid_fetch_refspec(...) || strchr(option_origin, '/'))

or something like that?

> +		/* TRANSLATORS: %s will be the user-provided --origin / -o option */
> +		die(_("'%s' is not a valid origin name"), option_origin);

I am not sure if this translator comment is helping.

The message makes it sound as if "%s" here is used to switch between
'-o' or '--origin'.  If it said "'%s' will be the value given to
--origin/-o option", it would become much less confusing.

> +	strbuf_release(&resolved_refspec);
> +
>  	repo_name = argv[0];
>  
>  	path = get_repo_path(repo_name, &is_bundle);
> diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
> index d20a78f84b..c865f96def 100755
> --- a/t/t5606-clone-options.sh
> +++ b/t/t5606-clone-options.sh
> @@ -19,6 +19,14 @@ test_expect_success 'clone -o' '
>  
>  '
>  
> +test_expect_success 'rejects invalid -o/--origin' '
> +
> +	test_expect_code 128 git clone -o "bad...name" parent clone-bad-name 2>err &&
> +	test_debug "cat err" &&
> +	test_i18ngrep "'\''bad...name'\'' is not a valid origin name" err
> +
> +'

... and we can also test for "bad/name" here in another test.

>  test_expect_success 'disallows --bare with --origin' '
>  
>  	test_expect_code 128 git clone -o foo --bare parent clone-bare-o 2>err &&

Thanks.
Sean Barag Sept. 16, 2020, 4:28 p.m. UTC | #3
On Fri, 11 Sep 2020 at 15:24:15 -0400, Derrick Stolee writes:
> On 9/11/2020 2:25 PM, Sean Barag via GitGitGadget wrote:
> > +	strbuf_addf(&resolved_refspec, "refs/heads/test:refs/remotes/%s/test", option_origin);
> > +	if (!valid_fetch_refspec(resolved_refspec.buf))
> > +		/* TRANSLATORS: %s will be the user-provided --origin / -o option */
> > +		die(_("'%s' is not a valid origin name"), option_origin);
>
> Looking at this again, I'm not sure the translators note is
> necessary. Also, I would say "is not a valid remote name".
> That makes the string align with the already-translated string
> in builtin/remote.c.

Makes perfect sense!  My original intention was to provide a separate
use-case specific message, but you're right: "is not a valid remote
name" (as in `builtin/remote.c`) is still very clear.

> This code is duplicated from builtin/remote.c, so I'd rather
> see this be a helper method in refspec.c and have both
> builtin/clone.c and builtin/remote.c call that helper.
>
> Here is the helper:
>
> void valid_remote_name(const char *name)
> {
> 	int result;
> 	struct strbuf refspec = STRBUF_INIT;
> 	strbuf_addf(&refspec, "refs/heads/test:refs/remotes/%s/test", name);
> 	result = valid_fetch_refspec(refspec.buf);
> 	strbuf_release(&refspec);
> 	return result;
> }
>
> And here is the use in builtin/clone.c:
>
> 	if (!valid_remote_name(option_origin))
> 		die(_("'%s' is not a valid remote name"), option_origin);
>
> and in builtin/remote.c:
>
> 	if (!valid_remote_name(name))
> 		die(_("'%s' is not a valid remote name"), name);

That's a fantastic idea -- thanks for the suggestion!  I'll do that for
v2.

> > +test_expect_success 'rejects invalid -o/--origin' '
> > +
> > +	test_expect_code 128 git clone -o "bad...name" parent clone-bad-name 2>err &&
> > +	test_debug "cat err" &&
> > +	test_i18ngrep "'\''bad...name'\'' is not a valid origin name" err
> > +
> > +'
> > +
>
> Double newlines here! I personally appreciate newlines to
> spread out content, but it doesn't fit our guidelines.

Darn, I missed this one.  Thanks for the heads-up :)

--
Sean
Sean Barag Sept. 16, 2020, 5:11 p.m. UTC | #4
On Fri, 11 Sep 2020 at 13:39:20 -0700, Junio C Hamano writes:
> > From: Sean Barag <sean@barag.org>
> >
> > Providing a bad origin name to `git clone` currently reports an
> > 'invalid refspec' error instead of a more explicit message
> > explaining that the `--origin` option was malformed.  Per Junio,
> > it's been this way since 8434c2f1 (Build in clone, 2008-04-27).
>
> If you know it as a fact that it has been this way since the first
> version of rewritten "git clone", don't blame others.

Oh gosh, I'm so sorry!  I didn't mean for that to read as blaming, was
just trying to cite you as my source.  On a second read, it comes across
more "blame-y" than I originally intended.  I'll fix this in v2 (and
have learned a valuable lesson :) ).

> A micronit.  We describe the current status in present tense when
> presenting the problem to be solved, so "currently" can be dropped.
>
> > This patch reintroduces
>
> When presenting the solution, we write as if we are giving an order
> to a patch monkey to "make the code like so".
>
> "This patch reintroduces" -> "Reintroduce".

Great to know!  Thanks again for helping a newbie fit in.  Will fix in
v2.

> > validation for the provided `--origin` option, but notably _doesn't_
> > include a multi-level check (e.g. "foo/bar") that was present in the
> > original `git-clone.sh`.  It seems `git remote` allows multi-level
> > remote names, so applying that same validation in `git clone` seems
> > reasonable.
>
> Even though I suspect "git remote" is being overly loose and
> careless here, I am not sure if it is a good idea to tighten it
> retroactively.  But if this is meant as a bugfix for misconversion
> made in 8434c2f1, we should be as strict as the original.  I dunno.


To be honest, I'm struggling to decide which route to go.  It seems
like multilevel fetch and push refspecs are allowed as far back as
46220ca100 (remote.c: Fix overtight refspec validation, 2008-03-20) or
ef00d150e4 (Tighten refspec processing, 2008-03-17), so perhaps
removing the multilevel check in 8434c2f1 (Build in clone, 2008-04-27)
was intentional?  If removing that check in 8434c2f1 was a mistake and
we reintroduce it, that's probably a breaking change for some users.
What sort of accommodations would I need to include in this patchset to
ease that pain for users?

> > +	strbuf_addf(&resolved_refspec, "refs/heads/test:refs/remotes/%s/test", option_origin);
> > +	if (!valid_fetch_refspec(resolved_refspec.buf))
>
> If we reintroduce pre-8434c2f1 check, then we would want
>
> 	if (!valid_fetch_refspec(...) || strchr(option_origin, '/'))
>
> or something like that?

Absolutely!  Happy to include the multilevel check if you think it's the
right path forward (see above).

> > +		/* TRANSLATORS: %s will be the user-provided --origin / -o option */
> > +		die(_("'%s' is not a valid origin name"), option_origin);
>
> I am not sure if this translator comment is helping.
>
> The message makes it sound as if "%s" here is used to switch between
> '-o' or '--origin'.  If it said "'%s' will be the value given to
> --origin/-o option", it would become much less confusing.

Agreed.  I plan to take Derrick's suggestion [1] and use the same
" is not a valid remote name" message from `builtin/remote.c`, which
should make that translator comment a non-issue.

[1] https://lore.kernel.org/git/bf0107fb-2a6c-68d3-df24-72c6a9df6182@gmail.com/


I can't stress enough how sorry I am for the improper blame, and how
much I appreciate your help!
--
Sean
Sean Barag Sept. 21, 2020, 4:13 p.m. UTC | #5
On Wed, 16 Sep 2020 at 10:11:51 -0700, Sean Barag writes:
> > > validation for the provided `--origin` option, but notably
> > > _doesn't_ include a multi-level check (e.g. "foo/bar") that was
> > > present in the original `git-clone.sh`.  It seems `git remote`
> > > allows multi-level remote names, so applying that same validation
> > > in `git clone` seems reasonable.
> >
> > Even though I suspect "git remote" is being overly loose and
> > careless here, I am not sure if it is a good idea to tighten it
> > retroactively.  But if this is meant as a bugfix for misconversion
> > made in 8434c2f1, we should be as strict as the original.  I dunno.
>
>
> To be honest, I'm struggling to decide which route to go.  It seems
> like multilevel fetch and push refspecs are allowed as far back as
> 46220ca100 (remote.c: Fix overtight refspec validation, 2008-03-20) or
> ef00d150e4 (Tighten refspec processing, 2008-03-17), so perhaps
> removing the multilevel check in 8434c2f1 (Build in clone, 2008-04-27)
> was intentional?  If removing that check in 8434c2f1 was a mistake and
> we reintroduce it, that's probably a breaking change for some users.
> What sort of accommodations would I need to include in this patchset
> to ease that pain for users?

Thinking about this more, I'd be very surprised as a `git` user if
introducing a new config option broke backwards compatibility.  Other
`git` UIs have mixed support for slashes in remote names:

* GitHub Desktop has an open issue regarding remote names that contain
  slashes: https://github.com/desktop/desktop/issues/3618
* Sublime Merge (as-of build 2032) renders remote names with slashes as
  a tree of remotes, e.g.:

      $ git remote -v
      foo/bar     /tmp/example_repo
      foo/baz     /tmp/example_repo2

  is rendered with as a collapsible tree, roughly:

      REMOTES (2)
      v foo
        bar
        baz

* `tig` (2.4.1) renders remote names with slashes identically to those
  without slashes

Retroactively tightening those rules would cause more harm than good
(both for end-users and for downstream projects), especially with no
safe way to fix existing /-containing remote names.  I'm going to keep
the multilevel check out of this patchset (thus continuing to allowing
multilevel remote names), but if anyone feels strongly either way please
let me know! :)

Sean
diff mbox series

Patch

diff --git a/builtin/clone.c b/builtin/clone.c
index bf095815f0..1cd62d0001 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -967,6 +967,7 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	const struct ref *ref;
 	struct strbuf key = STRBUF_INIT;
 	struct strbuf default_refspec = STRBUF_INIT;
+	struct strbuf resolved_refspec = STRBUF_INIT;
 	struct strbuf branch_top = STRBUF_INIT, reflog_msg = STRBUF_INIT;
 	struct transport *transport = NULL;
 	const char *src_ref_prefix = "refs/heads/";
@@ -1011,6 +1012,12 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (!option_origin)
 		option_origin = "origin";
 
+	strbuf_addf(&resolved_refspec, "refs/heads/test:refs/remotes/%s/test", option_origin);
+	if (!valid_fetch_refspec(resolved_refspec.buf))
+		/* TRANSLATORS: %s will be the user-provided --origin / -o option */
+		die(_("'%s' is not a valid origin name"), option_origin);
+	strbuf_release(&resolved_refspec);
+
 	repo_name = argv[0];
 
 	path = get_repo_path(repo_name, &is_bundle);
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index d20a78f84b..c865f96def 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -19,6 +19,14 @@  test_expect_success 'clone -o' '
 
 '
 
+test_expect_success 'rejects invalid -o/--origin' '
+
+	test_expect_code 128 git clone -o "bad...name" parent clone-bad-name 2>err &&
+	test_debug "cat err" &&
+	test_i18ngrep "'\''bad...name'\'' is not a valid origin name" err
+
+'
+
 test_expect_success 'disallows --bare with --origin' '
 
 	test_expect_code 128 git clone -o foo --bare parent clone-bare-o 2>err &&