builtin/push: call set_refspecs after validating remote

Message ID	20240708140350.622986-1-karthik.188@gmail.com (mailing list archive)
State	Superseded
Headers	show Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0FDC1365 for <git@vger.kernel.org>; Mon, 8 Jul 2024 14:03:54 +0000 (UTC) From: Karthik Nayak <karthik.188@gmail.com> To: karthik.188@gmail.com Cc: git@vger.kernel.org, peff@peff.net Subject: [PATCH] builtin/push: call set_refspecs after validating remote Date: Mon, 8 Jul 2024 16:03:50 +0200 Message-ID: <20240708140350.622986-1-karthik.188@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	builtin/push: call set_refspecs after validating remote \| expand builtin/push: call set_refspecs after validating remote

karthik nayak July 8, 2024, 2:03 p.m. UTC

Since 9badf97c4 (remote: allow resetting url list), we reset the remote
URL if the provided URL is empty. This means any caller of
`remotes_remote_get()` would now get a NULL remote.

The 'builtin/push.c' code, calls 'set_refspecs' before validating the
remote. This worked earlier since we would get a remote, albeit with an
empty URL. With the new changes, we get a NULL remote and this crashes.

Do a simple fix by doing remote validation first and also add a test to
validate the bug fix.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---

I noticed that this was breaking on master. We run tests on Git master
for Gitaly at GitLab and I noticed a SEFAULT. I could also reproduce the
bug by simply doing 'git push "" refs/heads/master' on master, next and
seen. 

 builtin/push.c         | 7 ++++---
 t/t5529-push-errors.sh | 8 ++++++++
 2 files changed, 12 insertions(+), 3 deletions(-)

Jeff King July 8, 2024, 11:32 p.m. UTC | #1

On Mon, Jul 08, 2024 at 04:03:50PM +0200, Karthik Nayak wrote:

> Since 9badf97c4 (remote: allow resetting url list), we reset the remote
> URL if the provided URL is empty. This means any caller of
> `remotes_remote_get()` would now get a NULL remote.
> 
> The 'builtin/push.c' code, calls 'set_refspecs' before validating the
> remote. This worked earlier since we would get a remote, albeit with an
> empty URL. With the new changes, we get a NULL remote and this crashes.

Interesting. I think this was always a bit buggy, in the sense that the
some of the code was prepared for pushremote_get() to return NULL, but
the set_refspecs() call was not. So in theory _any_ problem with the
remote that caused pushremote_get() to bail out would be a problem. But
in practice, I'm not sure there was a way to do so, since the remote.c
code usually falls back on the given name as the url if needed, rather
than returning NULL.

And 9badf97c4 does something a bit unexpected here, since the fallback
calls the same add_url() function that we feed the config values to, and
so it respects the same "empty means reset" logic. Which means that an
empty-string remote name will no longer fall back in that way.

It's a little surprising that we hit the "empty means reset" logic here.
I wonder if that fallback path should be avoiding it. OTOH, an empty
string URL is nonsense, and it's not going to work, so maybe returning a
NULL remote is a good thing. The issue here is mostly just calling BUG()
for something that is bogus input.

> Do a simple fix by doing remote validation first and also add a test to
> validate the bug fix.

OK, so we push it further down, past the "if (!remote)" check in the
caller. I think that's a good fix. It does make me wonder why
set_refspecs() does not simply take the remote struct in the first
place? I.e.:

diff --git a/builtin/push.c b/builtin/push.c
index 992f603de7..ae787f1f63 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -96,9 +96,8 @@ static void refspec_append_mapped(struct refspec *refspec, const char *ref,
 	refspec_append(refspec, ref);
 }

-static void set_refspecs(const char **refs, int nr, const char *repo)
+static void set_refspecs(const char **refs, int nr, struct remote *remote)
 {
-	struct remote *remote = NULL;
 	struct ref *local_refs = NULL;
 	int i;

@@ -127,12 +126,6 @@ static void set_refspecs(const char **refs, int nr, const char *repo)
 			if (count_refspec_match(ref, local_refs, &matched) != 1) {
 				refspec_append(&rs, ref);
 			} else {
-				/* lazily grab remote */
-				if (!remote)
-					remote = remote_get(repo);
-				if (!remote)
-					BUG("must get a remote for repo '%s'", repo);
-
 				refspec_append_mapped(&rs, ref, remote, matched);
 			}
 		} else
@@ -648,7 +641,7 @@ int cmd_push(int argc, const char **argv, const char *prefix)
 	}

 	if (argc > 0)
-		set_refspecs(argv + 1, argc - 1, repo);
+		set_refspecs(argv + 1, argc - 1, remote);

 	if (remote->mirror)
 		flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE);

which is now possible after your patch. Note that set_refspecs()
currently calls remote_get(), but the caller will use pushremote_get()
to get the remote. I think that set_refspecs() is actually wrong here,
but it doesn't matter in practice because "repo" is always non-NULL at
this point.

But with the patch above, this kind of error would be impossible to
trigger (but again, your patch is still necessary to get the ordering
right in the first place).

> I noticed that this was breaking on master. We run tests on Git master
> for Gitaly at GitLab and I noticed a SEFAULT. I could also reproduce the
> bug by simply doing 'git push "" refs/heads/master' on master, next and
> seen. 

I don't see a segfault, but rather a BUG() that triggers SIGABRT. Does
that match what you see?

> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +
>  TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>  
> @@ -38,6 +41,11 @@ test_expect_success 'detect missing sha1 expressions early' '
>  	test_cmp expect rp-ran
>  '
>  
> +test_expect_success 'detect empty remote' '
> +	test_must_fail git push "" main 2> stderr &&
> +	grep "fatal: bad repository ''" stderr
> +'

The test makes sense. Your single-quotes are not doing what you expect,
though (they are closing and re-opening the outer test body, so end up
as the empty string). You can use $SQ$SQ instead (I'm also working on
patches to allow you to specify the body as a here-doc to avoid exactly
this kind of situation, but I don't think we should depend on that yet).

I was a little surprised you needed to use the name "main" and not just
"HEAD" or something neutral (avoiding the INITIAL_BRANCH_NAME stuff).
But we only hit the problematic code path when we match a local name.

Also, minor style nit: use "2>stderr" with no space.

Anyway, thanks for catching my bug! I think your fix is the right
approach, and we just need to adjust the test. I do think we should do
the patch I suggested above on top. I'd be happy if you want to add it
in to your series, but let me know if you want me to write a commit
message or just send it separately afterwards.

-Peff

Jeff King July 8, 2024, 11:33 p.m. UTC | #2

On Mon, Jul 08, 2024 at 12:17:38PM -0700, Junio C Hamano wrote:

> > This worked earlier since we would get a remote, albeit with an
> > empty URL. With the new changes, we get a NULL remote and this crashes.
> 
> You'd really really need to clarify what you mean by "a NULL remote"
> if you want the proposed log message and the change to be
> understood.  The change made by 9badf97c (remote: allow resetting
> url list, 2024-06-14), as far as I can tell, can make the strvecs
> that hold URL and pushURL in a remote structure empty, but it does
> not otherwise destroy the remote structure, or nullify a pointer
> that points at the remote structure.  So I am completely lost here.

I was somewhat confused how it could happen, too. ;) I left more
comments elsewhere in the thread, but the gist of it is that an empty
remote name confuses the "when there are no urls, fall back to the
remote name as a url" code.

-Peff

karthik nayak July 9, 2024, 9:05 a.m. UTC | #3

Jeff King <peff@peff.net> writes:

> On Mon, Jul 08, 2024 at 04:03:50PM +0200, Karthik Nayak wrote:
>
>> Since 9badf97c4 (remote: allow resetting url list), we reset the remote
>> URL if the provided URL is empty. This means any caller of
>> `remotes_remote_get()` would now get a NULL remote.
>>
>> The 'builtin/push.c' code, calls 'set_refspecs' before validating the
>> remote. This worked earlier since we would get a remote, albeit with an
>> empty URL. With the new changes, we get a NULL remote and this crashes.
>
> Interesting. I think this was always a bit buggy, in the sense that the
> some of the code was prepared for pushremote_get() to return NULL, but
> the set_refspecs() call was not. So in theory _any_ problem with the
> remote that caused pushremote_get() to bail out would be a problem. But
> in practice, I'm not sure there was a way to do so, since the remote.c
> code usually falls back on the given name as the url if needed, rather
> than returning NULL.
>

Yup, that seems right.

> And 9badf97c4 does something a bit unexpected here, since the fallback
> calls the same add_url() function that we feed the config values to, and
> so it respects the same "empty means reset" logic. Which means that an
> empty-string remote name will no longer fall back in that way.
>
> It's a little surprising that we hit the "empty means reset" logic here.
> I wonder if that fallback path should be avoiding it. OTOH, an empty
> string URL is nonsense, and it's not going to work, so maybe returning a
> NULL remote is a good thing. The issue here is mostly just calling BUG()
> for something that is bogus input.
>

Yup, that's the explanation that I should've added, thanks. This is a
good summary.

>> Do a simple fix by doing remote validation first and also add a test to
>> validate the bug fix.
>
> OK, so we push it further down, past the "if (!remote)" check in the
> caller. I think that's a good fix. It does make me wonder why
> set_refspecs() does not simply take the remote struct in the first
> place? I.e.:
>
> diff --git a/builtin/push.c b/builtin/push.c
> index 992f603de7..ae787f1f63 100644
> --- a/builtin/push.c
> +++ b/builtin/push.c
> @@ -96,9 +96,8 @@ static void refspec_append_mapped(struct refspec *refspec, const char *ref,
>  	refspec_append(refspec, ref);
>  }
>
> -static void set_refspecs(const char **refs, int nr, const char *repo)
> +static void set_refspecs(const char **refs, int nr, struct remote *remote)
>  {
> -	struct remote *remote = NULL;
>  	struct ref *local_refs = NULL;
>  	int i;
>
> @@ -127,12 +126,6 @@ static void set_refspecs(const char **refs, int nr, const char *repo)
>  			if (count_refspec_match(ref, local_refs, &matched) != 1) {
>  				refspec_append(&rs, ref);
>  			} else {
> -				/* lazily grab remote */
> -				if (!remote)
> -					remote = remote_get(repo);
> -				if (!remote)
> -					BUG("must get a remote for repo '%s'", repo);
> -
>  				refspec_append_mapped(&rs, ref, remote, matched);
>  			}
>  		} else
> @@ -648,7 +641,7 @@ int cmd_push(int argc, const char **argv, const char *prefix)
>  	}
>
>  	if (argc > 0)
> -		set_refspecs(argv + 1, argc - 1, repo);
> +		set_refspecs(argv + 1, argc - 1, remote);
>
>  	if (remote->mirror)
>  		flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE);
>
> which is now possible after your patch. Note that set_refspecs()
> currently calls remote_get(), but the caller will use pushremote_get()
> to get the remote. I think that set_refspecs() is actually wrong here,
> but it doesn't matter in practice because "repo" is always non-NULL at
> this point.
>
> But with the patch above, this kind of error would be impossible to
> trigger (but again, your patch is still necessary to get the ordering
> right in the first place).
>

I think this is a great addition on top of my patch, I will add it in.

>> I noticed that this was breaking on master. We run tests on Git master
>> for Gitaly at GitLab and I noticed a SEFAULT. I could also reproduce the
>> bug by simply doing 'git push "" refs/heads/master' on master, next and
>> seen.
>
> I don't see a segfault, but rather a BUG() that triggers SIGABRT. Does
> that match what you see?
>

Yes, it is indeed SIGABRT and not SEGFAULT. Will correct my message.

>> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>> +
>>  TEST_PASSES_SANITIZE_LEAK=true
>>  . ./test-lib.sh
>>
>> @@ -38,6 +41,11 @@ test_expect_success 'detect missing sha1 expressions early' '
>>  	test_cmp expect rp-ran
>>  '
>>
>> +test_expect_success 'detect empty remote' '
>> +	test_must_fail git push "" main 2> stderr &&
>> +	grep "fatal: bad repository ''" stderr
>> +'
>
> The test makes sense. Your single-quotes are not doing what you expect,
> though (they are closing and re-opening the outer test body, so end up
> as the empty string). You can use $SQ$SQ instead (I'm also working on
> patches to allow you to specify the body as a here-doc to avoid exactly
> this kind of situation, but I don't think we should depend on that yet).
>

Good catch. I'm wondering how it worked though, since I wrote the test
before the fix and used the test to validate the failure and the fix.

> I was a little surprised you needed to use the name "main" and not just
> "HEAD" or something neutral (avoiding the INITIAL_BRANCH_NAME stuff).
> But we only hit the problematic code path when we match a local name.
>

Exactly. I'll add a comment though, to make it clearer

> Also, minor style nit: use "2>stderr" with no space.
>

Thanks, will change.

>
> Anyway, thanks for catching my bug! I think your fix is the right
> approach, and we just need to adjust the test. I do think we should do
> the patch I suggested above on top. I'd be happy if you want to add it
> in to your series, but let me know if you want me to write a commit
> message or just send it separately afterwards.
>
> -Peff

I will add it in. Thanks for your help!

Jeff King July 9, 2024, 9:59 a.m. UTC | #4

On Tue, Jul 09, 2024 at 05:05:58AM -0400, Karthik Nayak wrote:

> >> +test_expect_success 'detect empty remote' '
> >> +	test_must_fail git push "" main 2> stderr &&
> >> +	grep "fatal: bad repository ''" stderr
> >> +'
> >
> > The test makes sense. Your single-quotes are not doing what you expect,
> > though (they are closing and re-opening the outer test body, so end up
> > as the empty string). You can use $SQ$SQ instead (I'm also working on
> > patches to allow you to specify the body as a here-doc to avoid exactly
> > this kind of situation, but I don't think we should depend on that yet).
> 
> Good catch. I'm wondering how it worked though, since I wrote the test
> before the fix and used the test to validate the failure and the fix.

You end up grepping for the sub-string "fatal: bad repository ",
which still matches. It's just not quite as careful as what you
intended.

-Peff

karthik nayak July 9, 2024, 9:59 a.m. UTC | #5

Junio C Hamano <gitster@pobox.com> writes:

> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> Since 9badf97c4 (remote: allow resetting url list),
>
> Please do not be original in places where it shouldn't matter.  Use
> "git show -s --format=reference" that includes the datestamp to help
> readers judge how old the problem is.
>

Will do.

>> we reset the remote
>> URL if the provided URL is empty. This means any caller of
>> `remotes_remote_get()` would now get a NULL remote.
>
> "NULL remote" meaning?
>
> If you have this:
>
>  [remote "multi"]
> 	url = wrong-one
> 	url = wrong-two
> 	url =
>
> and ask "remotes_remote_get()" to give you the remote "multi", you'd
> get a remote whose URL array has no elements.  Is that what you are
> referring to?

I was referring to the 'struct remote *' being 'NULL'. I think Jeff
explained this in his email reply to this.

>> The 'builtin/push.c' code, calls 'set_refspecs' before validating the
>> remote.
>
> There is a comment about "lazily grab remote", so it is very
> understandable.
>
>> This worked earlier since we would get a remote, albeit with an
>> empty URL. With the new changes, we get a NULL remote and this crashes.
>
> You'd really really need to clarify what you mean by "a NULL remote"
> if you want the proposed log message and the change to be
> understood.  The change made by 9badf97c (remote: allow resetting
> url list, 2024-06-14), as far as I can tell, can make the strvecs
> that hold URL and pushURL in a remote structure empty, but it does
> not otherwise destroy the remote structure, or nullify a pointer
> that points at the remote structure.  So I am completely lost here.

I will amend the commit to the following:

Since 9badf97c42 (remote: allow resetting url list, 2024-06-14), we
reset the remote URL if the provided URL is empty. When a user of
'remotes_remote_get' tries to fetch a remote with an empty repo name,
the function initializes the remote via 'make_remote'. But the remote is
still not a valid remote, since the URL is empty, so it tries to add the
URL alias using 'add_url_alias'. This in-turn will call 'add_url', but
since the URL is empty we call 'strvec_clear' on the `remote->url`. Back
in 'remotes_remote_get', we again check if the remote is valid, which
fails, so we return 'NULL' for the 'struct remote *' value.

The 'builtin/push.c' code, calls 'set_refspecs' before validating the
remote. This worked earlier since we would get a remote, albeit with an
empty URL. With the new changes, we get a NULL remote this causes the
check for remote to fail and raises the BUG.

---

I hope this makes it clearer

builtin/push: call set_refspecs after validating remote

Commit Message

Comments

Patch