diff mbox series

connect: also update offset for features without values

Message ID pull.1091.git.git.1631970872884.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series connect: also update offset for features without values | expand

Commit Message

Andrzej Hunt Sept. 18, 2021, 1:14 p.m. UTC
From: Andrzej Hunt <andrzej@ahunt.org>

parse_feature_value() does not update offset if the feature being
searched for does not specify a value. A loop that uses
parse_feature_value() to find a feature which was specified without a
value therefore might never exit (such loops will typically use
next_server_feature_value() as opposed to parse_feature_value() itself).
This usually isn't an issue: there's no point in using
next_server_feature_value() to search for repeated instances of the same
capability unless that capability typically specifies a value - but a
broken server could send a response that omits the value for a feature
even when we are expecting a value.

Therefore we add an offset update calculation for the no-value case,
which helps ensure that loops using next_server_feature_value() will
always terminate.

next_server_feature_value(), and the offset calculation, were first
added in 2.28 in:
  2c6a403d96 (connect: add function to parse multiple v1 capability values, 2020-05-25)

Thanks to Peff for authoring the test.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
---
    connect: also update offset for features without values
    
    This is a small patch to avoid an infinite loop which can occur when a
    broken server forgets to include a value when specifying symref in the
    capabilities list.
    
    Thanks to Peff for writing the test.
    
    Note: I modified the test by adding and object-format=... to the
    injected server response, because the oid that we're using is the
    default hash (which will be e.g. sha256 for some CI jobs), but our
    protocol handler assumes sha1 unless a different hash has been
    explicitly specified. I'm open to alternative suggestions.
    
    ATB,
    
    Andrzej

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1091%2Fahunt%2Fconnectloop-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1091/ahunt/connectloop-v1
Pull-Request: https://github.com/git/git/pull/1091

 connect.c                      |  2 ++
 t/t5704-protocol-violations.sh | 13 +++++++++++++
 2 files changed, 15 insertions(+)


base-commit: 186eaaae567db501179c0af0bf89b34cbea02c26

Comments

Taylor Blau Sept. 18, 2021, 3:53 p.m. UTC | #1
Hi Andrzej,

On Sat, Sep 18, 2021 at 01:14:32PM +0000, Andrzej Hunt via GitGitGadget wrote:
> From: Andrzej Hunt <andrzej@ahunt.org>

Thanks for writing this patch. I have seen a copy of this on the
security list, but the modified version here looks good to me, too. I
left a few notes throughout.

Recapping our discussion on the security list, we decided that this
didn't merit an embargoed release because a misbehaving server can still
cause a client to hang if it simply printed half of its ref
advertisement. So this issue isn't new, but fixing this instance of it
is good nonetheless.

> parse_feature_value() does not update offset if the feature being
> searched for does not specify a value. A loop that uses
> parse_feature_value() to find a feature which was specified without a
> value therefore might never exit (such loops will typically use
> next_server_feature_value() as opposed to parse_feature_value() itself).
> This usually isn't an issue: there's no point in using
> next_server_feature_value() to search for repeated instances of the same
> capability unless that capability typically specifies a value - but a
> broken server could send a response that omits the value for a feature
> even when we are expecting a value.

It may be worth adding a little detail here. parse_feature_value takes
an offset, and uses it to seek past the point in features_list that
we've already seen. But if we get a value-less feature, then offset is
never updated, and we'll keep parsing the same thing over and over in a
loop.

(I know that you know all of that, but I think it is worth spelling out
a little more clearly in the patch message).

> Therefore we add an offset update calculation for the no-value case,
> which helps ensure that loops using next_server_feature_value() will
> always terminate.

> next_server_feature_value(), and the offset calculation, were first
> added in 2.28 in:
>   2c6a403d96 (connect: add function to parse multiple v1 capability values, 2020-05-25)

This line wrapping is a little odd, but not a big deal.

>
> Thanks to Peff for authoring the test.
>
> Co-authored-by: Jeff King <peff@peff.net>
> Signed-off-by: Jeff King <peff@peff.net>
> Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
> ---
>     connect: also update offset for features without values
>
>     This is a small patch to avoid an infinite loop which can occur when a
>     broken server forgets to include a value when specifying symref in the
>     capabilities list.
>
>     Thanks to Peff for writing the test.
>
>     Note: I modified the test by adding and object-format=... to the
>     injected server response, because the oid that we're using is the
>     default hash (which will be e.g. sha256 for some CI jobs), but our
>     protocol handler assumes sha1 unless a different hash has been
>     explicitly specified. I'm open to alternative suggestions.
>
>     ATB,
>
>     Andrzej
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1091%2Fahunt%2Fconnectloop-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1091/ahunt/connectloop-v1
> Pull-Request: https://github.com/git/git/pull/1091
>
>  connect.c                      |  2 ++
>  t/t5704-protocol-violations.sh | 13 +++++++++++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/connect.c b/connect.c
> index aff13a270e6..eaf7d6d2618 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -557,6 +557,8 @@ const char *parse_feature_value(const char *feature_list, const char *feature, i
>  			if (!*value || isspace(*value)) {
>  				if (lenp)
>  					*lenp = 0;
> +				if (offset)
> +					*offset = found + len - feature_list;

The critical piece :-). Since feature_list is a superset of found, this
is perfectly safe. It calculates first the offset of the found string
within feature_list, and then adds the length of the feature name.

I would have found this easier to read if it were spelled out as:

    *offset = found - features_list + len;

which is the same thing but follows the order of how I spelled out this
expression in English. But the way you wrote it matches how
parse_feature_value() sets the offset when there is a value, so I think
it's worth being consistent with that.

> diff --git a/t/t5704-protocol-violations.sh b/t/t5704-protocol-violations.sh
> index 5c941949b98..34538cebf01 100755
> --- a/t/t5704-protocol-violations.sh
> +++ b/t/t5704-protocol-violations.sh
> @@ -32,4 +32,17 @@ test_expect_success 'extra delim packet in v2 fetch args' '
>  	test_i18ngrep "expected flush after fetch arguments" err
>  '
>
> +test_expect_success 'bogus symref in v0 capabilities' '
> +	test_commit foo &&
> +	oid=$(git rev-parse HEAD) &&
> +	{
> +		printf "%s HEAD\0symref object-format=%s\n" "$oid" "$GIT_DEFAULT_HASH" |
> +			test-tool pkt-line pack-raw-stdin &&

I'm actually really happy with this modification to add the non-empty
object-format after the broken "symref" part, since it ensures that your
offset calculation is right (and that we can continue to parse features
with or without values after a value-less one).

> +		printf "0000"
> +	} >input &&
> +	git ls-remote --upload-pack="cat input ;:" . >actual &&
> +	printf "%s\tHEAD\n" "$oid" >expect &&
> +	test_cmp expect actual
> +'

Looks great to me.

Thanks,
Taylor
brian m. carlson Sept. 18, 2021, 5:18 p.m. UTC | #2
On 2021-09-18 at 13:14:32, Andrzej Hunt via GitGitGadget wrote:
> From: Andrzej Hunt <andrzej@ahunt.org>
> 
> parse_feature_value() does not update offset if the feature being
> searched for does not specify a value. A loop that uses
> parse_feature_value() to find a feature which was specified without a
> value therefore might never exit (such loops will typically use
> next_server_feature_value() as opposed to parse_feature_value() itself).
> This usually isn't an issue: there's no point in using
> next_server_feature_value() to search for repeated instances of the same
> capability unless that capability typically specifies a value - but a
> broken server could send a response that omits the value for a feature
> even when we are expecting a value.
> 
> Therefore we add an offset update calculation for the no-value case,
> which helps ensure that loops using next_server_feature_value() will
> always terminate.
> 
> next_server_feature_value(), and the offset calculation, were first
> added in 2.28 in:
>   2c6a403d96 (connect: add function to parse multiple v1 capability values, 2020-05-25)
> 
> Thanks to Peff for authoring the test.

Thanks to both of you for the patch and test.

>     Note: I modified the test by adding and object-format=... to the
>     injected server response, because the oid that we're using is the
>     default hash (which will be e.g. sha256 for some CI jobs), but our
>     protocol handler assumes sha1 unless a different hash has been
>     explicitly specified. I'm open to alternative suggestions.

This is a fine solution, I think.

>     ATB,
>     
>     Andrzej
> 
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1091%2Fahunt%2Fconnectloop-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1091/ahunt/connectloop-v1
> Pull-Request: https://github.com/git/git/pull/1091
> 
>  connect.c                      |  2 ++
>  t/t5704-protocol-violations.sh | 13 +++++++++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/connect.c b/connect.c
> index aff13a270e6..eaf7d6d2618 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -557,6 +557,8 @@ const char *parse_feature_value(const char *feature_list, const char *feature, i
>  			if (!*value || isspace(*value)) {
>  				if (lenp)
>  					*lenp = 0;
> +				if (offset)
> +					*offset = found + len - feature_list;

Yeah, this seems sensible.  A few lines above, we compute the value
offset ("value") as "found + len", where "found" starts as
"feature_list", so this will be either the space following this value or
the NUL byte at the end of the string.  That means that we'll make
progress next time because strstr will start searching from that point.

(I'm sure all of this is obvious to you, but I'm just mentioning it to
ensure that my understanding of the code is the same as everyone
else's.)
Jeff King Sept. 18, 2021, 10:05 p.m. UTC | #3
On Sat, Sep 18, 2021 at 11:53:00AM -0400, Taylor Blau wrote:

> > +test_expect_success 'bogus symref in v0 capabilities' '
> > +	test_commit foo &&
> > +	oid=$(git rev-parse HEAD) &&
> > +	{
> > +		printf "%s HEAD\0symref object-format=%s\n" "$oid" "$GIT_DEFAULT_HASH" |
> > +			test-tool pkt-line pack-raw-stdin &&
> 
> I'm actually really happy with this modification to add the non-empty
> object-format after the broken "symref" part, since it ensures that your
> offset calculation is right (and that we can continue to parse features
> with or without values after a value-less one).

I don't think it quite does that, though. If I understand the parsing
code correctly, it walks through the list looking for entries for a
_particular_ capability. I.e., it will look for any "symref" entries,
advancing the offset counter. And then separately it will start again
looking for any object-format entries, with a brand-new offset counter
starting at 0.

So if you want to confirm that the parsing continues after the
unexpected entry, you'd want a second symref entry, and then to make
sure it was correctly parsed.  Perhaps something like this:

diff --git a/t/t5704-protocol-violations.sh b/t/t5704-protocol-violations.sh
index 34538cebf0..98d7f4981a 100755
--- a/t/t5704-protocol-violations.sh
+++ b/t/t5704-protocol-violations.sh
@@ -35,13 +35,15 @@ test_expect_success 'extra delim packet in v2 fetch args' '
 test_expect_success 'bogus symref in v0 capabilities' '
 	test_commit foo &&
 	oid=$(git rev-parse HEAD) &&
+	dst=refs/heads/foo &&
 	{
-		printf "%s HEAD\0symref object-format=%s\n" "$oid" "$GIT_DEFAULT_HASH" |
+		printf "%s HEAD\0symref object-format=%s symref=HEAD:%s\n" \
+			"$oid" "$GIT_DEFAULT_HASH" "$dst" |
 			test-tool pkt-line pack-raw-stdin &&
 		printf "0000"
 	} >input &&
-	git ls-remote --upload-pack="cat input ;:" . >actual &&
-	printf "%s\tHEAD\n" "$oid" >expect &&
+	git ls-remote --symref --upload-pack="cat input ;:" . >actual &&
+	printf "ref: %s\tHEAD\n%s\tHEAD\n" "$dst" "$oid" >expect &&
 	test_cmp expect actual
 '
 

I don't think it's hugely important (after all, this is something that
the server isn't supposed to send in the first place). But given that we
did make it work correctly (and the original on the security list
didn't), it's not too bad to test it on top.

Swapping out the "printf >expect" for a here-doc might make it a bit
more readable. I used printf because of the tab handling, but:

  tab=$(printf "\t")
  cat >expect <<-EOF
  ref: ${dst}${tab}HEAD
  ${oid}${tab}HEAD
  EOF

isn't too bad.

-Peff
Taylor Blau Sept. 18, 2021, 10:35 p.m. UTC | #4
On Sat, Sep 18, 2021 at 06:05:17PM -0400, Jeff King wrote:
> On Sat, Sep 18, 2021 at 11:53:00AM -0400, Taylor Blau wrote:
>
> > > +test_expect_success 'bogus symref in v0 capabilities' '
> > > +	test_commit foo &&
> > > +	oid=$(git rev-parse HEAD) &&
> > > +	{
> > > +		printf "%s HEAD\0symref object-format=%s\n" "$oid" "$GIT_DEFAULT_HASH" |
> > > +			test-tool pkt-line pack-raw-stdin &&
> >
> > I'm actually really happy with this modification to add the non-empty
> > object-format after the broken "symref" part, since it ensures that your
> > offset calculation is right (and that we can continue to parse features
> > with or without values after a value-less one).
>
> I don't think it quite does that, though. If I understand the parsing
> code correctly, it walks through the list looking for entries for a
> _particular_ capability. I.e., it will look for any "symref" entries,
> advancing the offset counter. And then separately it will start again
> looking for any object-format entries, with a brand-new offset counter
> starting at 0.

Ah; you're absolutely right. We call next_server_feature_value from
annotate_refs_with_symref_info() and server_supports_hash(), each of
which initializes their own offset from zero.

> So if you want to confirm that the parsing continues after the
> unexpected entry, you'd want a second symref entry, and then to make
> sure it was correctly parsed.  Perhaps something like this:
>
> [...]

Yeah, I agree that would exercise it, and I also agree that it isn't
hugely important. But this patch does make an effort to handle that
case, so it's probably worth testing.

Thanks,
Taylor
Eric Sunshine Sept. 19, 2021, 1:02 a.m. UTC | #5
On Sat, Sep 18, 2021 at 6:05 PM Jeff King <peff@peff.net> wrote:
> Swapping out the "printf >expect" for a here-doc might make it a bit
> more readable. I used printf because of the tab handling, but:
>
>   tab=$(printf "\t")
>   cat >expect <<-EOF
>   ref: ${dst}${tab}HEAD
>   ${oid}${tab}HEAD
>   EOF
>
> isn't too bad.

Or just use q_to_tab():

    q_to_tab >expect <<-EOF
    ref: ${dst}QHEAD
    ${oid}QHEAD
    EOF

However, the typical use-case for q_to_tab() is when we need a leading
or trailing TAB character. When TAB is embedded within the line, we
often just use a literal TAB character; indeed, many tests in the
suite do exactly that, so that would be an even simpler option.
Jeff King Sept. 19, 2021, 2:20 a.m. UTC | #6
On Sat, Sep 18, 2021 at 09:02:37PM -0400, Eric Sunshine wrote:

> On Sat, Sep 18, 2021 at 6:05 PM Jeff King <peff@peff.net> wrote:
> > Swapping out the "printf >expect" for a here-doc might make it a bit
> > more readable. I used printf because of the tab handling, but:
> >
> >   tab=$(printf "\t")
> >   cat >expect <<-EOF
> >   ref: ${dst}${tab}HEAD
> >   ${oid}${tab}HEAD
> >   EOF
> >
> > isn't too bad.
> 
> Or just use q_to_tab():
> 
>     q_to_tab >expect <<-EOF
>     ref: ${dst}QHEAD
>     ${oid}QHEAD
>     EOF
> 
> However, the typical use-case for q_to_tab() is when we need a leading
> or trailing TAB character.

Ah, yeah, I forgot we had that. I _thought_ we had a variable ($HT or
something) for this, but it looks like we only define and use it in a
few scripts.

I'm not sure using q_to_tab() is all that readable here, because it
blends into the HEAD token.

> When TAB is embedded within the line, we
> often just use a literal TAB character; indeed, many tests in the
> suite do exactly that, so that would be an even simpler option.

Yeah, that'd probably be OK. I usually shy away from embedded tabs
because they can cause confusion in editors. But we have them already,
and this kind of expected output is not touched all that often.

-Peff
Eric Sunshine Sept. 19, 2021, 2:53 a.m. UTC | #7
On Sat, Sep 18, 2021 at 10:20 PM Jeff King <peff@peff.net> wrote:
> On Sat, Sep 18, 2021 at 09:02:37PM -0400, Eric Sunshine wrote:
> > Or just use q_to_tab():
>
> Ah, yeah, I forgot we had that. I _thought_ we had a variable ($HT or
> something) for this, but it looks like we only define and use it in a
> few scripts.

Perhaps you were thinking about the LF or SQ variables defined by t/test-lib.sh?
Junio C Hamano Sept. 19, 2021, 7:12 a.m. UTC | #8
Jeff King <peff@peff.net> writes:

> Ah, yeah, I forgot we had that. I _thought_ we had a variable ($HT or
> something) for this, but it looks like we only define and use it in a
> few scripts.

I do not mind seeing a patch that consolidates them to test-lib.sh
and make HT sit next to SQ and LF.

> Yeah, that'd probably be OK. I usually shy away from embedded tabs
> because they can cause confusion in editors. But we have them already,
> and this kind of expected output is not touched all that often.

Hmph, I do shy away from trailing whitespaces and we cannot do
leading tabs when doing <<-EOF, but I haven't seen much problem with
them in the middle of a line.

It does become annoying when they happen to be at the 7th column
because it is hard to tell (even with one-letter-at-a-time move
command in your editor) if it is a SP or HT, but q-to-tab in the
middle of lines would make the result much harder to read and the
cure is worse than the disease.
Junio C Hamano Sept. 23, 2021, 9:20 p.m. UTC | #9
"Andrzej Hunt via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/connect.c b/connect.c
> index aff13a270e6..eaf7d6d2618 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -557,6 +557,8 @@ const char *parse_feature_value(const char *feature_list, const char *feature, i
>  			if (!*value || isspace(*value)) {
>  				if (lenp)
>  					*lenp = 0;
> +				if (offset)
> +					*offset = found + len - feature_list;
>  				return value;
>  			}
>  			/* feature with a value (e.g., "agent=git/1.2.3") */
> diff --git a/t/t5704-protocol-violations.sh b/t/t5704-protocol-violations.sh
> index 5c941949b98..34538cebf01 100755
> --- a/t/t5704-protocol-violations.sh
> +++ b/t/t5704-protocol-violations.sh
> @@ -32,4 +32,17 @@ test_expect_success 'extra delim packet in v2 fetch args' '
>  	test_i18ngrep "expected flush after fetch arguments" err
>  '
>  
> +test_expect_success 'bogus symref in v0 capabilities' '
> +	test_commit foo &&
> +	oid=$(git rev-parse HEAD) &&
> +	{
> +		printf "%s HEAD\0symref object-format=%s\n" "$oid" "$GIT_DEFAULT_HASH" |
> +			test-tool pkt-line pack-raw-stdin &&
> +		printf "0000"
> +	} >input &&
> +	git ls-remote --upload-pack="cat input ;:" . >actual &&
> +	printf "%s\tHEAD\n" "$oid" >expect &&
> +	test_cmp expect actual
> +'
> +
>  test_done
>
> base-commit: 186eaaae567db501179c0af0bf89b34cbea02c26

I've been seeing an occasional and not-reliably-reproducible test
failure from t5704 in 'seen' these days---since this is the only
commit that touches t5704, I am suspecting if there is something
racy about it, but I am coming up empty after staring at it for a
few minutes.

Building 87446480 (connect: also update offset for features without
values, 2021-09-18), which is an application of the patch directly on
top of v2.33.0, and doing

    $ cd t
    $ while sh t5704-*.sh; do :; done

I can get it fail in a dozen iterations or so when the box is
loaded, so it does seem timing dependent.
Jeff King Sept. 23, 2021, 9:38 p.m. UTC | #10
On Thu, Sep 23, 2021 at 02:20:28PM -0700, Junio C Hamano wrote:

> > +test_expect_success 'bogus symref in v0 capabilities' '
> > +	test_commit foo &&
> > +	oid=$(git rev-parse HEAD) &&
> > +	{
> > +		printf "%s HEAD\0symref object-format=%s\n" "$oid" "$GIT_DEFAULT_HASH" |
> > +			test-tool pkt-line pack-raw-stdin &&
> > +		printf "0000"
> > +	} >input &&
> > +	git ls-remote --upload-pack="cat input ;:" . >actual &&
> > +	printf "%s\tHEAD\n" "$oid" >expect &&
> > +	test_cmp expect actual
> > +'
> > +
> >  test_done
> >
> > base-commit: 186eaaae567db501179c0af0bf89b34cbea02c26
> 
> I've been seeing an occasional and not-reliably-reproducible test
> failure from t5704 in 'seen' these days---since this is the only
> commit that touches t5704, I am suspecting if there is something
> racy about it, but I am coming up empty after staring at it for a
> few minutes.
> 
> Building 87446480 (connect: also update offset for features without
> values, 2021-09-18), which is an application of the patch directly on
> top of v2.33.0, and doing
> 
>     $ cd t
>     $ while sh t5704-*.sh; do :; done
> 
> I can get it fail in a dozen iterations or so when the box is
> loaded, so it does seem timing dependent.

I think the problem is that our fake upload-pack exits immediately, so
ls-remote gets SIGPIPE. In a v0 conversation, ls-remote expects to say
"0000" to indicate that it's not interested in fetching anything (in v2,
it doesn't bother, since fetching would be a separate request that it
just declines to make).

This seems to fix it:

diff --git a/t/t5704-protocol-violations.sh b/t/t5704-protocol-violations.sh
index 34538cebf0..0983c2b507 100755
--- a/t/t5704-protocol-violations.sh
+++ b/t/t5704-protocol-violations.sh
@@ -40,7 +40,7 @@ test_expect_success 'bogus symref in v0 capabilities' '
 			test-tool pkt-line pack-raw-stdin &&
 		printf "0000"
 	} >input &&
-	git ls-remote --upload-pack="cat input ;:" . >actual &&
+	git ls-remote --upload-pack="cat input; read junk;:" . >actual &&
 	printf "%s\tHEAD\n" "$oid" >expect &&
 	test_cmp expect actual
 '

-Peff
Junio C Hamano Sept. 23, 2021, 9:52 p.m. UTC | #11
Jeff King <peff@peff.net> writes:

> I think the problem is that our fake upload-pack exits immediately, so
> ls-remote gets SIGPIPE. In a v0 conversation, ls-remote expects to say
> "0000" to indicate that it's not interested in fetching anything (in v2,
> it doesn't bother, since fetching would be a separate request that it
> just declines to make).

Ah, Makes sense---the usual SIGPIPE problem ;-)

> This seems to fix it:
>
> diff --git a/t/t5704-protocol-violations.sh b/t/t5704-protocol-violations.sh
> index 34538cebf0..0983c2b507 100755
> --- a/t/t5704-protocol-violations.sh
> +++ b/t/t5704-protocol-violations.sh
> @@ -40,7 +40,7 @@ test_expect_success 'bogus symref in v0 capabilities' '
>  			test-tool pkt-line pack-raw-stdin &&
>  		printf "0000"
>  	} >input &&
> -	git ls-remote --upload-pack="cat input ;:" . >actual &&
> +	git ls-remote --upload-pack="cat input; read junk;:" . >actual &&
>  	printf "%s\tHEAD\n" "$oid" >expect &&
>  	test_cmp expect actual
>  '

Yup.  In the original thread there was some further back-and-forth
about further improving the test, if I recall correctly; has the
issue been settled there, or is everybody happy with the above
version?

Thanks.
Jeff King Sept. 23, 2021, 10:02 p.m. UTC | #12
On Thu, Sep 23, 2021 at 02:52:53PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I think the problem is that our fake upload-pack exits immediately, so
> > ls-remote gets SIGPIPE. In a v0 conversation, ls-remote expects to say
> > "0000" to indicate that it's not interested in fetching anything (in v2,
> > it doesn't bother, since fetching would be a separate request that it
> > just declines to make).
> 
> Ah, Makes sense---the usual SIGPIPE problem ;-)

Yes, though it definitely took some head-scratching for me to see where
it was. ;)

Doing: "./t5704-* --stress" made it pretty clear. It fails almost
immediately, and mentions SIGPIPE (well, exit code 141, but by now I
have that one memorized).

> > This seems to fix it:
> >
> > diff --git a/t/t5704-protocol-violations.sh b/t/t5704-protocol-violations.sh
> > index 34538cebf0..0983c2b507 100755
> > --- a/t/t5704-protocol-violations.sh
> > +++ b/t/t5704-protocol-violations.sh
> > @@ -40,7 +40,7 @@ test_expect_success 'bogus symref in v0 capabilities' '
> >  			test-tool pkt-line pack-raw-stdin &&
> >  		printf "0000"
> >  	} >input &&
> > -	git ls-remote --upload-pack="cat input ;:" . >actual &&
> > +	git ls-remote --upload-pack="cat input; read junk;:" . >actual &&
> >  	printf "%s\tHEAD\n" "$oid" >expect &&
> >  	test_cmp expect actual
> >  '
> 
> Yup.  In the original thread there was some further back-and-forth
> about further improving the test, if I recall correctly; has the
> issue been settled there, or is everybody happy with the above
> version?

I think the change I showed earlier (to use ls-remote --symref) is worth
doing. There was lots of discussion about how to format a tab, but in
the end I don't think it really matters.

So here's that patch again, with this race fix on top, which could be
squashed in, and then I hope we can call it good.

diff --git a/t/t5704-protocol-violations.sh b/t/t5704-protocol-violations.sh
index 34538cebf0..bc393d7c31 100755
--- a/t/t5704-protocol-violations.sh
+++ b/t/t5704-protocol-violations.sh
@@ -35,13 +35,15 @@ test_expect_success 'extra delim packet in v2 fetch args' '
 test_expect_success 'bogus symref in v0 capabilities' '
 	test_commit foo &&
 	oid=$(git rev-parse HEAD) &&
+	dst=refs/heads/foo &&
 	{
-		printf "%s HEAD\0symref object-format=%s\n" "$oid" "$GIT_DEFAULT_HASH" |
+		printf "%s HEAD\0symref object-format=%s symref=HEAD:%s\n" \
+			"$oid" "$GIT_DEFAULT_HASH" "$dst" |
 			test-tool pkt-line pack-raw-stdin &&
 		printf "0000"
 	} >input &&
-	git ls-remote --upload-pack="cat input ;:" . >actual &&
-	printf "%s\tHEAD\n" "$oid" >expect &&
+	git ls-remote --symref --upload-pack="cat input; read junk;:" . >actual &&
+	printf "ref: %s\tHEAD\n%s\tHEAD\n" "$dst" "$oid" >expect &&
 	test_cmp expect actual
 '
Andrzej Hunt Sept. 26, 2021, 3:14 p.m. UTC | #13
On 18/09/2021 17:53, Taylor Blau wrote:
>> parse_feature_value() does not update offset if the feature being
>> searched for does not specify a value. A loop that uses
>> parse_feature_value() to find a feature which was specified without a
>> value therefore might never exit (such loops will typically use
>> next_server_feature_value() as opposed to parse_feature_value() itself).
>> This usually isn't an issue: there's no point in using
>> next_server_feature_value() to search for repeated instances of the same
>> capability unless that capability typically specifies a value - but a
>> broken server could send a response that omits the value for a feature
>> even when we are expecting a value.
> 
> It may be worth adding a little detail here. parse_feature_value takes
> an offset, and uses it to seek past the point in features_list that
> we've already seen. But if we get a value-less feature, then offset is
> never updated, and we'll keep parsing the same thing over and over in a
> loop.
> 
> (I know that you know all of that, but I think it is worth spelling out
> a little more clearly in the patch message).

Good point - I've tried to improve this for V2 (I've mostly just copied 
your description verbatim).

> 
>> Therefore we add an offset update calculation for the no-value case,
>> which helps ensure that loops using next_server_feature_value() will
>> always terminate.
> 
>> next_server_feature_value(), and the offset calculation, were first
>> added in 2.28 in:
>>    2c6a403d96 (connect: add function to parse multiple v1 capability values, 2020-05-25)
> 
> This line wrapping is a little odd, but not a big deal.

I'll fix this for V2 - I think I tried too hard to make this look nice, 
but putting the reference inline does look better (and I've now realised 
this seems to be the usual way of doing things here).

ATB,

Andrzej
Andrzej Hunt Sept. 26, 2021, 3:16 p.m. UTC | #14
On 24/09/2021 00:02, Jeff King wrote:
> On Thu, Sep 23, 2021 at 02:52:53PM -0700, Junio C Hamano wrote:
> 
>> Jeff King <peff@peff.net> writes:
>>
>>> I think the problem is that our fake upload-pack exits immediately, so
>>> ls-remote gets SIGPIPE. In a v0 conversation, ls-remote expects to say
>>> "0000" to indicate that it's not interested in fetching anything (in v2,
>>> it doesn't bother, since fetching would be a separate request that it
>>> just declines to make).
>>
>> Ah, Makes sense---the usual SIGPIPE problem ;-)
> 
> Yes, though it definitely took some head-scratching for me to see where
> it was. ;)
> 
> Doing: "./t5704-* --stress" made it pretty clear. It fails almost
> immediately, and mentions SIGPIPE (well, exit code 141, but by now I
> have that one memorized).
> 
>>> This seems to fix it:
>>>
>>> diff --git a/t/t5704-protocol-violations.sh b/t/t5704-protocol-violations.sh
>>> index 34538cebf0..0983c2b507 100755
>>> --- a/t/t5704-protocol-violations.sh
>>> +++ b/t/t5704-protocol-violations.sh
>>> @@ -40,7 +40,7 @@ test_expect_success 'bogus symref in v0 capabilities' '
>>>   			test-tool pkt-line pack-raw-stdin &&
>>>   		printf "0000"
>>>   	} >input &&
>>> -	git ls-remote --upload-pack="cat input ;:" . >actual &&
>>> +	git ls-remote --upload-pack="cat input; read junk;:" . >actual &&
>>>   	printf "%s\tHEAD\n" "$oid" >expect &&
>>>   	test_cmp expect actual
>>>   '
>>
>> Yup.  In the original thread there was some further back-and-forth
>> about further improving the test, if I recall correctly; has the
>> issue been settled there, or is everybody happy with the above
>> version?
> 
> I think the change I showed earlier (to use ls-remote --symref) is worth
> doing. There was lots of discussion about how to format a tab, but in
> the end I don't think it really matters.
> 
> So here's that patch again, with this race fix on top, which could be
> squashed in, and then I hope we can call it good.

Thanks again for doing the actual work - I'll send out a V2 with your 
changes squashed in, along with an attempt at improving the commit 
message (as discussed in reply to Taylor's comments).

> 
> diff --git a/t/t5704-protocol-violations.sh b/t/t5704-protocol-violations.sh
> index 34538cebf0..bc393d7c31 100755
> --- a/t/t5704-protocol-violations.sh
> +++ b/t/t5704-protocol-violations.sh
> @@ -35,13 +35,15 @@ test_expect_success 'extra delim packet in v2 fetch args' '
>   test_expect_success 'bogus symref in v0 capabilities' '
>   	test_commit foo &&
>   	oid=$(git rev-parse HEAD) &&
> +	dst=refs/heads/foo &&
>   	{
> -		printf "%s HEAD\0symref object-format=%s\n" "$oid" "$GIT_DEFAULT_HASH" |
> +		printf "%s HEAD\0symref object-format=%s symref=HEAD:%s\n" \
> +			"$oid" "$GIT_DEFAULT_HASH" "$dst" |
>   			test-tool pkt-line pack-raw-stdin &&
>   		printf "0000"
>   	} >input &&
> -	git ls-remote --upload-pack="cat input ;:" . >actual &&
> -	printf "%s\tHEAD\n" "$oid" >expect &&
> +	git ls-remote --symref --upload-pack="cat input; read junk;:" . >actual &&
> +	printf "ref: %s\tHEAD\n%s\tHEAD\n" "$dst" "$oid" >expect &&
>   	test_cmp expect actual
>   '
>   
>
diff mbox series

Patch

diff --git a/connect.c b/connect.c
index aff13a270e6..eaf7d6d2618 100644
--- a/connect.c
+++ b/connect.c
@@ -557,6 +557,8 @@  const char *parse_feature_value(const char *feature_list, const char *feature, i
 			if (!*value || isspace(*value)) {
 				if (lenp)
 					*lenp = 0;
+				if (offset)
+					*offset = found + len - feature_list;
 				return value;
 			}
 			/* feature with a value (e.g., "agent=git/1.2.3") */
diff --git a/t/t5704-protocol-violations.sh b/t/t5704-protocol-violations.sh
index 5c941949b98..34538cebf01 100755
--- a/t/t5704-protocol-violations.sh
+++ b/t/t5704-protocol-violations.sh
@@ -32,4 +32,17 @@  test_expect_success 'extra delim packet in v2 fetch args' '
 	test_i18ngrep "expected flush after fetch arguments" err
 '
 
+test_expect_success 'bogus symref in v0 capabilities' '
+	test_commit foo &&
+	oid=$(git rev-parse HEAD) &&
+	{
+		printf "%s HEAD\0symref object-format=%s\n" "$oid" "$GIT_DEFAULT_HASH" |
+			test-tool pkt-line pack-raw-stdin &&
+		printf "0000"
+	} >input &&
+	git ls-remote --upload-pack="cat input ;:" . >actual &&
+	printf "%s\tHEAD\n" "$oid" >expect &&
+	test_cmp expect actual
+'
+
 test_done