diff mbox series

[v2,2/2] connect, protocol: log negotiated protocol version

Message ID d138b2615c1af288627b9a2f8f7dd3d2fed184d7.1628115065.git.steadmon@google.com (mailing list archive)
State New, archived
Headers show
Series log negotiated protocol version. | expand

Commit Message

Josh Steadmon Aug. 4, 2021, 10:17 p.m. UTC
It is useful for performance monitoring and debugging purposes to know
the wire protocol used for remote operations. This may differ from the
version set in local configuration due to differences in version and/or
configuration between the server and the client. Therefore, log the
negotiated wire protocol version via trace2, for both clients and
servers.

Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 connect.c                             |  3 +++
 protocol.c                            |  3 +++
 t/t5705-session-id-in-capabilities.sh | 12 ++++++++++++
 3 files changed, 18 insertions(+)

Comments

Eric Sunshine Aug. 4, 2021, 10:28 p.m. UTC | #1
On Wed, Aug 4, 2021 at 6:17 PM Josh Steadmon <steadmon@google.com> wrote:
> It is useful for performance monitoring and debugging purposes to know
> the wire protocol used for remote operations. This may differ from the
> version set in local configuration due to differences in version and/or
> configuration between the server and the client. Therefore, log the
> negotiated wire protocol version via trace2, for both clients and
> servers.
>
> Signed-off-by: Josh Steadmon <steadmon@google.com>
> ---
> diff --git a/t/t5705-session-id-in-capabilities.sh b/t/t5705-session-id-in-capabilities.sh
> @@ -73,6 +74,17 @@ do
> +       test_expect_success "client & server log negotiated version (v${PROTO})" '
> +               test_when_finished "rm -rf local tr2-client-events tr2-server-events" &&
> +               cp -r "$LOCAL_PRISTINE" local &&
> +               GIT_TRACE2_EVENT="$(pwd)/tr2-client-events" \
> +               git -c protocol.version=$PROTO -C local fetch \
> +                       --upload-pack "GIT_TRACE2_EVENT=\"$(pwd)/tr2-server-events\" git-upload-pack" \
> +                       origin &&
> +               test "$(grep \"key\":\"negotiated-version\",\"value\":\"$PROTO\" tr2-client-events)" &&
> +               test "$(grep \"key\":\"negotiated-version\",\"value\":\"$PROTO\" tr2-server-events)"
> +       '
>  done

What are these `test` commands actually testing? Did you mean `test
-n`? Or, even better, just plain `grep` (not within a
command-substitution)?
Ævar Arnfjörð Bjarmason Aug. 4, 2021, 11:40 p.m. UTC | #2
On Wed, Aug 04 2021, Josh Steadmon wrote:

> It is useful for performance monitoring and debugging purposes to know
> the wire protocol used for remote operations. This may differ from the
> version set in local configuration due to differences in version and/or
> configuration between the server and the client. Therefore, log the
> negotiated wire protocol version via trace2, for both clients and
> servers.
>
> Signed-off-by: Josh Steadmon <steadmon@google.com>
> ---

I know Taylor asked you to change it to a string from in int upthread in
<YQmxSxTswHE/gTet@nand.local>, but I really don't see the point. But am
willing to be convinced otherwise.

It seems to me that both of these codepaths will never usefully use this
new "UNKNOWN_VERSION" string you added, i.e.:

>  connect.c                             |  3 +++
>  protocol.c                            |  3 +++
>  t/t5705-session-id-in-capabilities.sh | 12 ++++++++++++
>  3 files changed, 18 insertions(+)
>
> diff --git a/connect.c b/connect.c
> index 70b13389ba..5f0e113625 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -150,6 +150,9 @@ enum protocol_version discover_version(struct packet_reader *reader)
>  		break;
>  	}
>  
> +	trace2_data_string("transfer", NULL, "negotiated-version",
> +			   format_protocol_version(version));

Right after this.

>  	switch (version) {
>  	case protocol_v2:
>  		process_capabilities_v2(reader);

We'll die here with BUG("unknown protocol version") if it's unknown..

> diff --git a/protocol.c b/protocol.c
> index 7ec7ce896e..f52dc2d7a2 100644
> --- a/protocol.c
> +++ b/protocol.c
> @@ -87,6 +87,9 @@ enum protocol_version determine_protocol_version_server(void)
>  		string_list_clear(&list, 0);
>  	}
>  
> +	trace2_data_string("transfer", NULL, "negotiated-version",
> +			   format_protocol_version(version));
> +

And this code is simply unreachable as far as logging this
"UNKNOWN_VERSION" string goes. If we did have an unknown version we'd
die right above this with:

    die("server is speaking an unknown protocol")

And if we did not have a "version " at all we'd default to protocol_v0
here, i.e. we either die already on an unknown version, or we don't log
"UNKNOWN_VERSION" at all.

>  	return version;
>  }
>  
> diff --git a/t/t5705-session-id-in-capabilities.sh b/t/t5705-session-id-in-capabilities.sh
> index f1d189d5bc..88871c59b5 100755
> --- a/t/t5705-session-id-in-capabilities.sh
> +++ b/t/t5705-session-id-in-capabilities.sh
> @@ -40,6 +40,7 @@ do
>  		test -z "$(grep \"key\":\"server-sid\" tr2-client-events)" &&
>  		test -z "$(grep \"key\":\"client-sid\" tr2-server-events)"
>  	'
> +
>  done
>  
>  test_expect_success 'enable SID advertisement' '
> @@ -73,6 +74,17 @@ do
>  		grep \"key\":\"server-sid\" tr2-client-events &&
>  		grep \"key\":\"client-sid\" tr2-server-events
>  	'
> +
> +	test_expect_success "client & server log negotiated version (v${PROTO})" '
> +		test_when_finished "rm -rf local tr2-client-events tr2-server-events" &&
> +		cp -r "$LOCAL_PRISTINE" local &&
> +		GIT_TRACE2_EVENT="$(pwd)/tr2-client-events" \
> +		git -c protocol.version=$PROTO -C local fetch \
> +			--upload-pack "GIT_TRACE2_EVENT=\"$(pwd)/tr2-server-events\" git-upload-pack" \
> +			origin &&
> +		test "$(grep \"key\":\"negotiated-version\",\"value\":\"$PROTO\" tr2-client-events)" &&
> +		test "$(grep \"key\":\"negotiated-version\",\"value\":\"$PROTO\" tr2-server-events)"
> +	'
>  done
>  
>  test_done

So given the above I think you can come up with trace2 output where we
log "UNKNOWN_VERSION", it just seems rather useless. We'll hit a BUG()
anyway, which we also trace2 log. In terms of anyone who collect logs
surely they'll first care about logged BUG(), and second about any
version aggregation involved in such a BUG(), and it's not a big deal if
the report of versions doesn't include the "UNKNOWN VERSION" to go with
such a one-off bug.

But perhaps you and Taylor really do have a use-case for this, hence the
"willing to be convinced otherwise". I suspect the desire to log
"<unknown>" came from an assumption that we did so in any recoverable
non-BUG() case, which we won't do.
Taylor Blau Aug. 5, 2021, 1:26 a.m. UTC | #3
On Thu, Aug 05, 2021 at 01:40:51AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
> On Wed, Aug 04 2021, Josh Steadmon wrote:
>
> > It is useful for performance monitoring and debugging purposes to know
> > the wire protocol used for remote operations. This may differ from the
> > version set in local configuration due to differences in version and/or
> > configuration between the server and the client. Therefore, log the
> > negotiated wire protocol version via trace2, for both clients and
> > servers.
> >
> > Signed-off-by: Josh Steadmon <steadmon@google.com>
> > ---
>
> I know Taylor asked you to change it to a string from in int upthread in
> <YQmxSxTswHE/gTet@nand.local>, but I really don't see the point. But am
> willing to be convinced otherwise.

The conversion to log a string instead of an integer is necessary if
Josh wants to write "<unknown>" instead of -1. To me, that seemed
clearer, and I like that it makes the trace2 representation for a
protocol version separate from the protocol_version enum.

> It seems to me that both of these codepaths will never usefully use this
> new "UNKNOWN_VERSION" string you added, i.e.:
>
> >  connect.c                             |  3 +++
> >  protocol.c                            |  3 +++
> >  t/t5705-session-id-in-capabilities.sh | 12 ++++++++++++
> >  3 files changed, 18 insertions(+)
> >
> > diff --git a/connect.c b/connect.c
> > index 70b13389ba..5f0e113625 100644
> > --- a/connect.c
> > +++ b/connect.c
> > @@ -150,6 +150,9 @@ enum protocol_version discover_version(struct packet_reader *reader)
> >  		break;
> >  	}
> >
> > +	trace2_data_string("transfer", NULL, "negotiated-version",
> > +			   format_protocol_version(version));
>
> Right after this.
>
> >  	switch (version) {
> >  	case protocol_v2:
> >  		process_capabilities_v2(reader);
>
> We'll die here with BUG("unknown protocol version") if it's unknown..

Good eyes. In fact, the second switch statement shouldn't even need a
case-arm for protocol_unknown_version (but has it to cover all
enumerated values).

I didn't realize before that the unknown case really is dead code, so
we'll never log "<unknown>". And since the mapping from protocol_version
to string is identical for known values, we could probably do without
it.

And I don't much care either way. I think the benefit is really pretty
slim, and arguably my code is just adding unnecessary overhead. So I'm
happy to go with or without it, but I'd be rather sad to spend much more
of our collective time discussing it.

Thanks,
Taylor
Ævar Arnfjörð Bjarmason Aug. 5, 2021, 2:47 a.m. UTC | #4
On Wed, Aug 04 2021, Taylor Blau wrote:

> On Thu, Aug 05, 2021 at 01:40:51AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Wed, Aug 04 2021, Josh Steadmon wrote:
>>
>> > It is useful for performance monitoring and debugging purposes to know
>> > the wire protocol used for remote operations. This may differ from the
>> > version set in local configuration due to differences in version and/or
>> > configuration between the server and the client. Therefore, log the
>> > negotiated wire protocol version via trace2, for both clients and
>> > servers.
>> >
>> > Signed-off-by: Josh Steadmon <steadmon@google.com>
>> > ---
>>
>> I know Taylor asked you to change it to a string from in int upthread in
>> <YQmxSxTswHE/gTet@nand.local>, but I really don't see the point. But am
>> willing to be convinced otherwise.
>
> The conversion to log a string instead of an integer is necessary if
> Josh wants to write "<unknown>" instead of -1. To me, that seemed
> clearer, and I like that it makes the trace2 representation for a
> protocol version separate from the protocol_version enum.

Yes, having a magic -1 value would be bad, but since it seems we'll
never get it in practice...

>> It seems to me that both of these codepaths will never usefully use this
>> new "UNKNOWN_VERSION" string you added, i.e.:
>>
>> >  connect.c                             |  3 +++
>> >  protocol.c                            |  3 +++
>> >  t/t5705-session-id-in-capabilities.sh | 12 ++++++++++++
>> >  3 files changed, 18 insertions(+)
>> >
>> > diff --git a/connect.c b/connect.c
>> > index 70b13389ba..5f0e113625 100644
>> > --- a/connect.c
>> > +++ b/connect.c
>> > @@ -150,6 +150,9 @@ enum protocol_version discover_version(struct packet_reader *reader)
>> >  		break;
>> >  	}
>> >
>> > +	trace2_data_string("transfer", NULL, "negotiated-version",
>> > +			   format_protocol_version(version));
>>
>> Right after this.
>>
>> >  	switch (version) {
>> >  	case protocol_v2:
>> >  		process_capabilities_v2(reader);
>>
>> We'll die here with BUG("unknown protocol version") if it's unknown..
>
> Good eyes. In fact, the second switch statement shouldn't even need a
> case-arm for protocol_unknown_version (but has it to cover all
> enumerated values).

I didn't check if crafting an unknown version will be found earlier, or
if we'll actually reach that "unknown" case.

> I didn't realize before that the unknown case really is dead code, so
> we'll never log "<unknown>". And since the mapping from protocol_version
> to string is identical for known values, we could probably do without
> it.
>
> And I don't much care either way. I think the benefit is really pretty
> slim, and arguably my code is just adding unnecessary overhead. So I'm
> happy to go with or without it, but I'd be rather sad to spend much more
> of our collective time discussing it.

Yeah, I just think if we can be sure it's an integer *and* a valid
version when we log it, people writing future log summarizing code will
thank us, i.e. just 0, 1, 2, and in the future maybe 3, ..., but not -1
or "<unknown>" or other values we'll trust die() etc. to handle.
Josh Steadmon Aug. 6, 2021, 9:15 p.m. UTC | #5
On 2021.08.04 18:28, Eric Sunshine wrote:
> On Wed, Aug 4, 2021 at 6:17 PM Josh Steadmon <steadmon@google.com> wrote:
> > It is useful for performance monitoring and debugging purposes to know
> > the wire protocol used for remote operations. This may differ from the
> > version set in local configuration due to differences in version and/or
> > configuration between the server and the client. Therefore, log the
> > negotiated wire protocol version via trace2, for both clients and
> > servers.
> >
> > Signed-off-by: Josh Steadmon <steadmon@google.com>
> > ---
> > diff --git a/t/t5705-session-id-in-capabilities.sh b/t/t5705-session-id-in-capabilities.sh
> > @@ -73,6 +74,17 @@ do
> > +       test_expect_success "client & server log negotiated version (v${PROTO})" '
> > +               test_when_finished "rm -rf local tr2-client-events tr2-server-events" &&
> > +               cp -r "$LOCAL_PRISTINE" local &&
> > +               GIT_TRACE2_EVENT="$(pwd)/tr2-client-events" \
> > +               git -c protocol.version=$PROTO -C local fetch \
> > +                       --upload-pack "GIT_TRACE2_EVENT=\"$(pwd)/tr2-server-events\" git-upload-pack" \
> > +                       origin &&
> > +               test "$(grep \"key\":\"negotiated-version\",\"value\":\"$PROTO\" tr2-client-events)" &&
> > +               test "$(grep \"key\":\"negotiated-version\",\"value\":\"$PROTO\" tr2-server-events)"
> > +       '
> >  done
> 
> What are these `test` commands actually testing? Did you mean `test
> -n`? Or, even better, just plain `grep` (not within a
> command-substitution)?

Yes, sorry about that, just plain grep is best here. This was due to a
quick copy & paste modification, I should have spent a bit more time
thinking about the test case.

Will be fixed in V3.
Josh Steadmon Aug. 6, 2021, 9:22 p.m. UTC | #6
On 2021.08.05 04:47, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Aug 04 2021, Taylor Blau wrote:
> 
> > I didn't realize before that the unknown case really is dead code, so
> > we'll never log "<unknown>". And since the mapping from protocol_version
> > to string is identical for known values, we could probably do without
> > it.
> >
> > And I don't much care either way. I think the benefit is really pretty
> > slim, and arguably my code is just adding unnecessary overhead. So I'm
> > happy to go with or without it, but I'd be rather sad to spend much more
> > of our collective time discussing it.
> 
> Yeah, I just think if we can be sure it's an integer *and* a valid
> version when we log it, people writing future log summarizing code will
> thank us, i.e. just 0, 1, 2, and in the future maybe 3, ..., but not -1
> or "<unknown>" or other values we'll trust die() etc. to handle.

Sounds good, in V3 I will switch back to logging the enum value
directly, and will make sure we don't log anything if the version is
unknown. Thanks both for the discussion!
diff mbox series

Patch

diff --git a/connect.c b/connect.c
index 70b13389ba..5f0e113625 100644
--- a/connect.c
+++ b/connect.c
@@ -150,6 +150,9 @@  enum protocol_version discover_version(struct packet_reader *reader)
 		break;
 	}
 
+	trace2_data_string("transfer", NULL, "negotiated-version",
+			   format_protocol_version(version));
+
 	switch (version) {
 	case protocol_v2:
 		process_capabilities_v2(reader);
diff --git a/protocol.c b/protocol.c
index 7ec7ce896e..f52dc2d7a2 100644
--- a/protocol.c
+++ b/protocol.c
@@ -87,6 +87,9 @@  enum protocol_version determine_protocol_version_server(void)
 		string_list_clear(&list, 0);
 	}
 
+	trace2_data_string("transfer", NULL, "negotiated-version",
+			   format_protocol_version(version));
+
 	return version;
 }
 
diff --git a/t/t5705-session-id-in-capabilities.sh b/t/t5705-session-id-in-capabilities.sh
index f1d189d5bc..88871c59b5 100755
--- a/t/t5705-session-id-in-capabilities.sh
+++ b/t/t5705-session-id-in-capabilities.sh
@@ -40,6 +40,7 @@  do
 		test -z "$(grep \"key\":\"server-sid\" tr2-client-events)" &&
 		test -z "$(grep \"key\":\"client-sid\" tr2-server-events)"
 	'
+
 done
 
 test_expect_success 'enable SID advertisement' '
@@ -73,6 +74,17 @@  do
 		grep \"key\":\"server-sid\" tr2-client-events &&
 		grep \"key\":\"client-sid\" tr2-server-events
 	'
+
+	test_expect_success "client & server log negotiated version (v${PROTO})" '
+		test_when_finished "rm -rf local tr2-client-events tr2-server-events" &&
+		cp -r "$LOCAL_PRISTINE" local &&
+		GIT_TRACE2_EVENT="$(pwd)/tr2-client-events" \
+		git -c protocol.version=$PROTO -C local fetch \
+			--upload-pack "GIT_TRACE2_EVENT=\"$(pwd)/tr2-server-events\" git-upload-pack" \
+			origin &&
+		test "$(grep \"key\":\"negotiated-version\",\"value\":\"$PROTO\" tr2-client-events)" &&
+		test "$(grep \"key\":\"negotiated-version\",\"value\":\"$PROTO\" tr2-server-events)"
+	'
 done
 
 test_done