diff mbox series

[v2] fetch-pack: write effective filter to trace2

Message ID 20220718170027.3993042-1-jonathantanmy@google.com (mailing list archive)
State Superseded
Headers show
Series [v2] fetch-pack: write effective filter to trace2 | expand

Commit Message

Jonathan Tan July 18, 2022, 5 p.m. UTC
Administrators of a managed Git environment (like the one at $DAYJOB)
might want to quantify the performance change of fetches with and
without filters from the client's point of view, and also detect if a
server does not support it. Therefore, log the filter information being
sent to the server whenever a fetch (or clone) occurs. Note that this is
not necessarily the same as what's specified on the CLI, because during
a fetch, the configured filter is used whenever a filter is not
specified on the CLI.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
Thanks everyone for your comments. I've renamed the keys and
distinguished the case in which the server does not support filters
following Jeff's suggestions.

A few small discussion points:
 - whether we need the print_verbose now that we have traces
 - should we log if no filter is specified
 - if yes, what key should it be logged under (I used "filter/none" for
   now)
---
 fetch-pack.c | 40 ++++++++++++++++++++++++++--------------
 1 file changed, 26 insertions(+), 14 deletions(-)

Comments

Junio C Hamano July 18, 2022, 7:47 p.m. UTC | #1
Jonathan Tan <jonathantanmy@google.com> writes:

> +static void write_and_trace_filter(struct fetch_pack_args *args,
> +				   struct strbuf *req_buf,
> +				   int server_supports_filter)
> +{
> +	if (args->filter_options.choice) {
> +		const char *spec =
> +			expand_list_objects_filter_spec(&args->filter_options);
> +		if (server_supports_filter) {
> +			print_verbose(args, _("Server supports filter"));
> +			packet_buf_write(req_buf, "filter %s", spec);
> +			trace2_data_string("fetch", the_repository,
> +					   "filter/effective", spec);
> +		} else {
> +			warning("filtering not recognized by server, ignoring");
> +			trace2_data_string("fetch", the_repository,
> +					   "filter/unsupported", spec);
> +		}
> +	} else {
> +		trace2_data_string("fetch", the_repository,
> +				   "filter/none", "");
> +	}
> +}

The previous round already had the same issue, but this makes it
even worse by introducing a function that makes it clear that it
mixes quite unrelated two features (i.e. write the filter to the
other end, which MUST be done for correct operation of the protocol,
and write a log to trace2, which may not be even necessary when we
are not logging at all).

Can we make the code a bit better structured, I have to wonder, by
having two separate phases of processing, one for the operation
proper, and the other for the logging/debugging?

In a sense, we can say that the only purpose this helper function is
to tell the server end what the filter we use is by renaming it; it
is OK to have debugging statements and logging code as part of the
implementation of such a function.

I actually like that direction better.  A helper function may exist
*ONLY* to trace, in which case, having "trace" in its name would
make perfect sense.  A helper function may exist to perform the real
work, but it may log what it did to perform the real work as well.
I think the latter shouldn't have "trace" in its name at all, or our
helpers will all be called do_FOO_and_trace(), do_BAR_and_debug(),
etc., which is nonsense.  Just calling do_FOO() and do_BAR(), and
then having them log or trace as needed, is fine.

Will queue, anyway.

Thanks.
Junio C Hamano July 25, 2022, 6:56 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> +static void write_and_trace_filter(struct fetch_pack_args *args,
>> +				   struct strbuf *req_buf,
>> +				   int server_supports_filter)
>> +{
>> +...
>> +}
>
> The previous round already had the same issue, but this makes it
> even worse by introducing a function that makes it clear that it
> mixes quite unrelated two features (i.e. write the filter to the
> other end, which MUST be done for correct operation of the protocol,
> and write a log to trace2, which may not be even necessary when we
> are not logging at all).
> ...
> In a sense, we can say that the only purpose this helper function is
> to tell the server end what the filter we use is by renaming it; it
> is OK to have debugging statements and logging code as part of the
> implementation of such a function.
>
> I actually like that direction better.  A helper function may exist
> *ONLY* to trace, in which case, having "trace" in its name would
> make perfect sense.  A helper function may exist to perform the real
> work, but it may log what it did to perform the real work as well.
> I think the latter shouldn't have "trace" in its name at all, or our
> helpers will all be called do_FOO_and_trace(), do_BAR_and_debug(),
> etc., which is nonsense.  Just calling do_FOO() and do_BAR(), and
> then having them log or trace as needed, is fine.

After waiting for a week, I still haven't seen any correction to
this patch, but do you want to give the helper function a bit more
sensible name in an updated patch, perhaps say "send_filter()" or
something?

Otherwise the topic looked good.

Thanks.
Jonathan Tan July 26, 2022, 4:28 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:
> After waiting for a week, I still haven't seen any correction to
> this patch, but do you want to give the helper function a bit more
> sensible name in an updated patch, perhaps say "send_filter()" or
> something?
> 
> Otherwise the topic looked good.
> 
> Thanks.

Thanks. Your proposed name looks good. I've sent out a v3.
diff mbox series

Patch

diff --git a/fetch-pack.c b/fetch-pack.c
index cb6647d657..68820f9a1a 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -292,6 +292,29 @@  static void mark_tips(struct fetch_negotiator *negotiator,
 	return;
 }
 
+static void write_and_trace_filter(struct fetch_pack_args *args,
+				   struct strbuf *req_buf,
+				   int server_supports_filter)
+{
+	if (args->filter_options.choice) {
+		const char *spec =
+			expand_list_objects_filter_spec(&args->filter_options);
+		if (server_supports_filter) {
+			print_verbose(args, _("Server supports filter"));
+			packet_buf_write(req_buf, "filter %s", spec);
+			trace2_data_string("fetch", the_repository,
+					   "filter/effective", spec);
+		} else {
+			warning("filtering not recognized by server, ignoring");
+			trace2_data_string("fetch", the_repository,
+					   "filter/unsupported", spec);
+		}
+	} else {
+		trace2_data_string("fetch", the_repository,
+				   "filter/none", "");
+	}
+}
+
 static int find_common(struct fetch_negotiator *negotiator,
 		       struct fetch_pack_args *args,
 		       int fd[2], struct object_id *result_oid,
@@ -389,11 +412,7 @@  static int find_common(struct fetch_negotiator *negotiator,
 			packet_buf_write(&req_buf, "deepen-not %s", s->string);
 		}
 	}
-	if (server_supports_filtering && args->filter_options.choice) {
-		const char *spec =
-			expand_list_objects_filter_spec(&args->filter_options);
-		packet_buf_write(&req_buf, "filter %s", spec);
-	}
+	write_and_trace_filter(args, &req_buf, server_supports_filtering);
 	packet_buf_flush(&req_buf);
 	state_len = req_buf.len;
 
@@ -1323,15 +1342,8 @@  static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
 		die(_("Server does not support shallow requests"));
 
 	/* Add filter */
-	if (server_supports_feature("fetch", "filter", 0) &&
-	    args->filter_options.choice) {
-		const char *spec =
-			expand_list_objects_filter_spec(&args->filter_options);
-		print_verbose(args, _("Server supports filter"));
-		packet_buf_write(&req_buf, "filter %s", spec);
-	} else if (args->filter_options.choice) {
-		warning("filtering not recognized by server, ignoring");
-	}
+	write_and_trace_filter(args, &req_buf,
+			       server_supports_feature("fetch", "filter", 0));
 
 	if (server_supports_feature("fetch", "packfile-uris", 0)) {
 		int i;