Message ID | 20230627195251.1973421-2-calvinwan@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Introduce Git Standard Library | expand |
Calvin Wan wrote: > As a library boundary, wrapper.c should not directly log trace2 > statistics, but instead provide those statistics upon > request. Therefore, move the trace2 logging code to trace2.[ch.]. This > also allows wrapper.c to not be dependent on trace2.h and repository.h. > ... > diff --git a/trace2.h b/trace2.h > index 4ced30c0db..689e9a4027 100644 > --- a/trace2.h > +++ b/trace2.h > @@ -581,4 +581,9 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason); > > const char *trace2_session_id(void); > > +/* > + * Writes out trace statistics for fsync > + */ > +void trace_git_fsync_stats(void); > + This function does not belong in 'trace2.h', IMO. The purpose of that file is to contain the generic API for Trace2 (e.g., 'trace2_printf()', 'trace2_region_(enter|exit)'), whereas this function is effectively a wrapper around a specific invocation of that API. You note in the commit message that "wrapper.c should not directly log trace2 statistics" with the reasoning of "[it's] a library boundary," but I suspect the unstated underlying reason is "because it tracks 'count_fsync_*' in static variables." This case would be better handled, then, by replacing the usage in 'wrapper.c' with a new Trace2 counter (API introduced in [1]). That keeps this usage consistent with the API already established for Trace2, rather than starting an unsustainable trend of creating ad-hoc, per-metric wrappers in 'trace2.[c|h]'. An added note re: the commit message - it's extremely important that functions _anywhere in Git_ are able to use the Trace2 API directly. A developer could reasonably want to measure performance, keep track of an interesting metric, log when a region is entered in the larger trace, capture error information, etc. for any function, regardless of where in falls in the internal library organization. To that end, I think either the commit message should be rephrased to remove that statement (if the issue is really "we're using a static variable and we want to avoid that"), or the libification effort should be updated to accommodate use of Trace2 anywhere in Git. [1] https://lore.kernel.org/git/pull.1373.v4.git.1666618868.gitgitgadget@gmail.com/
> This function does not belong in 'trace2.h', IMO. The purpose of that file > is to contain the generic API for Trace2 (e.g., 'trace2_printf()', > 'trace2_region_(enter|exit)'), whereas this function is effectively a > wrapper around a specific invocation of that API. > > You note in the commit message that "wrapper.c should not directly log > trace2 statistics" with the reasoning of "[it's] a library boundary," but I > suspect the unstated underlying reason is "because it tracks 'count_fsync_*' > in static variables." This case would be better handled, then, by replacing > the usage in 'wrapper.c' with a new Trace2 counter (API introduced in [1]). > That keeps this usage consistent with the API already established for > Trace2, rather than starting an unsustainable trend of creating ad-hoc, > per-metric wrappers in 'trace2.[c|h]'. The underlying reason is for removing the trace2 dependency from wrapper.c so that when git-std-lib is compiled, there isn't a missing object for trace_git_fsync_stats(), resulting in a compilation error. However I do agree that the method I chose to do so by creating an ad-hoc wrapper is unsustainable and I will come up with a better method for doing so. > > An added note re: the commit message - it's extremely important that > functions _anywhere in Git_ are able to use the Trace2 API directly. A > developer could reasonably want to measure performance, keep track of an > interesting metric, log when a region is entered in the larger trace, > capture error information, etc. for any function, regardless of where in > falls in the internal library organization. I don't quite agree that functions _anywhere in Git_ are able to use the Trace2 API directly for the same reason that we don't have the ability to log functions in external libraries -- logging common, low-level functionality creates an unnecessary amount of log churn and those logs generally contain practically useless information. However, that does not mean that all of the functions in git-std-lib fall into that category (usage has certain functions definitely worth logging). This means that files like usage.c could instead be separated into its own library and git-std-lib would only contain files that we deem "should never be logged". > To that end, I think either the > commit message should be rephrased to remove that statement (if the issue is > really "we're using a static variable and we want to avoid that"), or the > libification effort should be updated to accommodate use of Trace2 anywhere > in Git. Besides potentially redrawing the boundaries of git-std-lib to accommodate Trace2, we're also looking into the possibility of stubbing out tracing in git-std-lib so that it and other libraries can be built and tested, and then when Trace2 is turned into a library, it's full functionality can be linked to.
Calvin Wan wrote: >> An added note re: the commit message - it's extremely important that >> functions _anywhere in Git_ are able to use the Trace2 API directly. A >> developer could reasonably want to measure performance, keep track of an >> interesting metric, log when a region is entered in the larger trace, >> capture error information, etc. for any function, regardless of where in >> falls in the internal library organization. > > I don't quite agree that functions _anywhere in Git_ are able to use > the Trace2 API directly for the same reason that we don't have the > ability to log functions in external libraries -- logging common, > low-level functionality creates an unnecessary amount of log churn and > those logs generally contain practically useless information. That may be true in your use cases, but it isn't in mine and may not be for others'. In fact, I was just using these exact fsync metrics a couple weeks ago to do some performance analysis; I could easily imagine doing something similar for another "low level" function. It's unreasonable - and unfair to future development - to make an absolute declaration about "what's useful vs. useless" and use that decision to justify severely limiting our future flexibility on the matter. > However, > that does not mean that all of the functions in git-std-lib fall into > that category (usage has certain functions definitely worth logging). > This means that files like usage.c could instead be separated into its > own library and git-std-lib would only contain files that we deem > "should never be logged". How do you make that determination? What about if/when someone realizes, somewhere down the line, that one of those "should never be logged" files would actually benefit from some aggregate metric, e.g. a Trace2 timer? This isn't a case of extracting an extraneous dependency (where a function really doesn't _need_ something it has access to); tracing & logging is a core functionality in Git, and should not be artificially constrained in the name of organization. >> To that end, I think either the >> commit message should be rephrased to remove that statement (if the issue is >> really "we're using a static variable and we want to avoid that"), or the >> libification effort should be updated to accommodate use of Trace2 anywhere >> in Git. > > Besides potentially redrawing the boundaries of git-std-lib to > accommodate Trace2, we're also looking into the possibility of > stubbing out tracing in git-std-lib so that it and other libraries can > be built and tested, and then when Trace2 is turned into a library, > it's full functionality can be linked to. If that allows you to meet your libification goals without limiting Trace2's accessibility throughout the codebase, that works for me.
On 6/27/23 3:52 PM, Calvin Wan wrote: > As a library boundary, wrapper.c should not directly log trace2 > statistics, but instead provide those statistics upon > request. Therefore, move the trace2 logging code to trace2.[ch.]. This > also allows wrapper.c to not be dependent on trace2.h and repository.h. > > Signed-off-by: Calvin Wan <calvinwan@google.com> > --- > trace2.c | 13 +++++++++++++ > trace2.h | 5 +++++ > wrapper.c | 17 ++++++----------- > wrapper.h | 4 ++-- > 4 files changed, 26 insertions(+), 13 deletions(-) > > diff --git a/trace2.c b/trace2.c > index 0efc4e7b95..f367a1ce31 100644 > --- a/trace2.c > +++ b/trace2.c > @@ -915,3 +915,16 @@ const char *trace2_session_id(void) > { > return tr2_sid_get(); > } > + > +static void log_trace_fsync_if(const char *key) > +{ > + intmax_t value = get_trace_git_fsync_stats(key); > + if (value) > + trace2_data_intmax("fsync", the_repository, key, value); > +} > + > +void trace_git_fsync_stats(void) > +{ > + log_trace_fsync_if("fsync/writeout-only"); > + log_trace_fsync_if("fsync/hardware-flush"); > +} > diff --git a/trace2.h b/trace2.h > index 4ced30c0db..689e9a4027 100644 > --- a/trace2.h > +++ b/trace2.h > @@ -581,4 +581,9 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason); > > const char *trace2_session_id(void); > > +/* > + * Writes out trace statistics for fsync > + */ > +void trace_git_fsync_stats(void); > + > #endif /* TRACE2_H */ Sorry to be late to this party, but none of this belongs in trace2.[ch]. As Victoria stated, you can/should use the new "timers and counters" feature in Trace2 to collect and log these stats. And then you don't need specific log_trace_* functions or wrappers -- just use the trace2_timer_start()/stop() or trace2_counter_add() functions as necessary around the various fsync operations. I haven't really followed the lib-ification effort, so I'm just going to GUESS that all of the Trace2_ and tr2_ prefixed functions and data structures will need to be in the lowest-level .a so that it can be called from the main .exe and any other .a's between them. Jeff
diff --git a/trace2.c b/trace2.c index 0efc4e7b95..f367a1ce31 100644 --- a/trace2.c +++ b/trace2.c @@ -915,3 +915,16 @@ const char *trace2_session_id(void) { return tr2_sid_get(); } + +static void log_trace_fsync_if(const char *key) +{ + intmax_t value = get_trace_git_fsync_stats(key); + if (value) + trace2_data_intmax("fsync", the_repository, key, value); +} + +void trace_git_fsync_stats(void) +{ + log_trace_fsync_if("fsync/writeout-only"); + log_trace_fsync_if("fsync/hardware-flush"); +} diff --git a/trace2.h b/trace2.h index 4ced30c0db..689e9a4027 100644 --- a/trace2.h +++ b/trace2.h @@ -581,4 +581,9 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason); const char *trace2_session_id(void); +/* + * Writes out trace statistics for fsync + */ +void trace_git_fsync_stats(void); + #endif /* TRACE2_H */ diff --git a/wrapper.c b/wrapper.c index 22be9812a7..bd7f0a9752 100644 --- a/wrapper.c +++ b/wrapper.c @@ -6,9 +6,7 @@ #include "config.h" #include "gettext.h" #include "object.h" -#include "repository.h" #include "strbuf.h" -#include "trace2.h" static intmax_t count_fsync_writeout_only; static intmax_t count_fsync_hardware_flush; @@ -600,16 +598,13 @@ int git_fsync(int fd, enum fsync_action action) } } -static void log_trace_fsync_if(const char *key, intmax_t value) +intmax_t get_trace_git_fsync_stats(const char *key) { - if (value) - trace2_data_intmax("fsync", the_repository, key, value); -} - -void trace_git_fsync_stats(void) -{ - log_trace_fsync_if("fsync/writeout-only", count_fsync_writeout_only); - log_trace_fsync_if("fsync/hardware-flush", count_fsync_hardware_flush); + if (!strcmp(key, "fsync/writeout-only")) + return count_fsync_writeout_only; + if (!strcmp(key, "fsync/hardware-flush")) + return count_fsync_hardware_flush; + return 0; } static int warn_if_unremovable(const char *op, const char *file, int rc) diff --git a/wrapper.h b/wrapper.h index c85b1328d1..db1bc109ed 100644 --- a/wrapper.h +++ b/wrapper.h @@ -88,9 +88,9 @@ enum fsync_action { int git_fsync(int fd, enum fsync_action action); /* - * Writes out trace statistics for fsync using the trace2 API. + * Returns trace statistics for fsync using the trace2 API. */ -void trace_git_fsync_stats(void); +intmax_t get_trace_git_fsync_stats(const char *key); /* * Preserves errno, prints a message, but gives no warning for ENOENT.
As a library boundary, wrapper.c should not directly log trace2 statistics, but instead provide those statistics upon request. Therefore, move the trace2 logging code to trace2.[ch.]. This also allows wrapper.c to not be dependent on trace2.h and repository.h. Signed-off-by: Calvin Wan <calvinwan@google.com> --- trace2.c | 13 +++++++++++++ trace2.h | 5 +++++ wrapper.c | 17 ++++++----------- wrapper.h | 4 ++-- 4 files changed, 26 insertions(+), 13 deletions(-)