Message ID | c19c1dcc592186d8da2857692f4ebbfe35adfac0.1706664145.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Enrich Trailer API | expand |
"Linus Arver via GitGitGadget" <gitgitgadget@gmail.com> writes: > Make this struct private by putting its definition inside trailer.c. > This has two benefits: > > (1) it makes the surface area of the public facing > interface (trailer.h) smaller, and > > (2) external API users are unable to peer inside this struct (because > it is only ever exposed as an opaque pointer). At the cost of an extra pointer dereference every time the member of the struct is accessed, plus the cost of allocation/deallocation. Which may not be a huge deal, but I wonder if the approach to name the member of the outer struct with "private" that seems to be used in other parts of the code (e.g. the .private_size member in the hashmap structure, the .refs_private member in the repository structure) or even a documented convention (e.g. raw_object_store), might be more appropriate here. If Coccinelle works well (which we may be having some trouble with --- cf. <xmqq1q9ybsnl.fsf@gitster.g>), we should be able to catch external accesses without having to hide the implementation details via an extra pointer dereference. > @@ -176,11 +176,12 @@ static void interpret_trailers(const struct process_trailer_options *opts, > strbuf_release(&trailer_block); > > free_trailers(&head); > - trailer_info_release(&info); > > /* Print the lines after the trailers as is */ > if (!opts->only_trailers) > - fwrite(sb.buf + info.trailer_block_end, 1, sb.len - info.trailer_block_end, outfile); > + fwrite(sb.buf + trailer_block_end(info), 1, sb.len - trailer_block_end(info), outfile); > + > + trailer_info_release(info); Interesting. Is this an indenendent bugfix even if we decided not to take this patch? No, I have not fully decided to be negative on the move this entire patch makes (even though I am leaning towards saying so). Just hypothetically, even if we wanted to keep "info" here as a structure and not a pointer to an opaque structure, doesn't this hunk fix a real bug? Well, technically, not quite, because the members referenced in that if (.only_trailers) block are still live in the info struct. But it still smells wrong to access info.* after calling _release() on it, and this fix should come before "info" is turned from an instance to a pointer, I would say. > diff --git a/trailer.h b/trailer.h > index a7599067acc..e19ddf84e64 100644 > --- a/trailer.h > +++ b/trailer.h > @@ -4,6 +4,8 @@ > #include "list.h" > #include "strbuf.h" > > +struct trailer_info; > + > enum trailer_where { > WHERE_DEFAULT, > WHERE_END, > @@ -29,27 +31,6 @@ int trailer_set_where(enum trailer_where *item, const char *value); > int trailer_set_if_exists(enum trailer_if_exists *item, const char *value); > int trailer_set_if_missing(enum trailer_if_missing *item, const char *value); > > +size_t trailer_block_start(struct trailer_info *info); > +size_t trailer_block_end(struct trailer_info *info); > +int blank_line_before_trailer_block(struct trailer_info *info); And we need new accessors, which is a good change regardless of the answer to the "do we really want an extra pointer dereference? Shouldn't the existing 'private' and 'internal' label we see below sufficient?" question. > @@ -142,7 +123,7 @@ struct trailer_iterator { > > /* private */ > struct { > - struct trailer_info info; > + struct trailer_info *info; > size_t cur; > } internal; > };
Junio C Hamano <gitster@pobox.com> writes: > "Linus Arver via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> Make this struct private by putting its definition inside trailer.c. >> This has two benefits: >> >> (1) it makes the surface area of the public facing >> interface (trailer.h) smaller, and >> >> (2) external API users are unable to peer inside this struct (because >> it is only ever exposed as an opaque pointer). > > At the cost of an extra pointer dereference every time the member of > the struct is accessed, plus the cost of allocation/deallocation. Added to commit message for v4, thanks. > Which may not be a huge deal, but I wonder if the approach to name > the member of the outer struct with "private" that seems to be used > in other parts of the code (e.g. the .private_size member in the > hashmap structure, the .refs_private member in the repository > structure) or even a documented convention (e.g. raw_object_store), > might be more appropriate here. Having gotten my hands dirty with the pimpl idiom, I think possibly the best thing about it is that the compiler can tell me exactly where every Git subsystem outside of trailer.c is accessing members inside the (newly private) struct. So it's great for checking existing usage of how things are used already, day-to-day. Not being well-versed in shortlog or sequencer internals, these warnings from the compiler for these functions (trying to peer through the opaque pointer) have been very informative. I think it would be reasonable to drop the idiom if the additional performance costs (pointer dereference, allocation/deallocation) are too much to bear. For the trailer subsystem I don't see an immediate need to focus on performance though, so I think it's fine for now. > If Coccinelle works well (which we > may be having some trouble with --- cf. <xmqq1q9ybsnl.fsf@gitster.g>), > we should be able to catch external accesses without having to hide > the implementation details via an extra pointer dereference. That's true. But that would only work for trailer API users inside the Git codebase. Eventually I envision projects external to Git using <trailer.h>, and in that scenario we would _really_ not want to let these external users peek inside structs with members named "private" or "internal" in name only. Murphy's law suggests those external users will (naughtily) depend on the internals, and that would place an undue burden on us if we ever wanted to make large changes to these (public) structs because we might break them. Another thing to consider is that if our API users ever want to get more flexibility out of it, they would immediately come to us if we used the pimpl idiom (our users are bound by exactly how we define the API). If the structs were public, then they would be free to implement new functionality to extend the API on their own, without ever letting us know. If there are folks making small (unofficial) extensions to the API, I want that to happen _in_ this project, not outside. In summary what I'm saying is that this idiom gives very strong guidance for desining an API for both Git-internal and external-to-Git usage. I think for at least the trailer subsystem it's worth trying out. I've pointed out some of the same points in the cover letter also. >> @@ -176,11 +176,12 @@ static void interpret_trailers(const struct process_trailer_options *opts, >> strbuf_release(&trailer_block); >> >> free_trailers(&head); >> - trailer_info_release(&info); >> >> /* Print the lines after the trailers as is */ >> if (!opts->only_trailers) >> - fwrite(sb.buf + info.trailer_block_end, 1, sb.len - info.trailer_block_end, outfile); >> + fwrite(sb.buf + trailer_block_end(info), 1, sb.len - trailer_block_end(info), outfile); >> + >> + trailer_info_release(info); > > Interesting. Is this an indenendent bugfix even if we decided not > to take this patch? Nice catch! > No, I have not fully decided to be negative on > the move this entire patch makes (even though I am leaning towards > saying so). Just hypothetically, even if we wanted to keep "info" > here as a structure and not a pointer to an opaque structure, > doesn't this hunk fix a real bug? > > Well, technically, not quite, because the members referenced in that > if (.only_trailers) block are still live in the info struct. But it > still smells wrong to access info.* after calling _release() on it, > and this fix should come before "info" is turned from an instance to > a pointer, I would say. I've promoted this change into a bugfix patch as the very first patch for the next reroll. Thanks. >> diff --git a/trailer.h b/trailer.h >> index a7599067acc..e19ddf84e64 100644 >> --- a/trailer.h >> +++ b/trailer.h >> @@ -4,6 +4,8 @@ >> #include "list.h" >> #include "strbuf.h" >> >> +struct trailer_info; >> + >> enum trailer_where { >> WHERE_DEFAULT, >> WHERE_END, >> @@ -29,27 +31,6 @@ int trailer_set_where(enum trailer_where *item, const char *value); >> int trailer_set_if_exists(enum trailer_if_exists *item, const char *value); >> int trailer_set_if_missing(enum trailer_if_missing *item, const char *value); >> >> +size_t trailer_block_start(struct trailer_info *info); >> +size_t trailer_block_end(struct trailer_info *info); >> +int blank_line_before_trailer_block(struct trailer_info *info); > > And we need new accessors, which is a good change regardless of the > answer to the "do we really want an extra pointer dereference? > Shouldn't the existing 'private' and 'internal' label we see below > sufficient?" question. I am very surprised with your response here, because I was expecting the complete opposite reaction from reviewers --- something like Hmph, we have to write accessor functions now when before we could just reach inside the struct directly? Isn't this just adding needless verbosity? (perhaps with more dissatisfaction). Something tells me you meant "bad change" but accidentally wrote "good change". Am I correct?
Linus Arver <linusa@google.com> writes: >>> +size_t trailer_block_start(struct trailer_info *info); >>> +size_t trailer_block_end(struct trailer_info *info); >>> +int blank_line_before_trailer_block(struct trailer_info *info); >> >> And we need new accessors, which is a good change regardless of the >> answer to the "do we really want an extra pointer dereference? >> Shouldn't the existing 'private' and 'internal' label we see below >> sufficient?" question. > > I am very surprised with your response here, because I was expecting the > complete opposite reaction from reviewers --- something like > > Hmph, we have to write accessor functions now when before we could > just reach inside the struct directly? Isn't this just adding > needless verbosity? > > (perhaps with more dissatisfaction). Something tells me you meant "bad > change" but accidentally wrote "good change". Am I correct? I often make typoes and screw up my double negations, but not this time. While I still suspect that the extra secrecy afforded by using a pointer to an opaque structure is something unnecessary between friends, as coding convention, it would be a good change to introduce accessors and have them used by the API users, i.e. code outside the API implementation. You can still "git grep" for references to the ".trailer_private" (or whatever the "private" members are named by convention) member outside the trailer "client code" to see if there are people who are too intimate than they should be that way, as they should all be using the published accessors.
diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c index 5352ee65bd1..9e6ed6b65e2 100644 --- a/builtin/interpret-trailers.c +++ b/builtin/interpret-trailers.c @@ -141,7 +141,7 @@ static void interpret_trailers(const struct process_trailer_options *opts, LIST_HEAD(head); struct strbuf sb = STRBUF_INIT; struct strbuf trailer_block = STRBUF_INIT; - struct trailer_info info; + struct trailer_info *info; FILE *outfile = stdout; trailer_config_init(); @@ -151,13 +151,13 @@ static void interpret_trailers(const struct process_trailer_options *opts, if (opts->in_place) outfile = create_in_place_tempfile(file); - parse_trailers(opts, &info, sb.buf, &head); + info = parse_trailers(opts, sb.buf, &head); /* Print the lines before the trailers */ if (!opts->only_trailers) - fwrite(sb.buf, 1, info.trailer_block_start, outfile); + fwrite(sb.buf, 1, trailer_block_start(info), outfile); - if (!opts->only_trailers && !info.blank_line_before_trailer) + if (!opts->only_trailers && !blank_line_before_trailer_block(info)) fprintf(outfile, "\n"); @@ -176,11 +176,12 @@ static void interpret_trailers(const struct process_trailer_options *opts, strbuf_release(&trailer_block); free_trailers(&head); - trailer_info_release(&info); /* Print the lines after the trailers as is */ if (!opts->only_trailers) - fwrite(sb.buf + info.trailer_block_end, 1, sb.len - info.trailer_block_end, outfile); + fwrite(sb.buf + trailer_block_end(info), 1, sb.len - trailer_block_end(info), outfile); + + trailer_info_release(info); if (opts->in_place) if (rename_tempfile(&trailers_tempfile, file)) diff --git a/trailer.c b/trailer.c index 5bcc9b0006c..63774cd068d 100644 --- a/trailer.c +++ b/trailer.c @@ -11,6 +11,27 @@ * Copyright (c) 2013, 2014 Christian Couder <chriscool@tuxfamily.org> */ +struct trailer_info { + /* + * True if there is a blank line before the location pointed to by + * trailer_block_start. + */ + int blank_line_before_trailer; + + /* + * Offsets to the trailer block start and end positions in the input + * string. If no trailer block is found, these are both set to the + * "true" end of the input (find_end_of_log_message()). + */ + size_t trailer_block_start, trailer_block_end; + + /* + * Array of trailers found. + */ + char **trailers; + size_t trailer_nr; +}; + struct conf_info { char *name; char *key; @@ -1025,20 +1046,72 @@ void format_trailers(const struct process_trailer_options *opts, } } +static struct trailer_info *trailer_info_new(void) +{ + struct trailer_info *info = xcalloc(1, sizeof(*info)); + return info; +} + +static struct trailer_info *trailer_info_get(const struct process_trailer_options *opts, + const char *str) +{ + struct trailer_info *info = trailer_info_new(); + size_t end_of_log_message = 0, trailer_block_start = 0; + struct strbuf **trailer_lines, **ptr; + char **trailer_strings = NULL; + size_t nr = 0, alloc = 0; + char **last = NULL; + + trailer_config_init(); + + end_of_log_message = find_end_of_log_message(str, opts->no_divider); + trailer_block_start = find_trailer_block_start(str, end_of_log_message); + + trailer_lines = strbuf_split_buf(str + trailer_block_start, + end_of_log_message - trailer_block_start, + '\n', + 0); + for (ptr = trailer_lines; *ptr; ptr++) { + if (last && isspace((*ptr)->buf[0])) { + struct strbuf sb = STRBUF_INIT; + strbuf_attach(&sb, *last, strlen(*last), strlen(*last)); + strbuf_addbuf(&sb, *ptr); + *last = strbuf_detach(&sb, NULL); + continue; + } + ALLOC_GROW(trailer_strings, nr + 1, alloc); + trailer_strings[nr] = strbuf_detach(*ptr, NULL); + last = find_separator(trailer_strings[nr], separators) >= 1 + ? &trailer_strings[nr] + : NULL; + nr++; + } + strbuf_list_free(trailer_lines); + + info->blank_line_before_trailer = ends_with_blank_line(str, + trailer_block_start); + info->trailer_block_start = trailer_block_start; + info->trailer_block_end = end_of_log_message; + info->trailers = trailer_strings; + info->trailer_nr = nr; + + return info; +} + /* * Parse trailers in "str", populating the trailer info and "head" * linked list structure. */ -void parse_trailers(const struct process_trailer_options *opts, - struct trailer_info *info, - const char *str, - struct list_head *head) +struct trailer_info *parse_trailers(const struct process_trailer_options *opts, + const char *str, + struct list_head *head) { + struct trailer_info *info; struct strbuf tok = STRBUF_INIT; struct strbuf val = STRBUF_INIT; size_t i; - trailer_info_get(info, str, opts); + info = trailer_info_get(opts, str); for (i = 0; i < info->trailer_nr; i++) { int separator_pos; @@ -1062,6 +1135,8 @@ void parse_trailers(const struct process_trailer_options *opts, strbuf_detach(&val, NULL)); } } + + return info; } void free_trailers(struct list_head *trailers) @@ -1073,47 +1148,19 @@ void free_trailers(struct list_head *trailers) } } -void trailer_info_get(struct trailer_info *info, const char *str, - const struct process_trailer_options *opts) +size_t trailer_block_start(struct trailer_info *info) { - size_t end_of_log_message = 0, trailer_block_start = 0; - struct strbuf **trailer_lines, **ptr; - char **trailer_strings = NULL; - size_t nr = 0, alloc = 0; - char **last = NULL; - - trailer_config_init(); - - end_of_log_message = find_end_of_log_message(str, opts->no_divider); - trailer_block_start = find_trailer_block_start(str, end_of_log_message); + return info->trailer_block_start; +} - trailer_lines = strbuf_split_buf(str + trailer_block_start, - end_of_log_message - trailer_block_start, - '\n', - 0); - for (ptr = trailer_lines; *ptr; ptr++) { - if (last && isspace((*ptr)->buf[0])) { - struct strbuf sb = STRBUF_INIT; - strbuf_attach(&sb, *last, strlen(*last), strlen(*last)); - strbuf_addbuf(&sb, *ptr); - *last = strbuf_detach(&sb, NULL); - continue; - } - ALLOC_GROW(trailer_strings, nr + 1, alloc); - trailer_strings[nr] = strbuf_detach(*ptr, NULL); - last = find_separator(trailer_strings[nr], separators) >= 1 - ? &trailer_strings[nr] - : NULL; - nr++; - } - strbuf_list_free(trailer_lines); +size_t trailer_block_end(struct trailer_info *info) +{ + return info->trailer_block_end; +} - info->blank_line_before_trailer = ends_with_blank_line(str, - trailer_block_start); - info->trailer_block_start = trailer_block_start; - info->trailer_block_end = end_of_log_message; - info->trailers = trailer_strings; - info->trailer_nr = nr; +int blank_line_before_trailer_block(struct trailer_info *info) +{ + return info->blank_line_before_trailer; } void trailer_info_release(struct trailer_info *info) @@ -1122,6 +1169,7 @@ void trailer_info_release(struct trailer_info *info) for (i = 0; i < info->trailer_nr; i++) free(info->trailers[i]); free(info->trailers); + free(info); } void format_trailers_from_commit(const struct process_trailer_options *opts, @@ -1129,30 +1177,30 @@ void format_trailers_from_commit(const struct process_trailer_options *opts, struct strbuf *out) { LIST_HEAD(head); - struct trailer_info info; - - parse_trailers(opts, &info, msg, &head); + struct trailer_info *info = parse_trailers(opts, msg, &head); /* If we want the whole block untouched, we can take the fast path. */ if (!opts->only_trailers && !opts->unfold && !opts->filter && !opts->separator && !opts->key_only && !opts->value_only && !opts->key_value_separator) { - strbuf_add(out, msg + info.trailer_block_start, - info.trailer_block_end - info.trailer_block_start); + strbuf_add(out, msg + info->trailer_block_start, + info->trailer_block_end - info->trailer_block_start); } else format_trailers(opts, &head, out); free_trailers(&head); - trailer_info_release(&info); + trailer_info_release(info); } void trailer_iterator_init(struct trailer_iterator *iter, const char *msg) { struct process_trailer_options opts = PROCESS_TRAILER_OPTIONS_INIT; + struct trailer_info *internal = trailer_info_new(); strbuf_init(&iter->key, 0); strbuf_init(&iter->val, 0); opts.no_divider = 1; - trailer_info_get(&iter->internal.info, msg, &opts); + iter->internal.info = internal; + iter->internal.info = trailer_info_get(&opts, msg); iter->internal.cur = 0; } @@ -1160,8 +1208,8 @@ int trailer_iterator_advance(struct trailer_iterator *iter) { char *line; int separator_pos; - if (iter->internal.cur < iter->internal.info.trailer_nr) { - line = iter->internal.info.trailers[iter->internal.cur++]; + if (iter->internal.cur < iter->internal.info->trailer_nr) { + line = iter->internal.info->trailers[iter->internal.cur++]; separator_pos = find_separator(line, separators); iter->is_trailer = (separator_pos > 0); @@ -1178,7 +1226,7 @@ int trailer_iterator_advance(struct trailer_iterator *iter) void trailer_iterator_release(struct trailer_iterator *iter) { - trailer_info_release(&iter->internal.info); + trailer_info_release(iter->internal.info); strbuf_release(&iter->val); strbuf_release(&iter->key); } diff --git a/trailer.h b/trailer.h index a7599067acc..e19ddf84e64 100644 --- a/trailer.h +++ b/trailer.h @@ -4,6 +4,8 @@ #include "list.h" #include "strbuf.h" +struct trailer_info; + enum trailer_where { WHERE_DEFAULT, WHERE_END, @@ -29,27 +31,6 @@ int trailer_set_where(enum trailer_where *item, const char *value); int trailer_set_if_exists(enum trailer_if_exists *item, const char *value); int trailer_set_if_missing(enum trailer_if_missing *item, const char *value); -struct trailer_info { - /* - * True if there is a blank line before the location pointed to by - * trailer_block_start. - */ - int blank_line_before_trailer; - - /* - * Offsets to the trailer block start and end positions in the input - * string. If no trailer block is found, these are both set to the - * "true" end of the input (find_end_of_log_message()). - */ - size_t trailer_block_start, trailer_block_end; - - /* - * Array of trailers found. - */ - char **trailers; - size_t trailer_nr; -}; - /* * A list that represents newly-added trailers, such as those provided * with the --trailer command line option of git-interpret-trailers. @@ -89,13 +70,13 @@ void parse_trailers_from_command_line_args(struct list_head *arg_head, void process_trailers_lists(struct list_head *head, struct list_head *arg_head); -void parse_trailers(const struct process_trailer_options *opts, - struct trailer_info *info, - const char *str, - struct list_head *head); +struct trailer_info *parse_trailers(const struct process_trailer_options *opts, + const char *str, + struct list_head *head); -void trailer_info_get(struct trailer_info *info, const char *str, - const struct process_trailer_options *opts); +size_t trailer_block_start(struct trailer_info *info); +size_t trailer_block_end(struct trailer_info *info); +int blank_line_before_trailer_block(struct trailer_info *info); void trailer_info_release(struct trailer_info *info); @@ -142,7 +123,7 @@ struct trailer_iterator { /* private */ struct { - struct trailer_info info; + struct trailer_info *info; size_t cur; } internal; };