Message ID | xmqqfskdieqz.fsf@gitster.g (mailing list archive) |
---|---|
State | Accepted |
Commit | 5a5ea141e7d5317cd049cb1e93b9127c1cec04bb |
Headers | show |
Series | revision: mark blobs needed for resolve-undo as reachable | expand |
On 6/9/2022 7:44 PM, Junio C Hamano wrote: > + struct string_list *resolve_undo = istate->resolve_undo; > + > + if (!resolve_undo) > + return 0; > + > + for_each_string_list_item(item, resolve_undo) { I see this is necessary since for_each_string_list_item() does not handle NULL lists. After attempting to allow it to handle NULL lists, I see that the compiler complains about the cases where it would _never_ be NULL, so that change appears to be impossible. The patch looks good. I liked the comments for the three phases of the test. Thanks, -Stolee
Derrick Stolee <derrickstolee@github.com> writes: > On 6/9/2022 7:44 PM, Junio C Hamano wrote: > >> + struct string_list *resolve_undo = istate->resolve_undo; >> + >> + if (!resolve_undo) >> + return 0; >> + >> + for_each_string_list_item(item, resolve_undo) { > > I see this is necessary since for_each_string_list_item() does not > handle NULL lists. After attempting to allow it to handle NULL > lists, I see that the compiler complains about the cases where it > would _never_ be NULL, so that change appears to be impossible. Heh, no such deep thought went into this. I just copied what is done in builtin/ls-files.c::show_ru_info() that grabs these blobs and does something interesting and the only difference is this does something else interesting. We _could_ refactor these to into "take a callback and iterate over resolve-undo information" shell plus two callback functions, one to print them and the other to mark them still reachable, but the iterator being relatively short, I doubt that it is worth it. > The patch looks good. I liked the comments for the three phases > of the test. Thanks.
On Mon, Jun 13 2022, Derrick Stolee wrote: > On 6/9/2022 7:44 PM, Junio C Hamano wrote: > >> + struct string_list *resolve_undo = istate->resolve_undo; >> + >> + if (!resolve_undo) >> + return 0; >> + >> + for_each_string_list_item(item, resolve_undo) { > > I see this is necessary since for_each_string_list_item() does > not handle NULL lists. After attempting to allow it to handle > NULL lists, I see that the compiler complains about the cases > where it would _never_ be NULL, so that change appears to be > impossible. > > The patch looks good. I liked the comments for the three phases > of the test. I think it's probably good to keep for_each_string_list_item() implemented the way it is, given that all existing callers of it feed non-NULL lists to it. But why is it impossible to make it handle NULL lists? This works for me, and passes the tests: diff --git a/builtin/fsck.c b/builtin/fsck.c index 4b17ccc3f44..4160bb50ad0 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -763,9 +763,6 @@ static int fsck_resolve_undo(struct index_state *istate) struct string_list_item *item; struct string_list *resolve_undo = istate->resolve_undo; - if (!resolve_undo) - return 0; - for_each_string_list_item(item, resolve_undo) { const char *path = item->string; struct resolve_undo_info *ru = item->util; diff --git a/string-list.h b/string-list.h index d5a744e1438..aa15aea8c2b 100644 --- a/string-list.h +++ b/string-list.h @@ -143,7 +143,7 @@ int for_each_string_list(struct string_list *list, /** Iterate over each item, as a macro. */ #define for_each_string_list_item(item,list) \ - for (item = (list)->items; \ + for (item = (((list) && (list)->items) ? ((list)->items) : NULL); \ item && item < (list)->items + (list)->nr; \ ++item) Perhaps you tried to convert it to a do/while macro with an "if", which won't work as we need the "for" to be used for the subsequent {} (or a single statement).. But under the hood this is all just syntax sugar for "goto", so if we can avoid dereferencing "list" we're good to go...
On Thu, Jun 09, 2022 at 04:44:20PM -0700, Junio C Hamano wrote: > The resolve-undo extension was added to the index in cfc5789a > (resolve-undo: record resolved conflicts in a new index extension > section, 2009-12-25). This extension records the blob object names > and their modes of conflicted paths when the path gets resolved > (e.g. with "git add"), to allow "undoing" the resolution with > "checkout -m path". These blob objects should be guarded from > garbage-collection while we have the resolve-undo information in the > index (otherwise unresolve operation may try to use a blob object > that has already been pruned away). > > But the code called from mark_reachable_objects() for the index > forgets to do so. Teach add_index_objects_to_pending() helper to > also add objects referred to by the resolve-undo extension. Nice find! > Also make matching changes to "fsck", which has code that is fairly > similar to the reachability stuff, but have parallel implementations > for all these stuff, which may (or may not) someday want to be unified. I wasn't sure what the change in fsck was when skimming the diffstat before reading your patch message, but makes sense. I'm glad that you included this, too. > +static void add_resolve_undo_to_pending(struct index_state *istate, struct rev_info *revs) > +{ > + struct string_list_item *item; > + struct string_list *resolve_undo = istate->resolve_undo; > + > + if (!resolve_undo) > + return; > + > + for_each_string_list_item(item, resolve_undo) { > + const char *path = item->string; > + struct resolve_undo_info *ru = item->util; > + int i; > + > + if (!ru) > + continue; > + for (i = 0; i < 3; i++) { > + struct blob *blob; > + > + if (!ru->mode[i] || !S_ISREG(ru->mode[i])) > + continue; > + > + blob = lookup_blob(revs->repo, &ru->oid[i]); > + if (!blob) { > + warning(_("resolve-undo records `%s` which is missing"), > + oid_to_hex(&ru->oid[i])); > + continue; > + } > + add_pending_object_with_path(revs, &blob->object, "", > + ru->mode[i], path); > + } > + } > +} This implementation looks good to my eyes. > @@ -1718,6 +1752,8 @@ static void do_add_index_objects_to_pending(struct rev_info *revs, > add_cache_tree(istate->cache_tree, revs, &path, flags); > strbuf_release(&path); > } > + > + add_resolve_undo_to_pending(istate, revs); > } Great; this fixes the bug for cruft packs, too, whose reachable pack is generated with `--indexed-objects`, so the cruft pack will no longer contain the resolve-undo objects. > +test_expect_success 'resolve-undo keeps blobs from gc' ' Very thorough. Thanks! Taylor
On 6/13/2022 8:24 PM, Ævar Arnfjörð Bjarmason wrote: > > On Mon, Jun 13 2022, Derrick Stolee wrote: > >> On 6/9/2022 7:44 PM, Junio C Hamano wrote: >> >>> + struct string_list *resolve_undo = istate->resolve_undo; >>> + >>> + if (!resolve_undo) >>> + return 0; >>> + >>> + for_each_string_list_item(item, resolve_undo) { >> >> I see this is necessary since for_each_string_list_item() does >> not handle NULL lists. After attempting to allow it to handle >> NULL lists, I see that the compiler complains about the cases >> where it would _never_ be NULL, so that change appears to be >> impossible. >> >> The patch looks good. I liked the comments for the three phases >> of the test. > > I think it's probably good to keep for_each_string_list_item() > implemented the way it is, given that all existing callers of it feed > non-NULL lists to it. We are talking right now about an example where it would be cleaner to allow a NULL value. This guarded example also exists in http.c (we would still need to guard on NULL options): /* Add additional headers here */ if (options && options->extra_headers) { const struct string_list_item *item; for_each_string_list_item(item, options->extra_headers) { headers = curl_slist_append(headers, item->string); } } These guarded examples in ref_filter_match() would be greatly simplified: if (exclude_patterns && exclude_patterns->nr) { for_each_string_list_item(item, exclude_patterns) { if (match_ref_pattern(refname, item)) return 0; } } if (include_patterns && include_patterns->nr) { for_each_string_list_item(item, include_patterns) { if (match_ref_pattern(refname, item)) return 1; } return 0; } if (exclude_patterns_config && exclude_patterns_config->nr) { for_each_string_list_item(item, exclude_patterns_config) { if (match_ref_pattern(refname, item)) return 0; } } (The include_patterns check would still be needed for that extra return 0; in the middle.) There are more examples, but I'll stop listing them here. > But why is it impossible to make it handle NULL lists? This works for > me, and passes the tests: > /** Iterate over each item, as a macro. */ > #define for_each_string_list_item(item,list) \ > - for (item = (list)->items; \ > + for (item = (((list) && (list)->items) ? ((list)->items) : NULL); \ I thinks I had something like for ((list) && item = (list)->items; (list) && item && ... but even with your suggestion, I get this compiler error: In file included from convert.h:8, from cache.h:10, from apply.c:10: apply.c: In function ‘write_out_results’: string-list.h:146:22: error: the address of ‘cpath’ will always evaluate as ‘true’ [-Werror=address] 146 | for (item = ((list) && (list)->items) ? (list)->items : NULL; \ | ^ apply.c:4652:25: note: in expansion of macro ‘for_each_string_list_item’ 4652 | for_each_string_list_item(item, &cpath) | (along with many other examples). Junio is right that we would need to convert this into a method with a function pointer instead of a for_each_* macro. That's quite a big lift for some small convenience for the callers. Thanks, -Stolee
On Tue, Jun 14, 2022 at 10:35:10AM -0400, Derrick Stolee wrote: > I thinks I had something like > > for ((list) && item = (list)->items; (list) && item && ... I wrote something like this: --- >8 --- diff --git a/string-list.h b/string-list.h index d5a744e143..425abc55f4 100644 --- a/string-list.h +++ b/string-list.h @@ -143,7 +143,7 @@ int for_each_string_list(struct string_list *list, /** Iterate over each item, as a macro. */ #define for_each_string_list_item(item,list) \ - for (item = (list)->items; \ + for (item = (list) ? (list)->items : NULL; \ item && item < (list)->items + (list)->nr; \ ++item) --- 8< --- > but even with your suggestion, I get this compiler error: ...so did I. Though I'm not sure I understand the compiler's warning here. Surely the thing being passed as list in the macro expansion _won't_ always evaluate to non-NULL, will it? Thanks, Taylor
On Tue, Jun 14, 2022 at 10:02:32PM -0400, Taylor Blau wrote: > --- >8 --- > > diff --git a/string-list.h b/string-list.h > index d5a744e143..425abc55f4 100644 > --- a/string-list.h > +++ b/string-list.h > @@ -143,7 +143,7 @@ int for_each_string_list(struct string_list *list, > > /** Iterate over each item, as a macro. */ > #define for_each_string_list_item(item,list) \ > - for (item = (list)->items; \ > + for (item = (list) ? (list)->items : NULL; \ > item && item < (list)->items + (list)->nr; \ > ++item) > > --- 8< --- > > > but even with your suggestion, I get this compiler error: > > ...so did I. Though I'm not sure I understand the compiler's warning > here. Surely the thing being passed as list in the macro expansion > _won't_ always evaluate to non-NULL, will it? In the general case, no, but in this specific expansion of the macro, it is passing the address of a local variable (&cpath), which will never be NULL. The compiler is overeager here; the check is indeed pointless in this expansion, but warning on useless macro-expanded code isn't helpful, since other macro users need it. Hiding it in a function seems to work, even with -O2 inlining, like: diff --git a/string-list.h b/string-list.h index d5a744e143..b28b135e11 100644 --- a/string-list.h +++ b/string-list.h @@ -141,9 +141,14 @@ void string_list_clear_func(struct string_list *list, string_list_clear_func_t c int for_each_string_list(struct string_list *list, string_list_each_func_t func, void *cb_data); +static inline struct string_list_item *string_list_first_item(const struct string_list *list) +{ + return list ? list->items : NULL; +} + /** Iterate over each item, as a macro. */ #define for_each_string_list_item(item,list) \ - for (item = (list)->items; \ + for (item = string_list_first_item(list); \ item && item < (list)->items + (list)->nr; \ ++item) -Peff
Derrick Stolee <derrickstolee@github.com> writes: > Junio is right that we would need to convert this into a method with a > function pointer instead of a for_each_* macro. That's quite a big lift > for some small convenience for the callers. Heh, I never said we would need to. I only said we _could_ do such a change but I do not see the need to.
On Tue, Jun 14, 2022 at 11:48:20PM -0400, Jeff King wrote: > On Tue, Jun 14, 2022 at 10:02:32PM -0400, Taylor Blau wrote: > > > --- >8 --- > > > > diff --git a/string-list.h b/string-list.h > > index d5a744e143..425abc55f4 100644 > > --- a/string-list.h > > +++ b/string-list.h > > @@ -143,7 +143,7 @@ int for_each_string_list(struct string_list *list, > > > > /** Iterate over each item, as a macro. */ > > #define for_each_string_list_item(item,list) \ > > - for (item = (list)->items; \ > > + for (item = (list) ? (list)->items : NULL; \ > > item && item < (list)->items + (list)->nr; \ > > ++item) > > > > --- 8< --- > > > > > but even with your suggestion, I get this compiler error: > > > > ...so did I. Though I'm not sure I understand the compiler's warning > > here. Surely the thing being passed as list in the macro expansion > > _won't_ always evaluate to non-NULL, will it? > > In the general case, no, but in this specific expansion of the macro, it > is passing the address of a local variable (&cpath), which will never be > NULL. The compiler is overeager here; the check is indeed pointless in > this expansion, but warning on useless macro-expanded code isn't > helpful, since other macro users need it. Ah, that makes sense. The compiler is warning us that the macro-expanded version of for_each_string_list_item() has a ternary expression that will never evaluate its right-hand side in cases where it can prove the second argument to the macro is non-NULL. > Hiding it in a function seems to work, even with -O2 inlining, like: > > diff --git a/string-list.h b/string-list.h > index d5a744e143..b28b135e11 100644 > --- a/string-list.h > +++ b/string-list.h > @@ -141,9 +141,14 @@ void string_list_clear_func(struct string_list *list, string_list_clear_func_t c > int for_each_string_list(struct string_list *list, > string_list_each_func_t func, void *cb_data); > > +static inline struct string_list_item *string_list_first_item(const struct string_list *list) > +{ > + return list ? list->items : NULL; > +} > + > /** Iterate over each item, as a macro. */ > #define for_each_string_list_item(item,list) \ > - for (item = (list)->items; \ > + for (item = string_list_first_item(list); \ > item && item < (list)->items + (list)->nr; \ > ++item) That works, nice. I don't really want to mess up the tree too much this close to a release, but this sort of clean-up seems good to do. I know Stolee identified a handful of spots that would benefit from it. Some good #leftoverbits, I guess :-). Thanks, Taylor
On Tue, Jun 14 2022, Derrick Stolee wrote: > On 6/13/2022 8:24 PM, Ævar Arnfjörð Bjarmason wrote: >> >> On Mon, Jun 13 2022, Derrick Stolee wrote: >> >>> On 6/9/2022 7:44 PM, Junio C Hamano wrote: >>> >>>> + struct string_list *resolve_undo = istate->resolve_undo; >>>> + >>>> + if (!resolve_undo) >>>> + return 0; >>>> + >>>> + for_each_string_list_item(item, resolve_undo) { >>> >>> I see this is necessary since for_each_string_list_item() does >>> not handle NULL lists. After attempting to allow it to handle >>> NULL lists, I see that the compiler complains about the cases >>> where it would _never_ be NULL, so that change appears to be >>> impossible. >>> >>> The patch looks good. I liked the comments for the three phases >>> of the test. >> >> I think it's probably good to keep for_each_string_list_item() >> implemented the way it is, given that all existing callers of it feed >> non-NULL lists to it. > > We are talking right now about an example where it would be cleaner to > allow a NULL value. First, I've read the thread and I see the compile error you ran into below, and Jeff King's downthread workaround, so we could do this, and I'm not at all opposed... > This guarded example also exists in http.c (we would still need to guard > on NULL options): > > /* Add additional headers here */ > if (options && options->extra_headers) { > const struct string_list_item *item; > for_each_string_list_item(item, options->extra_headers) { > headers = curl_slist_append(headers, item->string); > } > } I think this and probably many other examples you can find are ones where we should just stop using a maybe-NULL "struct string_list", as in the WIP (but segfaulting) patch below, but I think you get the idea. I.e. in that case we're passing an "options" struct that can be NULL, but could just have an initializer. I.e. we should probably just have a non-NULL options with sensible defaults, which would also allow for changing the adjacent "options && options->no_cache" etc. code to just "options->no_cache". > These guarded examples in ref_filter_match() would be greatly simplified: > > if (exclude_patterns && exclude_patterns->nr) { > for_each_string_list_item(item, exclude_patterns) { > if (match_ref_pattern(refname, item)) > return 0; > } > } > > if (include_patterns && include_patterns->nr) { > for_each_string_list_item(item, include_patterns) { > if (match_ref_pattern(refname, item)) > return 1; > } > return 0; > } > > if (exclude_patterns_config && exclude_patterns_config->nr) { > for_each_string_list_item(item, exclude_patterns_config) { > if (match_ref_pattern(refname, item)) > return 0; > } > } First, regardless of any bigger change, I think all of those except the middle one can drop the "nr" check. But more generally, isn't something like [2] below a better change than chasing the case of "what if it's NULL?". Very WIP of course, and I just checked if it compiled, there's probably more lurking bugs, but I think you get the idea. One commit (of mine) on "master" that goes in that direction is 0000e81811b (builtin/remote.c: add and use SHOW_INFO_INIT, 2021-10-01). > (The include_patterns check would still be needed for that extra > return 0; in the middle.) > > There are more examples, but I'll stop listing them here. Maybe there's better ones, but from my own past spelunking into "struct string_list" users I've mostly found cases like 0000e81811b and the below. I.e. sure, handling NULL was a hassle, but the underlying problem was usually *why is it NULL*. I.e. can't we just have the list be 0..N, why do we need NULL, 0..N? 1. diff --git a/http.c b/http.c index 11c6f69facd..c2fa2b78db8 100644 --- a/http.c +++ b/http.c @@ -1808,6 +1808,7 @@ static int http_request(const char *url, struct strbuf buf = STRBUF_INIT; const char *accept_language; int ret; + const struct string_list_item *item; slot = get_active_slot(); curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1); @@ -1844,12 +1845,8 @@ static int http_request(const char *url, headers = curl_slist_append(headers, buf.buf); /* Add additional headers here */ - if (options && options->extra_headers) { - const struct string_list_item *item; - for_each_string_list_item(item, options->extra_headers) { - headers = curl_slist_append(headers, item->string); - } - } + for_each_string_list_item(item, &options->extra_headers) + headers = curl_slist_append(headers, item->string); curl_easy_setopt(slot->curl, CURLOPT_URL, url); curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers); @@ -2055,7 +2052,7 @@ static char *fetch_pack_index(unsigned char *hash, const char *base_url) strbuf_addf(&buf, "%s.temp", sha1_pack_index_name(hash)); tmp = strbuf_detach(&buf, NULL); - if (http_get_file(url, tmp, NULL) != HTTP_OK) { + if (http_get_file(url, tmp, NULL /* segfaults due to this */) != HTTP_OK) { error("Unable to get pack index %s", url); FREE_AND_NULL(tmp); } diff --git a/http.h b/http.h index ba303cfb372..41596fbf443 100644 --- a/http.h +++ b/http.h @@ -144,7 +144,7 @@ struct http_get_options { * request. The strings in the list must not be freed until after the * request has completed. */ - struct string_list *extra_headers; + struct string_list extra_headers; }; /* Return values for http_get_*() */ diff --git a/remote-curl.c b/remote-curl.c index 67f178b1120..30235577487 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -449,7 +449,6 @@ static struct discovery *discover_refs(const char *service, int for_push) struct strbuf refs_url = STRBUF_INIT; struct strbuf effective_url = STRBUF_INIT; struct strbuf protocol_header = STRBUF_INIT; - struct string_list extra_headers = STRING_LIST_INIT_DUP; struct discovery *last = last_discovery; int http_ret, maybe_smart = 0; struct http_get_options http_options; @@ -478,16 +477,18 @@ static struct discovery *discover_refs(const char *service, int for_push) if (version == protocol_v2 && !strcmp("git-receive-pack", service)) version = protocol_v0; + /* antipattern, we should have an initializer for "struct http_get_options"... */ + memset(&http_options, 0, sizeof(http_options)); + string_list_init_nodup(&http_options.extra_headers); + /* Add the extra Git-Protocol header */ if (get_protocol_http_header(version, &protocol_header)) - string_list_append(&extra_headers, protocol_header.buf); + string_list_append(&http_options.extra_headers, protocol_header.buf); - memset(&http_options, 0, sizeof(http_options)); http_options.content_type = &type; http_options.charset = &charset; http_options.effective_url = &effective_url; http_options.base_url = &url; - http_options.extra_headers = &extra_headers; http_options.initial_request = 1; http_options.no_cache = 1; @@ -538,7 +539,8 @@ static struct discovery *discover_refs(const char *service, int for_push) strbuf_release(&effective_url); strbuf_release(&buffer); strbuf_release(&protocol_header); - string_list_clear(&extra_headers, 0); + /*... and a proper destructor... */ + string_list_clear(&http_options.extra_headers, 0); last_discovery = last; return last; } 2. diff --git a/builtin/log.c b/builtin/log.c index 88a5e98875a..4af1503887e 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -168,12 +168,11 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix, struct userformat_want w; int quiet = 0, source = 0, mailmap; static struct line_opt_callback_data line_cb = {NULL, NULL, STRING_LIST_INIT_DUP}; - static struct string_list decorate_refs_exclude = STRING_LIST_INIT_NODUP; - static struct string_list decorate_refs_exclude_config = STRING_LIST_INIT_NODUP; - static struct string_list decorate_refs_include = STRING_LIST_INIT_NODUP; - struct decoration_filter decoration_filter = {&decorate_refs_include, - &decorate_refs_exclude, - &decorate_refs_exclude_config}; + struct decoration_filter decoration_filter = { + .include_ref_pattern = STRING_LIST_INIT_NODUP, + .exclude_ref_pattern = STRING_LIST_INIT_NODUP, + .exclude_ref_config_pattern = STRING_LIST_INIT_NODUP, + }; static struct revision_sources revision_sources; const struct option builtin_log_options[] = { @@ -181,9 +180,9 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix, OPT_BOOL(0, "source", &source, N_("show source")), OPT_BOOL(0, "use-mailmap", &mailmap, N_("use mail map file")), OPT_ALIAS(0, "mailmap", "use-mailmap"), - OPT_STRING_LIST(0, "decorate-refs", &decorate_refs_include, + OPT_STRING_LIST(0, "decorate-refs", &decoration_filter.include_ref_pattern, N_("pattern"), N_("only decorate refs that match <pattern>")), - OPT_STRING_LIST(0, "decorate-refs-exclude", &decorate_refs_exclude, + OPT_STRING_LIST(0, "decorate-refs-exclude", &decoration_filter.exclude_ref_pattern, N_("pattern"), N_("do not decorate refs that match <pattern>")), OPT_CALLBACK_F(0, "decorate", NULL, NULL, N_("decorate options"), PARSE_OPT_OPTARG, decorate_callback), @@ -272,7 +271,7 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix, if (config_exclude) { struct string_list_item *item; for_each_string_list_item(item, config_exclude) - string_list_append(&decorate_refs_exclude_config, + string_list_append(&decoration_filter.exclude_ref_config_pattern, item->string); } diff --git a/log-tree.c b/log-tree.c index d0ac0a6327a..f7d475f652f 100644 --- a/log-tree.c +++ b/log-tree.c @@ -104,32 +104,23 @@ static int ref_filter_match(const char *refname, const struct decoration_filter *filter) { struct string_list_item *item; - const struct string_list *exclude_patterns = filter->exclude_ref_pattern; - const struct string_list *include_patterns = filter->include_ref_pattern; - const struct string_list *exclude_patterns_config = - filter->exclude_ref_config_pattern; - if (exclude_patterns && exclude_patterns->nr) { - for_each_string_list_item(item, exclude_patterns) { - if (match_ref_pattern(refname, item)) - return 0; - } - } + for_each_string_list_item(item, &filter->exclude_ref_pattern) + if (match_ref_pattern(refname, item)) + return 0; - if (include_patterns && include_patterns->nr) { - for_each_string_list_item(item, include_patterns) { + + if (filter->include_ref_pattern.nr) { + for_each_string_list_item(item, &filter->include_ref_pattern) { if (match_ref_pattern(refname, item)) return 1; + return 0; } - return 0; } - if (exclude_patterns_config && exclude_patterns_config->nr) { - for_each_string_list_item(item, exclude_patterns_config) { - if (match_ref_pattern(refname, item)) - return 0; - } - } + for_each_string_list_item(item, &filter->exclude_ref_config_pattern) + if (match_ref_pattern(refname, item)) + return 0; return 1; } @@ -202,13 +193,13 @@ void load_ref_decorations(struct decoration_filter *filter, int flags) if (!decoration_loaded) { if (filter) { struct string_list_item *item; - for_each_string_list_item(item, filter->exclude_ref_pattern) { + for_each_string_list_item(item, &filter->exclude_ref_pattern) { normalize_glob_ref(item, NULL, item->string); } - for_each_string_list_item(item, filter->include_ref_pattern) { + for_each_string_list_item(item, &filter->include_ref_pattern) { normalize_glob_ref(item, NULL, item->string); } - for_each_string_list_item(item, filter->exclude_ref_config_pattern) { + for_each_string_list_item(item, &filter->exclude_ref_config_pattern) { normalize_glob_ref(item, NULL, item->string); } }
On Thu, Jun 09, 2022 at 04:44:20PM -0700, Junio C Hamano wrote: > +static int fsck_resolve_undo(struct index_state *istate) > +{ > + struct string_list_item *item; > + struct string_list *resolve_undo = istate->resolve_undo; > + > + if (!resolve_undo) > + return 0; > + > + for_each_string_list_item(item, resolve_undo) { > + const char *path = item->string; > + struct resolve_undo_info *ru = item->util; > + int i; > + > + if (!ru) > + continue; > + for (i = 0; i < 3; i++) { > + struct object *obj; > + > + if (!ru->mode[i] || !S_ISREG(ru->mode[i])) > + continue; > + > + obj = parse_object(the_repository, &ru->oid[i]); parse_object() can return NULL ... > + if (!obj) { ... and here is the if statement to show an error in that case ... > + error(_("%s: invalid sha1 pointer in resolve-undo"), > + oid_to_hex(&ru->oid[i])); > + errors_found |= ERROR_REFS; > + } > + obj->flags |= USED; ... but then there is this line which might dereference that NULL pointer. Perhaps all we would need is a 'continue' at the end of that 'if (!obj)' block, or an else block for the last three statements, which should result in the same control flow? Dunno. > + fsck_put_object_name(&fsck_walk_options, &ru->oid[i], > + ":(%d):%s", i, path); > + mark_object_reachable(obj); > + } > + } > + return 0; > +}
SZEDER Gábor <szeder.dev@gmail.com> writes: >> + for (i = 0; i < 3; i++) { >> + struct object *obj; >> + >> + if (!ru->mode[i] || !S_ISREG(ru->mode[i])) >> + continue; >> + >> + obj = parse_object(the_repository, &ru->oid[i]); > > parse_object() can return NULL ... > >> + if (!obj) { > > ... and here is the if statement to show an error in that case ... > >> + error(_("%s: invalid sha1 pointer in resolve-undo"), >> + oid_to_hex(&ru->oid[i])); >> + errors_found |= ERROR_REFS; >> + } >> + obj->flags |= USED; > > ... but then there is this line which might dereference that NULL > pointer. > > Perhaps all we would need is a 'continue' at the end of that 'if > (!obj)' block, or an else block for the last three statements, which > should result in the same control flow? Dunno. Thanks for spotting. Looking at how fsck_cache_tree() and fsck_walk_tree() handles missing object, it sounds like the right approach to continue after setting the errors_found bit. >> + fsck_put_object_name(&fsck_walk_options, &ru->oid[i], >> + ":(%d):%s", i, path); >> + mark_object_reachable(obj); >> + } >> + } >> + return 0; >> +}
diff --git a/builtin/fsck.c b/builtin/fsck.c index 9e54892311..4b17ccc3f4 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -19,6 +19,7 @@ #include "decorate.h" #include "packfile.h" #include "object-store.h" +#include "resolve-undo.h" #include "run-command.h" #include "worktree.h" @@ -757,6 +758,42 @@ static int fsck_cache_tree(struct cache_tree *it) return err; } +static int fsck_resolve_undo(struct index_state *istate) +{ + struct string_list_item *item; + struct string_list *resolve_undo = istate->resolve_undo; + + if (!resolve_undo) + return 0; + + for_each_string_list_item(item, resolve_undo) { + const char *path = item->string; + struct resolve_undo_info *ru = item->util; + int i; + + if (!ru) + continue; + for (i = 0; i < 3; i++) { + struct object *obj; + + if (!ru->mode[i] || !S_ISREG(ru->mode[i])) + continue; + + obj = parse_object(the_repository, &ru->oid[i]); + if (!obj) { + error(_("%s: invalid sha1 pointer in resolve-undo"), + oid_to_hex(&ru->oid[i])); + errors_found |= ERROR_REFS; + } + obj->flags |= USED; + fsck_put_object_name(&fsck_walk_options, &ru->oid[i], + ":(%d):%s", i, path); + mark_object_reachable(obj); + } + } + return 0; +} + static void mark_object_for_connectivity(const struct object_id *oid) { struct object *obj = lookup_unknown_object(the_repository, oid); @@ -938,6 +975,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) } if (active_cache_tree) fsck_cache_tree(active_cache_tree); + fsck_resolve_undo(&the_index); } check_connectivity(); diff --git a/revision.c b/revision.c index 7d435f8048..cd78627b20 100644 --- a/revision.c +++ b/revision.c @@ -33,6 +33,7 @@ #include "bloom.h" #include "json-writer.h" #include "list-objects-filter-options.h" +#include "resolve-undo.h" volatile show_early_output_fn_t show_early_output; @@ -1690,6 +1691,39 @@ static void add_cache_tree(struct cache_tree *it, struct rev_info *revs, } +static void add_resolve_undo_to_pending(struct index_state *istate, struct rev_info *revs) +{ + struct string_list_item *item; + struct string_list *resolve_undo = istate->resolve_undo; + + if (!resolve_undo) + return; + + for_each_string_list_item(item, resolve_undo) { + const char *path = item->string; + struct resolve_undo_info *ru = item->util; + int i; + + if (!ru) + continue; + for (i = 0; i < 3; i++) { + struct blob *blob; + + if (!ru->mode[i] || !S_ISREG(ru->mode[i])) + continue; + + blob = lookup_blob(revs->repo, &ru->oid[i]); + if (!blob) { + warning(_("resolve-undo records `%s` which is missing"), + oid_to_hex(&ru->oid[i])); + continue; + } + add_pending_object_with_path(revs, &blob->object, "", + ru->mode[i], path); + } + } +} + static void do_add_index_objects_to_pending(struct rev_info *revs, struct index_state *istate, unsigned int flags) @@ -1718,6 +1752,8 @@ static void do_add_index_objects_to_pending(struct rev_info *revs, add_cache_tree(istate->cache_tree, revs, &path, flags); strbuf_release(&path); } + + add_resolve_undo_to_pending(istate, revs); } void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags) diff --git a/t/t2030-unresolve-info.sh b/t/t2030-unresolve-info.sh index f691e6d903..2d8c70b03a 100755 --- a/t/t2030-unresolve-info.sh +++ b/t/t2030-unresolve-info.sh @@ -194,4 +194,75 @@ test_expect_success 'rerere forget (add-add conflict)' ' test_i18ngrep "no remembered" actual ' +test_expect_success 'resolve-undo keeps blobs from gc' ' + git checkout -f main && + + # First make sure we do not have any cruft left in the object store + git repack -a -d && + git prune --expire=now && + git prune-packed && + git gc --prune=now && + git fsck --unreachable >cruft && + test_must_be_empty cruft && + + # Now add three otherwise unreferenced blob objects to the index + git reset --hard && + B1=$(echo "resolve undo test data 1" | git hash-object -w --stdin) && + B2=$(echo "resolve undo test data 2" | git hash-object -w --stdin) && + B3=$(echo "resolve undo test data 3" | git hash-object -w --stdin) && + git update-index --add --index-info <<-EOF && + 100644 $B1 1 frotz + 100644 $B2 2 frotz + 100644 $B3 3 frotz + EOF + + # These three blob objects are reachable (only) from the index + git fsck --unreachable >cruft && + test_must_be_empty cruft && + # and they should be protected from GC + git gc --prune=now && + git cat-file -e $B1 && + git cat-file -e $B2 && + git cat-file -e $B3 && + + # Now resolve the conflicted path + B0=$(echo "resolve undo test data 0" | git hash-object -w --stdin) && + git update-index --add --cacheinfo 100644,$B0,frotz && + + # These three blob objects are now reachable only from the resolve-undo + git fsck --unreachable >cruft && + test_must_be_empty cruft && + + # and they should survive GC + git gc --prune=now && + git cat-file -e $B0 && + git cat-file -e $B1 && + git cat-file -e $B2 && + git cat-file -e $B3 && + + # Now we switch away, which nukes resolve-undo, and + # blobs B0..B3 would become dangling. fsck should + # notice that they are now unreachable. + git checkout -f side && + git fsck --unreachable >cruft && + sort cruft >actual && + sort <<-EOF >expect && + unreachable blob $B0 + unreachable blob $B1 + unreachable blob $B2 + unreachable blob $B3 + EOF + test_cmp expect actual && + + # And they should go away when gc runs. + git gc --prune=now && + git fsck --unreachable >cruft && + test_must_be_empty cruft && + + test_must_fail git cat-file -e $B0 && + test_must_fail git cat-file -e $B1 && + test_must_fail git cat-file -e $B2 && + test_must_fail git cat-file -e $B3 +' + test_done
The resolve-undo extension was added to the index in cfc5789a (resolve-undo: record resolved conflicts in a new index extension section, 2009-12-25). This extension records the blob object names and their modes of conflicted paths when the path gets resolved (e.g. with "git add"), to allow "undoing" the resolution with "checkout -m path". These blob objects should be guarded from garbage-collection while we have the resolve-undo information in the index (otherwise unresolve operation may try to use a blob object that has already been pruned away). But the code called from mark_reachable_objects() for the index forgets to do so. Teach add_index_objects_to_pending() helper to also add objects referred to by the resolve-undo extension. Also make matching changes to "fsck", which has code that is fairly similar to the reachability stuff, but have parallel implementations for all these stuff, which may (or may not) someday want to be unified. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- builtin/fsck.c | 38 +++++++++++++++++++++ revision.c | 36 ++++++++++++++++++++ t/t2030-unresolve-info.sh | 71 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 145 insertions(+)