Message ID | dc46d39611df4ebd90d9308364d887e638c1bc30.1601044119.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | CMake and Visual Studio | expand |
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > When compiling Git in Visual C, we do not have the luxury of > support for `typeof()`, and therefore `OFFSETOF_VAR()` unfortunately > has to fall back to pointer arithmetic. Sigh. Short of changing the signature of hashmap_put_entry() and friends to also take the type of these variables, I do not see any kosher way to reimplement the users of OFFSETOF_VAR() to help compilers without typeof() offhand. As a one-time annotation, the unfortunate noise we see in this patch may be tolerable, but what may make this approach unsustainable is that average programmers would not know, without compiling with that particular compiler, if their new variable that points at a hash_entry needs to have an oterwise unnecessary initialization. Also the variables that are left uninitialized by this patch may later require such an initialization. Of course, it does not help that this workaround relies on an undefined behaviour, as you pointed out. > When compiling code using the `hashmap_for_each_entry()` macro in Debug > mode, this leads to the "run-time check failure #3" because the variable > passed as `var` are not initialized, yet we calculate the pointer > difference `&(var->member)-var`. Whoa, wait. If it is just that macro, can we perhaps do something like the attached patch? > > This "run-time check failure" causes a scary dialog to pop up. > > Work around this by initializing the respective variables. > > Note: according to the C standard, performing pointer arithmetic > with `NULL` is not exactly well-defined, but it seems to work > here, and it is at least better than performing pointer arithmetic > with an uninitialized pointer. hashmap.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hashmap.h b/hashmap.h index ef220de4c6..49cd8a8e92 100644 --- a/hashmap.h +++ b/hashmap.h @@ -449,7 +449,7 @@ static inline struct hashmap_entry *hashmap_iter_first(struct hashmap *map, * containing a @member which is a "struct hashmap_entry" */ #define hashmap_for_each_entry(map, iter, var, member) \ - for (var = hashmap_iter_first_entry_offset(map, iter, \ + for (var = NULL, var = hashmap_iter_first_entry_offset(map, iter, \ OFFSETOF_VAR(var, member)); \ var; \ var = hashmap_iter_next_entry_offset(iter, \
Junio C Hamano <gitster@pobox.com> writes: > Whoa, wait. If it is just that macro, can we perhaps do something > like the attached patch? I looked at all the uses of OFFSETOF_VAR() and I think the one used for hashmap_for_each_entry() is the only instance that 'var' given to it can legitimately be uninitialized, if typeof() were available. Here are the findings. #define hashmap_put_entry(map, keyvar, member) \ container_of_or_null_offset(hashmap_put(map, &(keyvar)->member), \ OFFSETOF_VAR(keyvar, member)) The keyvar is a pointer to the entry being placed in the map; it must hold a valid one so the pointer-diff implementation of OFFSETOF_VAR() should work fine, or we are putting garbage in to the map. #define hashmap_remove_entry(map, keyvar, member, keydata) \ container_of_or_null_offset( \ hashmap_remove(map, &(keyvar)->member, keydata), \ OFFSETOF_VAR(keyvar, member)) The keyvar is used to match against an existing entry in the map to be removed---it must have a valid value. #define hashmap_for_each_entry(map, iter, var, member) \ for (var = hashmap_iter_first_entry_offset(map, iter, \ OFFSETOF_VAR(var, member)); \ var; \ var = hashmap_iter_next_entry_offset(iter, \ OFFSETOF_VAR(var, member))) This, as you discovered, can be fed an uninitialized var and the first thing it does is to use OFFSETOF_VAR() on it in order to call hashmap_iter_first_entry_offset(). After that, i.e. when we called that function to start the loop, var is defined and we would be OK. The trick I suggested is to initialize var to NULL before making the call to hashmap_iter_first_entry_offset(), i.e. for (var = NULL, \ var = hashmap_iter_first_entry_offset(map, iter, \ OFFSETOF_VAR(var, member)); \ #define hashmap_get_entry(map, keyvar, member, keydata) \ container_of_or_null_offset( \ hashmap_get(map, &(keyvar)->member, keydata), \ OFFSETOF_VAR(keyvar, member)) Must be OK for the same reason _put_entry() is OK. #define hashmap_get_next_entry(map, var, member) \ container_of_or_null_offset(hashmap_get_next(map, &(var)->member), \ OFFSETOF_VAR(var, member)) This tries to go to the next-equal-pointer starting from var, so var must be valid already. So, perhaps the attached may be a viable replacement that would be more futureproof with less maintenance cost, I suspect. Thanks. --- >8 ----- cut here ----- >8 --- Subject: hashmap_for_each_entry(): workaround MSVC's runtime check failure #3 The OFFSETOF_VAR(var, member) macro is implemented in terms of offsetof(typeof(*var), member) with compilers that know typeof(), but its fallback implemenation compares &(var->member) and (var) and count the distance in bytes, i.e. ((uintptr_t)&(var)->member - (uintptr_t)(var)) MSVC's runtime check, when fed an uninitialized 'var', flags this as a use of an uninitialized variable (and that is legit---uninitialized contents of 'var' is subtracted) in a debug build. After auditing all 6 uses of OFFSETOF_VAR(), 1 of them does feed a potentially uninitialized 'var' to the macro in the beginning of the for() loop: #define hashmap_for_each_entry(map, iter, var, member) \ for (var = hashmap_iter_first_entry_offset(map, iter, \ OFFSETOF_VAR(var, member)); \ var; \ var = hashmap_iter_next_entry_offset(iter, \ OFFSETOF_VAR(var, member))) We can work around this by making sure that var has _some_ value when OFFSETOF_VAR() is called. Strictly speaking, it invites undefined behaviour to use NULL here if we end up with pointer comparison, but MSVC runtime seems to be happy with it, and most other systems have typeof() and don't even need pointer comparison fallback code. --- hashmap.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git c/hashmap.h w/hashmap.h index ef220de4c6..b011b394fe 100644 --- c/hashmap.h +++ w/hashmap.h @@ -449,7 +449,8 @@ static inline struct hashmap_entry *hashmap_iter_first(struct hashmap *map, * containing a @member which is a "struct hashmap_entry" */ #define hashmap_for_each_entry(map, iter, var, member) \ - for (var = hashmap_iter_first_entry_offset(map, iter, \ + for (var = NULL, /* for systems without typeof */ \ + var = hashmap_iter_first_entry_offset(map, iter, \ OFFSETOF_VAR(var, member)); \ var; \ var = hashmap_iter_next_entry_offset(iter, \
Hi Junio, On Sat, 26 Sep 2020, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > Whoa, wait. If it is just that macro, can we perhaps do something > > like the attached patch? > > I looked at all the uses of OFFSETOF_VAR() and I think the one used > for hashmap_for_each_entry() is the only instance that 'var' given > to it can legitimately be uninitialized, if typeof() were available. Thank you for doing all that leg work. TBH I didn't even think about looking further, after having run a couple tests manually that I thought were exhaustive in exercising this type of code pattern. > Here are the findings. > > #define hashmap_put_entry(map, keyvar, member) \ > container_of_or_null_offset(hashmap_put(map, &(keyvar)->member), \ > OFFSETOF_VAR(keyvar, member)) > > The keyvar is a pointer to the entry being placed in the map; it > must hold a valid one so the pointer-diff implementation of > OFFSETOF_VAR() should work fine, or we are putting garbage in to the > map. > > #define hashmap_remove_entry(map, keyvar, member, keydata) \ > container_of_or_null_offset( \ > hashmap_remove(map, &(keyvar)->member, keydata), \ > OFFSETOF_VAR(keyvar, member)) > > The keyvar is used to match against an existing entry in the map to > be removed---it must have a valid value. > > #define hashmap_for_each_entry(map, iter, var, member) \ > for (var = hashmap_iter_first_entry_offset(map, iter, \ > OFFSETOF_VAR(var, member)); \ > var; \ > var = hashmap_iter_next_entry_offset(iter, \ > OFFSETOF_VAR(var, member))) > > This, as you discovered, can be fed an uninitialized var and the > first thing it does is to use OFFSETOF_VAR() on it in order to call > hashmap_iter_first_entry_offset(). After that, i.e. when we called > that function to start the loop, var is defined and we would be OK. > > The trick I suggested is to initialize var to NULL before making the > call to hashmap_iter_first_entry_offset(), i.e. > > for (var = NULL, \ > var = hashmap_iter_first_entry_offset(map, iter, \ > OFFSETOF_VAR(var, member)); \ > > #define hashmap_get_entry(map, keyvar, member, keydata) \ > container_of_or_null_offset( \ > hashmap_get(map, &(keyvar)->member, keydata), \ > OFFSETOF_VAR(keyvar, member)) > > Must be OK for the same reason _put_entry() is OK. > > #define hashmap_get_next_entry(map, var, member) \ > container_of_or_null_offset(hashmap_get_next(map, &(var)->member), \ > OFFSETOF_VAR(var, member)) > > This tries to go to the next-equal-pointer starting from var, so var > must be valid already. > > So, perhaps the attached may be a viable replacement that would be > more futureproof with less maintenance cost, I suspect. Definitely much nicer to maintain, and easier to verify. In my hands, this works better than my manual touch-ups of _all_ the call sites. So I replaced my patch with yours (adding your SOB). Ciao, Dscho > Thanks. > > --- >8 ----- cut here ----- >8 --- > Subject: hashmap_for_each_entry(): workaround MSVC's runtime check failure #3 > > The OFFSETOF_VAR(var, member) macro is implemented in terms of > offsetof(typeof(*var), member) with compilers that know typeof(), > but its fallback implemenation compares &(var->member) and (var) and > count the distance in bytes, i.e. > > ((uintptr_t)&(var)->member - (uintptr_t)(var)) > > MSVC's runtime check, when fed an uninitialized 'var', flags this as > a use of an uninitialized variable (and that is legit---uninitialized > contents of 'var' is subtracted) in a debug build. > > After auditing all 6 uses of OFFSETOF_VAR(), 1 of them does feed a > potentially uninitialized 'var' to the macro in the beginning of the > for() loop: > > #define hashmap_for_each_entry(map, iter, var, member) \ > for (var = hashmap_iter_first_entry_offset(map, iter, \ > OFFSETOF_VAR(var, member)); \ > var; \ > var = hashmap_iter_next_entry_offset(iter, \ > OFFSETOF_VAR(var, member))) > > We can work around this by making sure that var has _some_ value > when OFFSETOF_VAR() is called. Strictly speaking, it invites > undefined behaviour to use NULL here if we end up with pointer > comparison, but MSVC runtime seems to be happy with it, and most > other systems have typeof() and don't even need pointer comparison > fallback code. > > --- > hashmap.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git c/hashmap.h w/hashmap.h > index ef220de4c6..b011b394fe 100644 > --- c/hashmap.h > +++ w/hashmap.h > @@ -449,7 +449,8 @@ static inline struct hashmap_entry *hashmap_iter_first(struct hashmap *map, > * containing a @member which is a "struct hashmap_entry" > */ > #define hashmap_for_each_entry(map, iter, var, member) \ > - for (var = hashmap_iter_first_entry_offset(map, iter, \ > + for (var = NULL, /* for systems without typeof */ \ > + var = hashmap_iter_first_entry_offset(map, iter, \ > OFFSETOF_VAR(var, member)); \ > var; \ > var = hashmap_iter_next_entry_offset(iter, \ > >
diff --git a/attr.c b/attr.c index a826b2ef1f..b4fde37877 100644 --- a/attr.c +++ b/attr.c @@ -160,7 +160,7 @@ static void all_attrs_init(struct attr_hashmap *map, struct attr_check *check) * field and fill each entry with its corresponding git_attr. */ if (size != check->all_attrs_nr) { - struct attr_hash_entry *e; + struct attr_hash_entry *e = NULL; struct hashmap_iter iter; REALLOC_ARRAY(check->all_attrs, size); diff --git a/blame.c b/blame.c index b475bfa1c0..e9879a772e 100644 --- a/blame.c +++ b/blame.c @@ -450,7 +450,7 @@ static int fingerprint_similarity(struct fingerprint *a, struct fingerprint *b) { int intersection = 0; struct hashmap_iter iter; - const struct fingerprint_entry *entry_a, *entry_b; + const struct fingerprint_entry *entry_a, *entry_b = NULL; hashmap_for_each_entry(&b->map, &iter, entry_b, entry /* member name */) { @@ -469,7 +469,7 @@ static void fingerprint_subtract(struct fingerprint *a, struct fingerprint *b) { struct hashmap_iter iter; struct fingerprint_entry *entry_a; - const struct fingerprint_entry *entry_b; + const struct fingerprint_entry *entry_b = NULL; hashmap_iter_init(&b->map, &iter); diff --git a/bloom.c b/bloom.c index 1a573226e7..ee45e9ccce 100644 --- a/bloom.c +++ b/bloom.c @@ -221,7 +221,7 @@ struct bloom_filter *get_bloom_filter(struct repository *r, if (diffopt.num_changes <= max_changes) { struct hashmap pathmap; - struct pathmap_hash_entry *e; + struct pathmap_hash_entry *e = NULL; struct hashmap_iter iter; hashmap_init(&pathmap, pathmap_cmp, NULL, 0); diff --git a/builtin/describe.c b/builtin/describe.c index 7668591d57..8b281cf426 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -332,7 +332,7 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst) if (!have_util) { struct hashmap_iter iter; struct commit *c; - struct commit_name *n; + struct commit_name *n = NULL; init_commit_names(&commit_names); hashmap_for_each_entry(&names, &iter, n, diff --git a/builtin/difftool.c b/builtin/difftool.c index 7ac432b881..a1527ea01c 100644 --- a/builtin/difftool.c +++ b/builtin/difftool.c @@ -344,7 +344,7 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix, FILE *fp; struct hashmap working_tree_dups, submodules, symlinks2; struct hashmap_iter iter; - struct pair_entry *entry; + struct pair_entry *entry = NULL; struct index_state wtindex; struct checkout lstate, rstate; int rc, flags = RUN_GIT_CMD, err = 0; diff --git a/builtin/fast-import.c b/builtin/fast-import.c index 1bf50a73dc..72154383c3 100644 --- a/builtin/fast-import.c +++ b/builtin/fast-import.c @@ -498,7 +498,7 @@ static void invalidate_pack_id(unsigned int id) unsigned long lu; struct tag *t; struct hashmap_iter iter; - struct object_entry *e; + struct object_entry *e = NULL; hashmap_for_each_entry(&object_table, &iter, e, ent) { if (e->pack_id == id) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 4003f4d13a..fcd87da036 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -67,7 +67,7 @@ static int sparse_checkout_list(int argc, const char **argv) if (pl.use_cone_patterns) { int i; - struct pattern_entry *pe; + struct pattern_entry *pe = NULL; struct hashmap_iter iter; struct string_list sl = STRING_LIST_INIT_DUP; @@ -153,7 +153,7 @@ static char *escaped_pattern(char *pattern) static void write_cone_to_file(FILE *fp, struct pattern_list *pl) { int i; - struct pattern_entry *pe; + struct pattern_entry *pe = NULL; struct hashmap_iter iter; struct string_list sl = STRING_LIST_INIT_DUP; struct strbuf parent_pattern = STRBUF_INIT; @@ -465,7 +465,7 @@ static void add_patterns_cone_mode(int argc, const char **argv, struct pattern_list *pl) { struct strbuf buffer = STRBUF_INIT; - struct pattern_entry *pe; + struct pattern_entry *pe = NULL; struct hashmap_iter iter; struct pattern_list existing; char *sparse_filename = get_sparse_checkout_filename(); diff --git a/config.c b/config.c index 2bdff4457b..83c72dd6e6 100644 --- a/config.c +++ b/config.c @@ -1953,7 +1953,7 @@ void git_configset_init(struct config_set *cs) void git_configset_clear(struct config_set *cs) { - struct config_set_element *entry; + struct config_set_element *entry = NULL; struct hashmap_iter iter; if (!cs->hash_initialized) return; diff --git a/merge-recursive.c b/merge-recursive.c index d0214335a7..11ea550b0d 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -2151,8 +2151,8 @@ static void handle_directory_level_conflicts(struct merge_options *opt, struct tree *merge) { struct hashmap_iter iter; - struct dir_rename_entry *head_ent; - struct dir_rename_entry *merge_ent; + struct dir_rename_entry *head_ent = NULL; + struct dir_rename_entry *merge_ent = NULL; struct string_list remove_from_head = STRING_LIST_INIT_NODUP; struct string_list remove_from_merge = STRING_LIST_INIT_NODUP; @@ -2221,7 +2221,7 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs) { struct hashmap *dir_renames; struct hashmap_iter iter; - struct dir_rename_entry *entry; + struct dir_rename_entry *entry = NULL; int i; /* @@ -2590,7 +2590,7 @@ static struct string_list *get_renames(struct merge_options *opt, int i; struct hashmap collisions; struct hashmap_iter iter; - struct collision_entry *e; + struct collision_entry *e = NULL; struct string_list *renames; compute_collisions(&collisions, dir_renames, pairs); @@ -2862,7 +2862,7 @@ static void initial_cleanup_rename(struct diff_queue_struct *pairs, struct hashmap *dir_renames) { struct hashmap_iter iter; - struct dir_rename_entry *e; + struct dir_rename_entry *e = NULL; hashmap_for_each_entry(dir_renames, &iter, e, ent /* member name */) { diff --git a/name-hash.c b/name-hash.c index fb526a3775..a3f710b2f8 100644 --- a/name-hash.c +++ b/name-hash.c @@ -706,7 +706,7 @@ void adjust_dirname_case(struct index_state *istate, char *name) struct cache_entry *index_file_exists(struct index_state *istate, const char *name, int namelen, int icase) { - struct cache_entry *ce; + struct cache_entry *ce = NULL; unsigned int hash = memihash(name, namelen); lazy_init_name_hash(istate); diff --git a/revision.c b/revision.c index 067030e64c..232de3f6f5 100644 --- a/revision.c +++ b/revision.c @@ -132,7 +132,7 @@ static void paths_and_oids_init(struct hashmap *map) static void paths_and_oids_clear(struct hashmap *map) { struct hashmap_iter iter; - struct path_and_oids_entry *entry; + struct path_and_oids_entry *entry = NULL; hashmap_for_each_entry(map, &iter, entry, ent /* member name */) { oidset_clear(&entry->trees); @@ -215,7 +215,7 @@ void mark_trees_uninteresting_sparse(struct repository *r, unsigned has_interesting = 0, has_uninteresting = 0; struct hashmap map; struct hashmap_iter map_iter; - struct path_and_oids_entry *entry; + struct path_and_oids_entry *entry = NULL; struct object_id *oid; struct oidset_iter iter; diff --git a/submodule-config.c b/submodule-config.c index c569e22aa3..662b9d9c09 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -89,7 +89,7 @@ static void free_one_config(struct submodule_entry *entry) static void submodule_cache_clear(struct submodule_cache *cache) { struct hashmap_iter iter; - struct submodule_entry *entry; + struct submodule_entry *entry = NULL; if (!cache->initialized) return; diff --git a/t/helper/test-hashmap.c b/t/helper/test-hashmap.c index f38706216f..2bde90309b 100644 --- a/t/helper/test-hashmap.c +++ b/t/helper/test-hashmap.c @@ -162,7 +162,7 @@ int cmd__hashmap(int argc, const char **argv) while (strbuf_getline(&line, stdin) != EOF) { char *cmd, *p1 = NULL, *p2 = NULL; unsigned int hash = 0; - struct test_entry *entry; + struct test_entry *entry = NULL; /* break line into command and up to two parameters */ cmd = strtok(line.buf, DELIM); diff --git a/t/helper/test-lazy-init-name-hash.c b/t/helper/test-lazy-init-name-hash.c index cd1b4c9736..2cb1fa3d8c 100644 --- a/t/helper/test-lazy-init-name-hash.c +++ b/t/helper/test-lazy-init-name-hash.c @@ -29,8 +29,8 @@ static void dump_run(void) char name[FLEX_ARRAY]; }; - struct dir_entry *dir; - struct cache_entry *ce; + struct dir_entry *dir = NULL; + struct cache_entry *ce = NULL; read_cache(); if (single) {