Message ID | 33984eeaabbfbcfd4b9d3903549d8b7d6c4ced7e.1585726172.git.liu.denton@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix -Wmaybe-uninitialized warnings under -O0 | expand |
Hi Carlos, On Wed, Apr 01, 2020 at 01:01:41AM -0700, Carlo Arenas wrote: > On Wed, Apr 1, 2020 at 12:31 AM Denton Liu <liu.denton@gmail.com> wrote: > > > Nonetheless, these warnings block the compilation process when > > DEVELOPER=1 is enabled (which enables -Werror). > > > > you can also alternatively use DEVOPTS=no-error Thanks for the tip. By the way, it seems like your message was sent with HTML so it didn't hit the list. -Denton > > Carlo
On Wed, Apr 01, 2020 at 03:30:16AM -0400, Denton Liu wrote: > When compiling Git under -O0, gcc (Arch Linux 9.3.0-1) 9.3.0 produces > many -Wmaybe-uninitialized warnings. These are false positives since > when Git is compiled under -O2, gcc is smart enough to see that the > code paths that use these variables all initialise them beforehand. > Nonetheless, these warnings block the compilation process when > DEVELOPER=1 is enabled (which enables -Werror). > > Fix these warnings by initializing these variables with dummy values (0, > -1 or NULL as appropriate). Hmph. I almost always compile with -O0 and have been using gcc 9.3.0 since it was packaged for Debian a few weeks ago, but I don't see any of these warnings. The current version in Debian unstable is 9.3.0-8, which picks up some extra patches from the upstream gcc-9 branch. But even if I download a snapshot of the original 9.3.0 release, it builds fine. So why does your version behave differently? And if this is a temporary state for a buggy version of gcc (that may be fixed in the next point release), is it worth changing our source code to appease it? -Peff
Hi Peff, On Wed, Apr 01, 2020 at 05:52:55AM -0400, Jeff King wrote: > On Wed, Apr 01, 2020 at 03:30:16AM -0400, Denton Liu wrote: > > > When compiling Git under -O0, gcc (Arch Linux 9.3.0-1) 9.3.0 produces > > many -Wmaybe-uninitialized warnings. These are false positives since > > when Git is compiled under -O2, gcc is smart enough to see that the > > code paths that use these variables all initialise them beforehand. > > Nonetheless, these warnings block the compilation process when > > DEVELOPER=1 is enabled (which enables -Werror). > > > > Fix these warnings by initializing these variables with dummy values (0, > > -1 or NULL as appropriate). > > Hmph. I almost always compile with -O0 and have been using gcc 9.3.0 > since it was packaged for Debian a few weeks ago, but I don't see any of > these warnings. > > The current version in Debian unstable is 9.3.0-8, which picks up some > extra patches from the upstream gcc-9 branch. But even if I download a > snapshot of the original 9.3.0 release, it builds fine. > > So why does your version behave differently? And if this is a temporary > state for a buggy version of gcc (that may be fixed in the next point > release), is it worth changing our source code to appease it? A correction to the earlier message... It seems like I wasn't reporting the correct settings. I was actually compiling with -Og, not -O0 (whoops!). I tested it with gcc-8 and it seems like it also reports the same problem. Also, -O1 reports warnings as well. I guess the question is do we only support -O0 and -O2 or should we support the levels in between as well? Sorry for the confusion, Denton > > -Peff
On Wed, Apr 01, 2020 at 10:06:43AM -0400, Denton Liu wrote: > > So why does your version behave differently? And if this is a temporary > > state for a buggy version of gcc (that may be fixed in the next point > > release), is it worth changing our source code to appease it? > > A correction to the earlier message... It seems like I wasn't reporting > the correct settings. I was actually compiling with -Og, not -O0 > (whoops!). > > I tested it with gcc-8 and it seems like it also reports the same > problem. Also, -O1 reports warnings as well. Ah, OK, I can reproduce easily with -Og (up through gcc-10). Most of them don't trigger with -O1; just the one in ref-filter.c. That one's interesting. We have: int ret = 0; ... if (...) ... else ret = for_each_fullref_in_pattern(...); ... return ret; So we'd either have 0 or an assigned return. But the bug is actually in for_each_fullref_in_pattern(), which does this: int ret; /* uninitialized! */ /* a bunch of early return conditionals */ if (...) return ...; for_each_string_list_item(...) { ret = for_each_fullref_in(...); } return ret; but that will return an uninitialized value when there are no patterns. I doubt we have such a case, but that may explain why -O0 does not complain (it assumes "in_pattern" will return a useful value) and -O2 does not (it is able to figure out that it always does), but -O1 only inlines part of it. Curiously, -Og _does_ find the correct function. Your patch silences it, but is it doing the right thing? It sets "ret = 0", but we haven't actually iterated anything. Should it be an error instead? -Peff
On Fri, Apr 03, 2020 at 10:04:47AM -0400, Jeff King wrote: > Ah, OK, I can reproduce easily with -Og (up through gcc-10). Most of > them don't trigger with -O1; just the one in ref-filter.c. I guess I should have been more clear since the -O1 and -Og locations are different. -O1 complains in filter_refs(). By the way, that function's handling of filter->kind seems very sketchy to me. It does: int ret = 0; if (!filter->kind) die("filter_refs: invalid type"); else { /* * For common cases where we need only branches or remotes or tags, * we only iterate through those refs. If a mix of refs is needed, * we iterate over all refs and filter out required refs with the help * of filter_ref_kind(). */ if (filter->kind == FILTER_REFS_BRANCHES) ret = for_each_fullref_in("refs/heads/", ...); else if (filter->kind == FILTER_REFS_REMOTES) ret = for_each_fullref_in("refs/remotes/", ...); else if (filter->kind == FILTER_REFS_TAGS) ret = for_each_fullref_in("refs/tags/", ...); else if (filter->kind & FILTER_REFS_ALL) ret = for_each_fullref_in_pattern(filter, ...); if (!ret && (filter->kind & FILTER_REFS_DETACHED_HEAD)) head_ref(...); } So filter->kind is sometimes treated like a bitfield and sometimes not. I can set it to (ALL & DETACHED_HEAD) to get something useful, but not (BRANCHES & HEAD). The up-front check tries to complain if you didn't ask for anything, but there are other flags like INCLUDE_BROKEN that would cause "!filter->kind" to be false, but still not produce any output. And shouldn't we be checking the return value of head_ref() like the others? All of this is outside the scope of our current discussion, and untangling it might be messy (because it could touch the callers). I just wanted to document my findings for now. :) -Peff
Hi Peff, On Fri, Apr 03, 2020 at 10:04:47AM -0400, Jeff King wrote: > On Wed, Apr 01, 2020 at 10:06:43AM -0400, Denton Liu wrote: > > > > So why does your version behave differently? And if this is a temporary > > > state for a buggy version of gcc (that may be fixed in the next point > > > release), is it worth changing our source code to appease it? > > > > A correction to the earlier message... It seems like I wasn't reporting > > the correct settings. I was actually compiling with -Og, not -O0 > > (whoops!). > > > > I tested it with gcc-8 and it seems like it also reports the same > > problem. Also, -O1 reports warnings as well. > > Ah, OK, I can reproduce easily with -Og (up through gcc-10). Most of > them don't trigger with -O1; just the one in ref-filter.c. > > That one's interesting. We have: > > int ret = 0; > ... > if (...) > ... > else > ret = for_each_fullref_in_pattern(...); > ... > return ret; > > So we'd either have 0 or an assigned return. But the bug is actually in > for_each_fullref_in_pattern(), which does this: > > int ret; /* uninitialized! */ > > /* a bunch of early return conditionals */ > if (...) > return ...; > > for_each_string_list_item(...) { > ret = for_each_fullref_in(...); This loop is missing a bit of important context: if (ret) break; > } > > return ret; > > but that will return an uninitialized value when there are no patterns. > I doubt we have such a case, but that may explain why -O0 does not > complain (it assumes "in_pattern" will return a useful value) and -O2 > does not (it is able to figure out that it always does), but -O1 only > inlines part of it. > > Curiously, -Og _does_ find the correct function. > > Your patch silences it, but is it doing the right thing? It sets "ret = > 0", but we haven't actually iterated anything. Should it be an error > instead? I understood the semantics of for_each_fullref_in_pattern() (in the non-early return case) to be "for each item, iterate and compute a value; if that value is non-zero return it. If not found, return zero". The missing context above is important since, without it, we lose the semantics. If I'm understanding the above correctly then, studying this function in a vacuum, it would make sense to assign a default value of 0 since if the for operates on an empty list, the function should consider the item vacuously not found and we should return 0 as a result. This was the type of analysis I applied to the other changes. I'll admit that I studied the other functions in a vacuum as well since these seemed to be superficial warnings (since they aren't triggered with -O{0,2}) which indicated to me that the code was correct except for some "cosmetic" errors. I dunno, perhaps this is my inexperience showing through. Thanks, Denton P.S. Do we want to quash the -O3 warnings as well? > -Peff
On Sat, Apr 04, 2020 at 08:07:43AM -0400, Denton Liu wrote: > > for_each_string_list_item(...) { > > ret = for_each_fullref_in(...); > > This loop is missing a bit of important context: > > if (ret) > break; > Yes, I omitted it because it's not relevant to whether ret is initialized or not (i.e., if we enter the loop then it always is. But I think it is to the argument you make below. > > Your patch silences it, but is it doing the right thing? It sets "ret = > > 0", but we haven't actually iterated anything. Should it be an error > > instead? > > I understood the semantics of for_each_fullref_in_pattern() (in the > non-early return case) to be "for each item, iterate and compute a > value; if that value is non-zero return it. If not found, return zero". > The missing context above is important since, without it, we lose the > semantics. > > If I'm understanding the above correctly then, studying this function in > a vacuum, it would make sense to assign a default value of 0 since if > the for operates on an empty list, the function should consider the item > vacuously not found and we should return 0 as a result. I think the break on "ret" is better understood as "if we saw an error, return it; otherwise keep going". If we were given zero patterns, is that a noop success, or is it an error? This should never happen because we cover that case earlier, so it would be a bug in find_longest_prefixes. Perhaps: diff --git a/ref-filter.c b/ref-filter.c index b1812cb69a..b358567663 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -1964,6 +1964,8 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter, } find_longest_prefixes(&prefixes, filter->name_patterns); + if (!prefixes.nr) + BUG("no common ref prefix?"); for_each_string_list_item(prefix, &prefixes) { ret = for_each_fullref_in(prefix->string, cb, cb_data, broken); is worth doing to document that assumption. Ideally that would then let the compiler know that we'll always enter the loop, but it doesn't seem to be quite the clever. So I dunno. I think it probably doesn't matter between "0" and "-1", because this case really shouldn't be reachable. > This was the type of analysis I applied to the other changes. I'll admit > that I studied the other functions in a vacuum as well since these > seemed to be superficial warnings (since they aren't triggered with > -O{0,2}) which indicated to me that the code was correct except for some > "cosmetic" errors. I dunno, perhaps this is my inexperience showing > through. Yeah, the code is correct in this case, and I don't think the uninitialized case can be reached. But the subtle linkage here is that patterns[0] being non-NULL means that find_longest_prefixes() will never return a zero-length list, which means we'll always enter that loop at least once. > P.S. Do we want to quash the -O3 warnings as well? They're probably worth looking at. I've periodically swept through and fixed them, as recently as last September (try "git log --grep=-O3"). But new ones seem to have cropped up. I'm not sure if they were introduced in the code or if the compiler got smarter (or dumber). Just skimming the output, I see: In function ‘ll_binary_merge’, inlined from ‘ll_xdl_merge’ at ll-merge.c:115:10, inlined from ‘ll_union_merge’ at ll-merge.c:151:9: ll-merge.c:74:4: error: ‘%s’ directive argument is null [-Werror=format-overflow=] 74 | warning("Cannot merge binary files: %s (%s vs. %s)", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 75 | path, name1, name2); | ~~~~~~~~~~~~~~~~~~~ This one seems likely to be accurate (and just uncovered by more aggressive inlining). The union_merge function passes in NULL filenames, and probably could trigger this warning on binary input. trace2/tr2_dst.c: In function ‘tr2_dst_get_trace_fd.part.0’: trace2/tr2_dst.c:296:10: error: ‘fd’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 296 | dst->fd = fd; | ~~~~~~~~^~~~ trace2/tr2_dst.c:229:6: note: ‘fd’ was declared here 229 | int fd; | ^~ This one looks like a false positive. The tr2 code does something unusual for our code base: on error it returns errno. And I think the compiler is not aware that errno would not be zero after a syscall failure. It might be worth changing the return value to match our usual pattern (return -1, and the caller can look in errno), which would appease the compiler as a side effect. revision.c: In function ‘do_add_index_objects_to_pending’: revision.c:322:22: error: array subscript [1, 2147483647] is outside array bounds of ‘char[1]’ [-Werror=array-bounds] 322 | if (0 < len && name[len] && buf.len) | ~~~~^~~~~ This one is confusing. I imagine the char[1] is the empty string we frequently pass to add_pending_object() and friends. And the "len" here comes from interpret_branch_name(name, ...). So somehow the compiler doesn't quite know that the length we'll return is going to be less than or equal to the string length we pass in. I don't know how to fix that aside from this, which isn't great (btw, it looks like a lot of this code could stand to switch to using ssize_t instead of int; there are probably weird errors if you managed to somehow feed a 2GB branch name). diff --git a/revision.c b/revision.c index 8136929e23..9bc398bec1 100644 --- a/revision.c +++ b/revision.c @@ -317,9 +317,10 @@ static void add_pending_object_with_path(struct rev_info *revs, revs->no_walk = 0; if (revs->reflog_info && obj->type == OBJ_COMMIT) { struct strbuf buf = STRBUF_INIT; - int len = interpret_branch_name(name, 0, &buf, 0); + size_t namelen = strlen(name); + int len = interpret_branch_name(name, namelen, &buf, 0); - if (0 < len && name[len] && buf.len) + if (len == namelen && buf.len) strbuf_addstr(&buf, name + len); add_reflog_for_walk(revs->reflog_info, (struct commit *)obj, -Peff
diff --git a/builtin/branch.c b/builtin/branch.c index d8297f80ff..3669fba546 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -455,7 +455,7 @@ static void print_current_branch_name(void) { int flags; const char *refname = resolve_ref_unsafe("HEAD", 0, NULL, &flags); - const char *shortname; + const char *shortname = NULL; if (!refname) die(_("could not resolve HEAD")); else if (!(flags & REF_ISSYMREF)) diff --git a/diff.c b/diff.c index f2cfbf2214..99a35774d7 100644 --- a/diff.c +++ b/diff.c @@ -3263,7 +3263,7 @@ static void emit_binary_diff_body(struct diff_options *o, void *delta; void *deflated; void *data; - unsigned long orig_size; + unsigned long orig_size = 0; unsigned long delta_size; unsigned long deflate_size; unsigned long data_size; diff --git a/fast-import.c b/fast-import.c index b8b65a801c..0509c0b92f 100644 --- a/fast-import.c +++ b/fast-import.c @@ -1829,7 +1829,7 @@ static void parse_original_identifier(void) static int parse_data(struct strbuf *sb, uintmax_t limit, uintmax_t *len_res) { - const char *data; + const char *data = NULL; strbuf_reset(sb); if (!skip_prefix(command_buf.buf, "data ", &data)) @@ -2719,7 +2719,7 @@ static void parse_new_commit(const char *arg) static void parse_new_tag(const char *arg) { static struct strbuf msg = STRBUF_INIT; - const char *from; + const char *from = NULL; char *tagger; struct branch *s; struct tag *t; diff --git a/http-backend.c b/http-backend.c index ec3144b444..1091cdf2cd 100644 --- a/http-backend.c +++ b/http-backend.c @@ -252,7 +252,7 @@ static void http_config(void) static struct rpc_service *select_service(struct strbuf *hdr, const char *name) { - const char *svc_name; + const char *svc_name = NULL; struct rpc_service *svc = NULL; int i; diff --git a/ref-filter.c b/ref-filter.c index b1812cb69a..6abd48f81a 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -1938,7 +1938,7 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter, { struct string_list prefixes = STRING_LIST_INIT_DUP; struct string_list_item *prefix; - int ret; + int ret = 0; if (!filter->match_as_path) { /* diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 4458a0f69c..f37c6d19a6 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -627,7 +627,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs) /* If the file has a header line, process it: */ if (snapshot->buf < snapshot->eof && *snapshot->buf == '#') { - char *tmp, *p, *eol; + char *tmp, *p = NULL, *eol; struct string_list traits = STRING_LIST_INIT_NODUP; eol = memchr(snapshot->buf, '\n', diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c index 799fc00aa1..d82dcb83a3 100644 --- a/t/helper/test-ref-store.c +++ b/t/helper/test-ref-store.c @@ -19,7 +19,7 @@ static unsigned int arg_flags(const char *arg, const char *name) static const char **get_store(const char **argv, struct ref_store **refs) { - const char *gitdir; + const char *gitdir = NULL; if (!argv[0]) { die("ref store required"); diff --git a/trace2/tr2_dst.c b/trace2/tr2_dst.c index ae052a07fe..910301a53d 100644 --- a/trace2/tr2_dst.c +++ b/trace2/tr2_dst.c @@ -227,7 +227,7 @@ static int tr2_dst_try_unix_domain_socket(struct tr2_dst *dst, { unsigned int uds_try = 0; int fd; - int e; + int e = 0; const char *path = NULL; /* diff --git a/trailer.c b/trailer.c index 0c414f2fed..cac9d0a4d9 100644 --- a/trailer.c +++ b/trailer.c @@ -507,7 +507,7 @@ static int git_trailer_config(const char *conf_key, const char *value, void *cb) struct arg_item *item; struct conf_info *conf; char *name = NULL; - enum trailer_info_type type; + enum trailer_info_type type = -1; int i; if (!skip_prefix(conf_key, "trailer.", &trailer_item))
When compiling Git under -O0, gcc (Arch Linux 9.3.0-1) 9.3.0 produces many -Wmaybe-uninitialized warnings. These are false positives since when Git is compiled under -O2, gcc is smart enough to see that the code paths that use these variables all initialise them beforehand. Nonetheless, these warnings block the compilation process when DEVELOPER=1 is enabled (which enables -Werror). Fix these warnings by initializing these variables with dummy values (0, -1 or NULL as appropriate). Signed-off-by: Denton Liu <liu.denton@gmail.com> --- builtin/branch.c | 2 +- diff.c | 2 +- fast-import.c | 4 ++-- http-backend.c | 2 +- ref-filter.c | 2 +- refs/packed-backend.c | 2 +- t/helper/test-ref-store.c | 2 +- trace2/tr2_dst.c | 2 +- trailer.c | 2 +- 9 files changed, 10 insertions(+), 10 deletions(-)