Fix -Wmaybe-uninitialized warnings under -O0

Message ID	33984eeaabbfbcfd4b9d3903549d8b7d6c4ced7e.1585726172.git.liu.denton@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=RItT=5R=vger.kernel.org=git-owner@kernel.org> From: Denton Liu <liu.denton@gmail.com> To: Git Mailing List <git@vger.kernel.org> Subject: [PATCH] Fix -Wmaybe-uninitialized warnings under -O0 Date: Wed, 1 Apr 2020 03:30:16 -0400 Message-Id: <33984eeaabbfbcfd4b9d3903549d8b7d6c4ced7e.1585726172.git.liu.denton@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: git-owner@vger.kernel.org Precedence: bulk
Series	Fix -Wmaybe-uninitialized warnings under -O0 \| expand Fix -Wmaybe-uninitialized warnings under -O0

Denton Liu April 1, 2020, 7:30 a.m. UTC

When compiling Git under -O0, gcc (Arch Linux 9.3.0-1) 9.3.0 produces
many -Wmaybe-uninitialized warnings. These are false positives since
when Git is compiled under -O2, gcc is smart enough to see that the
code paths that use these variables all initialise them beforehand.
Nonetheless, these warnings block the compilation process when
DEVELOPER=1 is enabled (which enables -Werror).

Fix these warnings by initializing these variables with dummy values (0,
-1 or NULL as appropriate).

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 builtin/branch.c          | 2 +-
 diff.c                    | 2 +-
 fast-import.c             | 4 ++--
 http-backend.c            | 2 +-
 ref-filter.c              | 2 +-
 refs/packed-backend.c     | 2 +-
 t/helper/test-ref-store.c | 2 +-
 trace2/tr2_dst.c          | 2 +-
 trailer.c                 | 2 +-
 9 files changed, 10 insertions(+), 10 deletions(-)

Denton Liu April 1, 2020, 9:05 a.m. UTC | #1

Hi Carlos,

On Wed, Apr 01, 2020 at 01:01:41AM -0700, Carlo Arenas wrote:
> On Wed, Apr 1, 2020 at 12:31 AM Denton Liu <liu.denton@gmail.com> wrote:
> 
> > Nonetheless, these warnings block the compilation process when
> > DEVELOPER=1 is enabled (which enables -Werror).
> >
> 
> you can also alternatively use DEVOPTS=no-error

Thanks for the tip. By the way, it seems like your message was sent with
HTML so it didn't hit the list.

-Denton

> 
> Carlo

Jeff King April 1, 2020, 9:52 a.m. UTC | #2

On Wed, Apr 01, 2020 at 03:30:16AM -0400, Denton Liu wrote:

> When compiling Git under -O0, gcc (Arch Linux 9.3.0-1) 9.3.0 produces
> many -Wmaybe-uninitialized warnings. These are false positives since
> when Git is compiled under -O2, gcc is smart enough to see that the
> code paths that use these variables all initialise them beforehand.
> Nonetheless, these warnings block the compilation process when
> DEVELOPER=1 is enabled (which enables -Werror).
> 
> Fix these warnings by initializing these variables with dummy values (0,
> -1 or NULL as appropriate).

Hmph. I almost always compile with -O0 and have been using gcc 9.3.0
since it was packaged for Debian a few weeks ago, but I don't see any of
these warnings.

The current version in Debian unstable is 9.3.0-8, which picks up some
extra patches from the upstream gcc-9 branch. But even if I download a
snapshot of the original 9.3.0 release, it builds fine.

So why does your version behave differently? And if this is a temporary
state for a buggy version of gcc (that may be fixed in the next point
release), is it worth changing our source code to appease it?

-Peff

Denton Liu April 1, 2020, 2:06 p.m. UTC | #3

Hi Peff,

On Wed, Apr 01, 2020 at 05:52:55AM -0400, Jeff King wrote:
> On Wed, Apr 01, 2020 at 03:30:16AM -0400, Denton Liu wrote:
> 
> > When compiling Git under -O0, gcc (Arch Linux 9.3.0-1) 9.3.0 produces
> > many -Wmaybe-uninitialized warnings. These are false positives since
> > when Git is compiled under -O2, gcc is smart enough to see that the
> > code paths that use these variables all initialise them beforehand.
> > Nonetheless, these warnings block the compilation process when
> > DEVELOPER=1 is enabled (which enables -Werror).
> > 
> > Fix these warnings by initializing these variables with dummy values (0,
> > -1 or NULL as appropriate).
> 
> Hmph. I almost always compile with -O0 and have been using gcc 9.3.0
> since it was packaged for Debian a few weeks ago, but I don't see any of
> these warnings.
> 
> The current version in Debian unstable is 9.3.0-8, which picks up some
> extra patches from the upstream gcc-9 branch. But even if I download a
> snapshot of the original 9.3.0 release, it builds fine.
> 
> So why does your version behave differently? And if this is a temporary
> state for a buggy version of gcc (that may be fixed in the next point
> release), is it worth changing our source code to appease it?

A correction to the earlier message... It seems like I wasn't reporting
the correct settings. I was actually compiling with -Og, not -O0
(whoops!).

I tested it with gcc-8 and it seems like it also reports the same
problem. Also, -O1 reports warnings as well.

I guess the question is do we only support -O0 and -O2 or should we
support the levels in between as well?

Sorry for the confusion,

Denton

> 
> -Peff

Jeff King April 3, 2020, 2:04 p.m. UTC | #4

On Wed, Apr 01, 2020 at 10:06:43AM -0400, Denton Liu wrote:

> > So why does your version behave differently? And if this is a temporary
> > state for a buggy version of gcc (that may be fixed in the next point
> > release), is it worth changing our source code to appease it?
> 
> A correction to the earlier message... It seems like I wasn't reporting
> the correct settings. I was actually compiling with -Og, not -O0
> (whoops!).
> 
> I tested it with gcc-8 and it seems like it also reports the same
> problem. Also, -O1 reports warnings as well.

Ah, OK, I can reproduce easily with -Og (up through gcc-10). Most of
them don't trigger with -O1; just the one in ref-filter.c.

That one's interesting. We have:

  int ret = 0;
  ...
  if (...)
         ...
  else
         ret = for_each_fullref_in_pattern(...);
  ...
  return ret;

So we'd either have 0 or an assigned return. But the bug is actually in
for_each_fullref_in_pattern(), which does this:

  int ret; /* uninitialized! */

  /* a bunch of early return conditionals */
  if (...)
    return ...;

  for_each_string_list_item(...) {
    ret = for_each_fullref_in(...);
  }

  return ret;

but that will return an uninitialized value when there are no patterns.
I doubt we have such a case, but that may explain why -O0 does not
complain (it assumes "in_pattern" will return a useful value) and -O2
does not (it is able to figure out that it always does), but -O1 only
inlines part of it.

Curiously, -Og _does_ find the correct function.

Your patch silences it, but is it doing the right thing? It sets "ret =
0", but we haven't actually iterated anything. Should it be an error
instead?

-Peff

Jeff King April 3, 2020, 2:38 p.m. UTC | #5

On Fri, Apr 03, 2020 at 10:04:47AM -0400, Jeff King wrote:

> Ah, OK, I can reproduce easily with -Og (up through gcc-10). Most of
> them don't trigger with -O1; just the one in ref-filter.c.

I guess I should have been more clear since the -O1 and -Og locations
are different. -O1 complains in filter_refs().

By the way, that function's handling of filter->kind seems very sketchy
to me. It does:

  int ret = 0;
  if (!filter->kind)
                die("filter_refs: invalid type");
  else {
          /*
           * For common cases where we need only branches or remotes or tags,
           * we only iterate through those refs. If a mix of refs is needed,
           * we iterate over all refs and filter out required refs with the help
           * of filter_ref_kind().
           */
          if (filter->kind == FILTER_REFS_BRANCHES)
                  ret = for_each_fullref_in("refs/heads/", ...);
          else if (filter->kind == FILTER_REFS_REMOTES)
                  ret = for_each_fullref_in("refs/remotes/", ...);
          else if (filter->kind == FILTER_REFS_TAGS)
                  ret = for_each_fullref_in("refs/tags/", ...);
          else if (filter->kind & FILTER_REFS_ALL)
                  ret = for_each_fullref_in_pattern(filter, ...);
          if (!ret && (filter->kind & FILTER_REFS_DETACHED_HEAD))
                  head_ref(...);
  }

So filter->kind is sometimes treated like a bitfield and sometimes not.
I can set it to (ALL & DETACHED_HEAD) to get something useful, but not
(BRANCHES & HEAD). The up-front check tries to complain if you didn't
ask for anything, but there are other flags like INCLUDE_BROKEN that
would cause "!filter->kind" to be false, but still not produce any
output.

And shouldn't we be checking the return value of head_ref() like the
others?

All of this is outside the scope of our current discussion, and
untangling it might be messy (because it could touch the callers). I
just wanted to document my findings for now. :)

-Peff

Denton Liu April 4, 2020, 12:07 p.m. UTC | #6

Hi Peff,

On Fri, Apr 03, 2020 at 10:04:47AM -0400, Jeff King wrote:
> On Wed, Apr 01, 2020 at 10:06:43AM -0400, Denton Liu wrote:
> 
> > > So why does your version behave differently? And if this is a temporary
> > > state for a buggy version of gcc (that may be fixed in the next point
> > > release), is it worth changing our source code to appease it?
> > 
> > A correction to the earlier message... It seems like I wasn't reporting
> > the correct settings. I was actually compiling with -Og, not -O0
> > (whoops!).
> > 
> > I tested it with gcc-8 and it seems like it also reports the same
> > problem. Also, -O1 reports warnings as well.
> 
> Ah, OK, I can reproduce easily with -Og (up through gcc-10). Most of
> them don't trigger with -O1; just the one in ref-filter.c.
> 
> That one's interesting. We have:
> 
>   int ret = 0;
>   ...
>   if (...)
>          ...
>   else
>          ret = for_each_fullref_in_pattern(...);
>   ...
>   return ret;
> 
> So we'd either have 0 or an assigned return. But the bug is actually in
> for_each_fullref_in_pattern(), which does this:
> 
>   int ret; /* uninitialized! */
> 
>   /* a bunch of early return conditionals */
>   if (...)
>     return ...;
> 
>   for_each_string_list_item(...) {
>     ret = for_each_fullref_in(...);

This loop is missing a bit of important context:

	if (ret)
		break;

>   }
> 
>   return ret;
> 
> but that will return an uninitialized value when there are no patterns.
> I doubt we have such a case, but that may explain why -O0 does not
> complain (it assumes "in_pattern" will return a useful value) and -O2
> does not (it is able to figure out that it always does), but -O1 only
> inlines part of it.
> 
> Curiously, -Og _does_ find the correct function.
> 
> Your patch silences it, but is it doing the right thing? It sets "ret =
> 0", but we haven't actually iterated anything. Should it be an error
> instead?

I understood the semantics of for_each_fullref_in_pattern() (in the
non-early return case) to be "for each item, iterate and compute a
value; if that value is non-zero return it. If not found, return zero".
The missing context above is important since, without it, we lose the
semantics.

If I'm understanding the above correctly then, studying this function in
a vacuum, it would make sense to assign a default value of 0 since if
the for operates on an empty list, the function should consider the item
vacuously not found and we should return 0 as a result.

This was the type of analysis I applied to the other changes. I'll admit
that I studied the other functions in a vacuum as well since these
seemed to be superficial warnings (since they aren't triggered with
-O{0,2}) which indicated to me that the code was correct except for some
"cosmetic" errors. I dunno, perhaps this is my inexperience showing
through.

Thanks,

Denton

P.S. Do we want to quash the -O3 warnings as well?

> -Peff

Jeff King April 4, 2020, 2:21 p.m. UTC | #7

On Sat, Apr 04, 2020 at 08:07:43AM -0400, Denton Liu wrote:

> >   for_each_string_list_item(...) {
> >     ret = for_each_fullref_in(...);
> 
> This loop is missing a bit of important context:
> 
> 	if (ret)
> 		break;
> 

Yes, I omitted it because it's not relevant to whether ret is
initialized or not (i.e., if we enter the loop then it always is.
But I think it is to the argument you make below.

> > Your patch silences it, but is it doing the right thing? It sets "ret =
> > 0", but we haven't actually iterated anything. Should it be an error
> > instead?
> 
> I understood the semantics of for_each_fullref_in_pattern() (in the
> non-early return case) to be "for each item, iterate and compute a
> value; if that value is non-zero return it. If not found, return zero".
> The missing context above is important since, without it, we lose the
> semantics.
> 
> If I'm understanding the above correctly then, studying this function in
> a vacuum, it would make sense to assign a default value of 0 since if
> the for operates on an empty list, the function should consider the item
> vacuously not found and we should return 0 as a result.

I think the break on "ret" is better understood as "if we saw an error,
return it; otherwise keep going".

If we were given zero patterns, is that a noop success, or is it an
error? This should never happen because we cover that case earlier, so
it would be a bug in find_longest_prefixes. Perhaps:

diff --git a/ref-filter.c b/ref-filter.c
index b1812cb69a..b358567663 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1964,6 +1964,8 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 	}

 	find_longest_prefixes(&prefixes, filter->name_patterns);
+	if (!prefixes.nr)
+		BUG("no common ref prefix?");

 	for_each_string_list_item(prefix, &prefixes) {
 		ret = for_each_fullref_in(prefix->string, cb, cb_data, broken);

is worth doing to document that assumption. Ideally that would then let
the compiler know that we'll always enter the loop, but it doesn't seem
to be quite the clever.

So I dunno. I think it probably doesn't matter between "0" and "-1",
because this case really shouldn't be reachable.

> This was the type of analysis I applied to the other changes. I'll admit
> that I studied the other functions in a vacuum as well since these
> seemed to be superficial warnings (since they aren't triggered with
> -O{0,2}) which indicated to me that the code was correct except for some
> "cosmetic" errors. I dunno, perhaps this is my inexperience showing
> through.

Yeah, the code is correct in this case, and I don't think the
uninitialized case can be reached. But the subtle linkage here is that
patterns[0] being non-NULL means that find_longest_prefixes() will never
return a zero-length list, which means we'll always enter that loop at
least once.

> P.S. Do we want to quash the -O3 warnings as well?

They're probably worth looking at. I've periodically swept through and
fixed them, as recently as last September (try "git log --grep=-O3").
But new ones seem to have cropped up. I'm not sure if they were
introduced in the code or if the compiler got smarter (or dumber).

Just skimming the output, I see:

  In function ‘ll_binary_merge’,
      inlined from ‘ll_xdl_merge’ at ll-merge.c:115:10,
      inlined from ‘ll_union_merge’ at ll-merge.c:151:9:
  ll-merge.c:74:4: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
     74 |    warning("Cannot merge binary files: %s (%s vs. %s)",
        |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     75 |     path, name1, name2);
        |     ~~~~~~~~~~~~~~~~~~~

This one seems likely to be accurate (and just uncovered by more
aggressive inlining). The union_merge function passes in NULL filenames,
and probably could trigger this warning on binary input.

  trace2/tr2_dst.c: In function ‘tr2_dst_get_trace_fd.part.0’:
  trace2/tr2_dst.c:296:10: error: ‘fd’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
    296 |  dst->fd = fd;
        |  ~~~~~~~~^~~~
  trace2/tr2_dst.c:229:6: note: ‘fd’ was declared here
    229 |  int fd;
        |      ^~

This one looks like a false positive. The tr2 code does something
unusual for our code base: on error it returns errno. And I think the
compiler is not aware that errno would not be zero after a syscall
failure. It might be worth changing the return value to match our usual
pattern (return -1, and the caller can look in errno), which would
appease the compiler as a side effect.

  revision.c: In function ‘do_add_index_objects_to_pending’:
  revision.c:322:22: error: array subscript [1, 2147483647] is outside array bounds of ‘char[1]’ [-Werror=array-bounds]
    322 |   if (0 < len && name[len] && buf.len)
        |                  ~~~~^~~~~

This one is confusing. I imagine the char[1] is the empty string we
frequently pass to add_pending_object() and friends. And the "len" here
comes from interpret_branch_name(name, ...). So somehow the compiler
doesn't quite know that the length we'll return is going to be less than
or equal to the string length we pass in.

I don't know how to fix that aside from this, which isn't great (btw, it
looks like a lot of this code could stand to switch to using ssize_t
instead of int; there are probably weird errors if you managed to
somehow feed a 2GB branch name).

diff --git a/revision.c b/revision.c
index 8136929e23..9bc398bec1 100644
--- a/revision.c
+++ b/revision.c
@@ -317,9 +317,10 @@ static void add_pending_object_with_path(struct rev_info *revs,
 		revs->no_walk = 0;
 	if (revs->reflog_info && obj->type == OBJ_COMMIT) {
 		struct strbuf buf = STRBUF_INIT;
-		int len = interpret_branch_name(name, 0, &buf, 0);
+		size_t namelen = strlen(name);
+		int len = interpret_branch_name(name, namelen, &buf, 0);

-		if (0 < len && name[len] && buf.len)
+		if (len == namelen && buf.len)
 			strbuf_addstr(&buf, name + len);
 		add_reflog_for_walk(revs->reflog_info,
 				    (struct commit *)obj,

-Peff

Fix -Wmaybe-uninitialized warnings under -O0

Commit Message

Comments

Patch