Kbuild: Move -Wmaybe-uninitialized to W=1

Message ID	20140707105339.GA4776@pd.tnic (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-kbuild-owner@kernel.org> Date: Mon, 7 Jul 2014 12:53:39 +0200 From: Borislav Petkov <bp@alien8.de> To: Sam Ravnborg <sam@ravnborg.org> Cc: lkml <linux-kernel@vger.kernel.org>, Michael Matz <matz@suse.de>, linux-kbuild@vger.kernel.org, x86-ml <x86@kernel.org> Subject: Re: [PATCH] Kbuild: Move -Wmaybe-uninitialized to W=1 Message-ID: <20140707105339.GA4776@pd.tnic> References: <20140616132045.GE8170@pd.tnic> <20140616211405.GA7914@ravnborg.org> <20140624213835.GD15068@pd.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20140624213835.GD15068@pd.tnic> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kbuild-owner@vger.kernel.org Precedence: bulk

Borislav Petkov July 7, 2014, 10:53 a.m. UTC

Ok, here's a v2 which goes ontop of

http://lkml.kernel.org/r/1404175346-12330-1-git-send-email-behanw@converseincode.com

After this, all warnings stuff should be back to normal.

---
From: Borislav Petkov <bp@suse.de>
Subject: [PATCH -v2] Kbuild: Move -Wmaybe-uninitialized to W=1

This warning is enabled by -Wall or -Wextra, says the gcc manpage. It
also says that gcc cannot always know whether the warning is issued
correctly:

"These warnings are made optional because GCC is not smart enough to see
all the reasons why the code might be correct in spite of appearing to
have an error."

And, as expected, it fires for perfectly valid use cases, thus making it
not really useful. Let's move it to the W=1 bunch in case people want to
enable it with the additional checks.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 Makefile                   | 5 ++++-
 scripts/Makefile           | 1 +
 scripts/Makefile.extrawarn | 1 +
 3 files changed, 6 insertions(+), 1 deletion(-)

Paul Bolle July 8, 2014, 9:25 a.m. UTC | #1

On Mon, 2014-07-07 at 12:53 +0200, Borislav Petkov wrote:
> This warning is enabled by -Wall or -Wextra, says the gcc manpage. It
> also says that gcc cannot always know whether the warning is issued
> correctly:
> 
> "These warnings are made optional because GCC is not smart enough to see
> all the reasons why the code might be correct in spite of appearing to
> have an error."
> 
> And, as expected, it fires for perfectly valid use cases, thus making it
> not really useful. Let's move it to the W=1 bunch in case people want to
> enable it with the additional checks.
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>

Is the fact that this generates false positives by itself enough to
downgrade this check to W=1?

I do not have any hard numbers to back up my claims, but I'd like to
point out that it is possible that we never see many of the warnings
that GCC correctly issues because they might only occur during local
development. Ie, the developer involved cleans up the patch before
sending out.

My experience with the warnings that actually do make it into mainline
is that quite a few are correct while the false positives tend to be
generated by a pieces of code that are complicated (I think I've seen
the warning labeled as a "code smell").

For example, in my local builds this warning popped up three times in
the current development cycle:
    0) fs/direct-io.c:1011:12: warning: ‘from’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       fs/direct-io.c:1011:12: warning: ‘to’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    1) drivers/platform/x86/eeepc-laptop.c:279:10: warning: ‘value’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    2) drivers/usb/misc/usb3503.c:195:11: warning: ‘err’ may be used uninitialized in this function [-Wmaybe-uninitialized]

Warning 0) occurs in a 150 line function, that grows over 200 lines when
the inline functions it calls are included. And I do think it's not easy
to see hwat that code does. Anyhow, see
https://lkml.org/lkml/2014/7/1/496 for my attempt to silence this
warning by simplifying this function.

See https://lkml.org/lkml/2014/7/1/150 for my patch that silences 1) by,
again, simplifying the code.

And warning 2) is correct. See https://lkml.org/lkml/2014/6/2/511 for a
possible solution.

So, in summary, I believe that the signal/noise ratio actually isn't too
bad. Moreover I think that the noise isn't merely noise, as it points to
possibly problematic sections of code.

Paul Bolle

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Borislav Petkov July 8, 2014, 11:37 a.m. UTC | #2

On Tue, Jul 08, 2014 at 11:25:20AM +0200, Paul Bolle wrote:
> Is the fact that this generates false positives by itself enough to
> downgrade this check to W=1?
> 
> I do not have any hard numbers to back up my claims, but I'd like to
> point out that it is possible that we never see many of the warnings
> that GCC correctly issues because they might only occur during local
> development. Ie, the developer involved cleans up the patch before
> sending out.

You're assuming that a developer doesn't use W=<num> to test her/his
patches.

> My experience with the warnings that actually do make it into mainline

Which warnings? All warnings or only the maybe-uninitialized ones?

> is that quite a few are correct while the false positives tend to be
> generated by a pieces of code that are complicated (I think I've seen
> the warning labeled as a "code smell").
> 
> For example, in my local builds this warning popped up three times in
> the current development cycle:
>     0) fs/direct-io.c:1011:12: warning: ‘from’ may be used uninitialized in this function [-Wmaybe-uninitialized]
>        fs/direct-io.c:1011:12: warning: ‘to’ may be used uninitialized in this function [-Wmaybe-uninitialized]
>     1) drivers/platform/x86/eeepc-laptop.c:279:10: warning: ‘value’ may be used uninitialized in this function [-Wmaybe-uninitialized]
>     2) drivers/usb/misc/usb3503.c:195:11: warning: ‘err’ may be used uninitialized in this function [-Wmaybe-uninitialized]
> 
> Warning 0) occurs in a 150 line function, that grows over 200 lines when
> the inline functions it calls are included. And I do think it's not easy
> to see hwat that code does.

Have you actually tried to understand what the code does and also to see
that gcc simply cannot see that the to and from will never be used in
the error case? It is not that hard actually.

> Anyhow, see
> https://lkml.org/lkml/2014/7/1/496 for my attempt to silence this
> warning by simplifying this function.

Terrible idea - misdesign the code just because gcc doesn't say that two
variables *might* not be initialized.

> See https://lkml.org/lkml/2014/7/1/150 for my patch that silences 1) by,
> again, simplifying the code.

Again, this is wrong on *every* level! This is perfectly sane code where
value is used *only* in the success case.

> And warning 2) is correct. See https://lkml.org/lkml/2014/6/2/511 for a
> possible solution.

That warning should actually be not "maybe" but really -Wuninitialized.

> So, in summary, I believe that the signal/noise ratio actually isn't
> too bad. Moreover I think that the noise isn't merely noise, as it
> points to possibly problematic sections of code.

It seems you're missing the point so let me explain to you what I mean:

Here's the relevant snippet from the gcc manpage:

       -Wmaybe-uninitialized
           For an automatic variable, if there exists a path from the function entry to
           a use of the variable that is initialized, but there exist some other paths
           for which the variable is not initialized, the compiler emits a warning if
           it cannot prove the uninitialized paths are not executed at run time. These
           warnings are made optional because GCC is not smart enough to see all the
           reasons why the code might be correct in spite of appearing to have an
           error.

Now read it again: "gcc cannot prove the uninitialized paths are not
executed at run time. These warnings are made optional because GCC is
not smart enough ..."

And this is exactly what I'm proposing: keep the warning but downgrade
it so that it doesn't generate false positives noise. I'm not arguing it
is a bad check - I'm just saying it is more of a heuristic and should
belong to the rest of the W= checks.

Oh, moving it to W=1 has another reason - to hide it from overeager
people who really want to *fix* the code and thus obfuscate it for no
sensible reason what*so*f*ckin*ever! Just so that they can please the
compiler which says itself it cannot be that smart!

You have given perfect examples in 0) and 1) above for exactly that.

Borislav Petkov July 10, 2014, 10:42 a.m. UTC | #3

And,

I haven't even wished for it but just to prove everybody my point: just
built rc4+ from Linus' repo with gcc 4.9.0, see below.

Now all of a sudden there's more noise, maybe because this is a
different .config. However, I doubt those are real bugs.

A cursory look through those shows that they're not really bugs -
gcc simply can't know with all the ifdeffery, partial usage based on
conditionals, etc, etc.

---
In file included from scripts/sortextable.c:194:0:
scripts/sortextable.c: In function ‘main’:
scripts/sortextable.h:176:3: warning: ‘relocs_size’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   memset(relocs, 0, relocs_size);
   ^
scripts/sortextable.h:106:6: note: ‘relocs_size’ was declared here
  int relocs_size;
      ^
In file included from scripts/sortextable.c:192:0:
scripts/sortextable.h:176:3: warning: ‘relocs_size’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   memset(relocs, 0, relocs_size);
   ^
scripts/sortextable.h:106:6: note: ‘relocs_size’ was declared here
  int relocs_size;
      ^
fs/namespace.c: In function ‘SyS_mount’:
fs/namespace.c:2647:8: warning: ‘kernel_dev’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  ret = do_mount(kernel_dev, kernel_dir->name, kernel_type, flags,
        ^
fs/namespace.c:2626:8: note: ‘kernel_dev’ was declared here
  char *kernel_dev;
        ^
fs/namespace.c:2647:8: warning: ‘kernel_type’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  ret = do_mount(kernel_dev, kernel_dir->name, kernel_type, flags,
        ^
fs/namespace.c:2624:8: note: ‘kernel_type’ was declared here
  char *kernel_type;
        ^
fs/direct-io.c: In function ‘do_blockdev_direct_IO’:
fs/direct-io.c:920:9: warning: ‘to’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   while (from < to) {
         ^
fs/direct-io.c:913:16: note: ‘to’ was declared here
   size_t from, to;
                ^
fs/direct-io.c:1034:9: warning: ‘from’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    from += this_chunk_bytes;
         ^
fs/direct-io.c:913:10: note: ‘from’ was declared here
   size_t from, to;
          ^
In file included from include/net/inetpeer.h:15:0,
                 from net/ipv4/tcp_metrics.c:16:
net/ipv4/tcp_metrics.c: In function ‘tcp_peer_is_proven’:
include/net/ipv6.h:422:38: warning: ‘*((void *)&daddr+8)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  return ((ul1[0] ^ ul2[0]) | (ul1[1] ^ ul2[1])) == 0UL;
                                      ^
net/ipv4/tcp_metrics.c:231:30: note: ‘*((void *)&daddr+8)’ was declared here
  struct inetpeer_addr saddr, daddr;
                              ^
In file included from include/net/inetpeer.h:15:0,
                 from net/ipv4/tcp_metrics.c:16:
include/net/ipv6.h:422:38: warning: ‘*((void *)&saddr+8)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  return ((ul1[0] ^ ul2[0]) | (ul1[1] ^ ul2[1])) == 0UL;
                                      ^
net/ipv4/tcp_metrics.c:231:23: note: ‘*((void *)&saddr+8)’ was declared here
  struct inetpeer_addr saddr, daddr;
                       ^

Paul Bolle July 10, 2014, 11:03 a.m. UTC | #4

On Thu, 2014-07-10 at 12:42 +0200, Borislav Petkov wrote:
> I haven't even wished for it but just to prove everybody my point: just
> built rc4+ from Linus' repo with gcc 4.9.0, see below.
> 
> Now all of a sudden there's more noise, maybe because this is a
> different .config. However, I doubt those are real bugs.

Would you mind sharing that .config?

> A cursory look through those shows that they're not really bugs -
> gcc simply can't know with all the ifdeffery, partial usage based on

Nit: I'd expect ifdeffery to make the code only difficult to understand
for the human reader. By the time gcc will be doing flow analysis the
preprocessor has long handled all ifdeffery.

> conditionals, etc, etc.

Thanks,


Paul Bolle

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Borislav Petkov July 28, 2016, 4:20 a.m. UTC | #5

Hi Linus,

about 6e8d666e9253 ("Disable "maybe-uninitialized" warning globally"):

Good! Finally!

This thing was really getting on my nerves.

Btw, I tried at the time to move it to W=1 insted of completely
disabling it, see below. That got nowhere though.

On Mon, Jul 07, 2014 at 12:53:39PM +0200, Borislav Petkov wrote:
> Ok, here's a v2 which goes ontop of
> 
> http://lkml.kernel.org/r/1404175346-12330-1-git-send-email-behanw@converseincode.com
> 
> After this, all warnings stuff should be back to normal.
> 
> ---
> From: Borislav Petkov <bp@suse.de>
> Subject: [PATCH -v2] Kbuild: Move -Wmaybe-uninitialized to W=1
> 
> This warning is enabled by -Wall or -Wextra, says the gcc manpage. It
> also says that gcc cannot always know whether the warning is issued
> correctly:
> 
> "These warnings are made optional because GCC is not smart enough to see
> all the reasons why the code might be correct in spite of appearing to
> have an error."
> 
> And, as expected, it fires for perfectly valid use cases, thus making it
> not really useful. Let's move it to the W=1 bunch in case people want to
> enable it with the additional checks.
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> ---
>  Makefile                   | 5 ++++-
>  scripts/Makefile           | 1 +
>  scripts/Makefile.extrawarn | 1 +
>  3 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index 4d75b4bceedd..b58c94261960 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -613,7 +613,7 @@ include $(srctree)/arch/$(SRCARCH)/Makefile
>  KBUILD_CFLAGS	+= $(call cc-option,-fno-delete-null-pointer-checks,)
>  
>  ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
> -KBUILD_CFLAGS	+= -Os $(call cc-disable-warning,maybe-uninitialized,)
> +KBUILD_CFLAGS	+= -Os
>  else
>  KBUILD_CFLAGS	+= -O2
>  endif
> @@ -742,6 +742,9 @@ KBUILD_CFLAGS   += $(call cc-option,-Werror=date-time)
>  # use the deterministic mode of AR if available
>  KBUILD_ARFLAGS := $(call ar-option,D)
>  
> +# disable -Wmaybe-uninitialized as too noisy, see Makefile.extrawarn instead
> +KBUILD_CFLAGS   += $(call cc-disable-warning, maybe-uninitialized)
> +
>  # check for 'asm goto'
>  ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC)), y)
>  	KBUILD_CFLAGS += -DCC_HAVE_ASM_GOTO
> diff --git a/scripts/Makefile b/scripts/Makefile
> index 890df5c6adfb..a483d9988a2e 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -19,6 +19,7 @@ hostprogs-$(CONFIG_BUILDTIME_EXTABLE_SORT) += sortextable
>  hostprogs-$(CONFIG_ASN1)	 += asn1_compiler
>  
>  HOSTCFLAGS_sortextable.o = -I$(srctree)/tools/include
> +HOSTCFLAGS_sortextable.o += $(call cc-disable-warning, maybe-uninitialized)
>  HOSTCFLAGS_asn1_compiler.o = -I$(srctree)/include
>  
>  always		:= $(hostprogs-y) $(hostprogs-m)
> diff --git a/scripts/Makefile.extrawarn b/scripts/Makefile.extrawarn
> index e3501272cd93..9657d5778e06 100644
> --- a/scripts/Makefile.extrawarn
> +++ b/scripts/Makefile.extrawarn
> @@ -25,6 +25,7 @@ warning-1 += -Wold-style-definition
>  warning-1 += $(call cc-option, -Wmissing-include-dirs)
>  warning-1 += $(call cc-option, -Wunused-but-set-variable)
>  warning-1 += $(call cc-disable-warning, missing-field-initializers)
> +warning-1 += $(call cc-option, -Wmaybe-uninitialized)
>  
>  ifeq ($(COMPILER),clang)
>  warning-1 += $(call cc-disable-warning, initializer-overrides)
> -- 
> 2.0.0
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> Sent from a fat crate under my desk. Formatting is fine.
> --
>

Ingo Molnar July 28, 2016, 8:29 a.m. UTC | #6

* Borislav Petkov <bp@alien8.de> wrote:

> Hi Linus,
> 
> about 6e8d666e9253 ("Disable "maybe-uninitialized" warning globally"):
> 
> Good! Finally!
> 
> This thing was really getting on my nerves.

So I'm worried about this description in the changelog:

| Looking at the warnings produced, every single one I looked at was a false 
| positive, and the warnings are frequent enough (and big enough) that they can 
| easily hide real problems that you don't notice in the noise generated by 
| -Wmaybe-uninitialized.

BUT, isn't this the natural state of things, that the 'final' warnings that don't 
get fixed are the obnoxious, false positive ones - because anyone who looks at 
them will say "oh crap, idiotic compiler!"?

But over the last couple of years I think we probably had hundreds of bugs avoided 
due to the warning (both at the development and at the integration stage) - and 
now we are upset about a dozen residual warnings and declare it's all crap?

I am happy to bash GCC's cavalier attitude to warnings any day of the week, but 
this argumentation for this particular option looks fishy to me.

Why cannot we say something like:

 "a false_positive/false_negative ratio above 10% is considered obnoxious and
  won't be tolerated."

?

BTW., one related GCC property I'm pretty upset about: obvious false negatives for 
clear unused-variable bugs, which caused pain and hours of debugging. An example 
is this recent fix:

  commit e01d8718de4170373cd7fbf5cf6f9cb61cebb1e9
  Author: Peter Zijlstra <peterz@infradead.org>
  Date:   Wed Jan 27 23:24:29 2016 +0100

    perf/x86: Fix uninitialized value usage

  ...

  Only took 6 hours of painful debugging to find this. Neither GCC nor
  Smatch warnings flagged this bug.

... and my worry here is that we are now telling GCC: "don't you dare generate a 
false positive warning!" - at which point GCC folks will add even MORE heuristics 
to avoid false positives that generate even more false negatives which are hurting 
us asymmetrically more than the couple of low intensity minutes spent per false 
positive could ever hurt us ... ??

So what's the logic here? What am I missing?

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Borislav Petkov July 28, 2016, 8:46 a.m. UTC | #7

On Thu, Jul 28, 2016 at 10:29:15AM +0200, Ingo Molnar wrote:
> BUT, isn't this the natural state of things, that the 'final' warnings
> that don't get fixed are the obnoxious, false positive ones - because
> anyone who looks at them will say "oh crap, idiotic compiler!"?

Hmm, so my experience is like Linus' - that -Wmaybe thing generates too
much noise and a lot of false positives. The thing is, as Micha (on CC)
explained it to me, that warning simply says that GCC sometimes *cannot*
know whether the variable will be used uninitialized or not and eagerly
issues the warning message, just in case.

> But over the last couple of years I think we probably had hundreds of
> bugs avoided due to the warning (both at the development and at the
> integration stage) - and

Really?

And I've yet to see an example where it actually helped :-\

>   commit e01d8718de4170373cd7fbf5cf6f9cb61cebb1e9
>   Author: Peter Zijlstra <peterz@infradead.org>
>   Date:   Wed Jan 27 23:24:29 2016 +0100
> 
>     perf/x86: Fix uninitialized value usage
> 
>   ...
> 
>   Only took 6 hours of painful debugging to find this. Neither GCC nor
>   Smatch warnings flagged this bug.

So that warning didn't help here either.

> ... and my worry here is that we are now telling GCC: "don't you dare
> generate a false positive warning!" - at which point GCC folks will
> add even MORE heuristics to avoid false positives that generate even
> more false negatives

Why?

I think we should enable only the real warnings and turn off the stuff
which generates a lot of false positives. Or, we could put them behind
the -W= switch, so that people can still build the kernel with it but
not have them enabled by default.

Ingo Molnar July 28, 2016, 4:49 p.m. UTC | #8

Firstly, I'd like to stress it that I know that the commit is upstream and there's 
a less than 1% chance for it to be reverted. I'm not arguing with the expectation 
to see it reverted, I'm arguing because I'm not convinced about the underlying 
technical arguments.

Secondly, I'd like to stress it that there are a number of truly crappy GCC 
warnings that not only are obnoxious but also actively create the *wrong* kind of 
'fixes': -Wsign-compare or -fno-strict-aliasing for example.

* Borislav Petkov <bp@alien8.de> wrote:

> On Thu, Jul 28, 2016 at 10:29:15AM +0200, Ingo Molnar wrote:
> > BUT, isn't this the natural state of things, that the 'final' warnings
> > that don't get fixed are the obnoxious, false positive ones - because
> > anyone who looks at them will say "oh crap, idiotic compiler!"?
> 
> Hmm, so my experience is like Linus' - that -Wmaybe thing generates too
> much noise and a lot of false positives. [...]

But it's only the *residual* warnings we get to see in a huge, 1000+ commits per 
week, 20+ MLOC code base, every 3 months.

We are only seeing a very small minority of warnings, and are seeing them again 
and again in the allmodconfig output because nobody is fixing them because 'GCC is 
obviously wrong'.

So I might be wrong about this, but by its very nature this warning, even if it's 
implemented well, has the natural property that 'crappy corner cases coalesce at 
the bottom'.

> > But over the last couple of years I think we probably had hundreds of
> > bugs avoided due to the warning (both at the development and at the
> > integration stage) - and
> 
> Really?

Yes, really - even many of the false positives were useful to me, because they 
flagged questionable coding practices and overly complex code.

> And I've yet to see an example where it actually helped :-\

I think partly because you are used to working on x86 architecture code where we 
take a look at every warning ASAP.

Check out:

 commit 8e8a6b23f115906678252190c8fcf4168cc60fd5
 Author: Hans Verkuil <hverkuil@xs4all.nl>
 Date:   Tue Jul 28 05:33:55 2015 -0300

    [media] mt9v032: fix uninitialized variable warning

    It can indeed be uninitialized in one corner case. Initialize to NULL.

... we now turn these warnings into false negatives.

This one's a real fix too I think:

  c30400fcffb7 drm/i915: set FDI translations to NULL on SKL

and these were literally the first two random examples I took with a bit of 
grepping.

Plus I'd say that in many cases even if it's a false positive, these warnings 
often flag borderline code complexity bug: code flow should essentially never be 
so complex as to confuse a compiler. If it confuses a compiler it will confuse 9 
coders out of 10.

These two commits:

  3354781a2184 sched/numa: Reflow task_numa_group() to avoid a compiler warning
  719038de98bc x86/intel/cacheinfo: Shut up last long-standing warning

show cases where it flagged actively confusing code.

But these are only those rare warnings that make it upstream - I'd expect more 
than 90% of them to be caught by developers.

> >   commit e01d8718de4170373cd7fbf5cf6f9cb61cebb1e9
> >   Author: Peter Zijlstra <peterz@infradead.org>
> >   Date:   Wed Jan 27 23:24:29 2016 +0100
> > 
> >     perf/x86: Fix uninitialized value usage
> > 
> >   ...
> > 
> >   Only took 6 hours of painful debugging to find this. Neither GCC nor
> >   Smatch warnings flagged this bug.
> 
> So that warning didn't help here either.

Exactly, and now we'll get more such false negatives!

> > ... and my worry here is that we are now telling GCC: "don't you dare
> > generate a false positive warning!" - at which point GCC folks will
> > add even MORE heuristics to avoid false positives that generate even
> > more false negatives
> 
> Why?

Because people react to incentives, and the incentives we are creating here is for 
compiler writers to choose between two options:

1) Solve a really difficult code flow analysis problem and implement the perfect 
   warning detector.

2) Slap even more heuristics on top of existing heuristics to make warnings go
   away.

Guess which one is more likely to be picked?

Now granted, GCC folks will probably not react to 'incentives' created by the 
kernel project (we are one project out of thousands), but still.

> I think we should enable only the real warnings and turn off the stuff which 
> generates a lot of false positives. Or, we could put them behind the -W= switch, 
> so that people can still build the kernel with it but not have them enabled by 
> default.

But that's my point, I believe the false positive rate is pretty low in fact, due 
to three factors:

 - 90% of the warnings get fixed by developers, we never see them upstream

 - I'd say a majority (say 70%) of the remaining warnings are flagging 'complexity 
   bugs'

 - only a residual 3% are obnoxious ones.

But these remaining 3% are the ones we are seeing again and again in various 
compiler output, so we tend to get a subjective impression that this warning 
produces countless false positives.

So I *think* the better option would be to do what we are doing in the perf 
tooling: force a build error for these warnings (by default, with an option 
available to make it build). That flushes them out and also makes it sure that 
those questionable sequences of code never get upstream to begin with.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ingo Molnar July 28, 2016, 5:04 p.m. UTC | #9

* Ingo Molnar <mingo@kernel.org> wrote:

> But that's my point, I believe the false positive rate is pretty low in fact, due 
> to three factors:
> 
>  - 90% of the warnings get fixed by developers, we never see them upstream
> 
>  - I'd say a majority (say 70%) of the remaining warnings are flagging 'complexity 
>    bugs'
>  
>  - only a residual 3% are obnoxious ones.
> 
> But these remaining 3% are the ones we are seeing again and again in various 
> compiler output, so we tend to get a subjective impression that this warning 
> produces countless false positives.

And note that I am well aware of the real risk this poses: people will ignore real 
warnings if there are so many residual false positives.

I think this approach worked pretty well for perf:

> So I *think* the better option would be to do what we are doing in the perf 
> tooling: force a build error for these warnings (by default, with an option 
> available to make it build). That flushes them out and also makes it sure that 
> those questionable sequences of code never get upstream to begin with.

... but might not be appropriate for the kernel which is a 2 orders of magnitude 
larger code base.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Markus Trippelsdorf July 28, 2016, 5:56 p.m. UTC | #10

On 2016.07.28 at 10:46 +0200, Borislav Petkov wrote:
> On Thu, Jul 28, 2016 at 10:29:15AM +0200, Ingo Molnar wrote:
> > BUT, isn't this the natural state of things, that the 'final' warnings
> > that don't get fixed are the obnoxious, false positive ones - because
> > anyone who looks at them will say "oh crap, idiotic compiler!"?
> 
> Hmm, so my experience is like Linus' - that -Wmaybe thing generates too
> much noise and a lot of false positives. The thing is, as Micha (on CC)
> explained it to me, that warning simply says that GCC sometimes *cannot*
> know whether the variable will be used uninitialized or not and eagerly
> issues the warning message, just in case.

Another issue is that the number of warnings you get depend on the
optimization level. So -Os may be different from -O2 and once you use
-O3 (I know it is not officially supported) you will drown in false
positives...

Linus Torvalds July 28, 2016, 7:03 p.m. UTC | #11

On Thu, Jul 28, 2016 at 1:29 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
>
> So I'm worried about this description in the changelog:
>
> | Looking at the warnings produced, every single one I looked at was a false
> | positive, and the warnings are frequent enough (and big enough) that they can
> | easily hide real problems that you don't notice in the noise generated by
> | -Wmaybe-uninitialized.
>
> BUT, isn't this the natural state of things, that the 'final' warnings that don't
> get fixed are the obnoxious, false positive ones - because anyone who looks at
> them will say "oh crap, idiotic compiler!"?

No. The *natural* state of things is very simple:

   Zero warnings.

End of story. No excuses.

We were there at 4.6 for me with an "allmodconfig" build.

In 4.7, we had over a hundred lines of warnings. That is SIMPLY UNACCEPTABLE.

And the new warnings were actually not so much due to new code in 4.7,
as the fact that in between I did a user-space upgrade, and gcc 6.1.1
has regressed to the point of the warnings being an unusable mess.

> But over the last couple of years I think we probably had hundreds of bugs avoided
> due to the warning (both at the development and at the integration stage) - and
> now we are upset about a dozen residual warnings and declare it's all crap?

Nobody fixed the warnings, and looking at them, they were largely
unfixable without making the code worse.

Complain to the gcc people.

Remember: zero warnings.

> I am happy to bash GCC's cavalier attitude to warnings any day of the week, but
> this argumentation for this particular option looks fishy to me.

It has nothing to do with "that particular option".  It has everything
to do with bad warnings in general.

Also, if you actually look at the patch, you'll see that that option
has already been disabled for a _lot_ of configurations because it was
already bad for most cases.

> Why cannot we say something like:
>
>  "a false_positive/false_negative ratio above 10% is considered obnoxious and
>   won't be tolerated."

Sure. I'm happy to do that.

What part of "100% of all the warnings were false positives, and won't
be tolerated" did you not get?

Once we get to the point that the warning is no longer useful, and is
more pain than gain, it gets disabled.

If you open a bugzilla and the gcc people actually react to it, we can
revisit it. But as it is, that warning had largely been disabled for
other configurations anyway, and I just made it go away entirely.

Because false positives that we can't fix without making the code
worse are not acceptable.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds July 28, 2016, 7:08 p.m. UTC | #12

On Thu, Jul 28, 2016 at 12:03 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Once we get to the point that the warning is no longer useful, and is
> more pain than gain, it gets disabled.

Btw, I have a suspicion that you didn't realize that
"-Wmaybe-uninitialized" is separate from "-Wuninitialized" (which is
*not* disabled).

The "maybe-uninitialized" warning is literally gcc saying "I haven't
really followed all the logic, but from my broken understanding it
isn't _obvious_ that it is initialized".

And the problem is that a lot of gcc optimization choices basically
move the pointer of "obvious". So the warning is a bit random to begin
with. And when the gcc people screw thigns up, things go to hell in a
handbasket.

             Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ingo Molnar July 28, 2016, 8:28 p.m. UTC | #13

* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, Jul 28, 2016 at 12:03 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Once we get to the point that the warning is no longer useful, and is more 
> > pain than gain, it gets disabled.
> 
> Btw, I have a suspicion that you didn't realize that "-Wmaybe-uninitialized" is 
> separate from "-Wuninitialized" (which is *not* disabled).

I very much know the difference, as you can see from the commit IDs I cited.

> The "maybe-uninitialized" warning is literally gcc saying "I haven't really 
> followed all the logic, but from my broken understanding it isn't _obvious_ that 
> it is initialized".
> 
> And the problem is that a lot of gcc optimization choices basically move the 
> pointer of "obvious". So the warning is a bit random to begin with. And when the 
> gcc people screw thigns up, things go to hell in a handbasket.

Fair enough.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds July 28, 2016, 9:22 p.m. UTC | #14

On Thu, Jul 28, 2016 at 12:03 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> And the new warnings were actually not so much due to new code in 4.7,
> as the fact that in between I did a user-space upgrade, and gcc 6.1.1
> has regressed to the point of the warnings being an unusable mess.

Actually, thinking more about this, I'm not convinced it's a gcc
regression, because older gcc's have defainitely had the same problem
with that warning causing tons of spurious issues.

So it might actually mostly be due commit 877417e6ffb9 ("Kbuild:
change CC_OPTIMIZE_FOR_SIZE definition"). As a result of that, we now
end up not using CC_OPTIMIZE_FOR_SIZE for allmodconfig builds.

And since for us, -Os always disabled that warning anyway (because gcc
has always made a bad job of it), the bogus warnings didn't use to be
so annoying and hide the real things.

Of course, a big part of the reasoning for that commit was apparently
because Arnd liked the warning. It back-fired. Now that warning is
gone for everybody, because it's so broken.

                   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Arnd Bergmann July 29, 2016, 10:08 a.m. UTC | #15

On Thursday, July 28, 2016 2:22:02 PM CEST Linus Torvalds wrote:
> On Thu, Jul 28, 2016 at 12:03 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > And the new warnings were actually not so much due to new code in 4.7,
> > as the fact that in between I did a user-space upgrade, and gcc 6.1.1
> > has regressed to the point of the warnings being an unusable mess.
> 
> Actually, thinking more about this, I'm not convinced it's a gcc
> regression, because older gcc's have defainitely had the same problem
> with that warning causing tons of spurious issues.

Let me try to get to the bottom of this, maybe we can get the warning
back in the future. It has found a number of actual bugs. The majority
of -Wmaybe-uninitialized warnings that I fixed in linux-next were
false positives (maybe four out of five) but I would think the reason
for this is that most developers that get the warning for an actual
bug fix it right away, but some may have seen it and ignored it
as an obvious false positive.

> So it might actually mostly be due commit 877417e6ffb9 ("Kbuild:
> change CC_OPTIMIZE_FOR_SIZE definition"). As a result of that, we now
> end up not using CC_OPTIMIZE_FOR_SIZE for allmodconfig builds.

Correct, that was the intention. I tried my best to ensure that
all existing 'randconfig' warnings on ARM as well as 'defconfig' and
'allmodconfig' on x86 and powerpc got fixed before that patch
was merged, but evidently something went wrong there.

I'm normally testing with gcc-6.1 on ARM and did not find it to
be any worse than 4.9 or 5.3 for this warning here. I also did
some tests with gcc-6 on x86, but that showed a lot of /other/
warnings so I didn't look too closely at that. I'm installing
the latest gcc-6-branch version now to do some more tests on x86,
to see if gcc itself regressed, or something else caused the warnings
you see.

> And since for us, -Os always disabled that warning anyway (because gcc
> has always made a bad job of it), the bogus warnings didn't use to be
> so annoying and hide the real things.
> 
> Of course, a big part of the reasoning for that commit was apparently
> because Arnd liked the warning. It back-fired. Now that warning is
> gone for everybody, because it's so broken.

Makes sense. Some more background on this, as I've spent some significant
time on figuring out individual false-positive warnings:

The warning has always been enabled traditionally, and it was only
/disabled/ on allmodconfig by accident after my patch to turn it
off for -Os got merged after gcc-4.9 came out.

With gcc-4.8 and earlier, we had a number of false positives. This got
a lot better with "gcc-4.9 -O2" but worse with "gcc-4.9 -Os". On ARM,
I've managed to address all those warnings for randconfig builds,
in most cases by making the code more readable at the same time
(which helps both humans and the compiler to figure out when things
are initialized or not).

I am aware of two issues that made things worse again:

- The new jump-label-for-dynamic-debug patch caused one new warning
  on ARM (in media/dvb)
- x86 uses CONFIG_OPTIMIZE_INLINING on both defconfig and allmodconfig,
  whereas ARM does not. As this changes the inlining decisions, the
  compiler sees different initializations and reports a slightly differnet
  set of -Wmaybe-uninitialized warnings, both correct and incorrect.

Testing with today's linux tree and today's gcc-6.1 snapshot, I see
exactly these gcc warnings on x86 allmodconfig:x

arch/x86/crypto/aesni-intel_glue.c: In function 'helper_rfc4106_decrypt':
include/linux/scatterlist.h:124:36: error: 'dst_sg_walk.sg' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/media/dvb-frontends/cxd2841er.c: In function 'cxd2841er_tune_tc':
drivers/media/dvb-frontends/cxd2841er.c:3408:40: error: 'carrier_offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/sfi/sfi_core.c:175:53: error: self-comparison always evaluates to true [-Werror=tautological-compare]
drivers/sfi/sfi_core.c:195:71: error: self-comparison always evaluates to true [-Werror=tautological-compare]

The cxd2841er.c warning was introduced by the jump-label change
I mentioned, and I submitted a patch for that on the day it appeared.

I can certainly do another patch for the aesni-intel_glue.c warning
(I think avoiding the warning will also make the code more efficient),
but I don't see the pile of other warnings you mentioned, so I wonder
what else differs between our configurations.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Borislav Petkov July 29, 2016, 10:19 a.m. UTC | #16

On Fri, Jul 29, 2016 at 12:08:51PM +0200, Arnd Bergmann wrote:
> Let me try to get to the bottom of this, maybe we can get the warning
> back in the future. It has found a number of actual bugs. The majority
> of -Wmaybe-uninitialized warnings that I fixed in linux-next were
> false positives (maybe four out of five) but I would think the reason

So this is exactly the problem: we should not fix perfectly fine code
just so that gcc remains quiet. So when you say "fixed false positives"
you actually mean, "changed it so that gcc -Wmaybe-u... doesn't fire"
right?

And we should not do that.

Arnd Bergmann July 29, 2016, 10:35 a.m. UTC | #17

On Friday, July 29, 2016 12:19:32 PM CEST Borislav Petkov wrote:
> On Fri, Jul 29, 2016 at 12:08:51PM +0200, Arnd Bergmann wrote:
> > Let me try to get to the bottom of this, maybe we can get the warning
> > back in the future. It has found a number of actual bugs. The majority
> > of -Wmaybe-uninitialized warnings that I fixed in linux-next were
> > false positives (maybe four out of five) but I would think the reason
> 
> So this is exactly the problem: we should not fix perfectly fine code
> just so that gcc remains quiet. So when you say "fixed false positives"
> you actually mean, "changed it so that gcc -Wmaybe-u... doesn't fire"
> right?
> 
> And we should not do that.

As I said elsewhere in the mail, in general the code becomes more
readable in the process and/or the compiler gets to optimize it better.

What typically happens here is that something prevents the compiler
from seeing that a condition is always true, so it has to evaluate
it at runtime when it should have noticed that it can never hit.

If the code is written in a way that the compiler can actually see
that the condition is known based on what happened earlier, we save
an extra branch, or in some cases duplication of object code.

There have been a small number of cases where this was not possible
and I actually ended up adding a fake initialization because rearranging
the code for the compiler would have made it less readable for humans
(e.g. b268c34e5ee92a [1]), but that has been the rare exception because
of the reasons that Rusty nicely described in [2].

	Arnd

[1] https://patchwork.kernel.org/patch/9212881/
[2] https://rusty.ozlabs.org/?p=232
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds July 29, 2016, 6:26 p.m. UTC | #18

On Fri, Jul 29, 2016 at 3:19 AM, Borislav Petkov <bp@alien8.de> wrote:
>
> So this is exactly the problem: we should not fix perfectly fine code
> just so that gcc remains quiet. So when you say "fixed false positives"
> you actually mean, "changed it so that gcc -Wmaybe-u... doesn't fire"
> right?
>
> And we should not do that.

It's perfectly fine to do that when it makes sense and doesn't make
the code worse. Adding a few unnecessary initializations to make the
compiler shut up is not a problem.

But in the cases I looked at, that *really* didn't make sense. The
pattern was along the lines of

  struct something var;

  if (initialize_var(&var) < 0)
     return error;

  .. use "var.xyz" .. - gcc complains that "var.xyz" may be uninitialized

and quite frankly,. the code made sense, and adding crazy
initializations for the fact that gcc has a shit-for-brains warning
didn't work well seemed to just make the code worse.

And there was no sane *pattern* to why some cases warned. We have
things like the above in many places. The issue seems to be that
"initialize_var()" needs to be inlined (automatically or explicitly
asked for), and then the error flow in the init function is just
complex enough.

At the point where it doesn't make sense when to initialize things
explicitly, and it changes randomly depending on compiler version and
compiler command line flags, there is *no* sane way to work around it.

We could do whack-a-mole with random code cases, but I really feel
that when the warning is that unreliable and the changes to the source
code to make the broken compiler warning shut up are completely
arbitrary and random, it's worse than useless.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Kbuild: Move -Wmaybe-uninitialized to W=1

Commit Message

Comments

Patch