mbox series

[00/12] Clang -Wformat warning fixes

Message ID 20220609221702.347522-1-morbo@google.com (mailing list archive)
Headers show
Series Clang -Wformat warning fixes | expand

Message

Bill Wendling June 9, 2022, 10:16 p.m. UTC
This patch set fixes some clang warnings when -Wformat is enabled.

Bill Wendling (12):
  x86/mce: use correct format characters
  x86/CPU/AMD: use correct format characters
  x86/e820: use correct format characters
  blk-cgroup: use correct format characters
  fs: quota: use correct format characters
  PNP: use correct format characters
  driver/char: use correct format characters
  cdrom: use correct format characters
  ALSA: seq: use correct format characters
  ALSA: seq: use correct format characters
  ALSA: control: use correct format characters
  netfilter: conntrack: use correct format characters

 arch/x86/kernel/cpu/mce/amd.c       | 9 +++++----
 arch/x86/kernel/cpu/mce/core.c      | 2 +-
 arch/x86/kernel/e820.c              | 4 ++--
 drivers/cdrom/cdrom.c               | 2 +-
 drivers/char/mem.c                  | 2 +-
 drivers/pnp/interface.c             | 2 +-
 fs/quota/dquot.c                    | 2 +-
 mm/backing-dev.c                    | 2 +-
 net/netfilter/nf_conntrack_helper.c | 2 +-
 scripts/Makefile.extrawarn          | 4 ++--
 sound/core/control.c                | 2 +-
 sound/core/seq/seq_clientmgr.c      | 2 +-
 sound/core/sound.c                  | 2 +-
 13 files changed, 19 insertions(+), 18 deletions(-)

Comments

Andrew Morton June 9, 2022, 10:25 p.m. UTC | #1
On Thu,  9 Jun 2022 22:16:19 +0000 Bill Wendling <morbo@google.com> wrote:

> This patch set fixes some clang warnings when -Wformat is enabled.
> 

tldr:

-	printk(msg);
+	printk("%s", msg);

the only reason to make this change is where `msg' could contain a `%'.
Generally, it came from userspace.  Otherwise these changes are a
useless consumer of runtime resources.

I think it would be better to quieten clang in some fashion.
Bill Wendling June 9, 2022, 10:49 p.m. UTC | #2
On Thu, Jun 9, 2022 at 3:25 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu,  9 Jun 2022 22:16:19 +0000 Bill Wendling <morbo@google.com> wrote:
>
> > This patch set fixes some clang warnings when -Wformat is enabled.
> >
>
> tldr:
>
> -       printk(msg);
> +       printk("%s", msg);
>
> the only reason to make this change is where `msg' could contain a `%'.
> Generally, it came from userspace.

It helps kernel developers not accidentally to insert an unescaped '%'
in their messages, potentially exposing their code to an attack
vector.

> Otherwise these changes are a
> useless consumer of runtime resources.

Calling a "printf" style function is already insanely expensive. :-) I
understand that it's not okay blithely to increase runtime resources
simply because it's already slow, but in this case it's worthwhile.

> I think it would be better to quieten clang in some fashion.

The "printk" and similar functions all have the "__printf" attribute.
I don't know of a modification to that attribute which can turn off
this type of check.

-bw
Jan Engelhardt June 9, 2022, 11:03 p.m. UTC | #3
On Friday 2022-06-10 00:49, Bill Wendling wrote:
>On Thu, Jun 9, 2022 at 3:25 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>> On Thu,  9 Jun 2022 22:16:19 +0000 Bill Wendling <morbo@google.com> wrote:
>>
>> > This patch set fixes some clang warnings when -Wformat is enabled.
>>
>> tldr:
>>
>> -       printk(msg);
>> +       printk("%s", msg);
>>
>> Otherwise these changes are a
>> useless consumer of runtime resources.
>
>Calling a "printf" style function is already insanely expensive.
>[...]
>The "printk" and similar functions all have the "__printf" attribute.
>I don't know of a modification to that attribute which can turn off
>this type of check.

Perhaps you can split vprintk_store in the middle (after the call to
vsnprintf), and offer the second half as a function of its own (e.g.
"puts"). Then the tldr could be

- printk(msg);
+ puts(msg);
Bill Wendling June 9, 2022, 11:16 p.m. UTC | #4
On Thu, Jun 9, 2022 at 4:03 PM Jan Engelhardt <jengelh@inai.de> wrote:
> On Friday 2022-06-10 00:49, Bill Wendling wrote:
> >On Thu, Jun 9, 2022 at 3:25 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >> On Thu,  9 Jun 2022 22:16:19 +0000 Bill Wendling <morbo@google.com> wrote:
> >>
> >> > This patch set fixes some clang warnings when -Wformat is enabled.
> >>
> >> tldr:
> >>
> >> -       printk(msg);
> >> +       printk("%s", msg);
> >>
> >> Otherwise these changes are a
> >> useless consumer of runtime resources.
> >
> >Calling a "printf" style function is already insanely expensive.
> >[...]
> >The "printk" and similar functions all have the "__printf" attribute.
> >I don't know of a modification to that attribute which can turn off
> >this type of check.
>
> Perhaps you can split vprintk_store in the middle (after the call to
> vsnprintf), and offer the second half as a function of its own (e.g.
> "puts"). Then the tldr could be
>
> - printk(msg);
> + puts(msg);

That might be a nice compromise. Andrew, what do you think?

-bw
Nick Desaulniers June 10, 2022, 12:32 a.m. UTC | #5
On Thu, Jun 9, 2022 at 3:25 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu,  9 Jun 2022 22:16:19 +0000 Bill Wendling <morbo@google.com> wrote:
>
> > This patch set fixes some clang warnings when -Wformat is enabled.

It looks like this series fixes -Wformat-security, which while being a
member of the -Wformat group, is intentionally disabled in the kernel
and somewhat orthogonal to enabling -Wformat with Clang.

-Wformat is a group flag (like -Wall) that enables multiple other
flags implicitly.  Reading through
clang/include/clang/Basic/DiagnosticGroups.td in clang's sources, it
looks like:

1. -Wformat is a group flag.
2. -Wformat-security is a member of the -Wformat group; enabling
-Wformat will enable -Wformat-security.
3. -Wformat itself is a member of -Wmost (never heard of -Wmost, but
w/e). So -Wmost will enable -Wformat will enable -Wformat-security.
4. -Wmost is itself a member of -Wall. -Wall enables -Wmost enables
-Wformat enables -Wformat security.

Looking now at Kbuild:
1. Makefile:523 adds -Wall to KBUILD_CFLAGS.
2. The same assignment expression but on line 526 immediately disables
-Wformat-security via -Wno-format-security.
3. scripts/Makefile.extrawarn disables -Wformat via -Wno-format only
for clang (via guard of CONFIG_CC_IS_CLANG).

We _want_ -Wformat enabled for clang so that developers aren't sending
patches that trigger -Wformat with GCC (if they didn't happen to test
their code with both).  It's disabled for clang until we can build the
kernel cleanly with it enabled, which we'd like to do.

I don't think that we need to enable -Wformat-security to build with
-Wformat for clang.

I suspect based on Randy's comment on patch 1/12 that perhaps -Wformat
was _added_ to KBUILD_CFLAGS in scripts/Makefile.extrawarn rather than
-Wno-format being _removed_.  The former would re-enable
-Wformat-security due to the grouping logic described above.  The
latter is probably closer to our ultimate goal of enabling -Wformat
coverage for clang (or rather not disabling the coverage via
-Wno-format; a double negative).

I'm pretty sure the kernel doesn't support %n in format strings...see
the comment above vsnprintf in lib/vsprintf.c.  Are there other
attacks other than %n that -Wformat-security guards against? Maybe
there's some context on the commit that added -Wno-format-security to
the kernel?  Regardless, I don't think enabling -Wformat-security is a
blocker for enabling -Wformat (or...disabling -Wno-format...two sides
of the same coin) for clang.

> >
>
> tldr:
>
> -       printk(msg);
> +       printk("%s", msg);
>
> the only reason to make this change is where `msg' could contain a `%'.
> Generally, it came from userspace.  Otherwise these changes are a
> useless consumer of runtime resources.
>
> I think it would be better to quieten clang in some fashion.
Andrew Morton June 10, 2022, 1:19 a.m. UTC | #6
On Thu, 9 Jun 2022 16:16:16 -0700 Bill Wendling <morbo@google.com> wrote:

> On Thu, Jun 9, 2022 at 4:03 PM Jan Engelhardt <jengelh@inai.de> wrote:
> > On Friday 2022-06-10 00:49, Bill Wendling wrote:
> > >On Thu, Jun 9, 2022 at 3:25 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >> On Thu,  9 Jun 2022 22:16:19 +0000 Bill Wendling <morbo@google.com> wrote:
> > >>
> > >> > This patch set fixes some clang warnings when -Wformat is enabled.
> > >>
> > >> tldr:
> > >>
> > >> -       printk(msg);
> > >> +       printk("%s", msg);
> > >>
> > >> Otherwise these changes are a
> > >> useless consumer of runtime resources.
> > >
> > >Calling a "printf" style function is already insanely expensive.
> > >[...]
> > >The "printk" and similar functions all have the "__printf" attribute.
> > >I don't know of a modification to that attribute which can turn off
> > >this type of check.
> >
> > Perhaps you can split vprintk_store in the middle (after the call to
> > vsnprintf), and offer the second half as a function of its own (e.g.
> > "puts"). Then the tldr could be
> >
> > - printk(msg);
> > + puts(msg);
> 
> That might be a nice compromise. Andrew, what do you think?
> 

Sure.  It's surprising that we don't already have a puts().
Greg KH June 10, 2022, 5:20 a.m. UTC | #7
On Thu, Jun 09, 2022 at 04:16:16PM -0700, Bill Wendling wrote:
> On Thu, Jun 9, 2022 at 4:03 PM Jan Engelhardt <jengelh@inai.de> wrote:
> > On Friday 2022-06-10 00:49, Bill Wendling wrote:
> > >On Thu, Jun 9, 2022 at 3:25 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >> On Thu,  9 Jun 2022 22:16:19 +0000 Bill Wendling <morbo@google.com> wrote:
> > >>
> > >> > This patch set fixes some clang warnings when -Wformat is enabled.
> > >>
> > >> tldr:
> > >>
> > >> -       printk(msg);
> > >> +       printk("%s", msg);
> > >>
> > >> Otherwise these changes are a
> > >> useless consumer of runtime resources.
> > >
> > >Calling a "printf" style function is already insanely expensive.
> > >[...]
> > >The "printk" and similar functions all have the "__printf" attribute.
> > >I don't know of a modification to that attribute which can turn off
> > >this type of check.
> >
> > Perhaps you can split vprintk_store in the middle (after the call to
> > vsnprintf), and offer the second half as a function of its own (e.g.
> > "puts"). Then the tldr could be
> >
> > - printk(msg);
> > + puts(msg);
> 
> That might be a nice compromise. Andrew, what do you think?

You would need to do that for all of the dev_printk() variants, so I
doubt that would ever be all that useful as almost no one should be
using a "raw" printk() these days.

thanks,

greg k-h
David Laight June 10, 2022, 8:17 a.m. UTC | #8
From: Bill Wendling
> Sent: 09 June 2022 23:49
> 
> On Thu, Jun 9, 2022 at 3:25 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Thu,  9 Jun 2022 22:16:19 +0000 Bill Wendling <morbo@google.com> wrote:
> >
> > > This patch set fixes some clang warnings when -Wformat is enabled.
> > >
> >
> > tldr:
> >
> > -       printk(msg);
> > +       printk("%s", msg);
> >
> > the only reason to make this change is where `msg' could contain a `%'.
> > Generally, it came from userspace.
> 
> It helps kernel developers not accidentally to insert an unescaped '%'
> in their messages, potentially exposing their code to an attack
> vector.
> 
> > Otherwise these changes are a
> > useless consumer of runtime resources.
> 
> Calling a "printf" style function is already insanely expensive. :-) I
> understand that it's not okay blithely to increase runtime resources
> simply because it's already slow, but in this case it's worthwhile.

Yep, IMHO definitely should be fixed.
It is even possible that using "%s" is faster because the printf
code doesn't have to scan the string for format effectors.

> > I think it would be better to quieten clang in some fashion.
> 
> The "printk" and similar functions all have the "__printf" attribute.
> I don't know of a modification to that attribute which can turn off
> this type of check.

And you wouldn't want to for these cases.

The only problems arise when the format is calculated
(or passed in from a caller).
But that is likely to be dangerous - reading formats from
files (eg for language translation) isn't a good idea at all.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Jan Engelhardt June 10, 2022, 8:32 a.m. UTC | #9
On Friday 2022-06-10 10:17, David Laight wrote:
>> 
>> Calling a "printf" style function is already insanely expensive. :-) I
>> understand that it's not okay blithely to increase runtime resources
>> simply because it's already slow, but in this case it's worthwhile.
>
>Yep, IMHO definitely should be fixed.
>It is even possible that using "%s" is faster because the printf
>code doesn't have to scan the string for format effectors.

I see no special handling; the vsnprintf function just loops
over fmt as usual and I see no special casing of fmt by
e.g. strcmp(fmt, "%s") == 0 to take a shortcut.
David Laight June 10, 2022, 9:14 a.m. UTC | #10
From: Jan Engelhardt
> Sent: 10 June 2022 09:32
> 
> 
> On Friday 2022-06-10 10:17, David Laight wrote:
> >>
> >> Calling a "printf" style function is already insanely expensive. :-) I
> >> understand that it's not okay blithely to increase runtime resources
> >> simply because it's already slow, but in this case it's worthwhile.
> >
> >Yep, IMHO definitely should be fixed.
> >It is even possible that using "%s" is faster because the printf
> >code doesn't have to scan the string for format effectors.
> 
> I see no special handling; the vsnprintf function just loops
> over fmt as usual and I see no special casing of fmt by
> e.g. strcmp(fmt, "%s") == 0 to take a shortcut.

Consider the difference between:
	printf("fubar");
and
	printf("%s", "fubar");
In the former all of "fubar" is checked for '%'.
In the latter only the length of "fubar" has to be counted.

FWIW the patch description should probably by:
	use "%s" when formatting a single string.
(or something to that effect).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Jan Engelhardt June 10, 2022, 9:22 a.m. UTC | #11
On Friday 2022-06-10 11:14, David Laight wrote:
>> >Yep, IMHO definitely should be fixed.
>> >It is even possible that using "%s" is faster because the printf
>> >code doesn't have to scan the string for format effectors.
>> 
>> I see no special handling; the vsnprintf function just loops
>> over fmt as usual and I see no special casing of fmt by
>> e.g. strcmp(fmt, "%s") == 0 to take a shortcut.
>
>Consider the difference between:
>	printf("fubar");
>and
>	printf("%s", "fubar");
>In the former all of "fubar" is checked for '%'.
>In the latter only the length of "fubar" has to be counted.

To check the length of "fubar", printf first needs to know that there
even is an argument to be pulled from the stack, which it does by
evaluating the format string.

So, in fairness, it's more like:

 >> In the latter, all of "%s" is checked for '%'.
Joe Perches June 10, 2022, 12:44 p.m. UTC | #12
On Fri, 2022-06-10 at 07:20 +0200, Greg Kroah-Hartman wrote:
> On Thu, Jun 09, 2022 at 04:16:16PM -0700, Bill Wendling wrote:
> > On Thu, Jun 9, 2022 at 4:03 PM Jan Engelhardt <jengelh@inai.de> wrote:
> > > On Friday 2022-06-10 00:49, Bill Wendling wrote:
> > > > On Thu, Jun 9, 2022 at 3:25 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > > On Thu,  9 Jun 2022 22:16:19 +0000 Bill Wendling <morbo@google.com> wrote:
> > > > > 
> > > > > > This patch set fixes some clang warnings when -Wformat is enabled.
> > > > > 
> > > > > tldr:
> > > > > 
> > > > > -       printk(msg);
> > > > > +       printk("%s", msg);
> > > > > 
> > > > > Otherwise these changes are a
> > > > > useless consumer of runtime resources.

> > > > Calling a "printf" style function is already insanely expensive.

I expect the printk code itself dominates, not the % scan cost.

> > > Perhaps you can split vprintk_store in the middle (after the call to
> > > vsnprintf), and offer the second half as a function of its own (e.g.
> > > "puts"). Then the tldr could be
> > > 
> > > - printk(msg);
> > > + puts(msg);
> > 
> > That might be a nice compromise. Andrew, what do you think?
> 
> You would need to do that for all of the dev_printk() variants, so I
> doubt that would ever be all that useful as almost no one should be
> using a "raw" printk() these days.

True.  The kernel has ~20K variants like that.

$ git grep -P '\b(?:(?:\w+_){1,3}(?:alert|emerg|crit|err|warn|notice|info|cont|debug|dbg)|printk)\s*\(".*"\s*\)\s*;' | wc -l
21160

That doesn't include the ~3K uses like

#define foo "bar"
	printk(foo);

$ git grep -P '\b(?:(?:\w+_){1,3}(?:alert|emerg|crit|err|warn|info|notice|debug|dbg|cont)|printk)\s*\((?:\s*\w+){1,3}\s*\)\s*;'|wc -l
2922

There are apparently only a few hundred uses of variants like:

	printk("%s", foo)

$ git grep -P '\b(?:(?:\w+_){1,3}(?:alert|emerg|crit|err|warn|info|notice|debug|dbg|cont)|printk)\s*\(\s*"%s(?:\\n)?"\s*,\s*(?:".*"|\w+)\s*\)\s*;' | wc -l
305

unless I screwed up my greps (which of course is quite possible)