diff mbox series

objtool: prefer memory clobber & %= to volatile & __COUNTER__

Message ID 20220114010526.1776605-1-ndesaulniers@google.com (mailing list archive)
State Not Applicable, archived
Headers show
Series objtool: prefer memory clobber & %= to volatile & __COUNTER__ | expand

Commit Message

Nick Desaulniers Jan. 14, 2022, 1:05 a.m. UTC
commit dcce50e6cc4d ("compiler.h: Fix annotation macro misplacement with Clang")
mentions:

> 'volatile' is ignored for some reason and Clang feels free to move the
> reachable() annotation away from its intended location.

Indeed, volatile is not a compiler barrier. Particularly once `-march=`
flags are used under certain configs, LLVM's machine-scheduler can be
observed moving instructions across the asm statement meant to point to
known reachable or unreachable code, as reported by 0day bot.

Prefer a memory clobber which is a compiler barrier that prevents these
re-orderings and remove the volatile qualifier.

Looking closer, the use of __COUNTER__ seems to have been used to
prevent de-duplication of these asm statements. The GCC manual mentions:

> Under certain circumstances, GCC may duplicate (or remove duplicates
> of) your assembly code when optimizing. This can lead to unexpected
> duplicate symbol errors during compilation if your asm code defines
> symbols or labels. Using ‘%=’ (see AssemblerTemplate) may help resolve
> this problem.
>
> ‘%=’ Outputs a number that is unique to each instance of the asm
> statement in the entire compilation. This option is useful when
> creating local labels and referring to them multiple times in a single
> template that generates multiple assembler instructions.

commit 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()")

Mentions that

> The inline asm ‘%=’ token could be used for that, but unfortunately
> older versions of GCC don't support it.

From testing all versions of GCC available on godbolt.org, GCC 4.1+
seems to support 4.1. Since the minimum supported version of GCC at the
moment is GCC 5.1, it sounds like this is no longer a concern.

Prefer the %= assembler template to having to stringify __COUNTER__.

This commit is effectively a revert of the following commits:
commit dcce50e6cc4d ("compiler.h: Fix annotation macro misplacement with Clang")
commit f1069a8756b9 ("compiler.h: Avoid using inline asm operand modifiers")
commit c199f64ff93c ("instrumentation.h: Avoid using inline asm operand modifiers")
commit d0c2e691d1cb ("objtool: Add a comment for the unreachable annotation macros")
commit ec1e1b610917 ("objtool: Prevent GCC from merging annotate_unreachable(), take 2")
commit 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()")

Link: https://github.com/ClangBuiltLinux/linux/issues/1566
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#AssemblerTemplate
Link: https://lore.kernel.org/llvm/202112080834.XFYU8b5Q-lkp@intel.com/
Link: https://lore.kernel.org/llvm/202111300857.IyINAyJk-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Debugged-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
 include/linux/compiler.h        | 31 +++++++++++--------------------
 include/linux/instrumentation.h | 24 ++++++++++--------------
 2 files changed, 21 insertions(+), 34 deletions(-)


base-commit: dcce50e6cc4d86a63dc0a9a6ee7d4f948ccd53a1

Comments

Nathan Chancellor Jan. 14, 2022, 9:58 p.m. UTC | #1
On Thu, Jan 13, 2022 at 05:05:26PM -0800, Nick Desaulniers wrote:
> commit dcce50e6cc4d ("compiler.h: Fix annotation macro misplacement with Clang")
> mentions:
> 
> > 'volatile' is ignored for some reason and Clang feels free to move the
> > reachable() annotation away from its intended location.

Perhaps it is worth mentioning in the next paragraph that this does not
contradict GCC's own documentation that you link to below, which
mentions that asm volatile statements can be reordered.

"Note that the compiler can move even volatile asm instructions relative
to other code, including across jump instructions."

No strong opinion though.

> Indeed, volatile is not a compiler barrier. Particularly once `-march=`
> flags are used under certain configs, LLVM's machine-scheduler can be
> observed moving instructions across the asm statement meant to point to
> known reachable or unreachable code, as reported by 0day bot.
> 
> Prefer a memory clobber which is a compiler barrier that prevents these
> re-orderings and remove the volatile qualifier.
> 
> Looking closer, the use of __COUNTER__ seems to have been used to
> prevent de-duplication of these asm statements. The GCC manual mentions:
> 
> > Under certain circumstances, GCC may duplicate (or remove duplicates
> > of) your assembly code when optimizing. This can lead to unexpected
> > duplicate symbol errors during compilation if your asm code defines
> > symbols or labels. Using ‘%=’ (see AssemblerTemplate) may help resolve
> > this problem.
> >
> > ‘%=’ Outputs a number that is unique to each instance of the asm
> > statement in the entire compilation. This option is useful when
> > creating local labels and referring to them multiple times in a single
> > template that generates multiple assembler instructions.
> 
> commit 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()")
> 
> Mentions that
> 
> > The inline asm ‘%=’ token could be used for that, but unfortunately
> > older versions of GCC don't support it.
> 
> From testing all versions of GCC available on godbolt.org, GCC 4.1+
> seems to support 4.1. Since the minimum supported version of GCC at the
> moment is GCC 5.1, it sounds like this is no longer a concern.
> 
> Prefer the %= assembler template to having to stringify __COUNTER__.
> 
> This commit is effectively a revert of the following commits:
> commit dcce50e6cc4d ("compiler.h: Fix annotation macro misplacement with Clang")
> commit f1069a8756b9 ("compiler.h: Avoid using inline asm operand modifiers")
> commit c199f64ff93c ("instrumentation.h: Avoid using inline asm operand modifiers")
> commit d0c2e691d1cb ("objtool: Add a comment for the unreachable annotation macros")
> commit ec1e1b610917 ("objtool: Prevent GCC from merging annotate_unreachable(), take 2")
> commit 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()")
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1566
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#AssemblerTemplate
> Link: https://lore.kernel.org/llvm/202112080834.XFYU8b5Q-lkp@intel.com/
> Link: https://lore.kernel.org/llvm/202111300857.IyINAyJk-lkp@intel.com/
> Reported-by: kernel test robot <lkp@intel.com>
> Debugged-by: Nathan Chancellor <nathan@kernel.org>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

This resolves the original spew of warnings and does not regress a
couple of other configurations that I tested.

Tested-by: Nathan Chancellor <nathan@kernel.org>

Not sure I am qualified enough to give a reviewed-by.

> ---
>  include/linux/compiler.h        | 31 +++++++++++--------------------
>  include/linux/instrumentation.h | 24 ++++++++++--------------
>  2 files changed, 21 insertions(+), 34 deletions(-)
> 
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 429dcebe2b99..3ac21b888d20 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -108,30 +108,21 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>  # define barrier_before_unreachable() do { } while (0)
>  #endif
>  
> -/* Unreachable code */
> +/* These macros help objtool understand GCC code flow for unreachable code. */
>  #ifdef CONFIG_STACK_VALIDATION
> -/*
> - * These macros help objtool understand GCC code flow for unreachable code.
> - * The __COUNTER__ based labels are a hack to make each instance of the macros
> - * unique, to convince GCC not to merge duplicate inline asm statements.
> - */
> -#define __stringify_label(n) #n
> -
> -#define __annotate_reachable(c) ({					\
> -	asm volatile(__stringify_label(c) ":\n\t"			\
> -		     ".pushsection .discard.reachable\n\t"		\
> -		     ".long " __stringify_label(c) "b - .\n\t"		\
> -		     ".popsection\n\t" : : "i" (c));			\
> +#define annotate_reachable() ({			\
> +	asm (".Lreachable%=:\n\t"			\
> +	     ".pushsection .discard.reachable\n\t"	\
> +	     ".long .Lreachable%= - .\n\t"		\
> +	     ".popsection\n\t" ::: "memory");		\
>  })
> -#define annotate_reachable() __annotate_reachable(__COUNTER__)
>  
> -#define __annotate_unreachable(c) ({					\
> -	asm volatile(__stringify_label(c) ":\n\t"			\
> -		     ".pushsection .discard.unreachable\n\t"		\
> -		     ".long " __stringify_label(c) "b - .\n\t"		\
> -		     ".popsection\n\t" : : "i" (c));			\
> +#define annotate_unreachable() ({			\
> +	asm (".Lunreachable%=:\n\t"			\
> +	     ".pushsection .discard.unreachable\n\t"	\
> +	     ".long .Lunreachable%= - .\n\t"		\
> +	     ".popsection\n\t" ::: "memory");		\
>  })
> -#define annotate_unreachable() __annotate_unreachable(__COUNTER__)
>  
>  #define ASM_UNREACHABLE							\
>  	"999:\n\t"							\
> diff --git a/include/linux/instrumentation.h b/include/linux/instrumentation.h
> index 24359b4a9605..0dae9c08764f 100644
> --- a/include/linux/instrumentation.h
> +++ b/include/linux/instrumentation.h
> @@ -4,16 +4,13 @@
>  
>  #if defined(CONFIG_DEBUG_ENTRY) && defined(CONFIG_STACK_VALIDATION)
>  
> -#include <linux/stringify.h>
> -
>  /* Begin/end of an instrumentation safe region */
> -#define __instrumentation_begin(c) ({					\
> -	asm volatile(__stringify(c) ": nop\n\t"				\
> -		     ".pushsection .discard.instr_begin\n\t"		\
> -		     ".long " __stringify(c) "b - .\n\t"		\
> -		     ".popsection\n\t" : : "i" (c));			\
> +#define instrumentation_begin() ({			\
> +	asm (".Linstbegin%=: nop\n\t"			\
> +	     ".pushsection .discard.instr_begin\n\t"	\
> +	     ".long .Linstbegin%= - .\n\t"		\
> +	     ".popsection\n\t" ::: "memory");		\
>  })
> -#define instrumentation_begin() __instrumentation_begin(__COUNTER__)
>  
>  /*
>   * Because instrumentation_{begin,end}() can nest, objtool validation considers
> @@ -46,13 +43,12 @@
>   * To avoid this, have _end() be a NOP instruction, this ensures it will be
>   * part of the condition block and does not escape.
>   */
> -#define __instrumentation_end(c) ({					\
> -	asm volatile(__stringify(c) ": nop\n\t"				\
> -		     ".pushsection .discard.instr_end\n\t"		\
> -		     ".long " __stringify(c) "b - .\n\t"		\
> -		     ".popsection\n\t" : : "i" (c));			\
> +#define instrumentation_end() ({			\
> +	asm (".Linstend%=: nop\n\t"			\
> +	     ".pushsection .discard.instr_end\n\t"	\
> +	     ".long .Linstend%= - .\n\t"		\
> +	     ".popsection\n\t" ::: "memory");		\
>  })
> -#define instrumentation_end() __instrumentation_end(__COUNTER__)
>  #else
>  # define instrumentation_begin()	do { } while(0)
>  # define instrumentation_end()		do { } while(0)
> 
> base-commit: dcce50e6cc4d86a63dc0a9a6ee7d4f948ccd53a1
> -- 
> 2.34.1.703.g22d0c6ccf7-goog
>
Borislav Petkov Jan. 16, 2022, 1:32 p.m. UTC | #2
Hi Nick,

thanks for taking the time.

Here are a bunch of nitpicks:

On Thu, Jan 13, 2022 at 05:05:26PM -0800, Nick Desaulniers wrote:
> Subject: Re: [PATCH] objtool: prefer memory clobber & %= to volatile &  __COUNTER__

You probably wanna formulate that subject a little less cryptic and more
to the point what the patch does:

objtool: Use a memory clobber to prevent reachability annotations from getting merged

or so.

> commit dcce50e6cc4d ("compiler.h: Fix annotation macro misplacement with Clang")
> mentions:
> 
> > 'volatile' is ignored for some reason and Clang feels free to move the
> > reachable() annotation away from its intended location.
> 
> Indeed, volatile is not a compiler barrier. Particularly once `-march=`
> flags are used under certain configs, LLVM's machine-scheduler can be
> observed moving instructions across the asm statement meant to point to
> known reachable or unreachable code, as reported by 0day bot.
> 
> Prefer a memory clobber which is a compiler barrier that prevents these
> re-orderings and remove the volatile qualifier.

Yah, Nathan's note about talking about gcc reordering volatile inline
asms might make sense here as, apparently, that could have failed on gcc
too but it didn't for whatever reason and the most important thing is
that the memory clobber will prevent such reordering.

> Looking closer, the use of __COUNTER__ seems to have been used to

"... it seems, __COUNTER__ has been used... " or so. Simpler.

> prevent de-duplication of these asm statements. The GCC manual mentions:
> 
> > Under certain circumstances, GCC may duplicate (or remove duplicates
> > of) your assembly code when optimizing. This can lead to unexpected
> > duplicate symbol errors during compilation if your asm code defines
> > symbols or labels. Using ‘%=’ (see AssemblerTemplate) may help resolve
> > this problem.
> >
> > ‘%=’ Outputs a number that is unique to each instance of the asm
> > statement in the entire compilation. This option is useful when
> > creating local labels and referring to them multiple times in a single
> > template that generates multiple assembler instructions.
> 
> commit 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()")
> 
> Mentions that
> 
> > The inline asm ‘%=’ token could be used for that, but unfortunately
> > older versions of GCC don't support it.
> 
> From testing all versions of GCC available on godbolt.org, GCC 4.1+
> seems to support 4.1. Since the minimum supported version of GCC at the

You mean "seems to support %=." here perhaps?

> moment is GCC 5.1, it sounds like this is no longer a concern.
> 
> Prefer the %= assembler template to having to stringify __COUNTER__.
> 
> This commit is effectively a revert of the following commits:
> commit dcce50e6cc4d ("compiler.h: Fix annotation macro misplacement with Clang")
> commit f1069a8756b9 ("compiler.h: Avoid using inline asm operand modifiers")
> commit c199f64ff93c ("instrumentation.h: Avoid using inline asm operand modifiers")
> commit d0c2e691d1cb ("objtool: Add a comment for the unreachable annotation macros")
> commit ec1e1b610917 ("objtool: Prevent GCC from merging annotate_unreachable(), take 2")
> commit 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()")
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1566
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#AssemblerTemplate
> Link: https://lore.kernel.org/llvm/202112080834.XFYU8b5Q-lkp@intel.com/
> Link: https://lore.kernel.org/llvm/202111300857.IyINAyJk-lkp@intel.com/
> Reported-by: kernel test robot <lkp@intel.com>
> Debugged-by: Nathan Chancellor <nathan@kernel.org>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

I very much definitely appreciate the extensive and comprehensive
writeup - thanks for taking the time!

> ---
>  include/linux/compiler.h        | 31 +++++++++++--------------------
>  include/linux/instrumentation.h | 24 ++++++++++--------------
>  2 files changed, 21 insertions(+), 34 deletions(-)
> 
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 429dcebe2b99..3ac21b888d20 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -108,30 +108,21 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>  # define barrier_before_unreachable() do { } while (0)
>  #endif
>  
> -/* Unreachable code */
> +/* These macros help objtool understand GCC code flow for unreachable code. */
>  #ifdef CONFIG_STACK_VALIDATION
> -/*
> - * These macros help objtool understand GCC code flow for unreachable code.
> - * The __COUNTER__ based labels are a hack to make each instance of the macros
> - * unique, to convince GCC not to merge duplicate inline asm statements.
> - */
> -#define __stringify_label(n) #n
> -
> -#define __annotate_reachable(c) ({					\
> -	asm volatile(__stringify_label(c) ":\n\t"			\
> -		     ".pushsection .discard.reachable\n\t"		\
> -		     ".long " __stringify_label(c) "b - .\n\t"		\
> -		     ".popsection\n\t" : : "i" (c));			\
> +#define annotate_reachable() ({			\
> +	asm (".Lreachable%=:\n\t"			\
	   ^

	   no need for that space.


> +	     ".pushsection .discard.reachable\n\t"	\
> +	     ".long .Lreachable%= - .\n\t"		\
> +	     ".popsection\n\t" ::: "memory");		\

Btw, you don't need the "\t" on the last lines of the macros here. The
asm output has a stray tab this way.

As to the change itself, before it would look like this:

# arch/x86/kernel/reboot.c:130: 	unreachable();
# 130 "arch/x86/kernel/reboot.c" 1
	333:
	.pushsection .discard.unreachable
	.long 333b - .
	.popsection
	


and after

# arch/x86/kernel/reboot.c:130: 	unreachable();
# 130 "arch/x86/kernel/reboot.c" 1
	.Lunreachable44:
	.pushsection .discard.unreachable
	.long .Lunreachable44 - .
	.popsection


so I like the local label and how it is more readable this way.

So, provided the memory clobber works (I wonder here if Josh has some
concrete failing cases which could be tested with your version) and
after the nitpicks have been addressed

Acked-by: Borislav Petkov <bp@suse.de>

Thx.
Josh Poimboeuf Jan. 18, 2022, 7:22 p.m. UTC | #3
On Sun, Jan 16, 2022 at 02:32:59PM +0100, Borislav Petkov wrote:
> so I like the local label and how it is more readable this way.
> 
> So, provided the memory clobber works (I wonder here if Josh has some
> concrete failing cases which could be tested with your version) and
> after the nitpicks have been addressed
> 
> Acked-by: Borislav Petkov <bp@suse.de>

I think Nick was already able to recreate the original issue.  I'll run
it through some more testing.

I wanted to make this change years ago, but couldn't because of legacy
toolchains.  Here's hoping this is the final solution for those @#$%^
macros.

Boris, thanks for looping Nick in, I should have done so to begin with.
Josh Poimboeuf Jan. 18, 2022, 11:01 p.m. UTC | #4
On Tue, Jan 18, 2022 at 11:22:59AM -0800, Josh Poimboeuf wrote:
> On Sun, Jan 16, 2022 at 02:32:59PM +0100, Borislav Petkov wrote:
> > so I like the local label and how it is more readable this way.
> > 
> > So, provided the memory clobber works (I wonder here if Josh has some
> > concrete failing cases which could be tested with your version) and
> > after the nitpicks have been addressed
> > 
> > Acked-by: Borislav Petkov <bp@suse.de>
> 
> I think Nick was already able to recreate the original issue.  I'll run
> it through some more testing.
> 
> I wanted to make this change years ago, but couldn't because of legacy
> toolchains.  Here's hoping this is the final solution for those @#$%^
> macros.
> 
> Boris, thanks for looping Nick in, I should have done so to begin with.

Apparently this patch isn't going to work after all :-(

  https://lkml.kernel.org/r/202201190632.lhlaiCBk-lkp@intel.com

With the two WARN_ONs in media_request_object_complete(), GCC apparently
considers the two reachable() asm statements as duplicates, and it
removes the second one.
Borislav Petkov Jan. 18, 2022, 11:33 p.m. UTC | #5
On Tue, Jan 18, 2022 at 03:01:20PM -0800, Josh Poimboeuf wrote:
> With the two WARN_ONs in media_request_object_complete(), GCC apparently
> considers the two reachable() asm statements as duplicates, and it
> removes the second one.

Could that be the same thing:

net/xfrm/xfrm_output.o: warning: objtool: xfrm_output_resume()+0x2e0: unreachable instruction

I see two WARN_ONs in asm output there too...
Josh Poimboeuf Jan. 19, 2022, 12:03 a.m. UTC | #6
On Wed, Jan 19, 2022 at 12:33:02AM +0100, Borislav Petkov wrote:
> On Tue, Jan 18, 2022 at 03:01:20PM -0800, Josh Poimboeuf wrote:
> > With the two WARN_ONs in media_request_object_complete(), GCC apparently
> > considers the two reachable() asm statements as duplicates, and it
> > removes the second one.
> 
> Could that be the same thing:
> 
> net/xfrm/xfrm_output.o: warning: objtool: xfrm_output_resume()+0x2e0: unreachable instruction
> 
> I see two WARN_ONs in asm output there too...

If one of the '__bug_table' asm snippets isn't immediately followed by
the .L[un]reachable asm, then yeah, it's the same issue.


For example it's supposed to look something like this:


# 472 "net/xfrm/xfrm_output.c" 1
	1:	.byte 0x0f, 0x0b
.pushsection __bug_table,"aw"
2:	.long 1b - 2b	# bug_entry::bug_addr
	.long .LC4 - 2b	# bug_entry::file	#
	.word 472	# bug_entry::line	#
	.word 2307	# bug_entry::flags	#
	.org 2b+12	#
.popsection
# 0 "" 2
# 472 "net/xfrm/xfrm_output.c" 1
	.Lreachable1666:
	.pushsection .discard.reachable
	.long .Lreachable1666 - .
	.popsection


NOT just this:


# 472 "net/xfrm/xfrm_output.c" 1
	1:	.byte 0x0f, 0x0b
.pushsection __bug_table,"aw"
2:	.long 1b - 2b	# bug_entry::bug_addr
	.long .LC4 - 2b	# bug_entry::file	#
	.word 472	# bug_entry::line	#
	.word 2307	# bug_entry::flags	#
	.org 2b+12	#
.popsection
# some other code here...


There's a bunch of those throughout the code base.  The current
annotate_[un]reachable() implementations are carefully written to avoid
that happening.
Borislav Petkov Jan. 19, 2022, 10:01 a.m. UTC | #7
On Tue, Jan 18, 2022 at 04:03:27PM -0800, Josh Poimboeuf wrote:
> If one of the '__bug_table' asm snippets isn't immediately followed by
> the .L[un]reachable asm, then yeah, it's the same issue.

Found one.

AFAICT, that's the WARN_ON_ONCE(1) catch-all in the default: label of
the switch-case in nf_hook(). That thing is followed by other gunk and
no *reachable label near it.

Damn - that was too good to be true. Gotta love those compilers. :-P

# ./include/linux/netfilter.h:252: 		WARN_ON_ONCE(1);
#APP
# 252 "./include/linux/netfilter.h" 1
	1:	.byte 0x0f, 0x0b
.pushsection __bug_table,"aw"
2:	.long 1b - 2b	# bug_entry::bug_addr
	.long .LC5 - 2b	# bug_entry::file	#
	.word 252	# bug_entry::line	#
	.word 2307	# bug_entry::flags	#
	.org 2b+12	#
.popsection
# 0 "" 2
#NO_APP
	.p2align 4,,3
	jmp	.L344	#
.L265:
# ./include/linux/netfilter.h:229: 		hook_head = rcu_dereference(net->nf.hooks_ipv4[hook]);
Nick Desaulniers Jan. 24, 2022, 11:26 p.m. UTC | #8
(sorry for the delay responding, took a week off last week; spent a
lot of it thinking about this case though)

On Tue, Jan 18, 2022 at 3:01 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Tue, Jan 18, 2022 at 11:22:59AM -0800, Josh Poimboeuf wrote:
> > On Sun, Jan 16, 2022 at 02:32:59PM +0100, Borislav Petkov wrote:
> > > so I like the local label and how it is more readable this way.
> > >
> > > So, provided the memory clobber works (I wonder here if Josh has some
> > > concrete failing cases which could be tested with your version) and
> > > after the nitpicks have been addressed
> > >
> > > Acked-by: Borislav Petkov <bp@suse.de>
> >
> > I think Nick was already able to recreate the original issue.  I'll run
> > it through some more testing.
> >
> > I wanted to make this change years ago, but couldn't because of legacy
> > toolchains.  Here's hoping this is the final solution for those @#$%^
> > macros.
> >
> > Boris, thanks for looping Nick in, I should have done so to begin with.

No worries, we'll get all this sorted.  There's also
linux-toolchains@vger.kernel.org that can help with spooky toolchain
behavior.

I suspect this is what's causing a flood of objtool warnings when we
specify different instruction schedules via -march= (such as via
CONFIG_MATOM), which are hiding some of the actual codegen bugs
objtool has found in LLVM that we still need to fix.

I was looking at the HAVE_ARCH_BUG implementation in
arch/x86/include/asm/bug.h 2 weeks ago, and got the sinking feeling
that we'll probably need labels in C code, then asm goto to ensure
that control flow doesn't get folded in an unexpected way and which
I'd imagine might help avoid the costs of a full compiler barrier
(i.e. reloading all pointed-to values across it).

I'll follow up more with Josh on IRC on some ideas for different
approaches.  Having 0day bot test Josh's branches is very much
appreciated.

> Apparently this patch isn't going to work after all :-(
>
>   https://lkml.kernel.org/r/202201190632.lhlaiCBk-lkp@intel.com

I noticed in that report and
https://lore.kernel.org/lkml/202201190702.XNSXrMTK-lkp@intel.com/
that gcc-9 was used.  I wonder if %= has been fixed in gcc-10+? Have
there been other reports with gcc-10+ for my patch?

Boris' case of xfrm_output_resume is yet a third case; Boris, what
version of gcc did you spot that with?

If this is fixed in gcc-10, then we can probably add a comment with a
FIXME link to the issue or commit to replace __COUNTER__ with %= one
day.  If not, then we can probably come up with a reduced test case
for the GCC devs to take a look at, then add the FIXME comment to
kernel sources.

I'm more confident that we can remove the `volatile` keyword (I was
thinking about adding a new diagnostic to clang to warn that volatile
is redundate+implied for asm goto or inline asm that doesn't have
outputs) though that's not the problem here and will probably generate
some kernel wide cleanup before we could enable such a flag.  Perhaps
there are known compiler versions that still require the keyword for
those cases for some reason.

>
> With the two WARN_ONs in media_request_object_complete(), GCC apparently
> considers the two reachable() asm statements as duplicates, and it
> removes the second one.

On Fri, Jan 14, 2022 at 1:58 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> Perhaps it is worth mentioning in the next paragraph that this does not
> contradict GCC's own documentation that you link to below, which
> mentions that asm volatile statements can be reordered.
>
> "Note that the compiler can move even volatile asm instructions relative
> to other code, including across jump instructions."

On Sun, Jan 16, 2022 at 5:33 AM Borislav Petkov <bp@alien8.de> wrote:
>
> Yah, Nathan's note about talking about gcc reordering volatile inline
> asms might make sense here as, apparently, that could have failed on gcc
> too but it didn't for whatever reason and the most important thing is
> that the memory clobber will prevent such reordering.

Very much YES to both points.  Will include those in the commit
message for whatever we come up with for the final solution.
Nick Desaulniers Jan. 24, 2022, 11:38 p.m. UTC | #9
On Mon, Jan 24, 2022 at 3:26 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Tue, Jan 18, 2022 at 3:01 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > Apparently this patch isn't going to work after all :-(
> >
> >   https://lkml.kernel.org/r/202201190632.lhlaiCBk-lkp@intel.com
>
> I noticed in that report and
> https://lore.kernel.org/lkml/202201190702.XNSXrMTK-lkp@intel.com/
> that gcc-9 was used.  I wonder if %= has been fixed in gcc-10+? Have
> there been other reports with gcc-10+ for my patch?
>
> Boris' case of xfrm_output_resume is yet a third case; Boris, what
> version of gcc did you spot that with?
>
> If this is fixed in gcc-10, then we can probably add a comment with a
> FIXME link to the issue or commit to replace __COUNTER__ with %= one
> day.  If not, then we can probably come up with a reduced test case
> for the GCC devs to take a look at, then add the FIXME comment to
> kernel sources.


$ wget https://download.01.org/0day-ci/archive/20220119/202201190702.XNSXrMTK-lkp@intel.com/config
-O .config
$ make -j72 -s olddefconfig drivers/net/wireless/mac80211_hwsim.o
drivers/net/wireless/mac80211_hwsim.o: warning: objtool:
mac80211_hwsim_tx()+0x9aa: unreachable instruction
$ gcc --version
gcc (Debian 11.2.0-12) 11.2.0

:(

Let me see if I can come up with a reduced test case that I will
report upstream to https://gcc.gnu.org/bugzilla/.
Borislav Petkov Jan. 25, 2022, 6:49 p.m. UTC | #10
On Mon, Jan 24, 2022 at 03:38:38PM -0800, Nick Desaulniers wrote:
> $ wget https://download.01.org/0day-ci/archive/20220119/202201190702.XNSXrMTK-lkp@intel.com/config
> -O .config
> $ make -j72 -s olddefconfig drivers/net/wireless/mac80211_hwsim.o
> drivers/net/wireless/mac80211_hwsim.o: warning: objtool:
> mac80211_hwsim_tx()+0x9aa: unreachable instruction
> $ gcc --version
> gcc (Debian 11.2.0-12) 11.2.0

Yap, I got mine with:

gcc (Debian 10.2.1-6) 10.2.1 20210110

but since you hit it with gcc11, I don't see why I won't hit it with
gcc10.

> :(

You said it.
 
> Let me see if I can come up with a reduced test case that I will
> report upstream to https://gcc.gnu.org/bugzilla/.

Sounds good.

Please send the bug # once you have that so that I can add myself to Cc.

Thx.
Segher Boessenkool Jan. 25, 2022, 11:31 p.m. UTC | #11
Hi!

On Mon, Jan 24, 2022 at 03:26:36PM -0800, Nick Desaulniers wrote:
> I noticed in that report and
> https://lore.kernel.org/lkml/202201190702.XNSXrMTK-lkp@intel.com/
> that gcc-9 was used.  I wonder if %= has been fixed in gcc-10+? Have
> there been other reports with gcc-10+ for my patch?

None of the %= code (which is trivial) has been changed since 1992.

> If this is fixed in gcc-10, then we can probably add a comment with a
> FIXME link to the issue or commit to replace __COUNTER__ with %= one
> day.  If not, then we can probably come up with a reduced test case
> for the GCC devs to take a look at, then add the FIXME comment to
> kernel sources.

Please open a PR?

> I'm more confident that we can remove the `volatile` keyword (I was
> thinking about adding a new diagnostic to clang to warn that volatile
> is redundate+implied for asm goto or inline asm that doesn't have
> outputs) though that's not the problem here and will probably generate
> some kernel wide cleanup before we could enable such a flag.

Its main value is that it would discourage users from thinking volatile
is magic.  Seriously worth some pain!

> Perhaps
> there are known compiler versions that still require the keyword for
> those cases for some reason.

It was removed from compiler-gcc.h in 3347acc6fcd4 (which changed the
minimum required GCC version to GCC 5).


Segher
Nick Desaulniers Jan. 26, 2022, 12:59 a.m. UTC | #12
On Tue, Jan 25, 2022 at 3:34 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Hi!
>
> On Mon, Jan 24, 2022 at 03:26:36PM -0800, Nick Desaulniers wrote:
>
> > If this is fixed in gcc-10, then we can probably add a comment with a
> > FIXME link to the issue or commit to replace __COUNTER__ with %= one
> > day.  If not, then we can probably come up with a reduced test case
> > for the GCC devs to take a look at, then add the FIXME comment to
> > kernel sources.
>
> Please open a PR?

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104236

> > I'm more confident that we can remove the `volatile` keyword (I was
> > thinking about adding a new diagnostic to clang to warn that volatile
> > is redundate+implied for asm goto or inline asm that doesn't have
> > outputs) though that's not the problem here and will probably generate
> > some kernel wide cleanup before we could enable such a flag.
>
> Its main value is that it would discourage users from thinking volatile
> is magic.  Seriously worth some pain!

Yeah, SGTM.

>
> > Perhaps
> > there are known compiler versions that still require the keyword for
> > those cases for some reason.
>
> It was removed from compiler-gcc.h in 3347acc6fcd4 (which changed the
> minimum required GCC version to GCC 5).

```
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index e512f5505dad..b8fe0c23cfff 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -80,11 +80,25 @@ void ftrace_likely_update(struct
ftrace_likely_data *f, int val,

 /* Optimization barrier */
 #ifndef barrier
-# define barrier() __memory_barrier()
+/* The "volatile" is due to gcc bugs */
+# define barrier() __asm__ __volatile__("": : :"memory")
```

I definitely wish there was a comment with a link to what "gcc bugs"
they were referring to; otherwise who knows if it's been fixed...if
they have been...
Nick Desaulniers Jan. 26, 2022, 2:12 a.m. UTC | #13
On Tue, Jan 25, 2022 at 4:59 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Tue, Jan 25, 2022 at 3:34 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> >
> > Hi!
> >
> > On Mon, Jan 24, 2022 at 03:26:36PM -0800, Nick Desaulniers wrote:
> >
> > > If this is fixed in gcc-10, then we can probably add a comment with a
> > > FIXME link to the issue or commit to replace __COUNTER__ with %= one
> > > day.  If not, then we can probably come up with a reduced test case
> > > for the GCC devs to take a look at, then add the FIXME comment to
> > > kernel sources.
> >
> > Please open a PR?
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104236

Andrew clarified (thanks Andrew!) that %= can't be used as I imagined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104236#c4
and that I think was alluded to in
commit 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()")
which is fine, so I'll just need to keep usage of __COUNTER__. I'll
try out a revised approach for this patch tomorrow.
Segher Boessenkool Jan. 26, 2022, 11:13 a.m. UTC | #14
Hi!

On Tue, Jan 25, 2022 at 06:12:20PM -0800, Nick Desaulniers wrote:
> Andrew clarified (thanks Andrew!) that %= can't be used as I imagined
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104236#c4
> and that I think was alluded to in
> commit 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()")
> which is fine, so I'll just need to keep usage of __COUNTER__.

Aha.  Yes, %= *outputs* a unique number.  Before the assembler output
is written %= is just a string (like any other piece of output
template).  The output template is used for three things:

1) Inline assembler statements with different output templates are not
considered identical;
2) Newlines and assembler statement separators (semicolons for most
assembler dialects) are used to estimate the size of the machine code
generated.  This is a pessimistic estimate, but within reason: for
example you can write a million byte output with just a few characters
of input, if you want to sabotage yourself;
3) The output template is used at output time.

The mentioned commit's message says "unfortunately older versions of GCC
don't support it" which is mistaken.

If you want two identical inline asm statements to not be considered
identical by the compiler, you have to make them not identical.  Like by
using __COUNTER__ for example, yes :-)


Segher
Nick Desaulniers Jan. 31, 2022, 8:45 p.m. UTC | #15
On Tue, Jan 25, 2022 at 3:34 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Hi!
>
> On Mon, Jan 24, 2022 at 03:26:36PM -0800, Nick Desaulniers wrote:
> > I'm more confident that we can remove the `volatile` keyword (I was
> > thinking about adding a new diagnostic to clang to warn that volatile
> > is redundate+implied for asm goto or inline asm that doesn't have
> > outputs) though that's not the problem here and will probably generate
> > some kernel wide cleanup before we could enable such a flag.
>
> Its main value is that it would discourage users from thinking volatile
> is magic.  Seriously worth some pain!

https://reviews.llvm.org/D118297
PTAL
Segher Boessenkool Jan. 31, 2022, 10:13 p.m. UTC | #16
On Mon, Jan 31, 2022 at 12:45:20PM -0800, Nick Desaulniers wrote:
> On Tue, Jan 25, 2022 at 3:34 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> >
> > Hi!
> >
> > On Mon, Jan 24, 2022 at 03:26:36PM -0800, Nick Desaulniers wrote:
> > > I'm more confident that we can remove the `volatile` keyword (I was
> > > thinking about adding a new diagnostic to clang to warn that volatile
> > > is redundate+implied for asm goto or inline asm that doesn't have
> > > outputs) though that's not the problem here and will probably generate
> > > some kernel wide cleanup before we could enable such a flag.
> >
> > Its main value is that it would discourage users from thinking volatile
> > is magic.  Seriously worth some pain!
> 
> https://reviews.llvm.org/D118297
> PTAL

""
  Really the volatile asm-qualifier exists only to signal that an asm
  statement should not be DCE'd (when it has outputs but they are unused),
  CSE'd, or LICM'd. It is not a general compiler barrier.

It means that the asm has a side effect (one unknown to the compiler),
so it must be executed in the real machine just where it would be in the
abstract machine.  It *can* be CSEd, it *can* be DCEd, it can even be
optimised by LICM in certain cases: but it has to be executed as often
(and in the same order etc.) in the resulting machine code as it would
be if you single-stepped through the source code by hand.

Those are fine examples if you add "in most cases" (and that they are
just examples, it's not an exhaustive list).

Thanks,


Segher
diff mbox series

Patch

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 429dcebe2b99..3ac21b888d20 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -108,30 +108,21 @@  void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 # define barrier_before_unreachable() do { } while (0)
 #endif
 
-/* Unreachable code */
+/* These macros help objtool understand GCC code flow for unreachable code. */
 #ifdef CONFIG_STACK_VALIDATION
-/*
- * These macros help objtool understand GCC code flow for unreachable code.
- * The __COUNTER__ based labels are a hack to make each instance of the macros
- * unique, to convince GCC not to merge duplicate inline asm statements.
- */
-#define __stringify_label(n) #n
-
-#define __annotate_reachable(c) ({					\
-	asm volatile(__stringify_label(c) ":\n\t"			\
-		     ".pushsection .discard.reachable\n\t"		\
-		     ".long " __stringify_label(c) "b - .\n\t"		\
-		     ".popsection\n\t" : : "i" (c));			\
+#define annotate_reachable() ({			\
+	asm (".Lreachable%=:\n\t"			\
+	     ".pushsection .discard.reachable\n\t"	\
+	     ".long .Lreachable%= - .\n\t"		\
+	     ".popsection\n\t" ::: "memory");		\
 })
-#define annotate_reachable() __annotate_reachable(__COUNTER__)
 
-#define __annotate_unreachable(c) ({					\
-	asm volatile(__stringify_label(c) ":\n\t"			\
-		     ".pushsection .discard.unreachable\n\t"		\
-		     ".long " __stringify_label(c) "b - .\n\t"		\
-		     ".popsection\n\t" : : "i" (c));			\
+#define annotate_unreachable() ({			\
+	asm (".Lunreachable%=:\n\t"			\
+	     ".pushsection .discard.unreachable\n\t"	\
+	     ".long .Lunreachable%= - .\n\t"		\
+	     ".popsection\n\t" ::: "memory");		\
 })
-#define annotate_unreachable() __annotate_unreachable(__COUNTER__)
 
 #define ASM_UNREACHABLE							\
 	"999:\n\t"							\
diff --git a/include/linux/instrumentation.h b/include/linux/instrumentation.h
index 24359b4a9605..0dae9c08764f 100644
--- a/include/linux/instrumentation.h
+++ b/include/linux/instrumentation.h
@@ -4,16 +4,13 @@ 
 
 #if defined(CONFIG_DEBUG_ENTRY) && defined(CONFIG_STACK_VALIDATION)
 
-#include <linux/stringify.h>
-
 /* Begin/end of an instrumentation safe region */
-#define __instrumentation_begin(c) ({					\
-	asm volatile(__stringify(c) ": nop\n\t"				\
-		     ".pushsection .discard.instr_begin\n\t"		\
-		     ".long " __stringify(c) "b - .\n\t"		\
-		     ".popsection\n\t" : : "i" (c));			\
+#define instrumentation_begin() ({			\
+	asm (".Linstbegin%=: nop\n\t"			\
+	     ".pushsection .discard.instr_begin\n\t"	\
+	     ".long .Linstbegin%= - .\n\t"		\
+	     ".popsection\n\t" ::: "memory");		\
 })
-#define instrumentation_begin() __instrumentation_begin(__COUNTER__)
 
 /*
  * Because instrumentation_{begin,end}() can nest, objtool validation considers
@@ -46,13 +43,12 @@ 
  * To avoid this, have _end() be a NOP instruction, this ensures it will be
  * part of the condition block and does not escape.
  */
-#define __instrumentation_end(c) ({					\
-	asm volatile(__stringify(c) ": nop\n\t"				\
-		     ".pushsection .discard.instr_end\n\t"		\
-		     ".long " __stringify(c) "b - .\n\t"		\
-		     ".popsection\n\t" : : "i" (c));			\
+#define instrumentation_end() ({			\
+	asm (".Linstend%=: nop\n\t"			\
+	     ".pushsection .discard.instr_end\n\t"	\
+	     ".long .Linstend%= - .\n\t"		\
+	     ".popsection\n\t" ::: "memory");		\
 })
-#define instrumentation_end() __instrumentation_end(__COUNTER__)
 #else
 # define instrumentation_begin()	do { } while(0)
 # define instrumentation_end()		do { } while(0)