diff mbox series

ARM: Ensure that NEON code always compiles with Clang

Message ID 20181215212304.19390-1-natechancellor@gmail.com (mailing list archive)
State Mainlined, archived
Commit de9c0d49d85dc563549972edc5589d195cd5e859
Headers show
Series ARM: Ensure that NEON code always compiles with Clang | expand

Commit Message

Nathan Chancellor Dec. 15, 2018, 9:23 p.m. UTC
While building arm32 allyesconfig, I ran into the following errors:

  arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
  '-mfloat-abi=softfp -mfpu=neon'

  In file included from lib/raid6/neon1.c:27:
  /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
  error: "NEON support not enabled"

Building V=1 showed NEON_FLAGS getting passed along to Clang but
__ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
which is the '-march' value for allyesconfig.

From lib/Basic/Targets/ARM.cpp in the Clang source:

  // This only gets set when Neon instructions are actually available, unlike
  // the VFP define, hence the soft float and arch check. This is subtly
  // different from gcc, we follow the intent which was that it should be set
  // when Neon instructions are actually available.
  if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
    Builder.defineMacro("__ARM_NEON", "1");
    Builder.defineMacro("__ARM_NEON__");
    // current AArch32 NEON implementations do not support double-precision
    // floating-point even when it is present in VFP.
    Builder.defineMacro("__ARM_NEON_FP",
                        "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
  }

Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
definined by Clang. This doesn't functionally change anything because
that code will only run where NEON is supported, which is implicitly
armv7.

Link: https://github.com/ClangBuiltLinux/linux/issues/287
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
---
 Documentation/arm/kernel_mode_neon.txt | 4 ++--
 arch/arm/lib/Makefile                  | 2 +-
 arch/arm/lib/xor-neon.c                | 2 +-
 lib/raid6/Makefile                     | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

Comments

Nicolas Pitre Dec. 17, 2018, 6:23 p.m. UTC | #1
On Sat, 15 Dec 2018, Nathan Chancellor wrote:

> While building arm32 allyesconfig, I ran into the following errors:
> 
>   arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
>   '-mfloat-abi=softfp -mfpu=neon'
> 
>   In file included from lib/raid6/neon1.c:27:
>   /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
>   error: "NEON support not enabled"
> 
> Building V=1 showed NEON_FLAGS getting passed along to Clang but
> __ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
> only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
> which is the '-march' value for allyesconfig.
> 
> From lib/Basic/Targets/ARM.cpp in the Clang source:
> 
>   // This only gets set when Neon instructions are actually available, unlike
>   // the VFP define, hence the soft float and arch check. This is subtly
>   // different from gcc, we follow the intent which was that it should be set
>   // when Neon instructions are actually available.
>   if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
>     Builder.defineMacro("__ARM_NEON", "1");
>     Builder.defineMacro("__ARM_NEON__");
>     // current AArch32 NEON implementations do not support double-precision
>     // floating-point even when it is present in VFP.
>     Builder.defineMacro("__ARM_NEON_FP",
>                         "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
>   }
> 
> Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
> beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
> definined by Clang. This doesn't functionally change anything because
> that code will only run where NEON is supported, which is implicitly
> armv7.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/287
> Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

Did you test that this doesn't create issues with gcc e.g. complaints 
from the linker that objects have incompatible architecture 
specifications or similar annoyance? This already happened in the past 
but I forget the exact scenario. If you already did, or after you do 
validate with gcc as well, then you may add:

Acked-by: Nicolas Pitre <nico@linaro.org>



> ---
>  Documentation/arm/kernel_mode_neon.txt | 4 ++--
>  arch/arm/lib/Makefile                  | 2 +-
>  arch/arm/lib/xor-neon.c                | 2 +-
>  lib/raid6/Makefile                     | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/arm/kernel_mode_neon.txt b/Documentation/arm/kernel_mode_neon.txt
> index 525452726d31..b9e060c5b61e 100644
> --- a/Documentation/arm/kernel_mode_neon.txt
> +++ b/Documentation/arm/kernel_mode_neon.txt
> @@ -6,7 +6,7 @@ TL;DR summary
>  * Use only NEON instructions, or VFP instructions that don't rely on support
>    code
>  * Isolate your NEON code in a separate compilation unit, and compile it with
> -  '-mfpu=neon -mfloat-abi=softfp'
> +  '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
>  * Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
>    NEON code
>  * Don't sleep in your NEON code, and be aware that it will be executed with
> @@ -87,7 +87,7 @@ instructions appearing in unexpected places if no special care is taken.
>  Therefore, the recommended and only supported way of using NEON/VFP in the
>  kernel is by adhering to the following rules:
>  * isolate the NEON code in a separate compilation unit and compile it with
> -  '-mfpu=neon -mfloat-abi=softfp';
> +  '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
>  * issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
>    into the unit containing the NEON code from a compilation unit which is *not*
>    built with the GCC flag '-mfpu=neon' set.
> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> index ad25fd1872c7..0bff0176db2c 100644
> --- a/arch/arm/lib/Makefile
> +++ b/arch/arm/lib/Makefile
> @@ -39,7 +39,7 @@ $(obj)/csumpartialcopy.o:	$(obj)/csumpartialcopygeneric.S
>  $(obj)/csumpartialcopyuser.o:	$(obj)/csumpartialcopygeneric.S
>  
>  ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> -  NEON_FLAGS			:= -mfloat-abi=softfp -mfpu=neon
> +  NEON_FLAGS			:= -march=armv7-a -mfloat-abi=softfp -mfpu=neon
>    CFLAGS_xor-neon.o		+= $(NEON_FLAGS)
>    obj-$(CONFIG_XOR_BLOCKS)	+= xor-neon.o
>  endif
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index a6741a895189..4600b62d845f 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -14,7 +14,7 @@
>  MODULE_LICENSE("GPL");
>  
>  #ifndef __ARM_NEON__
> -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
>  #endif
>  
>  /*
> diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
> index 2f8b61dfd9b0..bfec7c87c61e 100644
> --- a/lib/raid6/Makefile
> +++ b/lib/raid6/Makefile
> @@ -25,7 +25,7 @@ endif
>  ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
>  NEON_FLAGS := -ffreestanding
>  ifeq ($(ARCH),arm)
> -NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
> +NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
>  endif
>  CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
>  ifeq ($(ARCH),arm64)
> -- 
> 2.20.1
> 
>
Nathan Chancellor Dec. 17, 2018, 7:34 p.m. UTC | #2
On Mon, Dec 17, 2018 at 01:23:52PM -0500, Nicolas Pitre wrote:
> On Sat, 15 Dec 2018, Nathan Chancellor wrote:
> 
> > While building arm32 allyesconfig, I ran into the following errors:
> > 
> >   arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
> >   '-mfloat-abi=softfp -mfpu=neon'
> > 
> >   In file included from lib/raid6/neon1.c:27:
> >   /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
> >   error: "NEON support not enabled"
> > 
> > Building V=1 showed NEON_FLAGS getting passed along to Clang but
> > __ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
> > only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
> > which is the '-march' value for allyesconfig.
> > 
> > From lib/Basic/Targets/ARM.cpp in the Clang source:
> > 
> >   // This only gets set when Neon instructions are actually available, unlike
> >   // the VFP define, hence the soft float and arch check. This is subtly
> >   // different from gcc, we follow the intent which was that it should be set
> >   // when Neon instructions are actually available.
> >   if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
> >     Builder.defineMacro("__ARM_NEON", "1");
> >     Builder.defineMacro("__ARM_NEON__");
> >     // current AArch32 NEON implementations do not support double-precision
> >     // floating-point even when it is present in VFP.
> >     Builder.defineMacro("__ARM_NEON_FP",
> >                         "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
> >   }
> > 
> > Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
> > beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
> > definined by Clang. This doesn't functionally change anything because
> > that code will only run where NEON is supported, which is implicitly
> > armv7.
> > 
> > Link: https://github.com/ClangBuiltLinux/linux/issues/287
> > Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
> 
> Did you test that this doesn't create issues with gcc e.g. complaints 
> from the linker that objects have incompatible architecture 
> specifications or similar annoyance? This already happened in the past 
> but I forget the exact scenario. If you already did, or after you do 
> validate with gcc as well, then you may add:
> 
> Acked-by: Nicolas Pitre <nico@linaro.org>
> 
> 

Hi Nicolas,

I was 99% sure that I checked GCC before sending this but I just did
another run to confirm and everything links successfully. We still use
binutils for assembling/linking the kernel so I assume that I would have
seen a warning from ld.bfd even with Clang.

Thank you for the review!
Nathan

> 
> > ---
> >  Documentation/arm/kernel_mode_neon.txt | 4 ++--
> >  arch/arm/lib/Makefile                  | 2 +-
> >  arch/arm/lib/xor-neon.c                | 2 +-
> >  lib/raid6/Makefile                     | 2 +-
> >  4 files changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/Documentation/arm/kernel_mode_neon.txt b/Documentation/arm/kernel_mode_neon.txt
> > index 525452726d31..b9e060c5b61e 100644
> > --- a/Documentation/arm/kernel_mode_neon.txt
> > +++ b/Documentation/arm/kernel_mode_neon.txt
> > @@ -6,7 +6,7 @@ TL;DR summary
> >  * Use only NEON instructions, or VFP instructions that don't rely on support
> >    code
> >  * Isolate your NEON code in a separate compilation unit, and compile it with
> > -  '-mfpu=neon -mfloat-abi=softfp'
> > +  '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
> >  * Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
> >    NEON code
> >  * Don't sleep in your NEON code, and be aware that it will be executed with
> > @@ -87,7 +87,7 @@ instructions appearing in unexpected places if no special care is taken.
> >  Therefore, the recommended and only supported way of using NEON/VFP in the
> >  kernel is by adhering to the following rules:
> >  * isolate the NEON code in a separate compilation unit and compile it with
> > -  '-mfpu=neon -mfloat-abi=softfp';
> > +  '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
> >  * issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
> >    into the unit containing the NEON code from a compilation unit which is *not*
> >    built with the GCC flag '-mfpu=neon' set.
> > diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> > index ad25fd1872c7..0bff0176db2c 100644
> > --- a/arch/arm/lib/Makefile
> > +++ b/arch/arm/lib/Makefile
> > @@ -39,7 +39,7 @@ $(obj)/csumpartialcopy.o:	$(obj)/csumpartialcopygeneric.S
> >  $(obj)/csumpartialcopyuser.o:	$(obj)/csumpartialcopygeneric.S
> >  
> >  ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> > -  NEON_FLAGS			:= -mfloat-abi=softfp -mfpu=neon
> > +  NEON_FLAGS			:= -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> >    CFLAGS_xor-neon.o		+= $(NEON_FLAGS)
> >    obj-$(CONFIG_XOR_BLOCKS)	+= xor-neon.o
> >  endif
> > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> > index a6741a895189..4600b62d845f 100644
> > --- a/arch/arm/lib/xor-neon.c
> > +++ b/arch/arm/lib/xor-neon.c
> > @@ -14,7 +14,7 @@
> >  MODULE_LICENSE("GPL");
> >  
> >  #ifndef __ARM_NEON__
> > -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> > +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
> >  #endif
> >  
> >  /*
> > diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
> > index 2f8b61dfd9b0..bfec7c87c61e 100644
> > --- a/lib/raid6/Makefile
> > +++ b/lib/raid6/Makefile
> > @@ -25,7 +25,7 @@ endif
> >  ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> >  NEON_FLAGS := -ffreestanding
> >  ifeq ($(ARCH),arm)
> > -NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
> > +NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
> >  endif
> >  CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
> >  ifeq ($(ARCH),arm64)
> > -- 
> > 2.20.1
> > 
> >
Nick Desaulniers Dec. 21, 2018, 6:11 p.m. UTC | #3
On Sat, Dec 15, 2018 at 1:23 PM Nathan Chancellor
<natechancellor@gmail.com> wrote:
>
> While building arm32 allyesconfig, I ran into the following errors:
>
>   arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
>   '-mfloat-abi=softfp -mfpu=neon'
>
>   In file included from lib/raid6/neon1.c:27:
>   /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
>   error: "NEON support not enabled"
>
> Building V=1 showed NEON_FLAGS getting passed along to Clang but
> __ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
> only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
> which is the '-march' value for allyesconfig.
>
> From lib/Basic/Targets/ARM.cpp in the Clang source:
>
>   // This only gets set when Neon instructions are actually available, unlike
>   // the VFP define, hence the soft float and arch check. This is subtly
>   // different from gcc, we follow the intent which was that it should be set
>   // when Neon instructions are actually available.
>   if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
>     Builder.defineMacro("__ARM_NEON", "1");
>     Builder.defineMacro("__ARM_NEON__");
>     // current AArch32 NEON implementations do not support double-precision
>     // floating-point even when it is present in VFP.
>     Builder.defineMacro("__ARM_NEON_FP",
>                         "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
>   }
>
> Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
> beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
> definined by Clang. This doesn't functionally change anything because
> that code will only run where NEON is supported, which is implicitly
> armv7.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/287
> Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

Nathan,
Thanks for sending the patch, and thanks to Ard for the suggestion.
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

> ---
>  Documentation/arm/kernel_mode_neon.txt | 4 ++--
>  arch/arm/lib/Makefile                  | 2 +-
>  arch/arm/lib/xor-neon.c                | 2 +-
>  lib/raid6/Makefile                     | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/arm/kernel_mode_neon.txt b/Documentation/arm/kernel_mode_neon.txt
> index 525452726d31..b9e060c5b61e 100644
> --- a/Documentation/arm/kernel_mode_neon.txt
> +++ b/Documentation/arm/kernel_mode_neon.txt
> @@ -6,7 +6,7 @@ TL;DR summary
>  * Use only NEON instructions, or VFP instructions that don't rely on support
>    code
>  * Isolate your NEON code in a separate compilation unit, and compile it with
> -  '-mfpu=neon -mfloat-abi=softfp'
> +  '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
>  * Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
>    NEON code
>  * Don't sleep in your NEON code, and be aware that it will be executed with
> @@ -87,7 +87,7 @@ instructions appearing in unexpected places if no special care is taken.
>  Therefore, the recommended and only supported way of using NEON/VFP in the
>  kernel is by adhering to the following rules:
>  * isolate the NEON code in a separate compilation unit and compile it with
> -  '-mfpu=neon -mfloat-abi=softfp';
> +  '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
>  * issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
>    into the unit containing the NEON code from a compilation unit which is *not*
>    built with the GCC flag '-mfpu=neon' set.
> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> index ad25fd1872c7..0bff0176db2c 100644
> --- a/arch/arm/lib/Makefile
> +++ b/arch/arm/lib/Makefile
> @@ -39,7 +39,7 @@ $(obj)/csumpartialcopy.o:     $(obj)/csumpartialcopygeneric.S
>  $(obj)/csumpartialcopyuser.o:  $(obj)/csumpartialcopygeneric.S
>
>  ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> -  NEON_FLAGS                   := -mfloat-abi=softfp -mfpu=neon
> +  NEON_FLAGS                   := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
>    CFLAGS_xor-neon.o            += $(NEON_FLAGS)
>    obj-$(CONFIG_XOR_BLOCKS)     += xor-neon.o
>  endif
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index a6741a895189..4600b62d845f 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -14,7 +14,7 @@
>  MODULE_LICENSE("GPL");
>
>  #ifndef __ARM_NEON__
> -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
>  #endif
>
>  /*
> diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
> index 2f8b61dfd9b0..bfec7c87c61e 100644
> --- a/lib/raid6/Makefile
> +++ b/lib/raid6/Makefile
> @@ -25,7 +25,7 @@ endif
>  ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
>  NEON_FLAGS := -ffreestanding
>  ifeq ($(ARCH),arm)
> -NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
> +NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
>  endif
>  CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
>  ifeq ($(ARCH),arm64)
> --
> 2.20.1
>
Arnd Bergmann March 11, 2019, 4:21 p.m. UTC | #4
On Sat, Dec 15, 2018 at 10:24 PM Nathan Chancellor
<natechancellor@gmail.com> wrote:
>  endif
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index a6741a895189..4600b62d845f 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -14,7 +14,7 @@
>  MODULE_LICENSE("GPL");
>
>  #ifndef __ARM_NEON__
> -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
>  #endif
>

I see this patch has made it in now, but I also see two other problems with the
same file that prevent it from working right with clang:

- it triggers #warning This code requires at least version 4.6 of GCC
- As I reported in https://bugs.llvm.org/show_bug.cgi?id=40976, even
  when it builds cleanly, it does not get vectorized.

Has anyone actually managed to get this to do the right thing?

       Arnd
Ard Biesheuvel March 11, 2019, 4:49 p.m. UTC | #5
On Mon, 11 Mar 2019 at 17:22, Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Sat, Dec 15, 2018 at 10:24 PM Nathan Chancellor
> <natechancellor@gmail.com> wrote:
> >  endif
> > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> > index a6741a895189..4600b62d845f 100644
> > --- a/arch/arm/lib/xor-neon.c
> > +++ b/arch/arm/lib/xor-neon.c
> > @@ -14,7 +14,7 @@
> >  MODULE_LICENSE("GPL");
> >
> >  #ifndef __ARM_NEON__
> > -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> > +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
> >  #endif
> >
>
> I see this patch has made it in now, but I also see two other problems with the
> same file that prevent it from working right with clang:
>
> - it triggers #warning This code requires at least version 4.6 of GCC

What is currently the oldest GCC we support for ARM?

> - As I reported in https://bugs.llvm.org/show_bug.cgi?id=40976, even
>   when it builds cleanly, it does not get vectorized.
>
> Has anyone actually managed to get this to do the right thing?
>

On my Cortex-A57 under KVM, I get this at boot

[    0.002287] xor: measuring software checksum speed
[    0.100054]    arm4regs  :  5212.800 MB/sec
[    0.200131]    8regs     :  3472.800 MB/sec
[    0.300205]    32regs    :  3282.000 MB/sec
[    0.400281]    neon      :  7011.600 MB/sec

So that means that
a) the cost model is inaccurate, at least for some cores,
b) given that the code is only selected if it is faster, it would be
nice if we could override the cost model based decisions made by the
vectorizer.
Arnd Bergmann March 11, 2019, 9:36 p.m. UTC | #6
On Mon, Mar 11, 2019 at 5:49 PM Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
>
> On Mon, 11 Mar 2019 at 17:22, Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Sat, Dec 15, 2018 at 10:24 PM Nathan Chancellor
> > <natechancellor@gmail.com> wrote:
> > >  endif
> > > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> > > index a6741a895189..4600b62d845f 100644
> > > --- a/arch/arm/lib/xor-neon.c
> > > +++ b/arch/arm/lib/xor-neon.c
> > > @@ -14,7 +14,7 @@
> > >  MODULE_LICENSE("GPL");
> > >
> > >  #ifndef __ARM_NEON__
> > > -#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
> > > +#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
> > >  #endif
> > >
> >
> > I see this patch has made it in now, but I also see two other problems with the
> > same file that prevent it from working right with clang:
> >
> > - it triggers #warning This code requires at least version 4.6 of GCC
>
> What is currently the oldest GCC we support for ARM?

Linux overall requires gcc-4.6, so we could just as well drop this
check, good point.

       Arnd
diff mbox series

Patch

diff --git a/Documentation/arm/kernel_mode_neon.txt b/Documentation/arm/kernel_mode_neon.txt
index 525452726d31..b9e060c5b61e 100644
--- a/Documentation/arm/kernel_mode_neon.txt
+++ b/Documentation/arm/kernel_mode_neon.txt
@@ -6,7 +6,7 @@  TL;DR summary
 * Use only NEON instructions, or VFP instructions that don't rely on support
   code
 * Isolate your NEON code in a separate compilation unit, and compile it with
-  '-mfpu=neon -mfloat-abi=softfp'
+  '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
 * Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
   NEON code
 * Don't sleep in your NEON code, and be aware that it will be executed with
@@ -87,7 +87,7 @@  instructions appearing in unexpected places if no special care is taken.
 Therefore, the recommended and only supported way of using NEON/VFP in the
 kernel is by adhering to the following rules:
 * isolate the NEON code in a separate compilation unit and compile it with
-  '-mfpu=neon -mfloat-abi=softfp';
+  '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
 * issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
   into the unit containing the NEON code from a compilation unit which is *not*
   built with the GCC flag '-mfpu=neon' set.
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index ad25fd1872c7..0bff0176db2c 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -39,7 +39,7 @@  $(obj)/csumpartialcopy.o:	$(obj)/csumpartialcopygeneric.S
 $(obj)/csumpartialcopyuser.o:	$(obj)/csumpartialcopygeneric.S
 
 ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
-  NEON_FLAGS			:= -mfloat-abi=softfp -mfpu=neon
+  NEON_FLAGS			:= -march=armv7-a -mfloat-abi=softfp -mfpu=neon
   CFLAGS_xor-neon.o		+= $(NEON_FLAGS)
   obj-$(CONFIG_XOR_BLOCKS)	+= xor-neon.o
 endif
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index a6741a895189..4600b62d845f 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -14,7 +14,7 @@ 
 MODULE_LICENSE("GPL");
 
 #ifndef __ARM_NEON__
-#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
+#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
 #endif
 
 /*
diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
index 2f8b61dfd9b0..bfec7c87c61e 100644
--- a/lib/raid6/Makefile
+++ b/lib/raid6/Makefile
@@ -25,7 +25,7 @@  endif
 ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
 NEON_FLAGS := -ffreestanding
 ifeq ($(ARCH),arm)
-NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
+NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
 endif
 CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
 ifeq ($(ARCH),arm64)