diff mbox series

[v2] RISC-V: build: Allow LTO to be selected

Message ID 20220512205545.992288-1-twd2.me@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v2] RISC-V: build: Allow LTO to be selected | expand

Commit Message

twd2 May 12, 2022, 8:55 p.m. UTC
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is an
issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

I have tested enabling LTO for `defconfig`. The LLD took ~2m21s and ~3GiB
on our Intel Xeon Gold 6140 server and produced an 18MiB Image. The image
can boot to shell using an archriscv rootfs on QEMU.

I have also tested it for `allyesconfig` without COMPILE_TEST, FTRACE,
KASAN, and GCOV. The LLD took ~7h03m and ~335GiB on the server,
successfully producing a 1.7GiB Image. Unfortunately, we cannot boot this
image because the `create_kernel_page_table()` -> `alloc_pmd_early()` ->
`BUG_ON()` logic limits the image to be < 1GiB. Maybe we can fix it in a
separate patch further.

[1] https://github.com/llvm/llvm-project/issues/50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")

Tested-by: Wende Tan <twd2.me@gmail.com>
Signed-off-by: Wende Tan <twd2.me@gmail.com>
---
v2:
- Some textual changes suggested by Nick.
- Drop the changes to `arch/riscv/Makefile`, since the LLVM issue is filed
  and resolved.
- Drop the unnecessary changes to `arch/riscv/kernel/vdso/Makefile`.

v1: https://lore.kernel.org/linux-riscv/20210719205208.1023221-1-twd2.me@gmail.com/
---
 arch/riscv/Kconfig | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Nick Desaulniers May 12, 2022, 9:34 p.m. UTC | #1
On Thu, May 12, 2022 at 1:56 PM Wende Tan <twd2.me@gmail.com> wrote:
>
> Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is an
> issue [1] in prior LLD versions that prevents LLD to generate proper
> machine code for RISC-V when writing `nop`s.
>
> I have tested enabling LTO for `defconfig`. The LLD took ~2m21s and ~3GiB
> on our Intel Xeon Gold 6140 server and produced an 18MiB Image. The image
> can boot to shell using an archriscv rootfs on QEMU.
>
> I have also tested it for `allyesconfig` without COMPILE_TEST, FTRACE,
> KASAN, and GCOV. The LLD took ~7h03m and ~335GiB on the server,

Haha! That's ok, allyesconfig is not expected to boot for any
architectures AFAIK. For CI, we simply verify we can build them; we
boot test everything but allyesconfig and allmodconfig (and
architectures which don't yet have qemu ports).  It helps detect when
assembler sources don't use enough encoding space when referenced
external symbols are too far away for larger images, IME.

That's a long time; first time I've seen a number from someone trying
to LTO allyesconfig!

> successfully producing a 1.7GiB Image. Unfortunately, we cannot boot this
> image because the `create_kernel_page_table()` -> `alloc_pmd_early()` ->
> `BUG_ON()` logic limits the image to be < 1GiB. Maybe we can fix it in a
> separate patch further.
>
> [1] https://github.com/llvm/llvm-project/issues/50505, resolved by LLVM
>     commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")

So looking at that change, it doesn't look like it touches LLD at all.
To me it touches both the assembler and the object streamer, which is
used by the compiler (and assembler) to stream instructions directly
into an object file (without the need for an external assembler).

That makes me think that
CLANG_VERSION >= 140000
would be more appropriate than
LLD_VERSION >= 140000

WDYT?

Also, I'm curious if the LLVM patch you had me commit for you recently
is at all related or necessary for this? If so, then the version check
should probably be against clang-15, not clang-14.
https://github.com/llvm/llvm-project/commit/6baaad740a5abb4bfcff022a8114abb4eea66a2d

Anyways, I just did a build+boot (in qemu) test of defconfig+thinlto
and defconfig+lto. LGTM

Tested-by: Nick Desaulniers <ndesaulniers@google.com>

>
> Tested-by: Wende Tan <twd2.me@gmail.com>
> Signed-off-by: Wende Tan <twd2.me@gmail.com>
> ---
> v2:
> - Some textual changes suggested by Nick.
> - Drop the changes to `arch/riscv/Makefile`, since the LLVM issue is filed
>   and resolved.
> - Drop the unnecessary changes to `arch/riscv/kernel/vdso/Makefile`.
>
> v1: https://lore.kernel.org/linux-riscv/20210719205208.1023221-1-twd2.me@gmail.com/
> ---
>  arch/riscv/Kconfig | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 00fd9c548f26..c55f6b95e5af 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -38,6 +38,10 @@ config RISCV
>         select ARCH_SUPPORTS_ATOMIC_RMW
>         select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
>         select ARCH_SUPPORTS_HUGETLBFS if MMU
> +       # LLD >= 14:
> +       # - https://github.com/llvm/llvm-project/issues/50505
> +       select ARCH_SUPPORTS_LTO_CLANG if LLD_VERSION >= 140000
> +       select ARCH_SUPPORTS_LTO_CLANG_THIN if LLD_VERSION >= 140000
>         select ARCH_USE_MEMTEST
>         select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
>         select ARCH_WANT_FRAME_POINTERS
> --
> 2.25.1
>
twd2 May 13, 2022, 6:28 p.m. UTC | #2
在 2022/5/13 5:34, Nick Desaulniers 写道:
> On Thu, May 12, 2022 at 1:56 PM Wende Tan <twd2.me@gmail.com> wrote:
>> Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is an
>> issue [1] in prior LLD versions that prevents LLD to generate proper
>> machine code for RISC-V when writing `nop`s.
>>
>> I have tested enabling LTO for `defconfig`. The LLD took ~2m21s and ~3GiB
>> on our Intel Xeon Gold 6140 server and produced an 18MiB Image. The image
>> can boot to shell using an archriscv rootfs on QEMU.
>>
>> I have also tested it for `allyesconfig` without COMPILE_TEST, FTRACE,
>> KASAN, and GCOV. The LLD took ~7h03m and ~335GiB on the server,
> Haha! That's ok, allyesconfig is not expected to boot for any
> architectures AFAIK. For CI, we simply verify we can build them; we
> boot test everything but allyesconfig and allmodconfig (and
> architectures which don't yet have qemu ports).  It helps detect when
> assembler sources don't use enough encoding space when referenced
> external symbols are too far away for larger images, IME.
>
> That's a long time; first time I've seen a number from someone trying
> to LTO allyesconfig!
>
>> successfully producing a 1.7GiB Image. Unfortunately, we cannot boot this
>> image because the `create_kernel_page_table()` -> `alloc_pmd_early()` ->
>> `BUG_ON()` logic limits the image to be < 1GiB. Maybe we can fix it in a
>> separate patch further.
>>
>> [1] https://github.com/llvm/llvm-project/issues/50505, resolved by LLVM
>>     commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
> So looking at that change, it doesn't look like it touches LLD at all.
> To me it touches both the assembler and the object streamer, which is
> used by the compiler (and assembler) to stream instructions directly
> into an object file (without the need for an external assembler).
>
> That makes me think that
> CLANG_VERSION >= 140000
> would be more appropriate than
> LLD_VERSION >= 140000
>
> WDYT?


It seems that when LTO is enabled, the compiler (clang) only generates bitcode, and LLD collects all the bitcode, does LTO, and generates machine code (assembly) and ELF object files? Actually, the error message in #50505 was originally reported by LLD. So I'd like to check LLD_VERSION here.


>
> Also, I'm curious if the LLVM patch you had me commit for you recently
> is at all related or necessary for this? If so, then the version check
> should probably be against clang-15, not clang-14.
> https://github.com/llvm/llvm-project/commit/6baaad740a5abb4bfcff022a8114abb4eea66a2d


Aha, that patch is for enabling LTO for allyesconfig (especially, with xfs and overlayfs). I think that patch is not mandatory for enabling LTO for RISC-V.


> Anyways, I just did a build+boot (in qemu) test of defconfig+thinlto
> and defconfig+lto. LGTM
>
> Tested-by: Nick Desaulniers <ndesaulniers@google.com>


Thanks,
Wende


>> Tested-by: Wende Tan <twd2.me@gmail.com>
>> Signed-off-by: Wende Tan <twd2.me@gmail.com>
>> ---
>> v2:
>> - Some textual changes suggested by Nick.
>> - Drop the changes to `arch/riscv/Makefile`, since the LLVM issue is filed
>>   and resolved.
>> - Drop the unnecessary changes to `arch/riscv/kernel/vdso/Makefile`.
>>
>> v1: https://lore.kernel.org/linux-riscv/20210719205208.1023221-1-twd2.me@gmail.com/
>> ---
>>  arch/riscv/Kconfig | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>> index 00fd9c548f26..c55f6b95e5af 100644
>> --- a/arch/riscv/Kconfig
>> +++ b/arch/riscv/Kconfig
>> @@ -38,6 +38,10 @@ config RISCV
>>         select ARCH_SUPPORTS_ATOMIC_RMW
>>         select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
>>         select ARCH_SUPPORTS_HUGETLBFS if MMU
>> +       # LLD >= 14:
>> +       # - https://github.com/llvm/llvm-project/issues/50505
>> +       select ARCH_SUPPORTS_LTO_CLANG if LLD_VERSION >= 140000
>> +       select ARCH_SUPPORTS_LTO_CLANG_THIN if LLD_VERSION >= 140000
>>         select ARCH_USE_MEMTEST
>>         select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
>>         select ARCH_WANT_FRAME_POINTERS
>> --
>> 2.25.1
>>
Fangrui Song May 13, 2022, 6:58 p.m. UTC | #3
On Fri, May 13, 2022 at 11:28 AM twd2 <twd2.me@gmail.com> wrote:
>
> 在 2022/5/13 5:34, Nick Desaulniers 写道:
> > On Thu, May 12, 2022 at 1:56 PM Wende Tan <twd2.me@gmail.com> wrote:
> >> Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is an
> >> issue [1] in prior LLD versions that prevents LLD to generate proper
> >> machine code for RISC-V when writing `nop`s.
> >>
> >> I have tested enabling LTO for `defconfig`. The LLD took ~2m21s and ~3GiB
> >> on our Intel Xeon Gold 6140 server and produced an 18MiB Image. The image
> >> can boot to shell using an archriscv rootfs on QEMU.
> >>
> >> I have also tested it for `allyesconfig` without COMPILE_TEST, FTRACE,
> >> KASAN, and GCOV. The LLD took ~7h03m and ~335GiB on the server,
> > Haha! That's ok, allyesconfig is not expected to boot for any
> > architectures AFAIK. For CI, we simply verify we can build them; we
> > boot test everything but allyesconfig and allmodconfig (and
> > architectures which don't yet have qemu ports).  It helps detect when
> > assembler sources don't use enough encoding space when referenced
> > external symbols are too far away for larger images, IME.
> >
> > That's a long time; first time I've seen a number from someone trying
> > to LTO allyesconfig!
> >
> >> successfully producing a 1.7GiB Image. Unfortunately, we cannot boot this
> >> image because the `create_kernel_page_table()` -> `alloc_pmd_early()` ->
> >> `BUG_ON()` logic limits the image to be < 1GiB. Maybe we can fix it in a
> >> separate patch further.
> >>
> >> [1] https://github.com/llvm/llvm-project/issues/50505, resolved by LLVM
> >>     commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
> > So looking at that change, it doesn't look like it touches LLD at all.
> > To me it touches both the assembler and the object streamer, which is
> > used by the compiler (and assembler) to stream instructions directly
> > into an object file (without the need for an external assembler).
> >
> > That makes me think that
> > CLANG_VERSION >= 140000
> > would be more appropriate than
> > LLD_VERSION >= 140000
> >
> > WDYT?
>
>
> It seems that when LTO is enabled, the compiler (clang) only generates bitcode, and LLD collects all the bitcode, does LTO, and generates machine code (assembly) and ELF object files? Actually, the error message in #50505 was originally reported by LLD. So I'd like to check LLD_VERSION here.

lld calls into llvm/lib/LTO which in turn uses llvm/lib/MC to write a
relocatable object file.
In this sense, it's the lld version which matters, but I think use
either clang/lld version is ok.

Technically one can use clang 13 and do LTO with lld 14 thanks to
bitcode compatibility. In this sense the lib/MC specific change only
requires lld version.

> >
> > Also, I'm curious if the LLVM patch you had me commit for you recently
> > is at all related or necessary for this? If so, then the version check
> > should probably be against clang-15, not clang-14.
> > https://github.com/llvm/llvm-project/commit/6baaad740a5abb4bfcff022a8114abb4eea66a2d
>
>
> Aha, that patch is for enabling LTO for allyesconfig (especially, with xfs and overlayfs). I think that patch is not mandatory for enabling LTO for RISC-V.
>
>
> > Anyways, I just did a build+boot (in qemu) test of defconfig+thinlto
> > and defconfig+lto. LGTM
> >
> > Tested-by: Nick Desaulniers <ndesaulniers@google.com>
>
>
> Thanks,
> Wende
>
>
> >> Tested-by: Wende Tan <twd2.me@gmail.com>
> >> Signed-off-by: Wende Tan <twd2.me@gmail.com>
> >> ---
> >> v2:
> >> - Some textual changes suggested by Nick.
> >> - Drop the changes to `arch/riscv/Makefile`, since the LLVM issue is filed
> >>   and resolved.
> >> - Drop the unnecessary changes to `arch/riscv/kernel/vdso/Makefile`.
> >>
> >> v1: https://lore.kernel.org/linux-riscv/20210719205208.1023221-1-twd2.me@gmail.com/
> >> ---
> >>  arch/riscv/Kconfig | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> >> index 00fd9c548f26..c55f6b95e5af 100644
> >> --- a/arch/riscv/Kconfig
> >> +++ b/arch/riscv/Kconfig
> >> @@ -38,6 +38,10 @@ config RISCV
> >>         select ARCH_SUPPORTS_ATOMIC_RMW
> >>         select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
> >>         select ARCH_SUPPORTS_HUGETLBFS if MMU
> >> +       # LLD >= 14:
> >> +       # - https://github.com/llvm/llvm-project/issues/50505
> >> +       select ARCH_SUPPORTS_LTO_CLANG if LLD_VERSION >= 140000
> >> +       select ARCH_SUPPORTS_LTO_CLANG_THIN if LLD_VERSION >= 140000
> >>         select ARCH_USE_MEMTEST
> >>         select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
> >>         select ARCH_WANT_FRAME_POINTERS
> >> --
> >> 2.25.1
> >>
Nick Desaulniers May 13, 2022, 7:30 p.m. UTC | #4
On Fri, May 13, 2022 at 11:28 AM twd2 <twd2.me@gmail.com> wrote:
>
> 在 2022/5/13 5:34, Nick Desaulniers 写道:
> > On Thu, May 12, 2022 at 1:56 PM Wende Tan <twd2.me@gmail.com> wrote:
> >> Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is an
> >> issue [1] in prior LLD versions that prevents LLD to generate proper
> >> machine code for RISC-V when writing `nop`s.
> >>
> >> I have tested enabling LTO for `defconfig`. The LLD took ~2m21s and ~3GiB
> >> on our Intel Xeon Gold 6140 server and produced an 18MiB Image. The image
> >> can boot to shell using an archriscv rootfs on QEMU.
> >>
> >> I have also tested it for `allyesconfig` without COMPILE_TEST, FTRACE,
> >> KASAN, and GCOV. The LLD took ~7h03m and ~335GiB on the server,
> > Haha! That's ok, allyesconfig is not expected to boot for any
> > architectures AFAIK. For CI, we simply verify we can build them; we
> > boot test everything but allyesconfig and allmodconfig (and
> > architectures which don't yet have qemu ports).  It helps detect when
> > assembler sources don't use enough encoding space when referenced
> > external symbols are too far away for larger images, IME.
> >
> > That's a long time; first time I've seen a number from someone trying
> > to LTO allyesconfig!
> >
> >> successfully producing a 1.7GiB Image. Unfortunately, we cannot boot this
> >> image because the `create_kernel_page_table()` -> `alloc_pmd_early()` ->
> >> `BUG_ON()` logic limits the image to be < 1GiB. Maybe we can fix it in a
> >> separate patch further.
> >>
> >> [1] https://github.com/llvm/llvm-project/issues/50505, resolved by LLVM
> >>     commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
> > So looking at that change, it doesn't look like it touches LLD at all.
> > To me it touches both the assembler and the object streamer, which is
> > used by the compiler (and assembler) to stream instructions directly
> > into an object file (without the need for an external assembler).
> >
> > That makes me think that
> > CLANG_VERSION >= 140000
> > would be more appropriate than
> > LLD_VERSION >= 140000
> >
> > WDYT?
>
>
> It seems that when LTO is enabled, the compiler (clang) only generates bitcode, and LLD collects all the bitcode, does LTO, and generates machine code (assembly) and ELF object files? Actually, the error message in #50505 was originally reported by LLD. So I'd like to check LLD_VERSION here.

Ah, that's right. Ok, LGTM.

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

>
>
> >
> > Also, I'm curious if the LLVM patch you had me commit for you recently
> > is at all related or necessary for this? If so, then the version check
> > should probably be against clang-15, not clang-14.
> > https://github.com/llvm/llvm-project/commit/6baaad740a5abb4bfcff022a8114abb4eea66a2d
>
>
> Aha, that patch is for enabling LTO for allyesconfig (especially, with xfs and overlayfs). I think that patch is not mandatory for enabling LTO for RISC-V.
>
>
> > Anyways, I just did a build+boot (in qemu) test of defconfig+thinlto
> > and defconfig+lto. LGTM
> >
> > Tested-by: Nick Desaulniers <ndesaulniers@google.com>
>
>
> Thanks,
> Wende
>
>
> >> Tested-by: Wende Tan <twd2.me@gmail.com>
> >> Signed-off-by: Wende Tan <twd2.me@gmail.com>
> >> ---
> >> v2:
> >> - Some textual changes suggested by Nick.
> >> - Drop the changes to `arch/riscv/Makefile`, since the LLVM issue is filed
> >>   and resolved.
> >> - Drop the unnecessary changes to `arch/riscv/kernel/vdso/Makefile`.
> >>
> >> v1: https://lore.kernel.org/linux-riscv/20210719205208.1023221-1-twd2.me@gmail.com/
> >> ---
> >>  arch/riscv/Kconfig | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> >> index 00fd9c548f26..c55f6b95e5af 100644
> >> --- a/arch/riscv/Kconfig
> >> +++ b/arch/riscv/Kconfig
> >> @@ -38,6 +38,10 @@ config RISCV
> >>         select ARCH_SUPPORTS_ATOMIC_RMW
> >>         select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
> >>         select ARCH_SUPPORTS_HUGETLBFS if MMU
> >> +       # LLD >= 14:
> >> +       # - https://github.com/llvm/llvm-project/issues/50505
> >> +       select ARCH_SUPPORTS_LTO_CLANG if LLD_VERSION >= 140000
> >> +       select ARCH_SUPPORTS_LTO_CLANG_THIN if LLD_VERSION >= 140000
> >>         select ARCH_USE_MEMTEST
> >>         select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
> >>         select ARCH_WANT_FRAME_POINTERS
> >> --
> >> 2.25.1
> >>
Fangrui Song May 13, 2022, 7:31 p.m. UTC | #5
On Fri, May 13, 2022 at 12:30 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Fri, May 13, 2022 at 11:28 AM twd2 <twd2.me@gmail.com> wrote:
> >
> > 在 2022/5/13 5:34, Nick Desaulniers 写道:
> > > On Thu, May 12, 2022 at 1:56 PM Wende Tan <twd2.me@gmail.com> wrote:
> > >> Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is an
> > >> issue [1] in prior LLD versions that prevents LLD to generate proper
> > >> machine code for RISC-V when writing `nop`s.
> > >>
> > >> I have tested enabling LTO for `defconfig`. The LLD took ~2m21s and ~3GiB
> > >> on our Intel Xeon Gold 6140 server and produced an 18MiB Image. The image
> > >> can boot to shell using an archriscv rootfs on QEMU.
> > >>
> > >> I have also tested it for `allyesconfig` without COMPILE_TEST, FTRACE,
> > >> KASAN, and GCOV. The LLD took ~7h03m and ~335GiB on the server,
> > > Haha! That's ok, allyesconfig is not expected to boot for any
> > > architectures AFAIK. For CI, we simply verify we can build them; we
> > > boot test everything but allyesconfig and allmodconfig (and
> > > architectures which don't yet have qemu ports).  It helps detect when
> > > assembler sources don't use enough encoding space when referenced
> > > external symbols are too far away for larger images, IME.
> > >
> > > That's a long time; first time I've seen a number from someone trying
> > > to LTO allyesconfig!
> > >
> > >> successfully producing a 1.7GiB Image. Unfortunately, we cannot boot this
> > >> image because the `create_kernel_page_table()` -> `alloc_pmd_early()` ->
> > >> `BUG_ON()` logic limits the image to be < 1GiB. Maybe we can fix it in a
> > >> separate patch further.
> > >>
> > >> [1] https://github.com/llvm/llvm-project/issues/50505, resolved by LLVM
> > >>     commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
> > > So looking at that change, it doesn't look like it touches LLD at all.
> > > To me it touches both the assembler and the object streamer, which is
> > > used by the compiler (and assembler) to stream instructions directly
> > > into an object file (without the need for an external assembler).
> > >
> > > That makes me think that
> > > CLANG_VERSION >= 140000
> > > would be more appropriate than
> > > LLD_VERSION >= 140000
> > >
> > > WDYT?
> >
> >
> > It seems that when LTO is enabled, the compiler (clang) only generates bitcode, and LLD collects all the bitcode, does LTO, and generates machine code (assembly) and ELF object files? Actually, the error message in #50505 was originally reported by LLD. So I'd like to check LLD_VERSION here.
>
> Ah, that's right. Ok, LGTM.
>
> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed-by: Fangrui Song <maskray@google.com>

> >
> >
> > >
> > > Also, I'm curious if the LLVM patch you had me commit for you recently
> > > is at all related or necessary for this? If so, then the version check
> > > should probably be against clang-15, not clang-14.
> > > https://github.com/llvm/llvm-project/commit/6baaad740a5abb4bfcff022a8114abb4eea66a2d
> >
> >
> > Aha, that patch is for enabling LTO for allyesconfig (especially, with xfs and overlayfs). I think that patch is not mandatory for enabling LTO for RISC-V.
> >
> >
> > > Anyways, I just did a build+boot (in qemu) test of defconfig+thinlto
> > > and defconfig+lto. LGTM
> > >
> > > Tested-by: Nick Desaulniers <ndesaulniers@google.com>
> >
> >
> > Thanks,
> > Wende
> >
> >
> > >> Tested-by: Wende Tan <twd2.me@gmail.com>
> > >> Signed-off-by: Wende Tan <twd2.me@gmail.com>
> > >> ---
> > >> v2:
> > >> - Some textual changes suggested by Nick.
> > >> - Drop the changes to `arch/riscv/Makefile`, since the LLVM issue is filed
> > >>   and resolved.
> > >> - Drop the unnecessary changes to `arch/riscv/kernel/vdso/Makefile`.
> > >>
> > >> v1: https://lore.kernel.org/linux-riscv/20210719205208.1023221-1-twd2.me@gmail.com/
> > >> ---
> > >>  arch/riscv/Kconfig | 4 ++++
> > >>  1 file changed, 4 insertions(+)
> > >>
> > >> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > >> index 00fd9c548f26..c55f6b95e5af 100644
> > >> --- a/arch/riscv/Kconfig
> > >> +++ b/arch/riscv/Kconfig
> > >> @@ -38,6 +38,10 @@ config RISCV
> > >>         select ARCH_SUPPORTS_ATOMIC_RMW
> > >>         select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
> > >>         select ARCH_SUPPORTS_HUGETLBFS if MMU
> > >> +       # LLD >= 14:
> > >> +       # - https://github.com/llvm/llvm-project/issues/50505
> > >> +       select ARCH_SUPPORTS_LTO_CLANG if LLD_VERSION >= 140000
> > >> +       select ARCH_SUPPORTS_LTO_CLANG_THIN if LLD_VERSION >= 140000
> > >>         select ARCH_USE_MEMTEST
> > >>         select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
> > >>         select ARCH_WANT_FRAME_POINTERS
> > >> --
> > >> 2.25.1
> > >>
>
>
>
> --
> Thanks,
> ~Nick Desaulniers
twd2 July 1, 2022, 4:42 a.m. UTC | #6
Hi Palmer,

ping?

Thanks
Wende

在 2022/5/14 3:31, Fāng-ruì Sòng 写道:
> On Fri, May 13, 2022 at 12:30 PM Nick Desaulniers
> <ndesaulniers@google.com> wrote:
>> On Fri, May 13, 2022 at 11:28 AM twd2 <twd2.me@gmail.com> wrote:
>>> 在 2022/5/13 5:34, Nick Desaulniers 写道:
>>>> On Thu, May 12, 2022 at 1:56 PM Wende Tan <twd2.me@gmail.com> wrote:
>>>>> Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is an
>>>>> issue [1] in prior LLD versions that prevents LLD to generate proper
>>>>> machine code for RISC-V when writing `nop`s.
>>>>>
>>>>> I have tested enabling LTO for `defconfig`. The LLD took ~2m21s and ~3GiB
>>>>> on our Intel Xeon Gold 6140 server and produced an 18MiB Image. The image
>>>>> can boot to shell using an archriscv rootfs on QEMU.
>>>>>
>>>>> I have also tested it for `allyesconfig` without COMPILE_TEST, FTRACE,
>>>>> KASAN, and GCOV. The LLD took ~7h03m and ~335GiB on the server,
>>>> Haha! That's ok, allyesconfig is not expected to boot for any
>>>> architectures AFAIK. For CI, we simply verify we can build them; we
>>>> boot test everything but allyesconfig and allmodconfig (and
>>>> architectures which don't yet have qemu ports).  It helps detect when
>>>> assembler sources don't use enough encoding space when referenced
>>>> external symbols are too far away for larger images, IME.
>>>>
>>>> That's a long time; first time I've seen a number from someone trying
>>>> to LTO allyesconfig!
>>>>
>>>>> successfully producing a 1.7GiB Image. Unfortunately, we cannot boot this
>>>>> image because the `create_kernel_page_table()` -> `alloc_pmd_early()` ->
>>>>> `BUG_ON()` logic limits the image to be < 1GiB. Maybe we can fix it in a
>>>>> separate patch further.
>>>>>
>>>>> [1] https://github.com/llvm/llvm-project/issues/50505, resolved by LLVM
>>>>>     commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
>>>> So looking at that change, it doesn't look like it touches LLD at all.
>>>> To me it touches both the assembler and the object streamer, which is
>>>> used by the compiler (and assembler) to stream instructions directly
>>>> into an object file (without the need for an external assembler).
>>>>
>>>> That makes me think that
>>>> CLANG_VERSION >= 140000
>>>> would be more appropriate than
>>>> LLD_VERSION >= 140000
>>>>
>>>> WDYT?
>>>
>>> It seems that when LTO is enabled, the compiler (clang) only generates bitcode, and LLD collects all the bitcode, does LTO, and generates machine code (assembly) and ELF object files? Actually, the error message in #50505 was originally reported by LLD. So I'd like to check LLD_VERSION here.
>> Ah, that's right. Ok, LGTM.
>>
>> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> Reviewed-by: Fangrui Song <maskray@google.com>
>
>>>
>>>> Also, I'm curious if the LLVM patch you had me commit for you recently
>>>> is at all related or necessary for this? If so, then the version check
>>>> should probably be against clang-15, not clang-14.
>>>> https://github.com/llvm/llvm-project/commit/6baaad740a5abb4bfcff022a8114abb4eea66a2d
>>>
>>> Aha, that patch is for enabling LTO for allyesconfig (especially, with xfs and overlayfs). I think that patch is not mandatory for enabling LTO for RISC-V.
>>>
>>>
>>>> Anyways, I just did a build+boot (in qemu) test of defconfig+thinlto
>>>> and defconfig+lto. LGTM
>>>>
>>>> Tested-by: Nick Desaulniers <ndesaulniers@google.com>
>>>
>>> Thanks,
>>> Wende
>>>
>>>
>>>>> Tested-by: Wende Tan <twd2.me@gmail.com>
>>>>> Signed-off-by: Wende Tan <twd2.me@gmail.com>
>>>>> ---
>>>>> v2:
>>>>> - Some textual changes suggested by Nick.
>>>>> - Drop the changes to `arch/riscv/Makefile`, since the LLVM issue is filed
>>>>>   and resolved.
>>>>> - Drop the unnecessary changes to `arch/riscv/kernel/vdso/Makefile`.
>>>>>
>>>>> v1: https://lore.kernel.org/linux-riscv/20210719205208.1023221-1-twd2.me@gmail.com/
>>>>> ---
>>>>>  arch/riscv/Kconfig | 4 ++++
>>>>>  1 file changed, 4 insertions(+)
>>>>>
>>>>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>>>>> index 00fd9c548f26..c55f6b95e5af 100644
>>>>> --- a/arch/riscv/Kconfig
>>>>> +++ b/arch/riscv/Kconfig
>>>>> @@ -38,6 +38,10 @@ config RISCV
>>>>>         select ARCH_SUPPORTS_ATOMIC_RMW
>>>>>         select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
>>>>>         select ARCH_SUPPORTS_HUGETLBFS if MMU
>>>>> +       # LLD >= 14:
>>>>> +       # - https://github.com/llvm/llvm-project/issues/50505
>>>>> +       select ARCH_SUPPORTS_LTO_CLANG if LLD_VERSION >= 140000
>>>>> +       select ARCH_SUPPORTS_LTO_CLANG_THIN if LLD_VERSION >= 140000
>>>>>         select ARCH_USE_MEMTEST
>>>>>         select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
>>>>>         select ARCH_WANT_FRAME_POINTERS
>>>>> --
>>>>> 2.25.1
>>>>>
>>
>>
>> --
>> Thanks,
>> ~Nick Desaulniers
>
>
Palmer Dabbelt Oct. 5, 2022, 1:57 a.m. UTC | #7
> Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is an
> issue [1] in prior LLD versions that prevents LLD to generate proper
> machine code for RISC-V when writing `nop`s.
> 
> I have tested enabling LTO for `defconfig`. The LLD took ~2m21s and ~3GiB
> on our Intel Xeon Gold 6140 server and produced an 18MiB Image. The image
> can boot to shell using an archriscv rootfs on QEMU.
> 
> I have also tested it for `allyesconfig` without COMPILE_TEST, FTRACE,
> KASAN, and GCOV. The LLD took ~7h03m and ~335GiB on the server,
> successfully producing a 1.7GiB Image. Unfortunately, we cannot boot this
> image because the `create_kernel_page_table()` -> `alloc_pmd_early()` ->
> `BUG_ON()` logic limits the image to be < 1GiB. Maybe we can fix it in a
> separate patch further.
> 
> [1] https://github.com/llvm/llvm-project/issues/50505, resolved by LLVM
>     commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
> 
> Tested-by: Wende Tan <twd2.me@gmail.com>
> Signed-off-by: Wende Tan <twd2.me@gmail.com>

Sorry for missing this, the v2 never made it to my inbox.  Not sure 
exactly what happened, but an off-list ping made it through.  I've put 
this on for-next, I don't have any way to test it because I don't have 
clang setup yet.
twd2 Oct. 5, 2022, 2:19 a.m. UTC | #8
Never mind. Thanks for your reply!

But I find Nick's Tested-by tag is missing?
 https://lore.kernel.org/all/CAKwvOdmJ26h4L=3+42xeWG0_mZmZXkye=+mNWgFOHbo59ZVz1w@mail.gmail.com/

On Wed, 5 Oct 2022 at 09:57, Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> > Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is an
> > issue [1] in prior LLD versions that prevents LLD to generate proper
> > machine code for RISC-V when writing `nop`s.
> >
> > I have tested enabling LTO for `defconfig`. The LLD took ~2m21s and ~3GiB
> > on our Intel Xeon Gold 6140 server and produced an 18MiB Image. The image
> > can boot to shell using an archriscv rootfs on QEMU.
> >
> > I have also tested it for `allyesconfig` without COMPILE_TEST, FTRACE,
> > KASAN, and GCOV. The LLD took ~7h03m and ~335GiB on the server,
> > successfully producing a 1.7GiB Image. Unfortunately, we cannot boot this
> > image because the `create_kernel_page_table()` -> `alloc_pmd_early()` ->
> > `BUG_ON()` logic limits the image to be < 1GiB. Maybe we can fix it in a
> > separate patch further.
> >
> > [1] https://github.com/llvm/llvm-project/issues/50505, resolved by LLVM
> >     commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
> >
> > Tested-by: Wende Tan <twd2.me@gmail.com>
> > Signed-off-by: Wende Tan <twd2.me@gmail.com>
>
> Sorry for missing this, the v2 never made it to my inbox.  Not sure
> exactly what happened, but an off-list ping made it through.  I've put
> this on for-next, I don't have any way to test it because I don't have
> clang setup yet.
Nathan Chancellor Oct. 28, 2022, 10:57 p.m. UTC | #9
On Tue, Oct 04, 2022 at 06:57:10PM -0700, Palmer Dabbelt wrote:
> > Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is an
> > issue [1] in prior LLD versions that prevents LLD to generate proper
> > machine code for RISC-V when writing `nop`s.
> > 
> > I have tested enabling LTO for `defconfig`. The LLD took ~2m21s and ~3GiB
> > on our Intel Xeon Gold 6140 server and produced an 18MiB Image. The image
> > can boot to shell using an archriscv rootfs on QEMU.
> > 
> > I have also tested it for `allyesconfig` without COMPILE_TEST, FTRACE,
> > KASAN, and GCOV. The LLD took ~7h03m and ~335GiB on the server,
> > successfully producing a 1.7GiB Image. Unfortunately, we cannot boot this
> > image because the `create_kernel_page_table()` -> `alloc_pmd_early()` ->
> > `BUG_ON()` logic limits the image to be < 1GiB. Maybe we can fix it in a
> > separate patch further.
> > 
> > [1] https://github.com/llvm/llvm-project/issues/50505, resolved by LLVM
> >     commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
> > 
> > Tested-by: Wende Tan <twd2.me@gmail.com>
> > Signed-off-by: Wende Tan <twd2.me@gmail.com>
> 
> Sorry for missing this, the v2 never made it to my inbox.  Not sure exactly
> what happened, but an off-list ping made it through.  I've put this on
> for-next, I don't have any way to test it because I don't have clang setup
> yet.
> 

Did this get dropped by accident? I do not see it in your tree.

Cheers,
Nathan
diff mbox series

Patch

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 00fd9c548f26..c55f6b95e5af 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -38,6 +38,10 @@  config RISCV
 	select ARCH_SUPPORTS_ATOMIC_RMW
 	select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
 	select ARCH_SUPPORTS_HUGETLBFS if MMU
+	# LLD >= 14:
+	# - https://github.com/llvm/llvm-project/issues/50505
+	select ARCH_SUPPORTS_LTO_CLANG if LLD_VERSION >= 140000
+	select ARCH_SUPPORTS_LTO_CLANG_THIN if LLD_VERSION >= 140000
 	select ARCH_USE_MEMTEST
 	select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
 	select ARCH_WANT_FRAME_POINTERS