Message ID | 20240501-kbuild-llvm-drop-thinlto-cache-v1-1-c117cc50a24b@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kbuild: Remove support for Clang's ThinLTO caching | expand |
On Thu, May 2, 2024 at 7:55 AM Nathan Chancellor <nathan@kernel.org> wrote: > > There is an issue in clang's ThinLTO caching (enabled for the kernel via > '--thinlto-cache-dir') with .incbin, which the kernel occasionally uses > to include data within the kernel, such as the .config file for > /proc/config.gz. For example, when changing the .config and rebuilding > vmlinux, the copy of .config in vmlinux does not match the copy of > .config in the build folder: > > $ echo 'CONFIG_LTO_NONE=n > CONFIG_LTO_CLANG_THIN=y > CONFIG_IKCONFIG=y > CONFIG_HEADERS_INSTALL=y' >kernel/configs/repro.config > > $ make -skj"$(nproc)" ARCH=x86_64 LLVM=1 clean defconfig repro.config vmlinux > ... > > $ grep CONFIG_HEADERS_INSTALL .config > CONFIG_HEADERS_INSTALL=y > > $ scripts/extract-ikconfig vmlinux | grep CONFIG_HEADERS_INSTALL > CONFIG_HEADERS_INSTALL=y > > $ scripts/config -d HEADERS_INSTALL > > $ make -kj"$(nproc)" ARCH=x86_64 LLVM=1 vmlinux > ... > UPD kernel/config_data > GZIP kernel/config_data.gz > CC kernel/configs.o > ... > LD vmlinux > ... > > $ grep CONFIG_HEADERS_INSTALL .config > # CONFIG_HEADERS_INSTALL is not set > > $ scripts/extract-ikconfig vmlinux | grep CONFIG_HEADERS_INSTALL > CONFIG_HEADERS_INSTALL=y > > Without '--thinlto-cache-dir' or when using full LTO, this issue does > not occur. > > Benchmarking incremental builds on a few different machines with and > without the cache shows a 20% increase in incremental build time without > the cache when measured by touching init/main.c and running 'make all'. > > ARCH=arm64 defconfig + CONFIG_LTO_CLANG_THIN=y on an arm64 host: > > Benchmark 1: With ThinLTO cache > Time (mean ± σ): 56.347 s ± 0.163 s [User: 83.768 s, System: 24.661 s] > Range (min … max): 56.109 s … 56.594 s 10 runs > > Benchmark 2: Without ThinLTO cache > Time (mean ± σ): 67.740 s ± 0.479 s [User: 718.458 s, System: 31.797 s] > Range (min … max): 67.059 s … 68.556 s 10 runs > > Summary > With ThinLTO cache ran > 1.20 ± 0.01 times faster than Without ThinLTO cache > > ARCH=x86_64 defconfig + CONFIG_LTO_CLANG_THIN=y on an x86_64 host: > > Benchmark 1: With ThinLTO cache > Time (mean ± σ): 85.772 s ± 0.252 s [User: 91.505 s, System: 8.408 s] > Range (min … max): 85.447 s … 86.244 s 10 runs > > Benchmark 2: Without ThinLTO cache > Time (mean ± σ): 103.833 s ± 0.288 s [User: 232.058 s, System: 8.569 s] > Range (min … max): 103.286 s … 104.124 s 10 runs > > Summary > With ThinLTO cache ran > 1.21 ± 0.00 times faster than Without ThinLTO cache > > While it is unfortunate to take this performance improvement off the > table, correctness is more important. If/when this is fixed in LLVM, it > can potentially be brought back in a conditional manner. Alternatively, > a developer can just disable LTO if doing incremental compiles quickly > is important, as a full compile cycle can still take over a minute even > with the cache and it is unlikely that LTO will result in functional > differences for a kernel change. > > Cc: stable@vger.kernel.org > Fixes: dc5723b02e52 ("kbuild: add support for Clang LTO") > Reported-by: Yifan Hong <elsk@google.com> > Closes: https://github.com/ClangBuiltLinux/linux/issues/2021 > Reported-by: Masami Hiramatsu <mhiramat@kernel.org> > Closes: https://lore.kernel.org/r/20220327115526.cc4b0ff55fc53c97683c3e4d@kernel.org/ > Signed-off-by: Nathan Chancellor <nathan@kernel.org> Applied to linux-kbuild. Thanks. > --- > This is an alternative (and arguably more robust) fix to Yifan's patch > at https://lore.kernel.org/20240429220756.979347-2-elsk@google.com/. > --- > Makefile | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/Makefile b/Makefile > index 40fb2ca6fe4c..6b6a4a757062 100644 > --- a/Makefile > +++ b/Makefile > @@ -942,7 +942,6 @@ endif > ifdef CONFIG_LTO_CLANG > ifdef CONFIG_LTO_CLANG_THIN > CC_FLAGS_LTO := -flto=thin -fsplit-lto-unit > -KBUILD_LDFLAGS += --thinlto-cache-dir=$(extmod_prefix).thinlto-cache > else > CC_FLAGS_LTO := -flto > endif > @@ -1477,7 +1476,7 @@ endif # CONFIG_MODULES > # Directories & files removed with 'make clean' > CLEAN_FILES += vmlinux.symvers modules-only.symvers \ > modules.builtin modules.builtin.modinfo modules.nsdeps \ > - compile_commands.json .thinlto-cache rust/test \ > + compile_commands.json rust/test \ > rust-project.json .vmlinux.objs .vmlinux.export.c > > # Directories & files removed with 'make mrproper' > @@ -1783,7 +1782,7 @@ PHONY += compile_commands.json > > clean-dirs := $(KBUILD_EXTMOD) > clean: rm-files := $(KBUILD_EXTMOD)/Module.symvers $(KBUILD_EXTMOD)/modules.nsdeps \ > - $(KBUILD_EXTMOD)/compile_commands.json $(KBUILD_EXTMOD)/.thinlto-cache > + $(KBUILD_EXTMOD)/compile_commands.json > > PHONY += prepare > # now expand this into a simple variable to reduce the cost of shell evaluations > > --- > base-commit: e67572cd2204894179d89bd7b984072f19313b03 > change-id: 20240501-kbuild-llvm-drop-thinlto-cache-f497770692f8 > > Best regards, > -- > Nathan Chancellor <nathan@kernel.org> > -- Best Regards Masahiro Yamada
diff --git a/Makefile b/Makefile index 40fb2ca6fe4c..6b6a4a757062 100644 --- a/Makefile +++ b/Makefile @@ -942,7 +942,6 @@ endif ifdef CONFIG_LTO_CLANG ifdef CONFIG_LTO_CLANG_THIN CC_FLAGS_LTO := -flto=thin -fsplit-lto-unit -KBUILD_LDFLAGS += --thinlto-cache-dir=$(extmod_prefix).thinlto-cache else CC_FLAGS_LTO := -flto endif @@ -1477,7 +1476,7 @@ endif # CONFIG_MODULES # Directories & files removed with 'make clean' CLEAN_FILES += vmlinux.symvers modules-only.symvers \ modules.builtin modules.builtin.modinfo modules.nsdeps \ - compile_commands.json .thinlto-cache rust/test \ + compile_commands.json rust/test \ rust-project.json .vmlinux.objs .vmlinux.export.c # Directories & files removed with 'make mrproper' @@ -1783,7 +1782,7 @@ PHONY += compile_commands.json clean-dirs := $(KBUILD_EXTMOD) clean: rm-files := $(KBUILD_EXTMOD)/Module.symvers $(KBUILD_EXTMOD)/modules.nsdeps \ - $(KBUILD_EXTMOD)/compile_commands.json $(KBUILD_EXTMOD)/.thinlto-cache + $(KBUILD_EXTMOD)/compile_commands.json PHONY += prepare # now expand this into a simple variable to reduce the cost of shell evaluations
There is an issue in clang's ThinLTO caching (enabled for the kernel via '--thinlto-cache-dir') with .incbin, which the kernel occasionally uses to include data within the kernel, such as the .config file for /proc/config.gz. For example, when changing the .config and rebuilding vmlinux, the copy of .config in vmlinux does not match the copy of .config in the build folder: $ echo 'CONFIG_LTO_NONE=n CONFIG_LTO_CLANG_THIN=y CONFIG_IKCONFIG=y CONFIG_HEADERS_INSTALL=y' >kernel/configs/repro.config $ make -skj"$(nproc)" ARCH=x86_64 LLVM=1 clean defconfig repro.config vmlinux ... $ grep CONFIG_HEADERS_INSTALL .config CONFIG_HEADERS_INSTALL=y $ scripts/extract-ikconfig vmlinux | grep CONFIG_HEADERS_INSTALL CONFIG_HEADERS_INSTALL=y $ scripts/config -d HEADERS_INSTALL $ make -kj"$(nproc)" ARCH=x86_64 LLVM=1 vmlinux ... UPD kernel/config_data GZIP kernel/config_data.gz CC kernel/configs.o ... LD vmlinux ... $ grep CONFIG_HEADERS_INSTALL .config # CONFIG_HEADERS_INSTALL is not set $ scripts/extract-ikconfig vmlinux | grep CONFIG_HEADERS_INSTALL CONFIG_HEADERS_INSTALL=y Without '--thinlto-cache-dir' or when using full LTO, this issue does not occur. Benchmarking incremental builds on a few different machines with and without the cache shows a 20% increase in incremental build time without the cache when measured by touching init/main.c and running 'make all'. ARCH=arm64 defconfig + CONFIG_LTO_CLANG_THIN=y on an arm64 host: Benchmark 1: With ThinLTO cache Time (mean ± σ): 56.347 s ± 0.163 s [User: 83.768 s, System: 24.661 s] Range (min … max): 56.109 s … 56.594 s 10 runs Benchmark 2: Without ThinLTO cache Time (mean ± σ): 67.740 s ± 0.479 s [User: 718.458 s, System: 31.797 s] Range (min … max): 67.059 s … 68.556 s 10 runs Summary With ThinLTO cache ran 1.20 ± 0.01 times faster than Without ThinLTO cache ARCH=x86_64 defconfig + CONFIG_LTO_CLANG_THIN=y on an x86_64 host: Benchmark 1: With ThinLTO cache Time (mean ± σ): 85.772 s ± 0.252 s [User: 91.505 s, System: 8.408 s] Range (min … max): 85.447 s … 86.244 s 10 runs Benchmark 2: Without ThinLTO cache Time (mean ± σ): 103.833 s ± 0.288 s [User: 232.058 s, System: 8.569 s] Range (min … max): 103.286 s … 104.124 s 10 runs Summary With ThinLTO cache ran 1.21 ± 0.00 times faster than Without ThinLTO cache While it is unfortunate to take this performance improvement off the table, correctness is more important. If/when this is fixed in LLVM, it can potentially be brought back in a conditional manner. Alternatively, a developer can just disable LTO if doing incremental compiles quickly is important, as a full compile cycle can still take over a minute even with the cache and it is unlikely that LTO will result in functional differences for a kernel change. Cc: stable@vger.kernel.org Fixes: dc5723b02e52 ("kbuild: add support for Clang LTO") Reported-by: Yifan Hong <elsk@google.com> Closes: https://github.com/ClangBuiltLinux/linux/issues/2021 Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Closes: https://lore.kernel.org/r/20220327115526.cc4b0ff55fc53c97683c3e4d@kernel.org/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> --- This is an alternative (and arguably more robust) fix to Yifan's patch at https://lore.kernel.org/20240429220756.979347-2-elsk@google.com/. --- Makefile | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) --- base-commit: e67572cd2204894179d89bd7b984072f19313b03 change-id: 20240501-kbuild-llvm-drop-thinlto-cache-f497770692f8 Best regards,