Message ID | 20200521202716.193316-1-samitolvanen@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kbuild: reuse vmlinux.o in vmlinux_link | expand |
On Thu, May 21, 2020 at 01:27:16PM -0700, Sami Tolvanen wrote: > Instead of linking all compilation units again each time vmlinux_link is > called, reuse vmlinux.o from modpost_link. > > With x86_64 allyesconfig, vmlinux_link is called three times and reusing > vmlinux.o reduces the build time ~38 seconds on my system (59% reduction > in the time spent in vmlinux_link). Nice! Any time savings at final link is a big cumulative win. > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > scripts/link-vmlinux.sh | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > index d09ab4afbda4..c6cc4305950c 100755 > --- a/scripts/link-vmlinux.sh > +++ b/scripts/link-vmlinux.sh > @@ -77,11 +77,8 @@ vmlinux_link() > > if [ "${SRCARCH}" != "um" ]; then > objects="--whole-archive \ > - ${KBUILD_VMLINUX_OBJS} \ > + vmlinux.o \ > --no-whole-archive \ > - --start-group \ > - ${KBUILD_VMLINUX_LIBS} \ > - --end-group \ > ${@}" > > ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \ I think the "um" case can be updated as well too, yes? Also, I think the comment above modpost_link() needs to be updated now to reflect the nature of how vmlinux.o gets used after this patch.
On Thu, May 21, 2020 at 01:27:16PM -0700, Sami Tolvanen wrote: > Instead of linking all compilation units again each time vmlinux_link is > called, reuse vmlinux.o from modpost_link. > > With x86_64 allyesconfig, vmlinux_link is called three times and reusing > vmlinux.o reduces the build time ~38 seconds on my system (59% reduction > in the time spent in vmlinux_link). BTW, I'll see this most in that it knocks about 6% off my "I changed 1 .c file and now I'm rebuilding" workflow time (which is obviously dominated by linking), from 25 seconds to 23.5 seconds. And since most of those seconds are spent staring at the build, it feels like a lot more. ;)
On Fri, May 22, 2020 at 5:27 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > Instead of linking all compilation units again each time vmlinux_link is > called, reuse vmlinux.o from modpost_link. > > With x86_64 allyesconfig, vmlinux_link is called three times and reusing > vmlinux.o reduces the build time ~38 seconds on my system (59% reduction > in the time spent in vmlinux_link). > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > scripts/link-vmlinux.sh | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > index d09ab4afbda4..c6cc4305950c 100755 > --- a/scripts/link-vmlinux.sh > +++ b/scripts/link-vmlinux.sh > @@ -77,11 +77,8 @@ vmlinux_link() > > if [ "${SRCARCH}" != "um" ]; then > objects="--whole-archive \ > - ${KBUILD_VMLINUX_OBJS} \ > + vmlinux.o \ > --no-whole-archive \ > - --start-group \ > - ${KBUILD_VMLINUX_LIBS} \ > - --end-group \ > ${@}" > > ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \ > > base-commit: b85051e755b0e9d6dd8f17ef1da083851b83287d > -- > 2.27.0.rc0.183.gde8f92d652-goog > I like this patch irrespective of CLANG_LTO, but unfortunately, my build test failed. ARCH=powerpc failed to build as follows: MODPOST vmlinux.o MODINFO modules.builtin.modinfo GEN modules.builtin LD .tmp_vmlinux.kallsyms1 vmlinux.o:(__ftr_alt_97+0x20): relocation truncated to fit: R_PPC64_REL14 against `.text'+4b1c vmlinux.o:(__ftr_alt_97+0x164): relocation truncated to fit: R_PPC64_REL14 against `.text'+1cf78 vmlinux.o:(__ftr_alt_97+0x288): relocation truncated to fit: R_PPC64_REL14 against `.text'+1dac4 vmlinux.o:(__ftr_alt_97+0x2f0): relocation truncated to fit: R_PPC64_REL14 against `.text'+1e254 make: *** [Makefile:1125: vmlinux] Error 1 I used powerpc-linux-gcc available at https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/9.2.0/ Build command: make -j24 ARCH=powerpc CROSS_COMPILE=powerpc-linux- defconfig all Could you check it please? I will apply it to my test branch. Perhaps, 0-day bot may find more failure cases. -- Best Regards Masahiro Yamada
+ Michael, and PPC ML. They may know something about the reason of failure. On Sat, May 23, 2020 at 2:41 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > On Fri, May 22, 2020 at 5:27 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > > > Instead of linking all compilation units again each time vmlinux_link is > > called, reuse vmlinux.o from modpost_link. > > > > With x86_64 allyesconfig, vmlinux_link is called three times and reusing > > vmlinux.o reduces the build time ~38 seconds on my system (59% reduction > > in the time spent in vmlinux_link). > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > --- > > scripts/link-vmlinux.sh | 5 +---- > > 1 file changed, 1 insertion(+), 4 deletions(-) > > > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > > index d09ab4afbda4..c6cc4305950c 100755 > > --- a/scripts/link-vmlinux.sh > > +++ b/scripts/link-vmlinux.sh > > @@ -77,11 +77,8 @@ vmlinux_link() > > > > if [ "${SRCARCH}" != "um" ]; then > > objects="--whole-archive \ > > - ${KBUILD_VMLINUX_OBJS} \ > > + vmlinux.o \ > > --no-whole-archive \ > > - --start-group \ > > - ${KBUILD_VMLINUX_LIBS} \ > > - --end-group \ > > ${@}" > > > > ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \ > > > > base-commit: b85051e755b0e9d6dd8f17ef1da083851b83287d > > -- > > 2.27.0.rc0.183.gde8f92d652-goog > > > > > I like this patch irrespective of CLANG_LTO, but > unfortunately, my build test failed. > > > ARCH=powerpc failed to build as follows: > > > > MODPOST vmlinux.o > MODINFO modules.builtin.modinfo > GEN modules.builtin > LD .tmp_vmlinux.kallsyms1 > vmlinux.o:(__ftr_alt_97+0x20): relocation truncated to fit: > R_PPC64_REL14 against `.text'+4b1c > vmlinux.o:(__ftr_alt_97+0x164): relocation truncated to fit: > R_PPC64_REL14 against `.text'+1cf78 > vmlinux.o:(__ftr_alt_97+0x288): relocation truncated to fit: > R_PPC64_REL14 against `.text'+1dac4 > vmlinux.o:(__ftr_alt_97+0x2f0): relocation truncated to fit: > R_PPC64_REL14 against `.text'+1e254 > make: *** [Makefile:1125: vmlinux] Error 1 > > > > I used powerpc-linux-gcc > available at > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/9.2.0/ > > > Build command: > > make -j24 ARCH=powerpc CROSS_COMPILE=powerpc-linux- defconfig all > > > Could you check it please? > > > > I will apply it to my test branch. > Perhaps, 0-day bot may find more failure cases. > > > -- > Best Regards > Masahiro Yamada
On Fri, May 22, 2020 at 7:08 AM Kees Cook <keescook@chromium.org> wrote: > > On Thu, May 21, 2020 at 01:27:16PM -0700, Sami Tolvanen wrote: > > Instead of linking all compilation units again each time vmlinux_link is > > called, reuse vmlinux.o from modpost_link. > > > > With x86_64 allyesconfig, vmlinux_link is called three times and reusing > > vmlinux.o reduces the build time ~38 seconds on my system (59% reduction > > in the time spent in vmlinux_link). > > Nice! Any time savings at final link is a big cumulative win. > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > --- > > scripts/link-vmlinux.sh | 5 +---- > > 1 file changed, 1 insertion(+), 4 deletions(-) > > > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > > index d09ab4afbda4..c6cc4305950c 100755 > > --- a/scripts/link-vmlinux.sh > > +++ b/scripts/link-vmlinux.sh > > @@ -77,11 +77,8 @@ vmlinux_link() > > > > if [ "${SRCARCH}" != "um" ]; then > > objects="--whole-archive \ > > - ${KBUILD_VMLINUX_OBJS} \ > > + vmlinux.o \ > > --no-whole-archive \ > > - --start-group \ > > - ${KBUILD_VMLINUX_LIBS} \ > > - --end-group \ > > ${@}" > > > > ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \ > > I think the "um" case can be updated as well too, yes? I agree. I changed the um part, then ARCH=um build is successful. > Also, I think the comment above modpost_link() needs to be updated now > to reflect the nature of how vmlinux.o gets used after this patch. > > -- > Kees Cook
Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am: > + Michael, and PPC ML. > > They may know something about the reason of failure. Because the linker can't put branch stubs within object code sections, so when you incrementally link them too large, the linker can't resolve branches into other object files. This is why we added incremental linking in the first place. I suppose it could be made conditional for platforms that can use this optimization. What'd be really nice is if we could somehow build and link kallsyms without relinking everything twice, and if we could do section mismatch analysis without making that vmlinux.o as well. I had a few ideas but not enough time to do much work on it. Thanks, Nick > > > On Sat, May 23, 2020 at 2:41 AM Masahiro Yamada <masahiroy@kernel.org> wrote: >> >> On Fri, May 22, 2020 at 5:27 AM Sami Tolvanen <samitolvanen@google.com> wrote: >> > >> > Instead of linking all compilation units again each time vmlinux_link is >> > called, reuse vmlinux.o from modpost_link. >> > >> > With x86_64 allyesconfig, vmlinux_link is called three times and reusing >> > vmlinux.o reduces the build time ~38 seconds on my system (59% reduction >> > in the time spent in vmlinux_link). >> > >> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> >> > --- >> > scripts/link-vmlinux.sh | 5 +---- >> > 1 file changed, 1 insertion(+), 4 deletions(-) >> > >> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh >> > index d09ab4afbda4..c6cc4305950c 100755 >> > --- a/scripts/link-vmlinux.sh >> > +++ b/scripts/link-vmlinux.sh >> > @@ -77,11 +77,8 @@ vmlinux_link() >> > >> > if [ "${SRCARCH}" != "um" ]; then >> > objects="--whole-archive \ >> > - ${KBUILD_VMLINUX_OBJS} \ >> > + vmlinux.o \ >> > --no-whole-archive \ >> > - --start-group \ >> > - ${KBUILD_VMLINUX_LIBS} \ >> > - --end-group \ >> > ${@}" >> > >> > ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \ >> > >> > base-commit: b85051e755b0e9d6dd8f17ef1da083851b83287d >> > -- >> > 2.27.0.rc0.183.gde8f92d652-goog >> > >> >> >> I like this patch irrespective of CLANG_LTO, but >> unfortunately, my build test failed. >> >> >> ARCH=powerpc failed to build as follows: >> >> >> >> MODPOST vmlinux.o >> MODINFO modules.builtin.modinfo >> GEN modules.builtin >> LD .tmp_vmlinux.kallsyms1 >> vmlinux.o:(__ftr_alt_97+0x20): relocation truncated to fit: >> R_PPC64_REL14 against `.text'+4b1c >> vmlinux.o:(__ftr_alt_97+0x164): relocation truncated to fit: >> R_PPC64_REL14 against `.text'+1cf78 >> vmlinux.o:(__ftr_alt_97+0x288): relocation truncated to fit: >> R_PPC64_REL14 against `.text'+1dac4 >> vmlinux.o:(__ftr_alt_97+0x2f0): relocation truncated to fit: >> R_PPC64_REL14 against `.text'+1e254 >> make: *** [Makefile:1125: vmlinux] Error 1 >> >> >> >> I used powerpc-linux-gcc >> available at >> https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/9.2.0/ >> >> >> Build command: >> >> make -j24 ARCH=powerpc CROSS_COMPILE=powerpc-linux- defconfig all >> >> >> Could you check it please? >> >> >> >> I will apply it to my test branch. >> Perhaps, 0-day bot may find more failure cases. >> >> >> -- >> Best Regards >> Masahiro Yamada > > > > -- > Best Regards > Masahiro Yamada >
Hi Nicholas, (+CC: Sam Ravnborg) On Sat, May 23, 2020 at 7:06 PM Nicholas Piggin <npiggin@gmail.com> wrote: > > Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am: > > + Michael, and PPC ML. > > > > They may know something about the reason of failure. > > Because the linker can't put branch stubs within object code sections, > so when you incrementally link them too large, the linker can't resolve > branches into other object files. Ah, you are right. So, this is a problem not only for PPC but also for ARM (both 32 and 64 bit), etc. ARM needs to insert a veneer to jump far. Prior to thin archive, we could not compile ARCH=arm allyesconfig because drivers/built-in.o was too large. This patch gets us back to the too large incremental object situation. With my quick compile-testing, ARCH=arm allyesconfig and ARCH=arm64 allyesconfig are broken. > This is why we added incremental linking in the first place. I suppose > it could be made conditional for platforms that can use this > optimization. > > What'd be really nice is if we could somehow build and link kallsyms > without relinking everything twice, and if we could do section mismatch > analysis without making that vmlinux.o as well. I had a few ideas but > not enough time to do much work on it. Right, kallsyms links 3 times. (not twice) Hmm, I think Sami's main motivation is Clang LTO. LTO is very time-consuming. So, the android common kernel implements Clang LTO in the pre modpost stage: 1) LTO against vmlinux.o 2) modpost against vmlinux.o 3) Link vmlinux.o + kallsyms into vmlinux (this requires linking 3 times) If we move LTO to 3), we need to do LTO 3 times. And, this was how GCC LTO was implemented in 2014, (then rejected by Linus). How to do modpost without making vmlinux.o ? In old days, the section mismatch analysis was done against the final vmlinux. 85bd2fddd68e757da8e1af98f857f61a3c9ce647 changed it to run modpost for individual .o files. Then, 741f98fe298a73c9d47ed53703c1279a29718581 introduced vmlinux.o to use it for modpost. The following two commits. I did not fully understand the background, though. I CC'ed Sam in case he may add some comments. commit 85bd2fddd68e757da8e1af98f857f61a3c9ce647 Author: Sam Ravnborg <sam@ravnborg.org> Date: Mon Feb 26 15:33:52 2007 +0100 kbuild: fix section mismatch check for vmlinux vmlinux does not contain relocation entries which is used by the section mismatch checks. Reported by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Use the individual objects as inputs to overcome this limitation. In modpost check the .o files and skip non-ELF files. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> commit 741f98fe298a73c9d47ed53703c1279a29718581 Author: Sam Ravnborg <sam@ravnborg.org> Date: Tue Jul 17 10:54:06 2007 +0200 kbuild: do section mismatch check on full vmlinux Previously we did do the check on the .o files used to link vmlinux but that failed to find questionable references across the .o files. Create a dedicated vmlinux.o file used only for section mismatch checks that uses the defualt linker script so section does not get renamed. The vmlinux.o may later be used as part of the the final link of vmlinux but for now it is used fo section mismatch only. For a defconfig build this is instant but for an allyesconfig this add two minutes to a full build (that anyways takes ~2 hours). Signed-off-by: Sam Ravnborg <sam@ravnborg.org> > > Thanks, > Nick > > > > > > > On Sat, May 23, 2020 at 2:41 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > >> > >> On Fri, May 22, 2020 at 5:27 AM Sami Tolvanen <samitolvanen@google.com> wrote: > >> > > >> > Instead of linking all compilation units again each time vmlinux_link is > >> > called, reuse vmlinux.o from modpost_link. > >> > > >> > With x86_64 allyesconfig, vmlinux_link is called three times and reusing > >> > vmlinux.o reduces the build time ~38 seconds on my system (59% reduction > >> > in the time spent in vmlinux_link). > >> > > >> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > >> > --- > >> > scripts/link-vmlinux.sh | 5 +---- > >> > 1 file changed, 1 insertion(+), 4 deletions(-) > >> > > >> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > >> > index d09ab4afbda4..c6cc4305950c 100755 > >> > --- a/scripts/link-vmlinux.sh > >> > +++ b/scripts/link-vmlinux.sh > >> > @@ -77,11 +77,8 @@ vmlinux_link() > >> > > >> > if [ "${SRCARCH}" != "um" ]; then > >> > objects="--whole-archive \ > >> > - ${KBUILD_VMLINUX_OBJS} \ > >> > + vmlinux.o \ > >> > --no-whole-archive \ > >> > - --start-group \ > >> > - ${KBUILD_VMLINUX_LIBS} \ > >> > - --end-group \ > >> > ${@}" > >> > > >> > ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \ > >> > > >> > base-commit: b85051e755b0e9d6dd8f17ef1da083851b83287d > >> > -- > >> > 2.27.0.rc0.183.gde8f92d652-goog > >> > > >> > >> > >> I like this patch irrespective of CLANG_LTO, but > >> unfortunately, my build test failed. > >> > >> > >> ARCH=powerpc failed to build as follows: > >> > >> > >> > >> MODPOST vmlinux.o > >> MODINFO modules.builtin.modinfo > >> GEN modules.builtin > >> LD .tmp_vmlinux.kallsyms1 > >> vmlinux.o:(__ftr_alt_97+0x20): relocation truncated to fit: > >> R_PPC64_REL14 against `.text'+4b1c > >> vmlinux.o:(__ftr_alt_97+0x164): relocation truncated to fit: > >> R_PPC64_REL14 against `.text'+1cf78 > >> vmlinux.o:(__ftr_alt_97+0x288): relocation truncated to fit: > >> R_PPC64_REL14 against `.text'+1dac4 > >> vmlinux.o:(__ftr_alt_97+0x2f0): relocation truncated to fit: > >> R_PPC64_REL14 against `.text'+1e254 > >> make: *** [Makefile:1125: vmlinux] Error 1 > >> > >> > >> > >> I used powerpc-linux-gcc > >> available at > >> https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/9.2.0/ > >> > >> > >> Build command: > >> > >> make -j24 ARCH=powerpc CROSS_COMPILE=powerpc-linux- defconfig all > >> > >> > >> Could you check it please? > >> > >> > >> > >> I will apply it to my test branch. > >> Perhaps, 0-day bot may find more failure cases. > >> > >> > >> -- > >> Best Regards > >> Masahiro Yamada > > > > > > > > -- > > Best Regards > > Masahiro Yamada > > -- Best Regards Masahiro Yamada
Hi Masahiro. On Sun, May 24, 2020 at 12:12:35AM +0900, Masahiro Yamada wrote: > Hi Nicholas, > (+CC: Sam Ravnborg) > > > On Sat, May 23, 2020 at 7:06 PM Nicholas Piggin <npiggin@gmail.com> wrote: > > > > Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am: > > > + Michael, and PPC ML. > > > > > > They may know something about the reason of failure. > > > > Because the linker can't put branch stubs within object code sections, > > so when you incrementally link them too large, the linker can't resolve > > branches into other object files. > > > Ah, you are right. > > So, this is a problem not only for PPC > but also for ARM (both 32 and 64 bit), etc. > > ARM needs to insert a veneer to jump far. > > Prior to thin archive, we could not compile > ARCH=arm allyesconfig because > drivers/built-in.o was too large. > > This patch gets us back to the too large > incremental object situation. > > With my quick compile-testing, > ARCH=arm allyesconfig > and ARCH=arm64 allyesconfig are broken. > > > > This is why we added incremental linking in the first place. I suppose > > it could be made conditional for platforms that can use this > > optimization. > > > > What'd be really nice is if we could somehow build and link kallsyms > > without relinking everything twice, and if we could do section mismatch > > analysis without making that vmlinux.o as well. I had a few ideas but > > not enough time to do much work on it. > > > Right, kallsyms links 3 times. (not twice) > > > Hmm, I think Sami's main motivation is Clang LTO. > > LTO is very time-consuming. > So, the android common kernel implements Clang LTO > in the pre modpost stage: > > > 1) LTO against vmlinux.o > > 2) modpost against vmlinux.o > > 3) Link vmlinux.o + kallsyms into vmlinux > (this requires linking 3 times) We have kallsyms we had to link three times because the linking increased the object a little in size so symbols did not match. The last time was added more or less only to check that we did have stable symbol addresses. All this predates LTO stuff which we only introduced later. The reason for doing modpost on vmlinux.o was that we had cases where everything in drivers/ was fine but there was section mismatch references from arch/* to drivers/* This is back when there were much more drivers in arch/ than what we have today. And back then we also had much more to check ad we had cPU hotplug that could really cause section mismatches - this is no longer the case which is a good thing. ... > > The following two commits. > I did not fully understand the background, though. > > I CC'ed Sam in case he may add some comments. > > commit 85bd2fddd68e757da8e1af98f857f61a3c9ce647 > Author: Sam Ravnborg <sam@ravnborg.org> > Date: Mon Feb 26 15:33:52 2007 +0100 > > kbuild: fix section mismatch check for vmlinux > > vmlinux does not contain relocation entries which is > used by the section mismatch checks. > Reported by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> > > Use the individual objects as inputs to overcome > this limitation. > In modpost check the .o files and skip non-ELF files. > > Signed-off-by: Sam Ravnborg <sam@ravnborg.org> So we checked vmlinx - but vmlinx did have too much stripped away. so in reality nothing was checked. To allow the warnings to be as precise as possible move the checks out to the indovidual .o files. Sometimes the names was mangled a little so if warnigns was only reported on vmlinx level in could be difficult to track down the offender. This would then also do the check on .o files that had all the relocation symbols rtequired. > > commit 741f98fe298a73c9d47ed53703c1279a29718581 > Author: Sam Ravnborg <sam@ravnborg.org> > Date: Tue Jul 17 10:54:06 2007 +0200 > > kbuild: do section mismatch check on full vmlinux > > Previously we did do the check on the .o files used to link > vmlinux but that failed to find questionable references across > the .o files. > Create a dedicated vmlinux.o file used only for section mismatch checks > that uses the defualt linker script so section does not get renamed. > > The vmlinux.o may later be used as part of the the final link of vmlinux > but for now it is used fo section mismatch only. > For a defconfig build this is instant but for an allyesconfig this > add two minutes to a full build (that anyways takes ~2 hours). > > Signed-off-by: Sam Ravnborg <sam@ravnborg.org> But when we introduced check of the individual .o fiules we missed when the references spanned outside the .o files as explained previously. So included a link of vmlinx.o that did NOT drop the relocations so we could use it to check for the remaining section mismatch warnings. Remember - back when we started this we had many hundred warnings and it was a fight to keep that number low. But we also wanted to report as much as possible. There was back then several discussions if this was really worth the effort. How much was gained from discarding the memory where the section mismatch warnigns was triggered. In other words - how about just keeping the init code in memory so there were no illegal references anymore. That is something that is maybe worth to consiuder again as we have even less memory we save by throwing away the init code. But I think this is a topic for another mail thread. Sam
Hi Sam, Thanks for the comments. On Sun, May 24, 2020 at 1:54 AM Sam Ravnborg <sam@ravnborg.org> wrote: > > Hi Masahiro. > > On Sun, May 24, 2020 at 12:12:35AM +0900, Masahiro Yamada wrote: > > Hi Nicholas, > > (+CC: Sam Ravnborg) > > > > > > On Sat, May 23, 2020 at 7:06 PM Nicholas Piggin <npiggin@gmail.com> wrote: > > > > > > Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am: > > > > + Michael, and PPC ML. > > > > > > > > They may know something about the reason of failure. > > > > > > Because the linker can't put branch stubs within object code sections, > > > so when you incrementally link them too large, the linker can't resolve > > > branches into other object files. > > > > > > Ah, you are right. > > > > So, this is a problem not only for PPC > > but also for ARM (both 32 and 64 bit), etc. > > > > ARM needs to insert a veneer to jump far. > > > > Prior to thin archive, we could not compile > > ARCH=arm allyesconfig because > > drivers/built-in.o was too large. > > > > This patch gets us back to the too large > > incremental object situation. > > > > With my quick compile-testing, > > ARCH=arm allyesconfig > > and ARCH=arm64 allyesconfig are broken. > > > > > > > This is why we added incremental linking in the first place. I suppose > > > it could be made conditional for platforms that can use this > > > optimization. > > > > > > What'd be really nice is if we could somehow build and link kallsyms > > > without relinking everything twice, and if we could do section mismatch > > > analysis without making that vmlinux.o as well. I had a few ideas but > > > not enough time to do much work on it. > > > > > > Right, kallsyms links 3 times. (not twice) > > > > > > Hmm, I think Sami's main motivation is Clang LTO. > > > > LTO is very time-consuming. > > So, the android common kernel implements Clang LTO > > in the pre modpost stage: > > > > > > 1) LTO against vmlinux.o > > > > 2) modpost against vmlinux.o > > > > 3) Link vmlinux.o + kallsyms into vmlinux > > (this requires linking 3 times) > > We have kallsyms we had to link three times because the linking > increased the object a little in size so symbols did not match. > The last time was added more or less only to check that we did > have stable symbol addresses. Usually vmlinux_link is invoked 3 times if CONFIG_KALLSYMS=y. (kallsyms_step 1, kallsyms_step 2, and final vmlinux_link) If the elf size does not match after kallsyms_step 2, kallsyms_step 3 is invoked. So, 4 times including the extra check pass. If CONFIG_DEBUG_INFO_BTF=y, vmlinux_link is invoked one more time. So, linked 5 times at most. > > All this predates LTO stuff which we only introduced later. > > The reason for doing modpost on vmlinux.o was that we had cases > where everything in drivers/ was fine but there was section mismatch > references from arch/* to drivers/* > This is back when there were much more drivers in arch/ than what we > have today. > And back then we also had much more to check ad we had cPU hotplug > that could really cause section mismatches - this is no longer the case > which is a good thing. > > > > ... > > > > The following two commits. > > I did not fully understand the background, though. > > > > I CC'ed Sam in case he may add some comments. > > > > commit 85bd2fddd68e757da8e1af98f857f61a3c9ce647 > > Author: Sam Ravnborg <sam@ravnborg.org> > > Date: Mon Feb 26 15:33:52 2007 +0100 > > > > kbuild: fix section mismatch check for vmlinux > > > > vmlinux does not contain relocation entries which is > > used by the section mismatch checks. > > Reported by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> > > > > Use the individual objects as inputs to overcome > > this limitation. > > In modpost check the .o files and skip non-ELF files. > > > > Signed-off-by: Sam Ravnborg <sam@ravnborg.org> > > > So we checked vmlinx - but vmlinx did have too much stripped away. > so in reality nothing was checked. > To allow the warnings to be as precise as possible move the checks > out to the indovidual .o files. > Sometimes the names was mangled a little so if warnigns was only > reported on vmlinx level in could be difficult to track down the > offender. > This would then also do the check on .o files that had all the > relocation symbols rtequired. > > > > > commit 741f98fe298a73c9d47ed53703c1279a29718581 > > Author: Sam Ravnborg <sam@ravnborg.org> > > Date: Tue Jul 17 10:54:06 2007 +0200 > > > > kbuild: do section mismatch check on full vmlinux > > > > Previously we did do the check on the .o files used to link > > vmlinux but that failed to find questionable references across > > the .o files. > > Create a dedicated vmlinux.o file used only for section mismatch checks > > that uses the defualt linker script so section does not get renamed. > > > > The vmlinux.o may later be used as part of the the final link of vmlinux > > but for now it is used fo section mismatch only. > > For a defconfig build this is instant but for an allyesconfig this > > add two minutes to a full build (that anyways takes ~2 hours). > > > > Signed-off-by: Sam Ravnborg <sam@ravnborg.org> > > But when we introduced check of the individual .o fiules we missed when > the references spanned outside the .o files as explained previously. > So included a link of vmlinx.o that did NOT drop the relocations > so we could use it to check for the remaining section mismatch warnings. > > Remember - back when we started this we had many hundred warnings > and it was a fight to keep that number low. > But we also wanted to report as much as possible. > > There was back then several discussions if this was really worth the > effort. How much was gained from discarding the memory where the > section mismatch warnigns was triggered. > In other words - how about just keeping the init code in memory so there > were no illegal references anymore. > That is something that is maybe worth to consiuder again as we have even > less memory we save by throwing away the init code. > But I think this is a topic for another mail thread. I am not sure if we want to go as far as stop doing __init. I want to reuse memory after initialization. Anyway, the section mismatch checks highly rely on REL or RELA. The REL(A) sections do not exist in the final vmlinux, or is useless at least. So, it does not work for most of architectures. If we use individual .o files, modpost cannot check function calls to a different object file. So, the conclusion is we definitely need vmlinux.o for section mismatch checks. -- Best Regards Masahiro Yamada
On Sat, May 23, 2020 at 8:13 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > Hi Nicholas, > (+CC: Sam Ravnborg) > > > On Sat, May 23, 2020 at 7:06 PM Nicholas Piggin <npiggin@gmail.com> wrote: > > > > Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am: > > > + Michael, and PPC ML. > > > > > > They may know something about the reason of failure. > > > > Because the linker can't put branch stubs within object code sections, > > so when you incrementally link them too large, the linker can't resolve > > branches into other object files. > > > Ah, you are right. > > So, this is a problem not only for PPC > but also for ARM (both 32 and 64 bit), etc. > > ARM needs to insert a veneer to jump far. > > Prior to thin archive, we could not compile > ARCH=arm allyesconfig because > drivers/built-in.o was too large. > > This patch gets us back to the too large > incremental object situation. > > With my quick compile-testing, > ARCH=arm allyesconfig > and ARCH=arm64 allyesconfig are broken. Thanks for looking into this! Clang doesn't appear to have this issue with LTO because it always enables both -ffunction-sections and -fdata-sections. I confirmed that -ffunction-sections also fixes arm64 allyesconfig with this patch. While I'm fine with reusing vmlinux.o only with LTO, how would you feel about enabling -ffunction-sections in the kernel by default? Sami
On Tue, Jun 16, 2020 at 6:47 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > On Sat, May 23, 2020 at 8:13 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > > > Hi Nicholas, > > (+CC: Sam Ravnborg) > > > > > > On Sat, May 23, 2020 at 7:06 PM Nicholas Piggin <npiggin@gmail.com> wrote: > > > > > > Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am: > > > > + Michael, and PPC ML. > > > > > > > > They may know something about the reason of failure. > > > > > > Because the linker can't put branch stubs within object code sections, > > > so when you incrementally link them too large, the linker can't resolve > > > branches into other object files. > > > > > > Ah, you are right. > > > > So, this is a problem not only for PPC > > but also for ARM (both 32 and 64 bit), etc. > > > > ARM needs to insert a veneer to jump far. > > > > Prior to thin archive, we could not compile > > ARCH=arm allyesconfig because > > drivers/built-in.o was too large. > > > > This patch gets us back to the too large > > incremental object situation. > > > > With my quick compile-testing, > > ARCH=arm allyesconfig > > and ARCH=arm64 allyesconfig are broken. > > Thanks for looking into this! Clang doesn't appear to have this issue > with LTO because it always enables both -ffunction-sections and > -fdata-sections. I confirmed that -ffunction-sections also fixes arm64 > allyesconfig with this patch. While I'm fine with reusing vmlinux.o > only with LTO, how would you feel about enabling -ffunction-sections > in the kernel by default? I am OK if it works. Please do compile tests for some architectures. (especially, ARCH=powerpc defconfig, and ARCH=arm(64) allyesconfig) Thank you.
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index d09ab4afbda4..c6cc4305950c 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -77,11 +77,8 @@ vmlinux_link() if [ "${SRCARCH}" != "um" ]; then objects="--whole-archive \ - ${KBUILD_VMLINUX_OBJS} \ + vmlinux.o \ --no-whole-archive \ - --start-group \ - ${KBUILD_VMLINUX_LIBS} \ - --end-group \ ${@}" ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux} \
Instead of linking all compilation units again each time vmlinux_link is called, reuse vmlinux.o from modpost_link. With x86_64 allyesconfig, vmlinux_link is called three times and reusing vmlinux.o reduces the build time ~38 seconds on my system (59% reduction in the time spent in vmlinux_link). Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- scripts/link-vmlinux.sh | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) base-commit: b85051e755b0e9d6dd8f17ef1da083851b83287d