Message ID | 20200210131925.145463-1-samitolvanen@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kbuild: remove duplicate dependencies from .mod files | expand |
Hi Sami, On Mon, Feb 10, 2020 at 10:19 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > With CONFIG_TRIM_UNUSED_SYMS, if a module has enough dependencies to > exceed the default xargs command line size limit, the output is split > into multiple lines, which can result in used symbols getting trimmed. > > This change removes duplicate dependencies, which will reduce the > probability of this happening and makes .mod files smaller and easier > to read. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > scripts/Makefile.build | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > index a1730d42e5f3..a083bcec19d3 100644 > --- a/scripts/Makefile.build > +++ b/scripts/Makefile.build > @@ -257,7 +257,7 @@ endef > > # List module undefined symbols (or empty line if not enabled) > ifdef CONFIG_TRIM_UNUSED_KSYMS > -cmd_undef_syms = $(NM) $< | sed -n 's/^ *U //p' | xargs echo > +cmd_undef_syms = $(NM) $< | sed -n 's/^ *U //p' | sort -u | xargs echo In which case are undefined symbols duplicated? Do you have a .config to reproduce it? > else > cmd_undef_syms = echo > endif > > base-commit: bb6d3fb354c5ee8d6bde2d576eb7220ea09862b9 > -- > 2.25.0.341.g760bfbb309-goog >
On Wed, Feb 12, 2020 at 5:23 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > In which case are undefined symbols duplicated? When a module consists of multiple compilation units, which depend on the same external symbols. In Android, we ran into this when adding hardening features that all depend on an external error handler function with a rather long name. When CONFIG_TRIM_UNUSED_SYMS was later enabled, we ran into this: $ llvm-nm drivers/gpu/drm/nouveau/nouveau.o | sed -n 's/^ *U //p' | xargs echo | wc 2 9136 168660 xargs defaults to 128kiB limit for command line size, so the output was split into two lines, which means some of the dependencies were dropped and we ran into modpost errors. One method of fixing this is to increase the limit: $ llvm-nm drivers/gpu/drm/nouveau/nouveau.o | sed -n 's/^ *U //p' | xargs -s 262144 echo | wc 1 9136 168660 But it seems removing duplicates is a better solution as the length of the dependency list is reduced significantly: $ llvm-nm drivers/gpu/drm/nouveau/nouveau.o | sed -n 's/^ *U //p' | sort -u | xargs echo | wc 1 2716 50461 > Do you have a .config to reproduce it? I can currently reproduce this on an Android kernel that has Control-Flow Integrity (CFI) enabled. While this feature is not upstreamed yet, there's nothing that would prevent us from hitting the command line limit with sufficiently large modules otherwise as well. Sami
Hi Sami, On Thu, Feb 13, 2020 at 2:13 AM Sami Tolvanen <samitolvanen@google.com> wrote: > > On Wed, Feb 12, 2020 at 5:23 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > In which case are undefined symbols duplicated? > > When a module consists of multiple compilation units, which depend on > the same external symbols. In Android, we ran into this when adding > hardening features that all depend on an external error handler > function with a rather long name. When CONFIG_TRIM_UNUSED_SYMS was > later enabled, we ran into this: > > $ llvm-nm drivers/gpu/drm/nouveau/nouveau.o | sed -n 's/^ *U //p' | > xargs echo | wc > 2 9136 168660 > > xargs defaults to 128kiB limit for command line size, so the output > was split into two lines, which means some of the dependencies were > dropped and we ran into modpost errors. One method of fixing this is > to increase the limit: > > $ llvm-nm drivers/gpu/drm/nouveau/nouveau.o | sed -n 's/^ *U //p' | > xargs -s 262144 echo | wc > 1 9136 168660 > > But it seems removing duplicates is a better solution as the length of > the dependency list is reduced significantly: > > $ llvm-nm drivers/gpu/drm/nouveau/nouveau.o | sed -n 's/^ *U //p' | > sort -u | xargs echo | wc > 1 2716 50461 At least, I am unable to reproduce this in upstream. This is my result for x86 allmodconfig builds. masahiro@grover:~/workspace/linux-kbuild$ nm drivers/gpu/drm/nouveau/nouveau.o | sed -n 's/^ *U //p' | wc 572 572 11478 masahiro@grover:~/workspace/linux-kbuild$ nm drivers/gpu/drm/nouveau/nouveau.o | sed -n 's/^ *U //p' | sort -u | wc 572 572 11478 I see no difference with/without 'sort -u'. I also tried llvm-nm instead of GNU nm, but the result is the same. masahiro@grover:~/workspace/linux-kbuild$ llvm-nm drivers/gpu/drm/nouveau/nouveau.o | sed -n 's/^ *U //p' | wc 572 572 11478 masahiro@grover:~/workspace/linux-kbuild$ llvm-nm drivers/gpu/drm/nouveau/nouveau.o | sed -n 's/^ *U //p' | sort -u | wc 572 572 11478 > > Do you have a .config to reproduce it? > > I can currently reproduce this on an Android kernel that has > Control-Flow Integrity (CFI) enabled. While this feature is not > upstreamed yet, there's nothing that would prevent us from hitting the > command line limit with sufficiently large modules otherwise as well. Does ACK do this differently? I think it would be strange if $(NM) duplicated undefined symbols.
Hi Masahiro, On Sat, Feb 15, 2020 at 8:28 PM Masahiro Yamada <masahiroy@kernel.org> wrote: > masahiro@grover:~/workspace/linux-kbuild$ llvm-nm > drivers/gpu/drm/nouveau/nouveau.o | sed -n 's/^ *U //p' | sort -u | > wc > 572 572 11478 Thank you for testing this! > Does ACK do this differently? Yes, the difference is that we use LTO and after looking into this a bit more, it turns out that when the individual files in the thin archive are LLVM bitcode, llvm-nm behaves differently and prints out the symbols for each file separately. $ llvm-nm drivers/gpu/drm/nouveau/nouveau.o ... nvif/client.o: ---------------- W __cfi_check ---------------- W __cfi_check_fail U __cfi_slowpath_diag U __ubsan_handle_cfi_check_fail ---------------- T nvif_client_fini ---------------- T nvif_client_init ---------------- T nvif_client_ioctl ---------------- T nvif_client_resume ---------------- T nvif_client_suspend U nvif_object_fini U nvif_object_init U strncpy nvif/device.o: ---------------- W __cfi_check ---------------- W __cfi_check_fail U __ubsan_handle_cfi_check_fail U kfree ---------------- T nvif_device_fini ---------------- T nvif_device_init ---------------- T nvif_device_time U nvif_object_fini U nvif_object_init U nvif_object_mthd U nvif_user_fini ... While this output format still works for us, it does generate a lot of duplicates. Anyway, I think we can come back to this when LTO is closer to getting upstreamed. Sami
diff --git a/scripts/Makefile.build b/scripts/Makefile.build index a1730d42e5f3..a083bcec19d3 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -257,7 +257,7 @@ endef # List module undefined symbols (or empty line if not enabled) ifdef CONFIG_TRIM_UNUSED_KSYMS -cmd_undef_syms = $(NM) $< | sed -n 's/^ *U //p' | xargs echo +cmd_undef_syms = $(NM) $< | sed -n 's/^ *U //p' | sort -u | xargs echo else cmd_undef_syms = echo endif
With CONFIG_TRIM_UNUSED_SYMS, if a module has enough dependencies to exceed the default xargs command line size limit, the output is split into multiple lines, which can result in used symbols getting trimmed. This change removes duplicate dependencies, which will reduce the probability of this happening and makes .mod files smaller and easier to read. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- scripts/Makefile.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) base-commit: bb6d3fb354c5ee8d6bde2d576eb7220ea09862b9