Message ID | 20230914131225.13415-4-will@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix 'faddr2line' for LLVM arm64 builds | expand |
On Thu, Sep 14, 2023 at 6:12 AM Will Deacon <will@kernel.org> wrote: > > Mapping symbols emitted in the readelf output can confuse the > 'faddr2line' symbol size calculation, resulting in the erroneous > rejection of valid offsets. This is especially prevalent when building > an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are > prefixed with a 32-bit data value in a '$d.n' section. For example: > > 447538: ffff800080014b80 548 FUNC GLOBAL DEFAULT 2 do_one_initcall > 104: ffff800080014c74 0 NOTYPE LOCAL DEFAULT 2 $x.73 > 106: ffff800080014d30 0 NOTYPE LOCAL DEFAULT 2 $x.75 > 111: ffff800080014da4 0 NOTYPE LOCAL DEFAULT 2 $d.78 > 112: ffff800080014da8 0 NOTYPE LOCAL DEFAULT 2 $x.79 > 36: ffff800080014de0 200 FUNC LOCAL DEFAULT 2 run_init_process > > Adding a warning to do_one_initcall() results in: > > | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260 > > Which 'faddr2line' refuses to accept: > > $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260 > skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224) > no match for do_one_initcall+0xf4/0x260 > > Filter out these entries from readelf using a shell reimplementation of > is_mapping_symbol(), so that the size of a symbol is calculated as a > delta to the next symbol present in ksymtab. > > Cc: Josh Poimboeuf <jpoimboe@kernel.org> > Cc: John Stultz <jstultz@google.com> > Suggested-by: Masahiro Yamada <masahiroy@kernel.org> > Signed-off-by: Will Deacon <will@kernel.org> > --- > scripts/faddr2line | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/scripts/faddr2line b/scripts/faddr2line > index 6b8206802157..20d9b3d37843 100755 > --- a/scripts/faddr2line > +++ b/scripts/faddr2line > @@ -179,6 +179,11 @@ __faddr2line() { > local cur_sym_elf_size=${fields[2]} > local cur_sym_name=${fields[7]:-} > > + # is_mapping_symbol(cur_sym_name) > + if [[ ${cur_sym_name} =~ ^((\.L)|(L0)|(\$[adtx](\.|$))) ]]; then Thanks for the patch! I'm curious about the `|$` in the final part of the regex. IIUC that will match something like $a Do we have any such symbols without `.<n>` suffixes? With aarch64 defconfig + cfi: $ llvm-readelf -s vmlinux | grep '\$' | rev | cut -d ' ' -f 1 | rev | sort -u I only see $d.<n> and $x.<n> where the initial value of <n> is zero (as opposed to no `.<n>` suffix). Can we tighten up that last part of the regex to be `\$[adtx]\.[0-9]+$` ? Or perhaps you've observed mapping symbols use another convention than what clang is doing? https://sourceware.org/binutils/docs/as/AArch64-Mapping-Symbols.html also only mentions $d and $x. Ah, https://developer.arm.com/documentation/dui0803/a/Accessing-and-managing-symbols-with-armlink/About-mapping-symbols mentions $a for A32 and $t for T32. Consider adding a link to the ARM documentation on mapping symbols in the commit message? (Curiously, `llvm-nm` does not print these symbols, but `llvm-readelf -s` does). > + continue > + fi > + > if [[ $cur_sym_addr = $sym_addr ]] && > [[ $cur_sym_elf_size = $sym_elf_size ]] && > [[ $cur_sym_name = $sym_name ]]; then > -- > 2.42.0.283.g2d96d420d3-goog >
On Thu, Sep 14, 2023 at 10:12 PM Will Deacon <will@kernel.org> wrote: > > Mapping symbols emitted in the readelf output can confuse the > 'faddr2line' symbol size calculation, resulting in the erroneous > rejection of valid offsets. This is especially prevalent when building > an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are > prefixed with a 32-bit data value in a '$d.n' section. For example: > > 447538: ffff800080014b80 548 FUNC GLOBAL DEFAULT 2 do_one_initcall > 104: ffff800080014c74 0 NOTYPE LOCAL DEFAULT 2 $x.73 > 106: ffff800080014d30 0 NOTYPE LOCAL DEFAULT 2 $x.75 > 111: ffff800080014da4 0 NOTYPE LOCAL DEFAULT 2 $d.78 > 112: ffff800080014da8 0 NOTYPE LOCAL DEFAULT 2 $x.79 > 36: ffff800080014de0 200 FUNC LOCAL DEFAULT 2 run_init_process > > Adding a warning to do_one_initcall() results in: > > | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260 > > Which 'faddr2line' refuses to accept: > > $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260 > skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224) > no match for do_one_initcall+0xf4/0x260 > > Filter out these entries from readelf using a shell reimplementation of > is_mapping_symbol(), so that the size of a symbol is calculated as a > delta to the next symbol present in ksymtab. > > Cc: Josh Poimboeuf <jpoimboe@kernel.org> > Cc: John Stultz <jstultz@google.com> > Suggested-by: Masahiro Yamada <masahiroy@kernel.org> > Signed-off-by: Will Deacon <will@kernel.org> > --- > scripts/faddr2line | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/scripts/faddr2line b/scripts/faddr2line > index 6b8206802157..20d9b3d37843 100755 > --- a/scripts/faddr2line > +++ b/scripts/faddr2line > @@ -179,6 +179,11 @@ __faddr2line() { > local cur_sym_elf_size=${fields[2]} > local cur_sym_name=${fields[7]:-} > > + # is_mapping_symbol(cur_sym_name) > + if [[ ${cur_sym_name} =~ ^((\.L)|(L0)|(\$[adtx](\.|$))) ]]; then > + continue > + fi > + Too many parentheses. The latest include/linux/module_symbol.h looks like this. static inline int is_mapping_symbol(const char *str) { if (str[0] == '.' && str[1] == 'L') return true; if (str[0] == 'L' && str[1] == '0') return true; return str[0] == '$'; } Does this work? if [[ ${cur_sym_name} =~ ^(\.L|L0|\$) ]]; then continue fi > if [[ $cur_sym_addr = $sym_addr ]] && > [[ $cur_sym_elf_size = $sym_elf_size ]] && > [[ $cur_sym_name = $sym_name ]]; then > -- > 2.42.0.283.g2d96d420d3-goog >
On Tue, Sep 26, 2023 at 01:50:20AM +0900, Masahiro Yamada wrote: > On Thu, Sep 14, 2023 at 10:12 PM Will Deacon <will@kernel.org> wrote: > > > > Mapping symbols emitted in the readelf output can confuse the > > 'faddr2line' symbol size calculation, resulting in the erroneous > > rejection of valid offsets. This is especially prevalent when building > > an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are > > prefixed with a 32-bit data value in a '$d.n' section. For example: > > > > 447538: ffff800080014b80 548 FUNC GLOBAL DEFAULT 2 do_one_initcall > > 104: ffff800080014c74 0 NOTYPE LOCAL DEFAULT 2 $x.73 > > 106: ffff800080014d30 0 NOTYPE LOCAL DEFAULT 2 $x.75 > > 111: ffff800080014da4 0 NOTYPE LOCAL DEFAULT 2 $d.78 > > 112: ffff800080014da8 0 NOTYPE LOCAL DEFAULT 2 $x.79 > > 36: ffff800080014de0 200 FUNC LOCAL DEFAULT 2 run_init_process > > > > Adding a warning to do_one_initcall() results in: > > > > | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260 > > > > Which 'faddr2line' refuses to accept: > > > > $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260 > > skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224) > > no match for do_one_initcall+0xf4/0x260 > > > > Filter out these entries from readelf using a shell reimplementation of > > is_mapping_symbol(), so that the size of a symbol is calculated as a > > delta to the next symbol present in ksymtab. > > > > Cc: Josh Poimboeuf <jpoimboe@kernel.org> > > Cc: John Stultz <jstultz@google.com> > > Suggested-by: Masahiro Yamada <masahiroy@kernel.org> > > Signed-off-by: Will Deacon <will@kernel.org> > > --- > > scripts/faddr2line | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/scripts/faddr2line b/scripts/faddr2line > > index 6b8206802157..20d9b3d37843 100755 > > --- a/scripts/faddr2line > > +++ b/scripts/faddr2line > > @@ -179,6 +179,11 @@ __faddr2line() { > > local cur_sym_elf_size=${fields[2]} > > local cur_sym_name=${fields[7]:-} > > > > + # is_mapping_symbol(cur_sym_name) > > + if [[ ${cur_sym_name} =~ ^((\.L)|(L0)|(\$[adtx](\.|$))) ]]; then > > + continue > > + fi > > + > > > Too many parentheses. Ha, well _that_ is subjective! I really think they help when it comes to regex syntax. However... > The latest include/linux/module_symbol.h looks like this. > > static inline int is_mapping_symbol(const char *str) > { > if (str[0] == '.' && str[1] == 'L') > return true; > if (str[0] == 'L' && str[1] == '0') > return true; > return str[0] == '$'; > } ...oh, nice, that got simplified a whole lot by ff09f6fd2972 ("modpost, kallsyms: Treat add '$'-prefixed symbols as mapping symbols") in the recent merge window, so I can definitely simplify the regex. > Does this work? > > if [[ ${cur_sym_name} =~ ^(\.L|L0|\$) ]]; then > continue > fi Looks about right. Will
On Mon, Sep 18, 2023 at 08:46:22AM -0700, Nick Desaulniers wrote: > On Thu, Sep 14, 2023 at 6:12 AM Will Deacon <will@kernel.org> wrote: > > > > Mapping symbols emitted in the readelf output can confuse the > > 'faddr2line' symbol size calculation, resulting in the erroneous > > rejection of valid offsets. This is especially prevalent when building > > an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are > > prefixed with a 32-bit data value in a '$d.n' section. For example: > > > > 447538: ffff800080014b80 548 FUNC GLOBAL DEFAULT 2 do_one_initcall > > 104: ffff800080014c74 0 NOTYPE LOCAL DEFAULT 2 $x.73 > > 106: ffff800080014d30 0 NOTYPE LOCAL DEFAULT 2 $x.75 > > 111: ffff800080014da4 0 NOTYPE LOCAL DEFAULT 2 $d.78 > > 112: ffff800080014da8 0 NOTYPE LOCAL DEFAULT 2 $x.79 > > 36: ffff800080014de0 200 FUNC LOCAL DEFAULT 2 run_init_process > > > > Adding a warning to do_one_initcall() results in: > > > > | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260 > > > > Which 'faddr2line' refuses to accept: > > > > $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260 > > skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224) > > no match for do_one_initcall+0xf4/0x260 > > > > Filter out these entries from readelf using a shell reimplementation of > > is_mapping_symbol(), so that the size of a symbol is calculated as a > > delta to the next symbol present in ksymtab. > > > > Cc: Josh Poimboeuf <jpoimboe@kernel.org> > > Cc: John Stultz <jstultz@google.com> > > Suggested-by: Masahiro Yamada <masahiroy@kernel.org> > > Signed-off-by: Will Deacon <will@kernel.org> > > --- > > scripts/faddr2line | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/scripts/faddr2line b/scripts/faddr2line > > index 6b8206802157..20d9b3d37843 100755 > > --- a/scripts/faddr2line > > +++ b/scripts/faddr2line > > @@ -179,6 +179,11 @@ __faddr2line() { > > local cur_sym_elf_size=${fields[2]} > > local cur_sym_name=${fields[7]:-} > > > > + # is_mapping_symbol(cur_sym_name) > > + if [[ ${cur_sym_name} =~ ^((\.L)|(L0)|(\$[adtx](\.|$))) ]]; then > > Thanks for the patch! > > I'm curious about the `|$` in the final part of the regex. IIUC that > will match something like > $a > Do we have any such symbols without `.<n>` suffixes? tbh, I just blindly followed the implementation of is_mapping_symbol() at the time, but Masahiro has since pointed out that it's been significantly simplified so this regex should get much more manageable in the next version. Will
diff --git a/scripts/faddr2line b/scripts/faddr2line index 6b8206802157..20d9b3d37843 100755 --- a/scripts/faddr2line +++ b/scripts/faddr2line @@ -179,6 +179,11 @@ __faddr2line() { local cur_sym_elf_size=${fields[2]} local cur_sym_name=${fields[7]:-} + # is_mapping_symbol(cur_sym_name) + if [[ ${cur_sym_name} =~ ^((\.L)|(L0)|(\$[adtx](\.|$))) ]]; then + continue + fi + if [[ $cur_sym_addr = $sym_addr ]] && [[ $cur_sym_elf_size = $sym_elf_size ]] && [[ $cur_sym_name = $sym_name ]]; then
Mapping symbols emitted in the readelf output can confuse the 'faddr2line' symbol size calculation, resulting in the erroneous rejection of valid offsets. This is especially prevalent when building an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are prefixed with a 32-bit data value in a '$d.n' section. For example: 447538: ffff800080014b80 548 FUNC GLOBAL DEFAULT 2 do_one_initcall 104: ffff800080014c74 0 NOTYPE LOCAL DEFAULT 2 $x.73 106: ffff800080014d30 0 NOTYPE LOCAL DEFAULT 2 $x.75 111: ffff800080014da4 0 NOTYPE LOCAL DEFAULT 2 $d.78 112: ffff800080014da8 0 NOTYPE LOCAL DEFAULT 2 $x.79 36: ffff800080014de0 200 FUNC LOCAL DEFAULT 2 run_init_process Adding a warning to do_one_initcall() results in: | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260 Which 'faddr2line' refuses to accept: $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260 skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224) no match for do_one_initcall+0xf4/0x260 Filter out these entries from readelf using a shell reimplementation of is_mapping_symbol(), so that the size of a symbol is calculated as a delta to the next symbol present in ksymtab. Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: John Stultz <jstultz@google.com> Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> --- scripts/faddr2line | 5 +++++ 1 file changed, 5 insertions(+)