Message ID | 20240613133711.2867745-3-zhengyejian1@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | kallsyms: Emit symbol for holes in text and fix weak function issue | expand |
On Thu, Jun 13, 2024 at 10:36 PM Zheng Yejian <zhengyejian1@huawei.com> wrote: > > When a weak type function is overridden, its symbol will be removed > from the symbol table, but its code will not be removed. Besides, > due to lacking of size for kallsyms, kernel compute function size by > substracting its symbol address from its next symbol address (see > kallsyms_lookup_size_offset()). These will cause that size of some > function is computed to be larger than it actually is, just because > symbol of its following weak function is removed. > > This issue also causes multiple __fentry__ locations to be counted in > the some function scope, and eventually causes ftrace_location() to find > wrong __fentry__ location. It was reported in > Link: https://lore.kernel.org/all/20240607115211.734845-1-zhengyejian1@huawei.com/ > > Peter suggested to change scipts/kallsyms.c to emit readily > identifiable symbol names for all the weak junk, eg: > > __weak_junk_NNNNN > > The name of this kind symbol needs some discussion, but it's temporarily > called "__hole_symbol_XXXXX" in this patch: > 1. Pass size info to scripts/kallsyms (see mksysmap()); > 2. Traverse sorted function symbols, if one function address plus its > size less than next function address, it means there's a hole, then > emit a symbol "__hole_symbol_XXXXX" there which type is 't'. > > After this patch, the effect is as follows: > > $ cat /proc/kallsyms | grep -A 3 do_one_initcall > ffffffff810021e0 T do_one_initcall > ffffffff8100245e t __hole_symbol_XXXXX > ffffffff810024a0 t __pfx_rootfs_init_fs_context > > Suggested-by: Peter Zijlstra <peterz@infradead.org> > Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> With my quick test, "t__hole_symbol_XXXXX" was encoded into the following 10-byte stream. .byte 0x09, 0x94, 0xbf, 0x18, 0xf3, 0x3d, 0xce, 0xd1, 0xd1, 0x58 Now "t__hole_symbol_XXXXX" is the most common symbol name. However, 10 byte is consumed for every instance of "t__hole_symbol_XXXXX". This is much less efficient thanI had expected, although I did not analyze the logic of this inefficiency. > --- > scripts/kallsyms.c | 101 +++++++++++++++++++++++++++++++++++++++- > scripts/link-vmlinux.sh | 4 +- > scripts/mksysmap | 2 +- > 3 files changed, 102 insertions(+), 5 deletions(-) > > diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c > index 6559a9802f6e..5c4cde864a04 100644 > --- a/scripts/kallsyms.c > +++ b/scripts/kallsyms.c > @@ -35,6 +35,7 @@ > struct sym_entry { > struct sym_entry *next; > unsigned long long addr; > + unsigned long long size; > unsigned int len; > unsigned int seq; > unsigned int start_pos; > @@ -74,6 +75,7 @@ static int token_profit[0x10000]; > static unsigned char best_table[256][2]; > static unsigned char best_table_len[256]; > > +static const char hole_symbol[] = "__hole_symbol_XXXXX"; > > static void usage(void) > { > @@ -130,8 +132,16 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) > size_t len; > ssize_t readlen; > struct sym_entry *sym; > + unsigned long long size = 0; > > errno = 0; > + /* > + * Example of expected symbol format: > + * 1. symbol with size info: > + * ffffffff81000070 00000000000001d7 T __startup_64 > + * 2. symbol without size info: > + * 0000000002a00000 A text_size > + */ > readlen = getline(buf, buf_len, in); > if (readlen < 0) { > if (errno) { > @@ -145,9 +155,24 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) > (*buf)[readlen - 1] = 0; > > addr = strtoull(*buf, &p, 16); > + if (*buf == p || *p++ != ' ') { > + fprintf(stderr, "line format error: unable to parse address\n"); > + exit(EXIT_FAILURE); > + } > + > + if (*p == '0') { > + char *str = p; > > - if (*buf == p || *p++ != ' ' || !isascii((type = *p++)) || *p++ != ' ') { > - fprintf(stderr, "line format error\n"); > + size = strtoull(str, &p, 16); > + if (str == p || *p++ != ' ') { > + fprintf(stderr, "line format error: unable to parse size\n"); > + exit(EXIT_FAILURE); > + } > + } > + > + type = *p++; > + if (!isascii(type) || *p++ != ' ') { > + fprintf(stderr, "line format error: unable to parse type\n"); > exit(EXIT_FAILURE); > } > > @@ -182,6 +207,7 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) > exit(EXIT_FAILURE); > } > sym->addr = addr; > + sym->size = size; > sym->len = len; > sym->sym[0] = type; > strcpy(sym_name(sym), name); > @@ -795,6 +821,76 @@ static void sort_symbols(void) > qsort(table, table_cnt, sizeof(table[0]), compare_symbols); > } > > +static int may_exist_hole_after_symbol(const struct sym_entry *se) The return type should be bool. > +{ > + char type = se->sym[0]; > + > + /* Only check text symbol or weak symbol */ > + if (type != 't' && type != 'T' && > + type != 'w' && type != 'W') > + return 0; > + /* Symbol without size has no hole */ > + return se->size != 0; > +} > + > +static struct sym_entry *gen_hole_symbol(unsigned long long addr) > +{ > + struct sym_entry *sym; > + static size_t len = sizeof(hole_symbol); > + > + /* include type field */ > + sym = malloc(sizeof(*sym) + len + 1); > + if (!sym) { > + fprintf(stderr, "unable to allocate memory for hole symbol\n"); > + exit(EXIT_FAILURE); > + } > + sym->addr = addr; > + sym->size = 0; > + sym->len = len; > + sym->sym[0] = 't'; > + strcpy(sym_name(sym), hole_symbol); > + sym->percpu_absolute = 0; > + return sym; > +} > + > +static void emit_hole_symbols(void) > +{ > + unsigned int i, pos, nr_emit; > + struct sym_entry **new_table; > + unsigned int new_cnt; > + > + nr_emit = 0; > + for (i = 0; i < table_cnt - 1; i++) { > + if (may_exist_hole_after_symbol(table[i]) && > + table[i]->addr + table[i]->size < table[i+1]->addr) > + nr_emit++; > + } > + if (!nr_emit) > + return; > + > + new_cnt = table_cnt + nr_emit; > + new_table = malloc(sizeof(*new_table) * new_cnt); Do you need to allocate another huge table? You can use realloc() to append the room for nr_emit if you iterate the table in the reverse order. > + if (!new_table) { > + fprintf(stderr, "unable to allocate memory for new table\n"); > + exit(EXIT_FAILURE); > + } > + > + pos = 0; > + for (i = 0; i < table_cnt; i++) { > + unsigned long long addr; > + > + new_table[pos++] = table[i]; > + if ((i == table_cnt - 1) || !may_exist_hole_after_symbol(table[i])) > + continue; > + addr = table[i]->addr + table[i]->size; > + if (addr < table[i+1]->addr) > + new_table[pos++] = gen_hole_symbol(addr); > + } > + free(table); > + table = new_table; > + table_cnt = new_cnt; > +} > + > static void make_percpus_absolute(void) > { > unsigned int i; > @@ -854,6 +950,7 @@ int main(int argc, char **argv) > if (absolute_percpu) > make_percpus_absolute(); > sort_symbols(); > + emit_hole_symbols(); > if (base_relative) > record_relative_base(); > optimize_token_table(); > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh > index 518c70b8db50..8e1373902bfe 100755 > --- a/scripts/link-vmlinux.sh > +++ b/scripts/link-vmlinux.sh > @@ -189,11 +189,11 @@ kallsyms_step() > } > > # Create map file with all symbols from ${1} > -# See mksymap for additional details > +# See mksysmap for additional details > mksysmap() > { > info NM ${2} > - ${NM} -n "${1}" | sed -f "${srctree}/scripts/mksysmap" > "${2}" > + ${NM} -nS "${1}" | sed -f "${srctree}/scripts/mksysmap" > "${2}" > } > > sorttable() > diff --git a/scripts/mksysmap b/scripts/mksysmap > index c12723a04655..7a4415f21143 100755 > --- a/scripts/mksysmap > +++ b/scripts/mksysmap > @@ -2,7 +2,7 @@ > # SPDX-License-Identifier: GPL-2.0-only > # > # sed script to filter out symbols that are not needed for System.map, > -# or not suitable for kallsyms. The input should be 'nm -n <file>'. > +# or not suitable for kallsyms. The input should be 'nm -nS <file>'. > # > # System.map is used by module-init tools and some debugging > # tools to retrieve the actual addresses of symbols in the kernel. > -- > 2.25.1 > >
On 2024/7/16 16:33, Masahiro Yamada wrote: > On Thu, Jun 13, 2024 at 10:36 PM Zheng Yejian <zhengyejian1@huawei.com> wrote: >> >> When a weak type function is overridden, its symbol will be removed >> from the symbol table, but its code will not be removed. Besides, >> due to lacking of size for kallsyms, kernel compute function size by >> substracting its symbol address from its next symbol address (see >> kallsyms_lookup_size_offset()). These will cause that size of some >> function is computed to be larger than it actually is, just because >> symbol of its following weak function is removed. >> >> This issue also causes multiple __fentry__ locations to be counted in >> the some function scope, and eventually causes ftrace_location() to find >> wrong __fentry__ location. It was reported in >> Link: https://lore.kernel.org/all/20240607115211.734845-1-zhengyejian1@huawei.com/ >> >> Peter suggested to change scipts/kallsyms.c to emit readily >> identifiable symbol names for all the weak junk, eg: >> >> __weak_junk_NNNNN >> >> The name of this kind symbol needs some discussion, but it's temporarily >> called "__hole_symbol_XXXXX" in this patch: >> 1. Pass size info to scripts/kallsyms (see mksysmap()); >> 2. Traverse sorted function symbols, if one function address plus its >> size less than next function address, it means there's a hole, then >> emit a symbol "__hole_symbol_XXXXX" there which type is 't'. >> >> After this patch, the effect is as follows: >> >> $ cat /proc/kallsyms | grep -A 3 do_one_initcall >> ffffffff810021e0 T do_one_initcall >> ffffffff8100245e t __hole_symbol_XXXXX >> ffffffff810024a0 t __pfx_rootfs_init_fs_context >> >> Suggested-by: Peter Zijlstra <peterz@infradead.org> >> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> > > > > With my quick test, "t__hole_symbol_XXXXX" was encoded > into the following 10-byte stream. > > .byte 0x09, 0x94, 0xbf, 0x18, 0xf3, 0x3d, 0xce, 0xd1, 0xd1, 0x58 > > > > Now "t__hole_symbol_XXXXX" is the most common symbol name. > However, 10 byte is consumed for every instance of > "t__hole_symbol_XXXXX". > > This is much less efficient thanI had expected, > although I did not analyze the logic of this inefficiency. > Hi, Masahiro! In my local test, "t__hole_symbol_XXXXX" was finally encoded into just one byte. See "kallsyms_token_table" in the .tmp_vmlinux.kallsyms2.S: kallsyms_token_table: [...] .asciz "t__hole_symbol_XXXXX" .asciz "hole_symbol_XXXXX" .asciz "e_symbol_XXXXX" .asciz "XXXXX" .asciz "XXX" .asciz "e_symbol_" .asciz "ymbol_" .asciz "ymb" .asciz "hol" .asciz "ol_" .asciz "pfx" .asciz "pf" .asciz "e_s" .asciz "ym" .asciz "t__" .asciz "_s" .asciz "ol" .asciz "__" .asciz "XX" But it would still takes up several tokens due to substrings of "t__hole_symbol_XXXXX" would also become the most common ones. After this patch, the number of "t__hole_symbol_XXXXX" will be ~30% of the total. > > > > > > >> --- >> scripts/kallsyms.c | 101 +++++++++++++++++++++++++++++++++++++++- >> scripts/link-vmlinux.sh | 4 +- >> scripts/mksysmap | 2 +- >> 3 files changed, 102 insertions(+), 5 deletions(-) >> >> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c >> index 6559a9802f6e..5c4cde864a04 100644 >> --- a/scripts/kallsyms.c >> +++ b/scripts/kallsyms.c >> @@ -35,6 +35,7 @@ >> struct sym_entry { >> struct sym_entry *next; >> unsigned long long addr; >> + unsigned long long size; >> unsigned int len; >> unsigned int seq; >> unsigned int start_pos; >> @@ -74,6 +75,7 @@ static int token_profit[0x10000]; >> static unsigned char best_table[256][2]; >> static unsigned char best_table_len[256]; >> >> +static const char hole_symbol[] = "__hole_symbol_XXXXX"; >> >> static void usage(void) >> { >> @@ -130,8 +132,16 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) >> size_t len; >> ssize_t readlen; >> struct sym_entry *sym; >> + unsigned long long size = 0; >> >> errno = 0; >> + /* >> + * Example of expected symbol format: >> + * 1. symbol with size info: >> + * ffffffff81000070 00000000000001d7 T __startup_64 >> + * 2. symbol without size info: >> + * 0000000002a00000 A text_size >> + */ >> readlen = getline(buf, buf_len, in); >> if (readlen < 0) { >> if (errno) { >> @@ -145,9 +155,24 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) >> (*buf)[readlen - 1] = 0; >> >> addr = strtoull(*buf, &p, 16); >> + if (*buf == p || *p++ != ' ') { >> + fprintf(stderr, "line format error: unable to parse address\n"); >> + exit(EXIT_FAILURE); >> + } >> + >> + if (*p == '0') { >> + char *str = p; >> >> - if (*buf == p || *p++ != ' ' || !isascii((type = *p++)) || *p++ != ' ') { >> - fprintf(stderr, "line format error\n"); >> + size = strtoull(str, &p, 16); >> + if (str == p || *p++ != ' ') { >> + fprintf(stderr, "line format error: unable to parse size\n"); >> + exit(EXIT_FAILURE); >> + } >> + } >> + >> + type = *p++; >> + if (!isascii(type) || *p++ != ' ') { >> + fprintf(stderr, "line format error: unable to parse type\n"); >> exit(EXIT_FAILURE); >> } >> >> @@ -182,6 +207,7 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) >> exit(EXIT_FAILURE); >> } >> sym->addr = addr; >> + sym->size = size; >> sym->len = len; >> sym->sym[0] = type; >> strcpy(sym_name(sym), name); >> @@ -795,6 +821,76 @@ static void sort_symbols(void) >> qsort(table, table_cnt, sizeof(table[0]), compare_symbols); >> } >> >> +static int may_exist_hole_after_symbol(const struct sym_entry *se) > > > The return type should be bool. > Yes! > > >> +{ >> + char type = se->sym[0]; >> + >> + /* Only check text symbol or weak symbol */ >> + if (type != 't' && type != 'T' && >> + type != 'w' && type != 'W') >> + return 0; >> + /* Symbol without size has no hole */ >> + return se->size != 0; >> +} >> + >> +static struct sym_entry *gen_hole_symbol(unsigned long long addr) >> +{ >> + struct sym_entry *sym; >> + static size_t len = sizeof(hole_symbol); >> + >> + /* include type field */ >> + sym = malloc(sizeof(*sym) + len + 1); >> + if (!sym) { >> + fprintf(stderr, "unable to allocate memory for hole symbol\n"); >> + exit(EXIT_FAILURE); >> + } >> + sym->addr = addr; >> + sym->size = 0; >> + sym->len = len; >> + sym->sym[0] = 't'; >> + strcpy(sym_name(sym), hole_symbol); >> + sym->percpu_absolute = 0; >> + return sym; >> +} >> + >> +static void emit_hole_symbols(void) >> +{ >> + unsigned int i, pos, nr_emit; >> + struct sym_entry **new_table; >> + unsigned int new_cnt; >> + >> + nr_emit = 0; >> + for (i = 0; i < table_cnt - 1; i++) { >> + if (may_exist_hole_after_symbol(table[i]) && >> + table[i]->addr + table[i]->size < table[i+1]->addr) >> + nr_emit++; >> + } >> + if (!nr_emit) >> + return; >> + >> + new_cnt = table_cnt + nr_emit; >> + new_table = malloc(sizeof(*new_table) * new_cnt); > > > Do you need to allocate another huge table? > > You can use realloc() to append the room for nr_emit > if you iterate the table in the reverse order. > Yes, it would be much better. If it turns out to be the "emit hole symbol" solution, I'll change it to that in the next version, actually, I forgot to mark this series as "RFC". > > > > > > >> + if (!new_table) { >> + fprintf(stderr, "unable to allocate memory for new table\n"); >> + exit(EXIT_FAILURE); >> + } >> + >> + pos = 0; >> + for (i = 0; i < table_cnt; i++) { >> + unsigned long long addr; >> + >> + new_table[pos++] = table[i]; >> + if ((i == table_cnt - 1) || !may_exist_hole_after_symbol(table[i])) >> + continue; >> + addr = table[i]->addr + table[i]->size; >> + if (addr < table[i+1]->addr) >> + new_table[pos++] = gen_hole_symbol(addr); >> + } >> + free(table); >> + table = new_table; >> + table_cnt = new_cnt; >> +} >> + >> static void make_percpus_absolute(void) >> { >> unsigned int i; >> @@ -854,6 +950,7 @@ int main(int argc, char **argv) >> if (absolute_percpu) >> make_percpus_absolute(); >> sort_symbols(); >> + emit_hole_symbols(); >> if (base_relative) >> record_relative_base(); >> optimize_token_table(); >> diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh >> index 518c70b8db50..8e1373902bfe 100755 >> --- a/scripts/link-vmlinux.sh >> +++ b/scripts/link-vmlinux.sh >> @@ -189,11 +189,11 @@ kallsyms_step() >> } >> >> # Create map file with all symbols from ${1} >> -# See mksymap for additional details >> +# See mksysmap for additional details >> mksysmap() >> { >> info NM ${2} >> - ${NM} -n "${1}" | sed -f "${srctree}/scripts/mksysmap" > "${2}" >> + ${NM} -nS "${1}" | sed -f "${srctree}/scripts/mksysmap" > "${2}" >> } >> >> sorttable() >> diff --git a/scripts/mksysmap b/scripts/mksysmap >> index c12723a04655..7a4415f21143 100755 >> --- a/scripts/mksysmap >> +++ b/scripts/mksysmap >> @@ -2,7 +2,7 @@ >> # SPDX-License-Identifier: GPL-2.0-only >> # >> # sed script to filter out symbols that are not needed for System.map, >> -# or not suitable for kallsyms. The input should be 'nm -n <file>'. >> +# or not suitable for kallsyms. The input should be 'nm -nS <file>'. >> # >> # System.map is used by module-init tools and some debugging >> # tools to retrieve the actual addresses of symbols in the kernel. >> -- >> 2.25.1 >> >> > >
On Thu, Jul 18, 2024 at 12:45 PM Zheng Yejian <zhengyejian1@huawei.com> wrote: > > On 2024/7/16 16:33, Masahiro Yamada wrote: > > On Thu, Jun 13, 2024 at 10:36 PM Zheng Yejian <zhengyejian1@huawei.com> wrote: > >> > >> When a weak type function is overridden, its symbol will be removed > >> from the symbol table, but its code will not be removed. Besides, > >> due to lacking of size for kallsyms, kernel compute function size by > >> substracting its symbol address from its next symbol address (see > >> kallsyms_lookup_size_offset()). These will cause that size of some > >> function is computed to be larger than it actually is, just because > >> symbol of its following weak function is removed. > >> > >> This issue also causes multiple __fentry__ locations to be counted in > >> the some function scope, and eventually causes ftrace_location() to find > >> wrong __fentry__ location. It was reported in > >> Link: https://lore.kernel.org/all/20240607115211.734845-1-zhengyejian1@huawei.com/ > >> > >> Peter suggested to change scipts/kallsyms.c to emit readily > >> identifiable symbol names for all the weak junk, eg: > >> > >> __weak_junk_NNNNN > >> > >> The name of this kind symbol needs some discussion, but it's temporarily > >> called "__hole_symbol_XXXXX" in this patch: > >> 1. Pass size info to scripts/kallsyms (see mksysmap()); > >> 2. Traverse sorted function symbols, if one function address plus its > >> size less than next function address, it means there's a hole, then > >> emit a symbol "__hole_symbol_XXXXX" there which type is 't'. > >> > >> After this patch, the effect is as follows: > >> > >> $ cat /proc/kallsyms | grep -A 3 do_one_initcall > >> ffffffff810021e0 T do_one_initcall > >> ffffffff8100245e t __hole_symbol_XXXXX > >> ffffffff810024a0 t __pfx_rootfs_init_fs_context > >> > >> Suggested-by: Peter Zijlstra <peterz@infradead.org> > >> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> > > > > > > > > With my quick test, "t__hole_symbol_XXXXX" was encoded > > into the following 10-byte stream. > > > > .byte 0x09, 0x94, 0xbf, 0x18, 0xf3, 0x3d, 0xce, 0xd1, 0xd1, 0x58 > > > > > > > > Now "t__hole_symbol_XXXXX" is the most common symbol name. > > However, 10 byte is consumed for every instance of > > "t__hole_symbol_XXXXX". > > > > This is much less efficient thanI had expected, > > although I did not analyze the logic of this inefficiency. > > > Hi, Masahiro! > > In my local test, "t__hole_symbol_XXXXX" was finally encoded > into just one byte. See "kallsyms_token_table" in the .tmp_vmlinux.kallsyms2.S: > > kallsyms_token_table: > [...] > .asciz "t__hole_symbol_XXXXX" > .asciz "hole_symbol_XXXXX" > .asciz "e_symbol_XXXXX" > .asciz "XXXXX" > .asciz "XXX" > .asciz "e_symbol_" > .asciz "ymbol_" > .asciz "ymb" > .asciz "hol" > .asciz "ol_" > .asciz "pfx" > .asciz "pf" > .asciz "e_s" > .asciz "ym" > .asciz "t__" > .asciz "_s" > .asciz "ol" > .asciz "__" > .asciz "XX" > > But it would still takes up several tokens due to substrings of > "t__hole_symbol_XXXXX" would also become the most common ones. > After this patch, the number of "t__hole_symbol_XXXXX" will be ~30% of the total. > > > > > > > > > > > > > > >> --- > >> scripts/kallsyms.c | 101 +++++++++++++++++++++++++++++++++++++++- > >> scripts/link-vmlinux.sh | 4 +- > >> scripts/mksysmap | 2 +- > >> 3 files changed, 102 insertions(+), 5 deletions(-) > >> > >> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c > >> index 6559a9802f6e..5c4cde864a04 100644 > >> --- a/scripts/kallsyms.c > >> +++ b/scripts/kallsyms.c > >> @@ -35,6 +35,7 @@ > >> struct sym_entry { > >> struct sym_entry *next; > >> unsigned long long addr; > >> + unsigned long long size; > >> unsigned int len; > >> unsigned int seq; > >> unsigned int start_pos; > >> @@ -74,6 +75,7 @@ static int token_profit[0x10000]; > >> static unsigned char best_table[256][2]; > >> static unsigned char best_table_len[256]; > >> > >> +static const char hole_symbol[] = "__hole_symbol_XXXXX"; > >> > >> static void usage(void) > >> { > >> @@ -130,8 +132,16 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) > >> size_t len; > >> ssize_t readlen; > >> struct sym_entry *sym; > >> + unsigned long long size = 0; > >> > >> errno = 0; > >> + /* > >> + * Example of expected symbol format: > >> + * 1. symbol with size info: > >> + * ffffffff81000070 00000000000001d7 T __startup_64 > >> + * 2. symbol without size info: > >> + * 0000000002a00000 A text_size > >> + */ > >> readlen = getline(buf, buf_len, in); > >> if (readlen < 0) { > >> if (errno) { > >> @@ -145,9 +155,24 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) > >> (*buf)[readlen - 1] = 0; > >> > >> addr = strtoull(*buf, &p, 16); > >> + if (*buf == p || *p++ != ' ') { > >> + fprintf(stderr, "line format error: unable to parse address\n"); > >> + exit(EXIT_FAILURE); > >> + } > >> + > >> + if (*p == '0') { > >> + char *str = p; > >> > >> - if (*buf == p || *p++ != ' ' || !isascii((type = *p++)) || *p++ != ' ') { > >> - fprintf(stderr, "line format error\n"); > >> + size = strtoull(str, &p, 16); > >> + if (str == p || *p++ != ' ') { > >> + fprintf(stderr, "line format error: unable to parse size\n"); > >> + exit(EXIT_FAILURE); > >> + } > >> + } > >> + > >> + type = *p++; > >> + if (!isascii(type) || *p++ != ' ') { > >> + fprintf(stderr, "line format error: unable to parse type\n"); > >> exit(EXIT_FAILURE); > >> } > >> > >> @@ -182,6 +207,7 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) > >> exit(EXIT_FAILURE); > >> } > >> sym->addr = addr; > >> + sym->size = size; > >> sym->len = len; > >> sym->sym[0] = type; > >> strcpy(sym_name(sym), name); > >> @@ -795,6 +821,76 @@ static void sort_symbols(void) > >> qsort(table, table_cnt, sizeof(table[0]), compare_symbols); > >> } > >> > >> +static int may_exist_hole_after_symbol(const struct sym_entry *se) > > > > > > The return type should be bool. > > > > Yes! > > > > > > >> +{ > >> + char type = se->sym[0]; > >> + > >> + /* Only check text symbol or weak symbol */ > >> + if (type != 't' && type != 'T' && > >> + type != 'w' && type != 'W') > >> + return 0; > >> + /* Symbol without size has no hole */ > >> + return se->size != 0; > >> +} > >> + > >> +static struct sym_entry *gen_hole_symbol(unsigned long long addr) > >> +{ > >> + struct sym_entry *sym; > >> + static size_t len = sizeof(hole_symbol); > >> + > >> + /* include type field */ > >> + sym = malloc(sizeof(*sym) + len + 1); > >> + if (!sym) { > >> + fprintf(stderr, "unable to allocate memory for hole symbol\n"); > >> + exit(EXIT_FAILURE); > >> + } > >> + sym->addr = addr; > >> + sym->size = 0; > >> + sym->len = len; > >> + sym->sym[0] = 't'; > >> + strcpy(sym_name(sym), hole_symbol); > >> + sym->percpu_absolute = 0; > >> + return sym; > >> +} > >> + > >> +static void emit_hole_symbols(void) > >> +{ > >> + unsigned int i, pos, nr_emit; > >> + struct sym_entry **new_table; > >> + unsigned int new_cnt; > >> + > >> + nr_emit = 0; > >> + for (i = 0; i < table_cnt - 1; i++) { > >> + if (may_exist_hole_after_symbol(table[i]) && > >> + table[i]->addr + table[i]->size < table[i+1]->addr) > >> + nr_emit++; > >> + } > >> + if (!nr_emit) > >> + return; > >> + > >> + new_cnt = table_cnt + nr_emit; > >> + new_table = malloc(sizeof(*new_table) * new_cnt); > > > > > > Do you need to allocate another huge table? > > > > You can use realloc() to append the room for nr_emit > > if you iterate the table in the reverse order. > > > > Yes, it would be much better. If it turns out to be the > "emit hole symbol" solution, I'll change it to that in the next version, > actually, I forgot to mark this series as "RFC". "__hole_symbol_XXXXX" is too much. You can use the empty symbol type/name as a special case to represent the hole.
On 2024/7/20 22:14, Masahiro Yamada wrote: > On Thu, Jul 18, 2024 at 12:45 PM Zheng Yejian <zhengyejian1@huawei.com> wrote: >> >> On 2024/7/16 16:33, Masahiro Yamada wrote: >>> On Thu, Jun 13, 2024 at 10:36 PM Zheng Yejian <zhengyejian1@huawei.com> wrote: >>>> >>>> When a weak type function is overridden, its symbol will be removed >>>> from the symbol table, but its code will not be removed. Besides, >>>> due to lacking of size for kallsyms, kernel compute function size by >>>> substracting its symbol address from its next symbol address (see >>>> kallsyms_lookup_size_offset()). These will cause that size of some >>>> function is computed to be larger than it actually is, just because >>>> symbol of its following weak function is removed. >>>> >>>> This issue also causes multiple __fentry__ locations to be counted in >>>> the some function scope, and eventually causes ftrace_location() to find >>>> wrong __fentry__ location. It was reported in >>>> Link: https://lore.kernel.org/all/20240607115211.734845-1-zhengyejian1@huawei.com/ >>>> >>>> Peter suggested to change scipts/kallsyms.c to emit readily >>>> identifiable symbol names for all the weak junk, eg: >>>> >>>> __weak_junk_NNNNN >>>> >>>> The name of this kind symbol needs some discussion, but it's temporarily >>>> called "__hole_symbol_XXXXX" in this patch: >>>> 1. Pass size info to scripts/kallsyms (see mksysmap()); >>>> 2. Traverse sorted function symbols, if one function address plus its >>>> size less than next function address, it means there's a hole, then >>>> emit a symbol "__hole_symbol_XXXXX" there which type is 't'. >>>> >>>> After this patch, the effect is as follows: >>>> >>>> $ cat /proc/kallsyms | grep -A 3 do_one_initcall >>>> ffffffff810021e0 T do_one_initcall >>>> ffffffff8100245e t __hole_symbol_XXXXX >>>> ffffffff810024a0 t __pfx_rootfs_init_fs_context >>>> >>>> Suggested-by: Peter Zijlstra <peterz@infradead.org> >>>> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> >>> >>> >>> >>> With my quick test, "t__hole_symbol_XXXXX" was encoded >>> into the following 10-byte stream. >>> >>> .byte 0x09, 0x94, 0xbf, 0x18, 0xf3, 0x3d, 0xce, 0xd1, 0xd1, 0x58 >>> >>> >>> >>> Now "t__hole_symbol_XXXXX" is the most common symbol name. >>> However, 10 byte is consumed for every instance of >>> "t__hole_symbol_XXXXX". >>> >>> This is much less efficient thanI had expected, >>> although I did not analyze the logic of this inefficiency. >>> >> Hi, Masahiro! >> >> In my local test, "t__hole_symbol_XXXXX" was finally encoded >> into just one byte. See "kallsyms_token_table" in the .tmp_vmlinux.kallsyms2.S: >> >> kallsyms_token_table: >> [...] >> .asciz "t__hole_symbol_XXXXX" >> .asciz "hole_symbol_XXXXX" >> .asciz "e_symbol_XXXXX" >> .asciz "XXXXX" >> .asciz "XXX" >> .asciz "e_symbol_" >> .asciz "ymbol_" >> .asciz "ymb" >> .asciz "hol" >> .asciz "ol_" >> .asciz "pfx" >> .asciz "pf" >> .asciz "e_s" >> .asciz "ym" >> .asciz "t__" >> .asciz "_s" >> .asciz "ol" >> .asciz "__" >> .asciz "XX" >> >> But it would still takes up several tokens due to substrings of >> "t__hole_symbol_XXXXX" would also become the most common ones. >> After this patch, the number of "t__hole_symbol_XXXXX" will be ~30% of the total. >> >>> >>> >>> >>> >>> >>> >>>> --- >>>> scripts/kallsyms.c | 101 +++++++++++++++++++++++++++++++++++++++- >>>> scripts/link-vmlinux.sh | 4 +- >>>> scripts/mksysmap | 2 +- >>>> 3 files changed, 102 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c >>>> index 6559a9802f6e..5c4cde864a04 100644 >>>> --- a/scripts/kallsyms.c >>>> +++ b/scripts/kallsyms.c >>>> @@ -35,6 +35,7 @@ >>>> struct sym_entry { >>>> struct sym_entry *next; >>>> unsigned long long addr; >>>> + unsigned long long size; >>>> unsigned int len; >>>> unsigned int seq; >>>> unsigned int start_pos; >>>> @@ -74,6 +75,7 @@ static int token_profit[0x10000]; >>>> static unsigned char best_table[256][2]; >>>> static unsigned char best_table_len[256]; >>>> >>>> +static const char hole_symbol[] = "__hole_symbol_XXXXX"; >>>> >>>> static void usage(void) >>>> { >>>> @@ -130,8 +132,16 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) >>>> size_t len; >>>> ssize_t readlen; >>>> struct sym_entry *sym; >>>> + unsigned long long size = 0; >>>> >>>> errno = 0; >>>> + /* >>>> + * Example of expected symbol format: >>>> + * 1. symbol with size info: >>>> + * ffffffff81000070 00000000000001d7 T __startup_64 >>>> + * 2. symbol without size info: >>>> + * 0000000002a00000 A text_size >>>> + */ >>>> readlen = getline(buf, buf_len, in); >>>> if (readlen < 0) { >>>> if (errno) { >>>> @@ -145,9 +155,24 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) >>>> (*buf)[readlen - 1] = 0; >>>> >>>> addr = strtoull(*buf, &p, 16); >>>> + if (*buf == p || *p++ != ' ') { >>>> + fprintf(stderr, "line format error: unable to parse address\n"); >>>> + exit(EXIT_FAILURE); >>>> + } >>>> + >>>> + if (*p == '0') { >>>> + char *str = p; >>>> >>>> - if (*buf == p || *p++ != ' ' || !isascii((type = *p++)) || *p++ != ' ') { >>>> - fprintf(stderr, "line format error\n"); >>>> + size = strtoull(str, &p, 16); >>>> + if (str == p || *p++ != ' ') { >>>> + fprintf(stderr, "line format error: unable to parse size\n"); >>>> + exit(EXIT_FAILURE); >>>> + } >>>> + } >>>> + >>>> + type = *p++; >>>> + if (!isascii(type) || *p++ != ' ') { >>>> + fprintf(stderr, "line format error: unable to parse type\n"); >>>> exit(EXIT_FAILURE); >>>> } >>>> >>>> @@ -182,6 +207,7 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) >>>> exit(EXIT_FAILURE); >>>> } >>>> sym->addr = addr; >>>> + sym->size = size; >>>> sym->len = len; >>>> sym->sym[0] = type; >>>> strcpy(sym_name(sym), name); >>>> @@ -795,6 +821,76 @@ static void sort_symbols(void) >>>> qsort(table, table_cnt, sizeof(table[0]), compare_symbols); >>>> } >>>> >>>> +static int may_exist_hole_after_symbol(const struct sym_entry *se) >>> >>> >>> The return type should be bool. >>> >> >> Yes! >> >>> >>> >>>> +{ >>>> + char type = se->sym[0]; >>>> + >>>> + /* Only check text symbol or weak symbol */ >>>> + if (type != 't' && type != 'T' && >>>> + type != 'w' && type != 'W') >>>> + return 0; >>>> + /* Symbol without size has no hole */ >>>> + return se->size != 0; >>>> +} >>>> + >>>> +static struct sym_entry *gen_hole_symbol(unsigned long long addr) >>>> +{ >>>> + struct sym_entry *sym; >>>> + static size_t len = sizeof(hole_symbol); >>>> + >>>> + /* include type field */ >>>> + sym = malloc(sizeof(*sym) + len + 1); >>>> + if (!sym) { >>>> + fprintf(stderr, "unable to allocate memory for hole symbol\n"); >>>> + exit(EXIT_FAILURE); >>>> + } >>>> + sym->addr = addr; >>>> + sym->size = 0; >>>> + sym->len = len; >>>> + sym->sym[0] = 't'; >>>> + strcpy(sym_name(sym), hole_symbol); >>>> + sym->percpu_absolute = 0; >>>> + return sym; >>>> +} >>>> + >>>> +static void emit_hole_symbols(void) >>>> +{ >>>> + unsigned int i, pos, nr_emit; >>>> + struct sym_entry **new_table; >>>> + unsigned int new_cnt; >>>> + >>>> + nr_emit = 0; >>>> + for (i = 0; i < table_cnt - 1; i++) { >>>> + if (may_exist_hole_after_symbol(table[i]) && >>>> + table[i]->addr + table[i]->size < table[i+1]->addr) >>>> + nr_emit++; >>>> + } >>>> + if (!nr_emit) >>>> + return; >>>> + >>>> + new_cnt = table_cnt + nr_emit; >>>> + new_table = malloc(sizeof(*new_table) * new_cnt); >>> >>> >>> Do you need to allocate another huge table? >>> >>> You can use realloc() to append the room for nr_emit >>> if you iterate the table in the reverse order. >>> >> >> Yes, it would be much better. If it turns out to be the >> "emit hole symbol" solution, I'll change it to that in the next version, >> actually, I forgot to mark this series as "RFC". > > > "__hole_symbol_XXXXX" is too much. > > You can use the empty symbol type/name as a special case > to represent the hole. > Ok, I'll try it in v2. >
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index 6559a9802f6e..5c4cde864a04 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -35,6 +35,7 @@ struct sym_entry { struct sym_entry *next; unsigned long long addr; + unsigned long long size; unsigned int len; unsigned int seq; unsigned int start_pos; @@ -74,6 +75,7 @@ static int token_profit[0x10000]; static unsigned char best_table[256][2]; static unsigned char best_table_len[256]; +static const char hole_symbol[] = "__hole_symbol_XXXXX"; static void usage(void) { @@ -130,8 +132,16 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) size_t len; ssize_t readlen; struct sym_entry *sym; + unsigned long long size = 0; errno = 0; + /* + * Example of expected symbol format: + * 1. symbol with size info: + * ffffffff81000070 00000000000001d7 T __startup_64 + * 2. symbol without size info: + * 0000000002a00000 A text_size + */ readlen = getline(buf, buf_len, in); if (readlen < 0) { if (errno) { @@ -145,9 +155,24 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) (*buf)[readlen - 1] = 0; addr = strtoull(*buf, &p, 16); + if (*buf == p || *p++ != ' ') { + fprintf(stderr, "line format error: unable to parse address\n"); + exit(EXIT_FAILURE); + } + + if (*p == '0') { + char *str = p; - if (*buf == p || *p++ != ' ' || !isascii((type = *p++)) || *p++ != ' ') { - fprintf(stderr, "line format error\n"); + size = strtoull(str, &p, 16); + if (str == p || *p++ != ' ') { + fprintf(stderr, "line format error: unable to parse size\n"); + exit(EXIT_FAILURE); + } + } + + type = *p++; + if (!isascii(type) || *p++ != ' ') { + fprintf(stderr, "line format error: unable to parse type\n"); exit(EXIT_FAILURE); } @@ -182,6 +207,7 @@ static struct sym_entry *read_symbol(FILE *in, char **buf, size_t *buf_len) exit(EXIT_FAILURE); } sym->addr = addr; + sym->size = size; sym->len = len; sym->sym[0] = type; strcpy(sym_name(sym), name); @@ -795,6 +821,76 @@ static void sort_symbols(void) qsort(table, table_cnt, sizeof(table[0]), compare_symbols); } +static int may_exist_hole_after_symbol(const struct sym_entry *se) +{ + char type = se->sym[0]; + + /* Only check text symbol or weak symbol */ + if (type != 't' && type != 'T' && + type != 'w' && type != 'W') + return 0; + /* Symbol without size has no hole */ + return se->size != 0; +} + +static struct sym_entry *gen_hole_symbol(unsigned long long addr) +{ + struct sym_entry *sym; + static size_t len = sizeof(hole_symbol); + + /* include type field */ + sym = malloc(sizeof(*sym) + len + 1); + if (!sym) { + fprintf(stderr, "unable to allocate memory for hole symbol\n"); + exit(EXIT_FAILURE); + } + sym->addr = addr; + sym->size = 0; + sym->len = len; + sym->sym[0] = 't'; + strcpy(sym_name(sym), hole_symbol); + sym->percpu_absolute = 0; + return sym; +} + +static void emit_hole_symbols(void) +{ + unsigned int i, pos, nr_emit; + struct sym_entry **new_table; + unsigned int new_cnt; + + nr_emit = 0; + for (i = 0; i < table_cnt - 1; i++) { + if (may_exist_hole_after_symbol(table[i]) && + table[i]->addr + table[i]->size < table[i+1]->addr) + nr_emit++; + } + if (!nr_emit) + return; + + new_cnt = table_cnt + nr_emit; + new_table = malloc(sizeof(*new_table) * new_cnt); + if (!new_table) { + fprintf(stderr, "unable to allocate memory for new table\n"); + exit(EXIT_FAILURE); + } + + pos = 0; + for (i = 0; i < table_cnt; i++) { + unsigned long long addr; + + new_table[pos++] = table[i]; + if ((i == table_cnt - 1) || !may_exist_hole_after_symbol(table[i])) + continue; + addr = table[i]->addr + table[i]->size; + if (addr < table[i+1]->addr) + new_table[pos++] = gen_hole_symbol(addr); + } + free(table); + table = new_table; + table_cnt = new_cnt; +} + static void make_percpus_absolute(void) { unsigned int i; @@ -854,6 +950,7 @@ int main(int argc, char **argv) if (absolute_percpu) make_percpus_absolute(); sort_symbols(); + emit_hole_symbols(); if (base_relative) record_relative_base(); optimize_token_table(); diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 518c70b8db50..8e1373902bfe 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -189,11 +189,11 @@ kallsyms_step() } # Create map file with all symbols from ${1} -# See mksymap for additional details +# See mksysmap for additional details mksysmap() { info NM ${2} - ${NM} -n "${1}" | sed -f "${srctree}/scripts/mksysmap" > "${2}" + ${NM} -nS "${1}" | sed -f "${srctree}/scripts/mksysmap" > "${2}" } sorttable() diff --git a/scripts/mksysmap b/scripts/mksysmap index c12723a04655..7a4415f21143 100755 --- a/scripts/mksysmap +++ b/scripts/mksysmap @@ -2,7 +2,7 @@ # SPDX-License-Identifier: GPL-2.0-only # # sed script to filter out symbols that are not needed for System.map, -# or not suitable for kallsyms. The input should be 'nm -n <file>'. +# or not suitable for kallsyms. The input should be 'nm -nS <file>'. # # System.map is used by module-init tools and some debugging # tools to retrieve the actual addresses of symbols in the kernel.
When a weak type function is overridden, its symbol will be removed from the symbol table, but its code will not be removed. Besides, due to lacking of size for kallsyms, kernel compute function size by substracting its symbol address from its next symbol address (see kallsyms_lookup_size_offset()). These will cause that size of some function is computed to be larger than it actually is, just because symbol of its following weak function is removed. This issue also causes multiple __fentry__ locations to be counted in the some function scope, and eventually causes ftrace_location() to find wrong __fentry__ location. It was reported in Link: https://lore.kernel.org/all/20240607115211.734845-1-zhengyejian1@huawei.com/ Peter suggested to change scipts/kallsyms.c to emit readily identifiable symbol names for all the weak junk, eg: __weak_junk_NNNNN The name of this kind symbol needs some discussion, but it's temporarily called "__hole_symbol_XXXXX" in this patch: 1. Pass size info to scripts/kallsyms (see mksysmap()); 2. Traverse sorted function symbols, if one function address plus its size less than next function address, it means there's a hole, then emit a symbol "__hole_symbol_XXXXX" there which type is 't'. After this patch, the effect is as follows: $ cat /proc/kallsyms | grep -A 3 do_one_initcall ffffffff810021e0 T do_one_initcall ffffffff8100245e t __hole_symbol_XXXXX ffffffff810024a0 t __pfx_rootfs_init_fs_context Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> --- scripts/kallsyms.c | 101 +++++++++++++++++++++++++++++++++++++++- scripts/link-vmlinux.sh | 4 +- scripts/mksysmap | 2 +- 3 files changed, 102 insertions(+), 5 deletions(-)