Message ID | 20220920071317.1787-3-thunder.leizhen@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kallsyms: Optimizes the performance of lookup symbols | expand |
On Tue 2022-09-20 15:13:11, Zhen Lei wrote: > For a symbol, there may be more than one place that can be merged. For > example: nfs_fs_proc_net_init, there are two "f"+"s_" combinations. > And we're only compressing the first combination at the moment. Really? > diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c > index 8caccc8f4a23703..3319d9f38d7a5f2 100644 > --- a/scripts/kallsyms.c > +++ b/scripts/kallsyms.c > @@ -553,7 +553,7 @@ static void compress_symbols(const unsigned char *str, int idx) > unsigned char *p1, *p2; > > for (i = 0; i < table_cnt; i++) { > - > +retry: > len = table[i]->len; > p1 = table[i]->sym; > > @@ -585,6 +585,9 @@ static void compress_symbols(const unsigned char *str, int idx) > > /* increase the counts for this symbol's new tokens */ > learn_symbol(table[i]->sym, len); > + > + /* May be more than one place that can be merged, try again */ > + goto retry; > } > } My understanding is that the code already tries to find the same token several times. Here are the important parts of the existing code: static void compress_symbols(const unsigned char *str, int idx) { p2 = find_token(p1, len, str); do { /* replace the found token with idx */ *p2 = idx; [...] /* find the token on the symbol */ p2 = find_token(p1, size, str); } while (p2); Best Regards, Petr
On 2022/9/21 16:00, Petr Mladek wrote: > On Tue 2022-09-20 15:13:11, Zhen Lei wrote: >> For a symbol, there may be more than one place that can be merged. For >> example: nfs_fs_proc_net_init, there are two "f"+"s_" combinations. >> And we're only compressing the first combination at the moment. > > Really? Yes, there are about 200 such functions. > >> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c >> index 8caccc8f4a23703..3319d9f38d7a5f2 100644 >> --- a/scripts/kallsyms.c >> +++ b/scripts/kallsyms.c >> @@ -553,7 +553,7 @@ static void compress_symbols(const unsigned char *str, int idx) >> unsigned char *p1, *p2; >> >> for (i = 0; i < table_cnt; i++) { >> - >> +retry: >> len = table[i]->len; >> p1 = table[i]->sym; >> >> @@ -585,6 +585,9 @@ static void compress_symbols(const unsigned char *str, int idx) >> >> /* increase the counts for this symbol's new tokens */ >> learn_symbol(table[i]->sym, len); >> + >> + /* May be more than one place that can be merged, try again */ >> + goto retry; >> } >> } > > My understanding is that the code already tries to find the same > token several times. Here are the important parts of the existing > code: > > static void compress_symbols(const unsigned char *str, int idx) > { > > p2 = find_token(p1, len, str); > > do { > /* replace the found token with idx */ > *p2 = idx; > [...] > > /* find the token on the symbol */ > p2 = find_token(p1, size, str); Oh, yes, it retries. Let me reanalyze it. However, the problem is real, and there may be a problem somewhere in the loop. > > } while (p2); > > Best Regards, > Petr > . >
On 2022/9/21 16:31, Leizhen (ThunderTown) wrote: > > > On 2022/9/21 16:00, Petr Mladek wrote: >> On Tue 2022-09-20 15:13:11, Zhen Lei wrote: >>> For a symbol, there may be more than one place that can be merged. For >>> example: nfs_fs_proc_net_init, there are two "f"+"s_" combinations. >>> And we're only compressing the first combination at the moment. >> >> Really? > > Yes, there are about 200 such functions. > >> >>> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c >>> index 8caccc8f4a23703..3319d9f38d7a5f2 100644 >>> --- a/scripts/kallsyms.c >>> +++ b/scripts/kallsyms.c >>> @@ -553,7 +553,7 @@ static void compress_symbols(const unsigned char *str, int idx) >>> unsigned char *p1, *p2; >>> >>> for (i = 0; i < table_cnt; i++) { >>> - >>> +retry: >>> len = table[i]->len; >>> p1 = table[i]->sym; >>> >>> @@ -585,6 +585,9 @@ static void compress_symbols(const unsigned char *str, int idx) >>> >>> /* increase the counts for this symbol's new tokens */ >>> learn_symbol(table[i]->sym, len); >>> + >>> + /* May be more than one place that can be merged, try again */ >>> + goto retry; >>> } >>> } >> >> My understanding is that the code already tries to find the same >> token several times. Here are the important parts of the existing >> code: >> >> static void compress_symbols(const unsigned char *str, int idx) >> { >> >> p2 = find_token(p1, len, str); >> >> do { >> /* replace the found token with idx */ >> *p2 = idx; >> [...] >> >> /* find the token on the symbol */ >> p2 = find_token(p1, size, str); > > Oh, yes, it retries. Let me reanalyze it. However, the problem is > real, and there may be a problem somewhere in the loop. Hi, Petr: Thanks. I found that it's my fault. The first round skip the type character. But the next round will incorrectly skip one character, so for nfs_fs_proc_net_init, the next round start from s, and using ^ the proposed "unsigned char type" in your next reply should solve the problem. Thank you very much. - for (i = 0; i < len - 1; i++) + for (i = sym_start_idx; i < len - 1; i++) > >> >> } while (p2); >> >> Best Regards, >> Petr >> . >> >
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index 8caccc8f4a23703..3319d9f38d7a5f2 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -553,7 +553,7 @@ static void compress_symbols(const unsigned char *str, int idx) unsigned char *p1, *p2; for (i = 0; i < table_cnt; i++) { - +retry: len = table[i]->len; p1 = table[i]->sym; @@ -585,6 +585,9 @@ static void compress_symbols(const unsigned char *str, int idx) /* increase the counts for this symbol's new tokens */ learn_symbol(table[i]->sym, len); + + /* May be more than one place that can be merged, try again */ + goto retry; } }
For a symbol, there may be more than one place that can be merged. For example: nfs_fs_proc_net_init, there are two "f"+"s_" combinations. And we're only compressing the first combination at the moment. Let's compress all possible combinations. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> --- scripts/kallsyms.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)