Message ID | 20230111161155.1349375-1-gary@garyguo.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | modpost: support arbitrary symbol length in modversion | expand |
On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: >Currently modversion uses a fixed size array of size (64 - sizeof(long)) >to store symbol names, thus placing a hard limit on length of symbols. >Rust symbols (which encodes crate and module names) can be quite a bit >longer. The length limit in kallsyms is increased to 512 for this reason. > >It's a waste of space to simply expand the fixed array size to 512 in >modversion info entries. I therefore make it variably sized, with offset >to the next entry indicated by the initial "next" field. > >In addition to supporting longer-than-56/60 byte symbols, this patch also >reduce the size for short symbols by getting rid of excessive 0 paddings. >There are still some zero paddings to ensure "next" and "crc" fields are >properly aligned. > >This patch does have a tiny drawback that it makes ".mod.c" files generated >a bit less easy to read, as code like > > "\x08\x00\x00\x00\x78\x56\x34\x12" > "symbol\0\0" > >is generated as opposed to > > { 0x12345678, "symbol" }, > >because the structure is now variable-length. But hopefully nobody reads >the generated file :) > >Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") >Link: https://github.com/Rust-for-Linux/linux/pull/379 > >Signed-off-by: Gary Guo <gary@garyguo.net> >--- > arch/powerpc/kernel/module_64.c | 3 ++- > include/linux/module.h | 6 ++++-- > kernel/module/version.c | 21 +++++++++------------ > scripts/export_report.pl | 9 +++++---- > scripts/mod/modpost.c | 33 +++++++++++++++++++++++---------- > 5 files changed, 43 insertions(+), 29 deletions(-) > >diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c >index ff045644f13f..eac23c11d579 100644 >--- a/arch/powerpc/kernel/module_64.c >+++ b/arch/powerpc/kernel/module_64.c >@@ -236,10 +236,11 @@ static void dedotify_versions(struct modversion_info *vers, > { > struct modversion_info *end; > >- for (end = (void *)vers + size; vers < end; vers++) >+ for (end = (void *)vers + size; vers < end; vers = (void *)vers + vers->next) { > if (vers->name[0] == '.') { > memmove(vers->name, vers->name+1, strlen(vers->name)); > } >+ } > } > > /* >diff --git a/include/linux/module.h b/include/linux/module.h >index 8c5909c0076c..37cb25af9099 100644 >--- a/include/linux/module.h >+++ b/include/linux/module.h >@@ -34,8 +34,10 @@ > #define MODULE_NAME_LEN MAX_PARAM_PREFIX_LEN > > struct modversion_info { >- unsigned long crc; >- char name[MODULE_NAME_LEN]; >+ /* Offset of the next modversion entry in relation to this one. */ >+ u32 next; >+ u32 crc; >+ char name[0]; although not really exported as uapi, this will break userspace as this is used in the elf file generated for the modules. I think this change must be made in a backward compatible way and kmod updated to deal with the variable name length: kmod $ git grep "\[64" libkmod/libkmod-elf.c: char name[64 - sizeof(uint32_t)]; libkmod/libkmod-elf.c: char name[64 - sizeof(uint64_t)]; in kmod we have both 32 and 64 because a 64-bit kmod can read both 32 and 64 bit module, and vice versa. Lucas De Marchi > }; > > struct module; >diff --git a/kernel/module/version.c b/kernel/module/version.c >index 53f43ac5a73e..af7478dcc158 100644 >--- a/kernel/module/version.c >+++ b/kernel/module/version.c >@@ -17,32 +17,29 @@ int check_version(const struct load_info *info, > { > Elf_Shdr *sechdrs = info->sechdrs; > unsigned int versindex = info->index.vers; >- unsigned int i, num_versions; >- struct modversion_info *versions; >+ struct modversion_info *versions, *end; >+ u32 crcval; > > /* Exporting module didn't supply crcs? OK, we're already tainted. */ > if (!crc) > return 1; >+ crcval = *crc; > > /* No versions at all? modprobe --force does this. */ > if (versindex == 0) > return try_to_force_load(mod, symname) == 0; > > versions = (void *)sechdrs[versindex].sh_addr; >- num_versions = sechdrs[versindex].sh_size >- / sizeof(struct modversion_info); >+ end = (void *)versions + sechdrs[versindex].sh_size; > >- for (i = 0; i < num_versions; i++) { >- u32 crcval; >- >- if (strcmp(versions[i].name, symname) != 0) >+ for (; versions < end; versions = (void *)versions + versions->next) { >+ if (strcmp(versions->name, symname) != 0) > continue; > >- crcval = *crc; >- if (versions[i].crc == crcval) >+ if (versions->crc == crcval) > return 1; >- pr_debug("Found checksum %X vs module %lX\n", >- crcval, versions[i].crc); >+ pr_debug("Found checksum %X vs module %X\n", >+ crcval, versions->crc); > goto bad_version; > } > >diff --git a/scripts/export_report.pl b/scripts/export_report.pl >index feb3d5542a62..1117646f3141 100755 >--- a/scripts/export_report.pl >+++ b/scripts/export_report.pl >@@ -116,18 +116,19 @@ foreach my $thismod (@allcfiles) { > while ( <$module> ) { > chomp; > if ($state == 0) { >- $state = 1 if ($_ =~ /static const struct modversion_info/); >+ $state = 1 if ($_ =~ /static const char ____versions/); > next; > } > if ($state == 1) { >- $state = 2 if ($_ =~ /__attribute__\(\(section\("__versions"\)\)\)/); >+ $state = 2 if ($_ =~ /__used __section\("__versions"\)/); > next; > } > if ($state == 2) { >- if ( $_ !~ /0x[0-9a-f]+,/ ) { >+ if ( $_ !~ /\\0"/ ) { >+ last if ($_ =~ /;/); > next; > } >- my $sym = (split /([,"])/,)[4]; >+ my $sym = (split /(["\\])/,)[2]; > my ($module, $value, $symbol, $gpl) = @{$SYMBOL{$sym}}; > $SYMBOL{ $sym } = [ $module, $value+1, $symbol, $gpl]; > push(@{$MODULE{$thismod}} , $sym); >diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c >index efff8078e395..334d170de31f 100644 >--- a/scripts/mod/modpost.c >+++ b/scripts/mod/modpost.c >@@ -2046,13 +2046,17 @@ static void add_exported_symbols(struct buffer *buf, struct module *mod) > static void add_versions(struct buffer *b, struct module *mod) > { > struct symbol *s; >+ unsigned int name_len; >+ unsigned int name_len_padded; >+ unsigned int tmp; >+ unsigned char *tmp_view = (unsigned char *)&tmp; > > if (!modversions) > return; > > buf_printf(b, "\n"); >- buf_printf(b, "static const struct modversion_info ____versions[]\n"); >- buf_printf(b, "__used __section(\"__versions\") = {\n"); >+ buf_printf(b, "static const char ____versions[]\n"); >+ buf_printf(b, "__used __section(\"__versions\") =\n"); > > list_for_each_entry(s, &mod->unresolved_symbols, list) { > if (!s->module) >@@ -2062,16 +2066,25 @@ static void add_versions(struct buffer *b, struct module *mod) > s->name, mod->name); > continue; > } >- if (strlen(s->name) >= MODULE_NAME_LEN) { >- error("too long symbol \"%s\" [%s.ko]\n", >- s->name, mod->name); >- break; >- } >- buf_printf(b, "\t{ %#8x, \"%s\" },\n", >- s->crc, s->name); >+ name_len = strlen(s->name); >+ name_len_padded = (name_len + 1 + 3) & ~3; >+ >+ /* Offset to next entry */ >+ tmp = TO_NATIVE(8 + name_len_padded); >+ buf_printf(b, "\t\"\\x%02x\\x%02x\\x%02x\\x%02x", >+ tmp_view[0], tmp_view[1], tmp_view[2], tmp_view[3]); >+ >+ tmp = TO_NATIVE(s->crc); >+ buf_printf(b, "\\x%02x\\x%02x\\x%02x\\x%02x\"\n", >+ tmp_view[0], tmp_view[1], tmp_view[2], tmp_view[3]); >+ >+ buf_printf(b, "\t\"%s", s->name); >+ for (; name_len < name_len_padded; name_len++) >+ buf_printf(b, "\\0"); >+ buf_printf(b, "\"\n"); > } > >- buf_printf(b, "};\n"); >+ buf_printf(b, ";\n"); > } > > static void add_depends(struct buffer *b, struct module *mod) >-- >2.34.1 >
On Thu, 12 Jan 2023 14:40:59 -0700 Lucas De Marchi <lucas.demarchi@intel.com> wrote: > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > > > > struct modversion_info { > >- unsigned long crc; > >- char name[MODULE_NAME_LEN]; > >+ /* Offset of the next modversion entry in relation to this one. */ > >+ u32 next; > >+ u32 crc; > >+ char name[0]; > > although not really exported as uapi, this will break userspace as this is > used in the elf file generated for the modules. I think > this change must be made in a backward compatible way and kmod updated > to deal with the variable name length: > > kmod $ git grep "\[64" > libkmod/libkmod-elf.c: char name[64 - sizeof(uint32_t)]; > libkmod/libkmod-elf.c: char name[64 - sizeof(uint64_t)]; > > in kmod we have both 32 and 64 because a 64-bit kmod can read both 32 > and 64 bit module, and vice versa. > Hi Lucas, Thanks for the information. The change can't be "truly" backward compatible, in a sense that regardless of the new format we choose, kmod would not be able to decode symbols longer than "64 - sizeof(long)" bytes. So the list it retrieves is going to be incomplete, isn't it? What kind of backward compatibility should be expected? It could be: * short symbols can still be found by old versions of kmod, but not long symbols; * or, no symbols are found by old versions of kmod, but it does not fail; * or, old versions of kmod would fail gracefully for not able to recognise the format of __versions section, but it didn't do anything crazy (e.g. decode it as old format). Also, do you think the current modversion format should stick forever or would we be able to migrate away from it eventually and fail old versions of modprobe given enough time? Best, Gary
Hello, On Fri, Jan 13, 2023 at 06:18:41PM +0000, Gary Guo wrote: > On Thu, 12 Jan 2023 14:40:59 -0700 > Lucas De Marchi <lucas.demarchi@intel.com> wrote: > > > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > > > > > > struct modversion_info { > > >- unsigned long crc; > > >- char name[MODULE_NAME_LEN]; > > >+ /* Offset of the next modversion entry in relation to this one. */ > > >+ u32 next; > > >+ u32 crc; > > >+ char name[0]; > > > > although not really exported as uapi, this will break userspace as this is > > used in the elf file generated for the modules. I think > > this change must be made in a backward compatible way and kmod updated > > to deal with the variable name length: > > > > kmod $ git grep "\[64" > > libkmod/libkmod-elf.c: char name[64 - sizeof(uint32_t)]; > > libkmod/libkmod-elf.c: char name[64 - sizeof(uint64_t)]; > > > > in kmod we have both 32 and 64 because a 64-bit kmod can read both 32 > > and 64 bit module, and vice versa. > > > > Hi Lucas, > > Thanks for the information. > > The change can't be "truly" backward compatible, in a sense that > regardless of the new format we choose, kmod would not be able to decode > symbols longer than "64 - sizeof(long)" bytes. So the list it retrieves > is going to be incomplete, isn't it? > > What kind of backward compatibility should be expected? It could be: > * short symbols can still be found by old versions of kmod, but not > long symbols; That sounds good. Not everyone is using rust, and with this option people who do will need to upgrade tooling, and people who don't care don't need to do anything. Thanks Michal
On Tue, Jan 17, 2023 at 06:51:44PM +0100, Michal Suchánek wrote: >Hello, > >On Fri, Jan 13, 2023 at 06:18:41PM +0000, Gary Guo wrote: >> On Thu, 12 Jan 2023 14:40:59 -0700 >> Lucas De Marchi <lucas.demarchi@intel.com> wrote: >> >> > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: >> > > >> > > struct modversion_info { >> > >- unsigned long crc; >> > >- char name[MODULE_NAME_LEN]; >> > >+ /* Offset of the next modversion entry in relation to this one. */ >> > >+ u32 next; >> > >+ u32 crc; >> > >+ char name[0]; >> > >> > although not really exported as uapi, this will break userspace as this is >> > used in the elf file generated for the modules. I think >> > this change must be made in a backward compatible way and kmod updated >> > to deal with the variable name length: >> > >> > kmod $ git grep "\[64" >> > libkmod/libkmod-elf.c: char name[64 - sizeof(uint32_t)]; >> > libkmod/libkmod-elf.c: char name[64 - sizeof(uint64_t)]; >> > >> > in kmod we have both 32 and 64 because a 64-bit kmod can read both 32 >> > and 64 bit module, and vice versa. >> > >> >> Hi Lucas, >> >> Thanks for the information. >> >> The change can't be "truly" backward compatible, in a sense that >> regardless of the new format we choose, kmod would not be able to decode >> symbols longer than "64 - sizeof(long)" bytes. So the list it retrieves >> is going to be incomplete, isn't it? >> >> What kind of backward compatibility should be expected? It could be: >> * short symbols can still be found by old versions of kmod, but not >> long symbols; > >That sounds good. Not everyone is using rust, and with this option >people who do will need to upgrade tooling, and people who don't care >don't need to do anything. that could be it indeed. My main worry here is: "After the support is added in kmod, kmod needs to be able to output the correct information regardless if the module is from before/after the change in the kernel and also without relying on kernel version." Just changing the struct modversion_info doesn't make that possible. Maybe adding the long symbols in another section? Or ble just increase to 512 and add the size to a "__versions_hdr" section. If we then output a max size per module, this would offset a little bit the additional size gained for the modules using rust. And the additional 0's should compress well so I'm not sure the additional size is that much relevant here. Lucas De Marchi > >Thanks > >Michal
On Wed, Jan 18, 2023 at 4:23 AM Lucas De Marchi <lucas.demarchi@intel.com> wrote: > > On Tue, Jan 17, 2023 at 06:51:44PM +0100, Michal Suchánek wrote: > >Hello, > > > >On Fri, Jan 13, 2023 at 06:18:41PM +0000, Gary Guo wrote: > >> On Thu, 12 Jan 2023 14:40:59 -0700 > >> Lucas De Marchi <lucas.demarchi@intel.com> wrote: > >> > >> > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > >> > > > >> > > struct modversion_info { > >> > >- unsigned long crc; > >> > >- char name[MODULE_NAME_LEN]; > >> > >+ /* Offset of the next modversion entry in relation to this one. */ > >> > >+ u32 next; > >> > >+ u32 crc; > >> > >+ char name[0]; > >> > > >> > although not really exported as uapi, this will break userspace as this is > >> > used in the elf file generated for the modules. I think > >> > this change must be made in a backward compatible way and kmod updated > >> > to deal with the variable name length: > >> > > >> > kmod $ git grep "\[64" > >> > libkmod/libkmod-elf.c: char name[64 - sizeof(uint32_t)]; > >> > libkmod/libkmod-elf.c: char name[64 - sizeof(uint64_t)]; > >> > > >> > in kmod we have both 32 and 64 because a 64-bit kmod can read both 32 > >> > and 64 bit module, and vice versa. > >> > > >> > >> Hi Lucas, > >> > >> Thanks for the information. > >> > >> The change can't be "truly" backward compatible, in a sense that > >> regardless of the new format we choose, kmod would not be able to decode > >> symbols longer than "64 - sizeof(long)" bytes. So the list it retrieves > >> is going to be incomplete, isn't it? > >> > >> What kind of backward compatibility should be expected? It could be: > >> * short symbols can still be found by old versions of kmod, but not > >> long symbols; > > > >That sounds good. Not everyone is using rust, and with this option > >people who do will need to upgrade tooling, and people who don't care > >don't need to do anything. > > that could be it indeed. My main worry here is: > > "After the support is added in kmod, kmod needs to be able to output the > correct information regardless if the module is from before/after the > change in the kernel and also without relying on kernel version." > Just changing the struct modversion_info doesn't make that possible. > > Maybe adding the long symbols in another section? Or ble > just increase to 512 and add the size to a > "__versions_hdr" section. If we then output a max size per module, > this would offset a little bit the additional size gained for the > modules using rust. And the additional 0's should compress well > so I'm not sure the additional size is that much relevant here. I also thought of new section(s) for long symbols. One idea is to have separate sections for CRCs and symbol names. section __version_crc: 0x12345678 0x23456789 0x34567890 section __version_sym: "very_very_very_very_long_symbol" "another_very_very_very_very_very_long_symbol" "yet_another_very_very_very_very_very_long_symbol" You can iterate in each section with this: crc += sizeof(u32); name += strlen(name) + 1; Benefits: - No next pointer - No padding - *.mod.c is kept human readable. BTW, the following is impossible because the pointer reference to .rodata is not available at this point? struct modversion_info { u32 crc; const char *name: };
On Tue, 17 Jan 2023 11:22:45 -0800 Lucas De Marchi <lucas.demarchi@intel.com> wrote: > On Tue, Jan 17, 2023 at 06:51:44PM +0100, Michal Suchánek wrote: > >Hello, > > > >On Fri, Jan 13, 2023 at 06:18:41PM +0000, Gary Guo wrote: > >> On Thu, 12 Jan 2023 14:40:59 -0700 > >> Lucas De Marchi <lucas.demarchi@intel.com> wrote: > >> > >> > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > >> > > > >> > > struct modversion_info { > >> > >- unsigned long crc; > >> > >- char name[MODULE_NAME_LEN]; > >> > >+ /* Offset of the next modversion entry in relation to this one. */ > >> > >+ u32 next; > >> > >+ u32 crc; > >> > >+ char name[0]; > >> > > >> > although not really exported as uapi, this will break userspace as this is > >> > used in the elf file generated for the modules. I think > >> > this change must be made in a backward compatible way and kmod updated > >> > to deal with the variable name length: > >> > > >> > kmod $ git grep "\[64" > >> > libkmod/libkmod-elf.c: char name[64 - sizeof(uint32_t)]; > >> > libkmod/libkmod-elf.c: char name[64 - sizeof(uint64_t)]; > >> > > >> > in kmod we have both 32 and 64 because a 64-bit kmod can read both 32 > >> > and 64 bit module, and vice versa. > >> > > >> > >> Hi Lucas, > >> > >> Thanks for the information. > >> > >> The change can't be "truly" backward compatible, in a sense that > >> regardless of the new format we choose, kmod would not be able to decode > >> symbols longer than "64 - sizeof(long)" bytes. So the list it retrieves > >> is going to be incomplete, isn't it? > >> > >> What kind of backward compatibility should be expected? It could be: > >> * short symbols can still be found by old versions of kmod, but not > >> long symbols; > > > >That sounds good. Not everyone is using rust, and with this option > >people who do will need to upgrade tooling, and people who don't care > >don't need to do anything. > > that could be it indeed. My main worry here is: > > "After the support is added in kmod, kmod needs to be able to output the > correct information regardless if the module is from before/after the > change in the kernel and also without relying on kernel version." > Just changing the struct modversion_info doesn't make that possible. > > Maybe adding the long symbols in another section? Yeah, that's what I imagined how it could be implemented when I said "short symbols can still be found by old versions of kmod, but not long symbols". > Or ble just increase to 512 and add the size to a > "__versions_hdr" section. If we then output a max size per module, > this would offset a little bit the additional size gained for the > modules using rust. That format isn't really elegant IMO. And symbol length can vary a lot, having all symbols dictated by the longest symbol doesn't sound a good approach. > And the additional 0's should compress well > so I'm not sure the additional size is that much relevant here. I am not sure why compression is mentioned here. I don't think section in .ko files are compressed. (sorry forget to reply-all, re-send email to the list) Best, Gary
On Thu, Jan 19, 2023 at 03:09:36PM +0000, Gary Guo wrote: > On Tue, 17 Jan 2023 11:22:45 -0800 > Lucas De Marchi <lucas.demarchi@intel.com> wrote: > > > On Tue, Jan 17, 2023 at 06:51:44PM +0100, Michal Suchánek wrote: > > >Hello, > > > > > >On Fri, Jan 13, 2023 at 06:18:41PM +0000, Gary Guo wrote: > > >> On Thu, 12 Jan 2023 14:40:59 -0700 > > >> Lucas De Marchi <lucas.demarchi@intel.com> wrote: > > >> > > >> > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > > >> > > > > >> > > struct modversion_info { > > >> > >- unsigned long crc; > > >> > >- char name[MODULE_NAME_LEN]; > > >> > >+ /* Offset of the next modversion entry in relation to this one. */ > > >> > >+ u32 next; > > >> > >+ u32 crc; > > >> > >+ char name[0]; > > >> > > > >> > although not really exported as uapi, this will break userspace as this is > > >> > used in the elf file generated for the modules. I think > > >> > this change must be made in a backward compatible way and kmod updated > > >> > to deal with the variable name length: > > >> > > > >> > kmod $ git grep "\[64" > > >> > libkmod/libkmod-elf.c: char name[64 - sizeof(uint32_t)]; > > >> > libkmod/libkmod-elf.c: char name[64 - sizeof(uint64_t)]; > > >> > > > >> > in kmod we have both 32 and 64 because a 64-bit kmod can read both 32 > > >> > and 64 bit module, and vice versa. > > >> > > > >> > > >> Hi Lucas, > > >> > > >> Thanks for the information. > > >> > > >> The change can't be "truly" backward compatible, in a sense that > > >> regardless of the new format we choose, kmod would not be able to decode > > >> symbols longer than "64 - sizeof(long)" bytes. So the list it retrieves > > >> is going to be incomplete, isn't it? > > >> > > >> What kind of backward compatibility should be expected? It could be: > > >> * short symbols can still be found by old versions of kmod, but not > > >> long symbols; > > > > > >That sounds good. Not everyone is using rust, and with this option > > >people who do will need to upgrade tooling, and people who don't care > > >don't need to do anything. > > > > that could be it indeed. My main worry here is: > > > > "After the support is added in kmod, kmod needs to be able to output the > > correct information regardless if the module is from before/after the > > change in the kernel and also without relying on kernel version." > > Just changing the struct modversion_info doesn't make that possible. > > > > Maybe adding the long symbols in another section? > > Yeah, that's what I imagined how it could be implemented when I said > "short symbols can still be found by old versions of kmod, but not long > symbols". > > > Or ble just increase to 512 and add the size to a > > "__versions_hdr" section. If we then output a max size per module, > > this would offset a little bit the additional size gained for the > > modules using rust. > > That format isn't really elegant IMO. And symbol length can vary a lot, > having all symbols dictated by the longest symbol doesn't sound a good > approach. > > > And the additional 0's should compress well > > so I'm not sure the additional size is that much relevant here. > > I am not sure why compression is mentioned here. I don't think section > in .ko files are compressed. There is the option to compress the whole .ko files, and it's commonly used. Thanks Michal
On Thu, 19 Jan 2023 16:18:57 +0100 Michal Suchánek <msuchanek@suse.de> wrote: > On Thu, Jan 19, 2023 at 03:09:36PM +0000, Gary Guo wrote: > > On Tue, 17 Jan 2023 11:22:45 -0800 > > Lucas De Marchi <lucas.demarchi@intel.com> wrote: > > > > > And the additional 0's should compress well > > > so I'm not sure the additional size is that much relevant here. > > > > I am not sure why compression is mentioned here. I don't think section > > in .ko files are compressed. > > There is the option to compress the whole .ko files, and it's commonly > used. Hi Michal, I am aware that there is an option but I am surprised to hear that it's commonly used. I don't think that's enabled by default, and certainly Debian/Ubuntu does not have it enabled. Best, Gary
On Wed, Jan 18, 2023 at 04:01:29PM +0900, Masahiro Yamada wrote: >On Wed, Jan 18, 2023 at 4:23 AM Lucas De Marchi ><lucas.demarchi@intel.com> wrote: >> >> On Tue, Jan 17, 2023 at 06:51:44PM +0100, Michal Suchánek wrote: >> >Hello, >> > >> >On Fri, Jan 13, 2023 at 06:18:41PM +0000, Gary Guo wrote: >> >> On Thu, 12 Jan 2023 14:40:59 -0700 >> >> Lucas De Marchi <lucas.demarchi@intel.com> wrote: >> >> >> >> > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: >> >> > > >> >> > > struct modversion_info { >> >> > >- unsigned long crc; >> >> > >- char name[MODULE_NAME_LEN]; >> >> > >+ /* Offset of the next modversion entry in relation to this one. */ >> >> > >+ u32 next; >> >> > >+ u32 crc; >> >> > >+ char name[0]; >> >> > >> >> > although not really exported as uapi, this will break userspace as this is >> >> > used in the elf file generated for the modules. I think >> >> > this change must be made in a backward compatible way and kmod updated >> >> > to deal with the variable name length: >> >> > >> >> > kmod $ git grep "\[64" >> >> > libkmod/libkmod-elf.c: char name[64 - sizeof(uint32_t)]; >> >> > libkmod/libkmod-elf.c: char name[64 - sizeof(uint64_t)]; >> >> > >> >> > in kmod we have both 32 and 64 because a 64-bit kmod can read both 32 >> >> > and 64 bit module, and vice versa. >> >> > >> >> >> >> Hi Lucas, >> >> >> >> Thanks for the information. >> >> >> >> The change can't be "truly" backward compatible, in a sense that >> >> regardless of the new format we choose, kmod would not be able to decode >> >> symbols longer than "64 - sizeof(long)" bytes. So the list it retrieves >> >> is going to be incomplete, isn't it? >> >> >> >> What kind of backward compatibility should be expected? It could be: >> >> * short symbols can still be found by old versions of kmod, but not >> >> long symbols; >> > >> >That sounds good. Not everyone is using rust, and with this option >> >people who do will need to upgrade tooling, and people who don't care >> >don't need to do anything. >> >> that could be it indeed. My main worry here is: >> >> "After the support is added in kmod, kmod needs to be able to output the >> correct information regardless if the module is from before/after the >> change in the kernel and also without relying on kernel version." >> Just changing the struct modversion_info doesn't make that possible. >> >> Maybe adding the long symbols in another section? Or ble >> just increase to 512 and add the size to a >> "__versions_hdr" section. If we then output a max size per module, >> this would offset a little bit the additional size gained for the >> modules using rust. And the additional 0's should compress well >> so I'm not sure the additional size is that much relevant here. > > > > > >I also thought of new section(s) for long symbols. > > > >One idea is to have separate sections for CRCs and symbol names. > > > > >section __version_crc: > 0x12345678 > 0x23456789 > 0x34567890 > > >section __version_sym: > "very_very_very_very_long_symbol" > "another_very_very_very_very_very_long_symbol" > "yet_another_very_very_very_very_very_long_symbol" > > > > >You can iterate in each section with this: > > crc += sizeof(u32); > name += strlen(name) + 1; > > >Benefits: > - No next pointer > - No padding > - *.mod.c is kept human readable. I like this option. It would be better than the current one. Lucas De Marchi > > > > > >BTW, the following is impossible >because the pointer reference to .rodata >is not available at this point? > >struct modversion_info { > u32 crc; > const char *name: >}; > > > >-- >Best Regards >Masahiro Yamada
On Wed, Jan 18, 2023 at 8:02 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > - *.mod.c is kept human readable. On the topic of `.mod.c` readability: for approaches that may be less readable, we could improve that by adding some extra comments or rearrange things in a different way (it is a generated file, after all!). For instance, for the original approach: https://godbolt.org/z/6oh45axnc Cheers, Miguel
On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > Currently modversion uses a fixed size array of size (64 - sizeof(long)) > to store symbol names, thus placing a hard limit on length of symbols. > Rust symbols (which encodes crate and module names) can be quite a bit > longer. The length limit in kallsyms is increased to 512 for this reason. > > It's a waste of space to simply expand the fixed array size to 512 in > modversion info entries. I therefore make it variably sized, with offset > to the next entry indicated by the initial "next" field. > > In addition to supporting longer-than-56/60 byte symbols, this patch also > reduce the size for short symbols by getting rid of excessive 0 paddings. > There are still some zero paddings to ensure "next" and "crc" fields are > properly aligned. > > This patch does have a tiny drawback that it makes ".mod.c" files generated > a bit less easy to read, as code like > > "\x08\x00\x00\x00\x78\x56\x34\x12" > "symbol\0\0" > > is generated as opposed to > > { 0x12345678, "symbol" }, > > because the structure is now variable-length. But hopefully nobody reads > the generated file :) > > Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") > Link: https://github.com/Rust-for-Linux/linux/pull/379 > > Signed-off-by: Gary Guo <gary@garyguo.net> Is there any newer version of this patch? I'm doing some tests with it, but I'm getting boot failures on ppc64 with this applied (at boot kernel is spitting out lots of oops'es and unfortunately it's really hard to copy paste or just read them from the console). Thanks, -Andrea
Hello, On Mon, Mar 13, 2023 at 09:32:16PM +0100, Andrea Righi wrote: > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > > Currently modversion uses a fixed size array of size (64 - sizeof(long)) > > to store symbol names, thus placing a hard limit on length of symbols. > > Rust symbols (which encodes crate and module names) can be quite a bit > > longer. The length limit in kallsyms is increased to 512 for this reason. > > > > It's a waste of space to simply expand the fixed array size to 512 in > > modversion info entries. I therefore make it variably sized, with offset > > to the next entry indicated by the initial "next" field. > > > > In addition to supporting longer-than-56/60 byte symbols, this patch also > > reduce the size for short symbols by getting rid of excessive 0 paddings. > > There are still some zero paddings to ensure "next" and "crc" fields are > > properly aligned. > > > > This patch does have a tiny drawback that it makes ".mod.c" files generated > > a bit less easy to read, as code like > > > > "\x08\x00\x00\x00\x78\x56\x34\x12" > > "symbol\0\0" > > > > is generated as opposed to > > > > { 0x12345678, "symbol" }, > > > > because the structure is now variable-length. But hopefully nobody reads > > the generated file :) > > > > Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") > > Link: https://github.com/Rust-for-Linux/linux/pull/379 > > > > Signed-off-by: Gary Guo <gary@garyguo.net> > > Is there any newer version of this patch? > > I'm doing some tests with it, but I'm getting boot failures on ppc64 > with this applied (at boot kernel is spitting out lots of oops'es and > unfortunately it's really hard to copy paste or just read them from the > console). Are you using the ELF ABI v1 or v2? v1 may have some additional issues when it comes to these symbol tables. Thanks Michal
On Mon, Mar 13, 2023 at 10:48:53PM +0100, Michal Suchánek wrote: > Hello, > > On Mon, Mar 13, 2023 at 09:32:16PM +0100, Andrea Righi wrote: > > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > > > Currently modversion uses a fixed size array of size (64 - sizeof(long)) > > > to store symbol names, thus placing a hard limit on length of symbols. > > > Rust symbols (which encodes crate and module names) can be quite a bit > > > longer. The length limit in kallsyms is increased to 512 for this reason. > > > > > > It's a waste of space to simply expand the fixed array size to 512 in > > > modversion info entries. I therefore make it variably sized, with offset > > > to the next entry indicated by the initial "next" field. > > > > > > In addition to supporting longer-than-56/60 byte symbols, this patch also > > > reduce the size for short symbols by getting rid of excessive 0 paddings. > > > There are still some zero paddings to ensure "next" and "crc" fields are > > > properly aligned. > > > > > > This patch does have a tiny drawback that it makes ".mod.c" files generated > > > a bit less easy to read, as code like > > > > > > "\x08\x00\x00\x00\x78\x56\x34\x12" > > > "symbol\0\0" > > > > > > is generated as opposed to > > > > > > { 0x12345678, "symbol" }, > > > > > > because the structure is now variable-length. But hopefully nobody reads > > > the generated file :) > > > > > > Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") > > > Link: https://github.com/Rust-for-Linux/linux/pull/379 > > > > > > Signed-off-by: Gary Guo <gary@garyguo.net> > > > > Is there any newer version of this patch? > > > > I'm doing some tests with it, but I'm getting boot failures on ppc64 > > with this applied (at boot kernel is spitting out lots of oops'es and > > unfortunately it's really hard to copy paste or just read them from the > > console). > > Are you using the ELF ABI v1 or v2? > > v1 may have some additional issues when it comes to these symbol tables. > > Thanks > > Michal I have CONFIG_PPC64_ELF_ABI_V2=y in my .config, so I guess I'm using v2. BTW, the issue seems to be in dedotify_versions(), as a silly test I tried to comment out this function completely to be a no-op and now my system boots fine (but I guess I'm probably breaking something else). Thanks, -Andrea
On Mon, Mar 13, 2023 at 10:53:34PM +0100, Andrea Righi wrote: > On Mon, Mar 13, 2023 at 10:48:53PM +0100, Michal Suchánek wrote: > > Hello, > > > > On Mon, Mar 13, 2023 at 09:32:16PM +0100, Andrea Righi wrote: > > > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > > > > Currently modversion uses a fixed size array of size (64 - sizeof(long)) > > > > to store symbol names, thus placing a hard limit on length of symbols. > > > > Rust symbols (which encodes crate and module names) can be quite a bit > > > > longer. The length limit in kallsyms is increased to 512 for this reason. > > > > > > > > It's a waste of space to simply expand the fixed array size to 512 in > > > > modversion info entries. I therefore make it variably sized, with offset > > > > to the next entry indicated by the initial "next" field. > > > > > > > > In addition to supporting longer-than-56/60 byte symbols, this patch also > > > > reduce the size for short symbols by getting rid of excessive 0 paddings. > > > > There are still some zero paddings to ensure "next" and "crc" fields are > > > > properly aligned. > > > > > > > > This patch does have a tiny drawback that it makes ".mod.c" files generated > > > > a bit less easy to read, as code like > > > > > > > > "\x08\x00\x00\x00\x78\x56\x34\x12" > > > > "symbol\0\0" > > > > > > > > is generated as opposed to > > > > > > > > { 0x12345678, "symbol" }, > > > > > > > > because the structure is now variable-length. But hopefully nobody reads > > > > the generated file :) > > > > > > > > Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") > > > > Link: https://github.com/Rust-for-Linux/linux/pull/379 > > > > > > > > Signed-off-by: Gary Guo <gary@garyguo.net> > > > > > > Is there any newer version of this patch? > > > > > > I'm doing some tests with it, but I'm getting boot failures on ppc64 > > > with this applied (at boot kernel is spitting out lots of oops'es and > > > unfortunately it's really hard to copy paste or just read them from the > > > console). > > > > Are you using the ELF ABI v1 or v2? > > > > v1 may have some additional issues when it comes to these symbol tables. > > > > Thanks > > > > Michal > > I have CONFIG_PPC64_ELF_ABI_V2=y in my .config, so I guess I'm using v2. > > BTW, the issue seems to be in dedotify_versions(), as a silly test I > tried to comment out this function completely to be a no-op and now my > system boots fine (but I guess I'm probably breaking something else). Probably not. You should not have the extra leading dot on ABI v2. So if dedotify does something that means something generates and then expects back symbols with a leading dot, and this workaround for ABI v1 breaks that. Or maybe it is called when it shouldn't. Thanks Michal
On Mon, Mar 13, 2023 at 11:02:34PM +0100, Michal Suchánek wrote: > On Mon, Mar 13, 2023 at 10:53:34PM +0100, Andrea Righi wrote: > > On Mon, Mar 13, 2023 at 10:48:53PM +0100, Michal Suchánek wrote: > > > Hello, > > > > > > On Mon, Mar 13, 2023 at 09:32:16PM +0100, Andrea Righi wrote: > > > > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > > > > > Currently modversion uses a fixed size array of size (64 - sizeof(long)) > > > > > to store symbol names, thus placing a hard limit on length of symbols. > > > > > Rust symbols (which encodes crate and module names) can be quite a bit > > > > > longer. The length limit in kallsyms is increased to 512 for this reason. > > > > > > > > > > It's a waste of space to simply expand the fixed array size to 512 in > > > > > modversion info entries. I therefore make it variably sized, with offset > > > > > to the next entry indicated by the initial "next" field. > > > > > > > > > > In addition to supporting longer-than-56/60 byte symbols, this patch also > > > > > reduce the size for short symbols by getting rid of excessive 0 paddings. > > > > > There are still some zero paddings to ensure "next" and "crc" fields are > > > > > properly aligned. > > > > > > > > > > This patch does have a tiny drawback that it makes ".mod.c" files generated > > > > > a bit less easy to read, as code like > > > > > > > > > > "\x08\x00\x00\x00\x78\x56\x34\x12" > > > > > "symbol\0\0" > > > > > > > > > > is generated as opposed to > > > > > > > > > > { 0x12345678, "symbol" }, > > > > > > > > > > because the structure is now variable-length. But hopefully nobody reads > > > > > the generated file :) > > > > > > > > > > Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") > > > > > Link: https://github.com/Rust-for-Linux/linux/pull/379 > > > > > > > > > > Signed-off-by: Gary Guo <gary@garyguo.net> > > > > > > > > Is there any newer version of this patch? > > > > > > > > I'm doing some tests with it, but I'm getting boot failures on ppc64 > > > > with this applied (at boot kernel is spitting out lots of oops'es and > > > > unfortunately it's really hard to copy paste or just read them from the > > > > console). > > > > > > Are you using the ELF ABI v1 or v2? > > > > > > v1 may have some additional issues when it comes to these symbol tables. > > > > > > Thanks > > > > > > Michal > > > > I have CONFIG_PPC64_ELF_ABI_V2=y in my .config, so I guess I'm using v2. > > > > BTW, the issue seems to be in dedotify_versions(), as a silly test I > > tried to comment out this function completely to be a no-op and now my > > system boots fine (but I guess I'm probably breaking something else). > > Probably not. You should not have the extra leading dot on ABI v2. So if > dedotify does something that means something generates and then expects > back symbols with a leading dot, and this workaround for ABI v1 breaks > that. Or maybe it is called when it shouldn't. Hm.. I'll add some debugging to this function to see what happens exactly. -Andrea
On Mon, Mar 13, 2023 at 11:09:31PM +0100, Andrea Righi wrote: > On Mon, Mar 13, 2023 at 11:02:34PM +0100, Michal Suchánek wrote: > > On Mon, Mar 13, 2023 at 10:53:34PM +0100, Andrea Righi wrote: > > > On Mon, Mar 13, 2023 at 10:48:53PM +0100, Michal Suchánek wrote: > > > > Hello, > > > > > > > > On Mon, Mar 13, 2023 at 09:32:16PM +0100, Andrea Righi wrote: > > > > > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > > > > > > Currently modversion uses a fixed size array of size (64 - sizeof(long)) > > > > > > to store symbol names, thus placing a hard limit on length of symbols. > > > > > > Rust symbols (which encodes crate and module names) can be quite a bit > > > > > > longer. The length limit in kallsyms is increased to 512 for this reason. > > > > > > > > > > > > It's a waste of space to simply expand the fixed array size to 512 in > > > > > > modversion info entries. I therefore make it variably sized, with offset > > > > > > to the next entry indicated by the initial "next" field. > > > > > > > > > > > > In addition to supporting longer-than-56/60 byte symbols, this patch also > > > > > > reduce the size for short symbols by getting rid of excessive 0 paddings. > > > > > > There are still some zero paddings to ensure "next" and "crc" fields are > > > > > > properly aligned. > > > > > > > > > > > > This patch does have a tiny drawback that it makes ".mod.c" files generated > > > > > > a bit less easy to read, as code like > > > > > > > > > > > > "\x08\x00\x00\x00\x78\x56\x34\x12" > > > > > > "symbol\0\0" > > > > > > > > > > > > is generated as opposed to > > > > > > > > > > > > { 0x12345678, "symbol" }, > > > > > > > > > > > > because the structure is now variable-length. But hopefully nobody reads > > > > > > the generated file :) > > > > > > > > > > > > Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") > > > > > > Link: https://github.com/Rust-for-Linux/linux/pull/379 > > > > > > > > > > > > Signed-off-by: Gary Guo <gary@garyguo.net> > > > > > > > > > > Is there any newer version of this patch? > > > > > > > > > > I'm doing some tests with it, but I'm getting boot failures on ppc64 > > > > > with this applied (at boot kernel is spitting out lots of oops'es and > > > > > unfortunately it's really hard to copy paste or just read them from the > > > > > console). > > > > > > > > Are you using the ELF ABI v1 or v2? > > > > > > > > v1 may have some additional issues when it comes to these symbol tables. > > > > > > > > Thanks > > > > > > > > Michal > > > > > > I have CONFIG_PPC64_ELF_ABI_V2=y in my .config, so I guess I'm using v2. > > > > > > BTW, the issue seems to be in dedotify_versions(), as a silly test I > > > tried to comment out this function completely to be a no-op and now my > > > system boots fine (but I guess I'm probably breaking something else). > > > > Probably not. You should not have the extra leading dot on ABI v2. So if > > dedotify does something that means something generates and then expects > > back symbols with a leading dot, and this workaround for ABI v1 breaks > > that. Or maybe it is called when it shouldn't. > > Hm.. I'll add some debugging to this function to see what happens exactly. Alright I've done more tests across different architectures. My problem with ppc64 is that this architecture is evaluating sechdrs[i].sh_size using get_stubs_size(), that apparently can add some extra padding, so doing (vers + vers->next < end) isn't a reliable check to determine the end of the variable array, because sometimes "end" can be greater than the last "vers + vers->next" entry. In general I think it'd be more reliable to add a dummy NULL entry at the end of the modversion array. Moreover, I think we also need to enforce struct modversion_info to be __packed, just to make sure that no extra padding is added (otherwise it may break our logic to determine the offset of the next entry). > @@ -2062,16 +2066,25 @@ static void add_versions(struct buffer *b, struct module *mod) > s->name, mod->name); > continue; > } > - if (strlen(s->name) >= MODULE_NAME_LEN) { > - error("too long symbol \"%s\" [%s.ko]\n", > - s->name, mod->name); > - break; > - } > - buf_printf(b, "\t{ %#8x, \"%s\" },\n", > - s->crc, s->name); > + name_len = strlen(s->name); > + name_len_padded = (name_len + 1 + 3) & ~3; > + > + /* Offset to next entry */ > + tmp = TO_NATIVE(8 + name_len_padded); ^ Here's another issue that I found, you can't use TO_NATIVE() in this way, some compilers are complaining (like on s390x this doesn't build). So we need to do something like: /* Offset to next entry */ tmp = 8 + name_len_padded tmp = TO_NATIVE(tmp); I'll do some additional tests with these changes and send an updated patch (for those that are interested). -Andrea
On Tue, Mar 14, 2023 at 03:38:24PM +0100, Andrea Righi wrote: > On Mon, Mar 13, 2023 at 11:09:31PM +0100, Andrea Righi wrote: > > On Mon, Mar 13, 2023 at 11:02:34PM +0100, Michal Suchánek wrote: > > > On Mon, Mar 13, 2023 at 10:53:34PM +0100, Andrea Righi wrote: > > > > On Mon, Mar 13, 2023 at 10:48:53PM +0100, Michal Suchánek wrote: > > > > > Hello, > > > > > > > > > > On Mon, Mar 13, 2023 at 09:32:16PM +0100, Andrea Righi wrote: > > > > > > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > > > > > > > Currently modversion uses a fixed size array of size (64 - sizeof(long)) > > > > > > > to store symbol names, thus placing a hard limit on length of symbols. > > > > > > > Rust symbols (which encodes crate and module names) can be quite a bit > > > > > > > longer. The length limit in kallsyms is increased to 512 for this reason. > > > > > > > > > > > > > > It's a waste of space to simply expand the fixed array size to 512 in > > > > > > > modversion info entries. I therefore make it variably sized, with offset > > > > > > > to the next entry indicated by the initial "next" field. > > > > > > > > > > > > > > In addition to supporting longer-than-56/60 byte symbols, this patch also > > > > > > > reduce the size for short symbols by getting rid of excessive 0 paddings. > > > > > > > There are still some zero paddings to ensure "next" and "crc" fields are > > > > > > > properly aligned. > > > > > > > > > > > > > > This patch does have a tiny drawback that it makes ".mod.c" files generated > > > > > > > a bit less easy to read, as code like > > > > > > > > > > > > > > "\x08\x00\x00\x00\x78\x56\x34\x12" > > > > > > > "symbol\0\0" > > > > > > > > > > > > > > is generated as opposed to > > > > > > > > > > > > > > { 0x12345678, "symbol" }, > > > > > > > > > > > > > > because the structure is now variable-length. But hopefully nobody reads > > > > > > > the generated file :) > > > > > > > > > > > > > > Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") > > > > > > > Link: https://github.com/Rust-for-Linux/linux/pull/379 > > > > > > > > > > > > > > Signed-off-by: Gary Guo <gary@garyguo.net> > > > > > > > > > > > > Is there any newer version of this patch? > > > > > > > > > > > > I'm doing some tests with it, but I'm getting boot failures on ppc64 > > > > > > with this applied (at boot kernel is spitting out lots of oops'es and > > > > > > unfortunately it's really hard to copy paste or just read them from the > > > > > > console). > > > > > > > > > > Are you using the ELF ABI v1 or v2? > > > > > > > > > > v1 may have some additional issues when it comes to these symbol tables. > > > > > > > > > > Thanks > > > > > > > > > > Michal > > > > > > > > I have CONFIG_PPC64_ELF_ABI_V2=y in my .config, so I guess I'm using v2. > > > > > > > > BTW, the issue seems to be in dedotify_versions(), as a silly test I > > > > tried to comment out this function completely to be a no-op and now my > > > > system boots fine (but I guess I'm probably breaking something else). > > > > > > Probably not. You should not have the extra leading dot on ABI v2. So if > > > dedotify does something that means something generates and then expects > > > back symbols with a leading dot, and this workaround for ABI v1 breaks > > > that. Or maybe it is called when it shouldn't. > > > > Hm.. I'll add some debugging to this function to see what happens exactly. > > Alright I've done more tests across different architectures. My problem > with ppc64 is that this architecture is evaluating sechdrs[i].sh_size > using get_stubs_size(), that apparently can add some extra padding, so > doing (vers + vers->next < end) isn't a reliable check to determine the > end of the variable array, because sometimes "end" can be greater than > the last "vers + vers->next" entry. > > In general I think it'd be more reliable to add a dummy NULL entry at > the end of the modversion array. > > Moreover, I think we also need to enforce struct modversion_info to be > __packed, just to make sure that no extra padding is added (otherwise it > may break our logic to determine the offset of the next entry). > > > @@ -2062,16 +2066,25 @@ static void add_versions(struct buffer *b, struct module *mod) > > s->name, mod->name); > > continue; > > } > > - if (strlen(s->name) >= MODULE_NAME_LEN) { > > - error("too long symbol \"%s\" [%s.ko]\n", > > - s->name, mod->name); > > - break; > > - } > > - buf_printf(b, "\t{ %#8x, \"%s\" },\n", > > - s->crc, s->name); > > + name_len = strlen(s->name); > > + name_len_padded = (name_len + 1 + 3) & ~3; > > + > > + /* Offset to next entry */ > > + tmp = TO_NATIVE(8 + name_len_padded); > > ^ Here's another issue that I found, you can't use TO_NATIVE() in this > way, some compilers are complaining (like on s390x this doesn't build). > > So we need to do something like: > > /* Offset to next entry */ > tmp = 8 + name_len_padded > tmp = TO_NATIVE(tmp); > > I'll do some additional tests with these changes and send an updated > patch (for those that are interested). > > -Andrea In practice, this is what I'm testing at the moment: --- arch/powerpc/kernel/module_64.c | 5 +++-- include/linux/module.h | 8 +++++--- kernel/module/version.c | 22 ++++++++++----------- scripts/export_report.pl | 9 +++++---- scripts/mod/modpost.c | 35 +++++++++++++++++++++++---------- 5 files changed, 48 insertions(+), 31 deletions(-) diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c index ff045644f13f..ea6c830ed1e7 100644 --- a/arch/powerpc/kernel/module_64.c +++ b/arch/powerpc/kernel/module_64.c @@ -234,12 +234,13 @@ static unsigned long get_stubs_size(const Elf64_Ehdr *hdr, static void dedotify_versions(struct modversion_info *vers, unsigned long size) { - struct modversion_info *end; + struct modversion_info *end = (void *)vers + size; - for (end = (void *)vers + size; vers < end; vers++) + for (; vers < end && vers->next; vers = (void *)vers + vers->next) { if (vers->name[0] == '.') { memmove(vers->name, vers->name+1, strlen(vers->name)); } + } } /* diff --git a/include/linux/module.h b/include/linux/module.h index 8c5909c0076c..4744901bdf63 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -34,9 +34,11 @@ #define MODULE_NAME_LEN MAX_PARAM_PREFIX_LEN struct modversion_info { - unsigned long crc; - char name[MODULE_NAME_LEN]; -}; + /* Offset of the next modversion entry in relation to this one. */ + u32 next; + u32 crc; + char name[0]; +} __packed; struct module; struct exception_table_entry; diff --git a/kernel/module/version.c b/kernel/module/version.c index 53f43ac5a73e..5528f98c42dc 100644 --- a/kernel/module/version.c +++ b/kernel/module/version.c @@ -17,32 +17,30 @@ int check_version(const struct load_info *info, { Elf_Shdr *sechdrs = info->sechdrs; unsigned int versindex = info->index.vers; - unsigned int i, num_versions; - struct modversion_info *versions; + struct modversion_info *versions, *end; + u32 crcval; /* Exporting module didn't supply crcs? OK, we're already tainted. */ if (!crc) return 1; + crcval = *crc; /* No versions at all? modprobe --force does this. */ if (versindex == 0) return try_to_force_load(mod, symname) == 0; versions = (void *)sechdrs[versindex].sh_addr; - num_versions = sechdrs[versindex].sh_size - / sizeof(struct modversion_info); + end = (void *)versions + sechdrs[versindex].sh_size; - for (i = 0; i < num_versions; i++) { - u32 crcval; - - if (strcmp(versions[i].name, symname) != 0) + for (; versions < end && versions->next; + versions = (void *)versions + versions->next) { + if (strcmp(versions->name, symname) != 0) continue; - crcval = *crc; - if (versions[i].crc == crcval) + if (versions->crc == crcval) return 1; - pr_debug("Found checksum %X vs module %lX\n", - crcval, versions[i].crc); + pr_debug("Found checksum %X vs module %X\n", + crcval, versions->crc); goto bad_version; } diff --git a/scripts/export_report.pl b/scripts/export_report.pl index feb3d5542a62..1117646f3141 100755 --- a/scripts/export_report.pl +++ b/scripts/export_report.pl @@ -116,18 +116,19 @@ foreach my $thismod (@allcfiles) { while ( <$module> ) { chomp; if ($state == 0) { - $state = 1 if ($_ =~ /static const struct modversion_info/); + $state = 1 if ($_ =~ /static const char ____versions/); next; } if ($state == 1) { - $state = 2 if ($_ =~ /__attribute__\(\(section\("__versions"\)\)\)/); + $state = 2 if ($_ =~ /__used __section\("__versions"\)/); next; } if ($state == 2) { - if ( $_ !~ /0x[0-9a-f]+,/ ) { + if ( $_ !~ /\\0"/ ) { + last if ($_ =~ /;/); next; } - my $sym = (split /([,"])/,)[4]; + my $sym = (split /(["\\])/,)[2]; my ($module, $value, $symbol, $gpl) = @{$SYMBOL{$sym}}; $SYMBOL{ $sym } = [ $module, $value+1, $symbol, $gpl]; push(@{$MODULE{$thismod}} , $sym); diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c index efff8078e395..55335ae98f4f 100644 --- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -2046,13 +2046,17 @@ static void add_exported_symbols(struct buffer *buf, struct module *mod) static void add_versions(struct buffer *b, struct module *mod) { struct symbol *s; + unsigned int name_len; + unsigned int name_len_padded; + unsigned int tmp; + unsigned char *tmp_view = (unsigned char *)&tmp; if (!modversions) return; buf_printf(b, "\n"); - buf_printf(b, "static const struct modversion_info ____versions[]\n"); - buf_printf(b, "__used __section(\"__versions\") = {\n"); + buf_printf(b, "static const char ____versions[]\n"); + buf_printf(b, "__used __section(\"__versions\") =\n"); list_for_each_entry(s, &mod->unresolved_symbols, list) { if (!s->module) @@ -2062,16 +2066,27 @@ static void add_versions(struct buffer *b, struct module *mod) s->name, mod->name); continue; } - if (strlen(s->name) >= MODULE_NAME_LEN) { - error("too long symbol \"%s\" [%s.ko]\n", - s->name, mod->name); - break; - } - buf_printf(b, "\t{ %#8x, \"%s\" },\n", - s->crc, s->name); + name_len = strlen(s->name); + name_len_padded = (name_len + 1 + 3) & ~3; + + /* Offset to next entry */ + tmp = 8 + name_len_padded; + tmp = TO_NATIVE(tmp); + buf_printf(b, "\t\"\\x%02x\\x%02x\\x%02x\\x%02x", + tmp_view[0], tmp_view[1], tmp_view[2], tmp_view[3]); + + tmp = TO_NATIVE(s->crc); + buf_printf(b, "\\x%02x\\x%02x\\x%02x\\x%02x\"\n", + tmp_view[0], tmp_view[1], tmp_view[2], tmp_view[3]); + + buf_printf(b, "\t\"%s", s->name); + for (; name_len < name_len_padded; name_len++) + buf_printf(b, "\\0"); + buf_printf(b, "\"\n"); } - buf_printf(b, "};\n"); + /* Always end with a dummy NULL entry */ + buf_printf(b, "\t\"\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\";\n"); } static void add_depends(struct buffer *b, struct module *mod)
> In practice, this is what I'm testing at the moment: > > --- > diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c > index ff045644f13f..ea6c830ed1e7 100644 > --- a/arch/powerpc/kernel/module_64.c > +++ b/arch/powerpc/kernel/module_64.c > @@ -234,12 +234,13 @@ static unsigned long get_stubs_size(const Elf64_Ehdr *hdr, > static void dedotify_versions(struct modversion_info *vers, > unsigned long size) > { > - struct modversion_info *end; > + struct modversion_info *end = (void *)vers + size; > > - for (end = (void *)vers + size; vers < end; vers++) > + for (; vers < end && vers->next; vers = (void *)vers + vers->next) { > if (vers->name[0] == '.') { > memmove(vers->name, vers->name+1, strlen(vers->name)); > } > + } > } > > /* > diff --git a/include/linux/module.h b/include/linux/module.h > index 8c5909c0076c..4744901bdf63 100644 > --- a/include/linux/module.h > +++ b/include/linux/module.h > @@ -34,9 +34,11 @@ > #define MODULE_NAME_LEN MAX_PARAM_PREFIX_LEN > > struct modversion_info { > - unsigned long crc; > - char name[MODULE_NAME_LEN]; > -}; > + /* Offset of the next modversion entry in relation to this one. */ > + u32 next; > + u32 crc; > + char name[0]; > +} __packed; > > struct module; > struct exception_table_entry; > diff --git a/kernel/module/version.c b/kernel/module/version.c > index 53f43ac5a73e..5528f98c42dc 100644 > --- a/kernel/module/version.c > +++ b/kernel/module/version.c > @@ -17,32 +17,30 @@ int check_version(const struct load_info *info, > { > Elf_Shdr *sechdrs = info->sechdrs; > unsigned int versindex = info->index.vers; > - unsigned int i, num_versions; > - struct modversion_info *versions; > + struct modversion_info *versions, *end; > + u32 crcval; > > /* Exporting module didn't supply crcs? OK, we're already tainted. */ > if (!crc) > return 1; > + crcval = *crc; > > /* No versions at all? modprobe --force does this. */ > if (versindex == 0) > return try_to_force_load(mod, symname) == 0; > > versions = (void *)sechdrs[versindex].sh_addr; > - num_versions = sechdrs[versindex].sh_size > - / sizeof(struct modversion_info); > + end = (void *)versions + sechdrs[versindex].sh_size; > > - for (i = 0; i < num_versions; i++) { > - u32 crcval; > - > - if (strcmp(versions[i].name, symname) != 0) > + for (; versions < end && versions->next; > + versions = (void *)versions + versions->next) { > + if (strcmp(versions->name, symname) != 0) > continue; > > - crcval = *crc; > - if (versions[i].crc == crcval) > + if (versions->crc == crcval) > return 1; > - pr_debug("Found checksum %X vs module %lX\n", > - crcval, versions[i].crc); > + pr_debug("Found checksum %X vs module %X\n", > + crcval, versions->crc); > goto bad_version; > } > > diff --git a/scripts/export_report.pl b/scripts/export_report.pl > index feb3d5542a62..1117646f3141 100755 > --- a/scripts/export_report.pl > +++ b/scripts/export_report.pl > @@ -116,18 +116,19 @@ foreach my $thismod (@allcfiles) { > while ( <$module> ) { > chomp; > if ($state == 0) { > - $state = 1 if ($_ =~ /static const struct modversion_info/); > + $state = 1 if ($_ =~ /static const char ____versions/); > next; > } > if ($state == 1) { > - $state = 2 if ($_ =~ /__attribute__\(\(section\("__versions"\)\)\)/); > + $state = 2 if ($_ =~ /__used __section\("__versions"\)/); > next; > } > if ($state == 2) { > - if ( $_ !~ /0x[0-9a-f]+,/ ) { > + if ( $_ !~ /\\0"/ ) { > + last if ($_ =~ /;/); > next; > } > - my $sym = (split /([,"])/,)[4]; > + my $sym = (split /(["\\])/,)[2]; > my ($module, $value, $symbol, $gpl) = @{$SYMBOL{$sym}}; > $SYMBOL{ $sym } = [ $module, $value+1, $symbol, $gpl]; > push(@{$MODULE{$thismod}} , $sym); > diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c > index efff8078e395..55335ae98f4f 100644 > --- a/scripts/mod/modpost.c > +++ b/scripts/mod/modpost.c > @@ -2046,13 +2046,17 @@ static void add_exported_symbols(struct buffer *buf, struct module *mod) > static void add_versions(struct buffer *b, struct module *mod) > { > struct symbol *s; > + unsigned int name_len; > + unsigned int name_len_padded; > + unsigned int tmp; > + unsigned char *tmp_view = (unsigned char *)&tmp; > > if (!modversions) > return; > > buf_printf(b, "\n"); > - buf_printf(b, "static const struct modversion_info ____versions[]\n"); > - buf_printf(b, "__used __section(\"__versions\") = {\n"); > + buf_printf(b, "static const char ____versions[]\n"); > + buf_printf(b, "__used __section(\"__versions\") =\n"); > > list_for_each_entry(s, &mod->unresolved_symbols, list) { > if (!s->module) > @@ -2062,16 +2066,27 @@ static void add_versions(struct buffer *b, struct module *mod) > s->name, mod->name); > continue; > } > - if (strlen(s->name) >= MODULE_NAME_LEN) { > - error("too long symbol \"%s\" [%s.ko]\n", > - s->name, mod->name); > - break; > - } > - buf_printf(b, "\t{ %#8x, \"%s\" },\n", > - s->crc, s->name); > + name_len = strlen(s->name); > + name_len_padded = (name_len + 1 + 3) & ~3; > + > + /* Offset to next entry */ > + tmp = 8 + name_len_padded; > + tmp = TO_NATIVE(tmp); > + buf_printf(b, "\t\"\\x%02x\\x%02x\\x%02x\\x%02x", > + tmp_view[0], tmp_view[1], tmp_view[2], tmp_view[3]); > + > + tmp = TO_NATIVE(s->crc); > + buf_printf(b, "\\x%02x\\x%02x\\x%02x\\x%02x\"\n", > + tmp_view[0], tmp_view[1], tmp_view[2], tmp_view[3]); > + > + buf_printf(b, "\t\"%s", s->name); > + for (; name_len < name_len_padded; name_len++) > + buf_printf(b, "\\0"); > + buf_printf(b, "\"\n"); > } > > - buf_printf(b, "};\n"); > + /* Always end with a dummy NULL entry */ > + buf_printf(b, "\t\"\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\";\n"); > } > > static void add_depends(struct buffer *b, struct module *mod) This looks promissing, do you had a small reproducer to this just to make the code a little bit clearn to the reader? Ciao, Vincent.
On Wed, Mar 15, 2023 at 01:15:03AM +0100, Vincenzo Palazzo wrote: > > In practice, this is what I'm testing at the moment: > > > > --- > > diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c > > index ff045644f13f..ea6c830ed1e7 100644 > > --- a/arch/powerpc/kernel/module_64.c > > +++ b/arch/powerpc/kernel/module_64.c > > @@ -234,12 +234,13 @@ static unsigned long get_stubs_size(const Elf64_Ehdr *hdr, > > static void dedotify_versions(struct modversion_info *vers, > > unsigned long size) > > { > > - struct modversion_info *end; > > + struct modversion_info *end = (void *)vers + size; > > > > - for (end = (void *)vers + size; vers < end; vers++) > > + for (; vers < end && vers->next; vers = (void *)vers + vers->next) { > > if (vers->name[0] == '.') { > > memmove(vers->name, vers->name+1, strlen(vers->name)); > > } > > + } > > } > > > > /* > > diff --git a/include/linux/module.h b/include/linux/module.h > > index 8c5909c0076c..4744901bdf63 100644 > > --- a/include/linux/module.h > > +++ b/include/linux/module.h > > @@ -34,9 +34,11 @@ > > #define MODULE_NAME_LEN MAX_PARAM_PREFIX_LEN > > > > struct modversion_info { > > - unsigned long crc; > > - char name[MODULE_NAME_LEN]; > > -}; > > + /* Offset of the next modversion entry in relation to this one. */ > > + u32 next; > > + u32 crc; > > + char name[0]; > > +} __packed; > > > > struct module; > > struct exception_table_entry; > > diff --git a/kernel/module/version.c b/kernel/module/version.c > > index 53f43ac5a73e..5528f98c42dc 100644 > > --- a/kernel/module/version.c > > +++ b/kernel/module/version.c > > @@ -17,32 +17,30 @@ int check_version(const struct load_info *info, > > { > > Elf_Shdr *sechdrs = info->sechdrs; > > unsigned int versindex = info->index.vers; > > - unsigned int i, num_versions; > > - struct modversion_info *versions; > > + struct modversion_info *versions, *end; > > + u32 crcval; > > > > /* Exporting module didn't supply crcs? OK, we're already tainted. */ > > if (!crc) > > return 1; > > + crcval = *crc; > > > > /* No versions at all? modprobe --force does this. */ > > if (versindex == 0) > > return try_to_force_load(mod, symname) == 0; > > > > versions = (void *)sechdrs[versindex].sh_addr; > > - num_versions = sechdrs[versindex].sh_size > > - / sizeof(struct modversion_info); > > + end = (void *)versions + sechdrs[versindex].sh_size; > > > > - for (i = 0; i < num_versions; i++) { > > - u32 crcval; > > - > > - if (strcmp(versions[i].name, symname) != 0) > > + for (; versions < end && versions->next; > > + versions = (void *)versions + versions->next) { > > + if (strcmp(versions->name, symname) != 0) > > continue; > > > > - crcval = *crc; > > - if (versions[i].crc == crcval) > > + if (versions->crc == crcval) > > return 1; > > - pr_debug("Found checksum %X vs module %lX\n", > > - crcval, versions[i].crc); > > + pr_debug("Found checksum %X vs module %X\n", > > + crcval, versions->crc); > > goto bad_version; > > } > > > > diff --git a/scripts/export_report.pl b/scripts/export_report.pl > > index feb3d5542a62..1117646f3141 100755 > > --- a/scripts/export_report.pl > > +++ b/scripts/export_report.pl > > @@ -116,18 +116,19 @@ foreach my $thismod (@allcfiles) { > > while ( <$module> ) { > > chomp; > > if ($state == 0) { > > - $state = 1 if ($_ =~ /static const struct modversion_info/); > > + $state = 1 if ($_ =~ /static const char ____versions/); > > next; > > } > > if ($state == 1) { > > - $state = 2 if ($_ =~ /__attribute__\(\(section\("__versions"\)\)\)/); > > + $state = 2 if ($_ =~ /__used __section\("__versions"\)/); > > next; > > } > > if ($state == 2) { > > - if ( $_ !~ /0x[0-9a-f]+,/ ) { > > + if ( $_ !~ /\\0"/ ) { > > + last if ($_ =~ /;/); > > next; > > } > > - my $sym = (split /([,"])/,)[4]; > > + my $sym = (split /(["\\])/,)[2]; > > my ($module, $value, $symbol, $gpl) = @{$SYMBOL{$sym}}; > > $SYMBOL{ $sym } = [ $module, $value+1, $symbol, $gpl]; > > push(@{$MODULE{$thismod}} , $sym); > > diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c > > index efff8078e395..55335ae98f4f 100644 > > --- a/scripts/mod/modpost.c > > +++ b/scripts/mod/modpost.c > > @@ -2046,13 +2046,17 @@ static void add_exported_symbols(struct buffer *buf, struct module *mod) > > static void add_versions(struct buffer *b, struct module *mod) > > { > > struct symbol *s; > > + unsigned int name_len; > > + unsigned int name_len_padded; > > + unsigned int tmp; > > + unsigned char *tmp_view = (unsigned char *)&tmp; > > > > if (!modversions) > > return; > > > > buf_printf(b, "\n"); > > - buf_printf(b, "static const struct modversion_info ____versions[]\n"); > > - buf_printf(b, "__used __section(\"__versions\") = {\n"); > > + buf_printf(b, "static const char ____versions[]\n"); > > + buf_printf(b, "__used __section(\"__versions\") =\n"); > > > > list_for_each_entry(s, &mod->unresolved_symbols, list) { > > if (!s->module) > > @@ -2062,16 +2066,27 @@ static void add_versions(struct buffer *b, struct module *mod) > > s->name, mod->name); > > continue; > > } > > - if (strlen(s->name) >= MODULE_NAME_LEN) { > > - error("too long symbol \"%s\" [%s.ko]\n", > > - s->name, mod->name); > > - break; > > - } > > - buf_printf(b, "\t{ %#8x, \"%s\" },\n", > > - s->crc, s->name); > > + name_len = strlen(s->name); > > + name_len_padded = (name_len + 1 + 3) & ~3; > > + > > + /* Offset to next entry */ > > + tmp = 8 + name_len_padded; > > + tmp = TO_NATIVE(tmp); > > + buf_printf(b, "\t\"\\x%02x\\x%02x\\x%02x\\x%02x", > > + tmp_view[0], tmp_view[1], tmp_view[2], tmp_view[3]); > > + > > + tmp = TO_NATIVE(s->crc); > > + buf_printf(b, "\\x%02x\\x%02x\\x%02x\\x%02x\"\n", > > + tmp_view[0], tmp_view[1], tmp_view[2], tmp_view[3]); > > + > > + buf_printf(b, "\t\"%s", s->name); > > + for (; name_len < name_len_padded; name_len++) > > + buf_printf(b, "\\0"); > > + buf_printf(b, "\"\n"); > > } > > > > - buf_printf(b, "};\n"); > > + /* Always end with a dummy NULL entry */ > > + buf_printf(b, "\t\"\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\";\n"); > > } > > > > static void add_depends(struct buffer *b, struct module *mod) > This looks promissing, do you had a small reproducer to this just to > make the code a little bit clearn to the reader? My reproducer was to simply boot the latest 6.2 kernel with this patch appplied on top on a ppc64 server (without the additional changes the system just fails to boot as soon as the first module is loaded, spitting random errors/oops to the console, due to random memory accesses). I don't have a smaller reproducer than this, maybe I can clarify a bit more the purpose of the last dummy entry in the comment to help the reader understand why we need this empty entry (if I understand your request correctly). For the TO_NATIVE(tmp) change, instead, it's enough to build the kernel with a recent gcc (I'm using gcc 12.2.0-17ubuntu1) and the build just fails. -Andrea
On Tue, Mar 14, 2023 at 11:38 PM Andrea Righi <andrea.righi@canonical.com> wrote: > > On Mon, Mar 13, 2023 at 11:09:31PM +0100, Andrea Righi wrote: > > On Mon, Mar 13, 2023 at 11:02:34PM +0100, Michal Suchánek wrote: > > > On Mon, Mar 13, 2023 at 10:53:34PM +0100, Andrea Righi wrote: > > > > On Mon, Mar 13, 2023 at 10:48:53PM +0100, Michal Suchánek wrote: > > > > > Hello, > > > > > > > > > > On Mon, Mar 13, 2023 at 09:32:16PM +0100, Andrea Righi wrote: > > > > > > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > > > > > > > Currently modversion uses a fixed size array of size (64 - sizeof(long)) > > > > > > > to store symbol names, thus placing a hard limit on length of symbols. > > > > > > > Rust symbols (which encodes crate and module names) can be quite a bit > > > > > > > longer. The length limit in kallsyms is increased to 512 for this reason. > > > > > > > > > > > > > > It's a waste of space to simply expand the fixed array size to 512 in > > > > > > > modversion info entries. I therefore make it variably sized, with offset > > > > > > > to the next entry indicated by the initial "next" field. > > > > > > > > > > > > > > In addition to supporting longer-than-56/60 byte symbols, this patch also > > > > > > > reduce the size for short symbols by getting rid of excessive 0 paddings. > > > > > > > There are still some zero paddings to ensure "next" and "crc" fields are > > > > > > > properly aligned. > > > > > > > > > > > > > > This patch does have a tiny drawback that it makes ".mod.c" files generated > > > > > > > a bit less easy to read, as code like > > > > > > > > > > > > > > "\x08\x00\x00\x00\x78\x56\x34\x12" > > > > > > > "symbol\0\0" > > > > > > > > > > > > > > is generated as opposed to > > > > > > > > > > > > > > { 0x12345678, "symbol" }, > > > > > > > > > > > > > > because the structure is now variable-length. But hopefully nobody reads > > > > > > > the generated file :) > > > > > > > > > > > > > > Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") > > > > > > > Link: https://github.com/Rust-for-Linux/linux/pull/379 > > > > > > > > > > > > > > Signed-off-by: Gary Guo <gary@garyguo.net> > > > > > > > > > > > > Is there any newer version of this patch? > > > > > > > > > > > > I'm doing some tests with it, but I'm getting boot failures on ppc64 > > > > > > with this applied (at boot kernel is spitting out lots of oops'es and > > > > > > unfortunately it's really hard to copy paste or just read them from the > > > > > > console). > > > > > > > > > > Are you using the ELF ABI v1 or v2? > > > > > > > > > > v1 may have some additional issues when it comes to these symbol tables. > > > > > > > > > > Thanks > > > > > > > > > > Michal > > > > > > > > I have CONFIG_PPC64_ELF_ABI_V2=y in my .config, so I guess I'm using v2. > > > > > > > > BTW, the issue seems to be in dedotify_versions(), as a silly test I > > > > tried to comment out this function completely to be a no-op and now my > > > > system boots fine (but I guess I'm probably breaking something else). > > > > > > Probably not. You should not have the extra leading dot on ABI v2. So if > > > dedotify does something that means something generates and then expects > > > back symbols with a leading dot, and this workaround for ABI v1 breaks > > > that. Or maybe it is called when it shouldn't. > > > > Hm.. I'll add some debugging to this function to see what happens exactly. > > Alright I've done more tests across different architectures. My problem > with ppc64 is that this architecture is evaluating sechdrs[i].sh_size > using get_stubs_size(), that apparently can add some extra padding, so > doing (vers + vers->next < end) isn't a reliable check to determine the > end of the variable array, because sometimes "end" can be greater than > the last "vers + vers->next" entry. I am not familiar enough with ppc, so I may be misundering. Checking the for-loop in module_frob_arch_sections(), they seem to be orthogonal to me. dedotify_versions() is only called for the "__versions" section. get_stubs_size() only affects the ".stubs" section. I did not get how they are related to each other. BTW, we decided to not go in this direction in the former discussion. I am not sure how much effort is needed to track down the issue in this version. If we add new sections to keep the backward compatibility for the current "__versions", this issue may not exist. > In general I think it'd be more reliable to add a dummy NULL entry at > the end of the modversion array. > > Moreover, I think we also need to enforce struct modversion_info to be > __packed, just to make sure that no extra padding is added (otherwise it > may break our logic to determine the offset of the next entry). > > > @@ -2062,16 +2066,25 @@ static void add_versions(struct buffer *b, struct module *mod) > > s->name, mod->name); > > continue; > > } > > - if (strlen(s->name) >= MODULE_NAME_LEN) { > > - error("too long symbol \"%s\" [%s.ko]\n", > > - s->name, mod->name); > > - break; > > - } > > - buf_printf(b, "\t{ %#8x, \"%s\" },\n", > > - s->crc, s->name); > > + name_len = strlen(s->name); > > + name_len_padded = (name_len + 1 + 3) & ~3; > > + > > + /* Offset to next entry */ > > + tmp = TO_NATIVE(8 + name_len_padded); > > ^ Here's another issue that I found, you can't use TO_NATIVE() in this > way, some compilers are complaining (like on s390x this doesn't build). > > So we need to do something like: > > /* Offset to next entry */ > tmp = 8 + name_len_padded > tmp = TO_NATIVE(tmp); > > I'll do some additional tests with these changes and send an updated > patch (for those that are interested). > > -Andrea -- Best Regards Masahiro Yamada
On Thu, Mar 16, 2023 at 01:18:23AM +0900, Masahiro Yamada wrote: > On Tue, Mar 14, 2023 at 11:38 PM Andrea Righi > <andrea.righi@canonical.com> wrote: > > > > On Mon, Mar 13, 2023 at 11:09:31PM +0100, Andrea Righi wrote: > > > On Mon, Mar 13, 2023 at 11:02:34PM +0100, Michal Suchánek wrote: > > > > On Mon, Mar 13, 2023 at 10:53:34PM +0100, Andrea Righi wrote: > > > > > On Mon, Mar 13, 2023 at 10:48:53PM +0100, Michal Suchánek wrote: > > > > > > Hello, > > > > > > > > > > > > On Mon, Mar 13, 2023 at 09:32:16PM +0100, Andrea Righi wrote: > > > > > > > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote: > > > > > > > > Currently modversion uses a fixed size array of size (64 - sizeof(long)) > > > > > > > > to store symbol names, thus placing a hard limit on length of symbols. > > > > > > > > Rust symbols (which encodes crate and module names) can be quite a bit > > > > > > > > longer. The length limit in kallsyms is increased to 512 for this reason. > > > > > > > > > > > > > > > > It's a waste of space to simply expand the fixed array size to 512 in > > > > > > > > modversion info entries. I therefore make it variably sized, with offset > > > > > > > > to the next entry indicated by the initial "next" field. > > > > > > > > > > > > > > > > In addition to supporting longer-than-56/60 byte symbols, this patch also > > > > > > > > reduce the size for short symbols by getting rid of excessive 0 paddings. > > > > > > > > There are still some zero paddings to ensure "next" and "crc" fields are > > > > > > > > properly aligned. > > > > > > > > > > > > > > > > This patch does have a tiny drawback that it makes ".mod.c" files generated > > > > > > > > a bit less easy to read, as code like > > > > > > > > > > > > > > > > "\x08\x00\x00\x00\x78\x56\x34\x12" > > > > > > > > "symbol\0\0" > > > > > > > > > > > > > > > > is generated as opposed to > > > > > > > > > > > > > > > > { 0x12345678, "symbol" }, > > > > > > > > > > > > > > > > because the structure is now variable-length. But hopefully nobody reads > > > > > > > > the generated file :) > > > > > > > > > > > > > > > > Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") > > > > > > > > Link: https://github.com/Rust-for-Linux/linux/pull/379 > > > > > > > > > > > > > > > > Signed-off-by: Gary Guo <gary@garyguo.net> > > > > > > > > > > > > > > Is there any newer version of this patch? > > > > > > > > > > > > > > I'm doing some tests with it, but I'm getting boot failures on ppc64 > > > > > > > with this applied (at boot kernel is spitting out lots of oops'es and > > > > > > > unfortunately it's really hard to copy paste or just read them from the > > > > > > > console). > > > > > > > > > > > > Are you using the ELF ABI v1 or v2? > > > > > > > > > > > > v1 may have some additional issues when it comes to these symbol tables. > > > > > > > > > > > > Thanks > > > > > > > > > > > > Michal > > > > > > > > > > I have CONFIG_PPC64_ELF_ABI_V2=y in my .config, so I guess I'm using v2. > > > > > > > > > > BTW, the issue seems to be in dedotify_versions(), as a silly test I > > > > > tried to comment out this function completely to be a no-op and now my > > > > > system boots fine (but I guess I'm probably breaking something else). > > > > > > > > Probably not. You should not have the extra leading dot on ABI v2. So if > > > > dedotify does something that means something generates and then expects > > > > back symbols with a leading dot, and this workaround for ABI v1 breaks > > > > that. Or maybe it is called when it shouldn't. > > > > > > Hm.. I'll add some debugging to this function to see what happens exactly. > > > > Alright I've done more tests across different architectures. My problem > > with ppc64 is that this architecture is evaluating sechdrs[i].sh_size > > using get_stubs_size(), that apparently can add some extra padding, so > > doing (vers + vers->next < end) isn't a reliable check to determine the > > end of the variable array, because sometimes "end" can be greater than > > the last "vers + vers->next" entry. > > > I am not familiar enough with ppc, so I may be misundering. > > Checking the for-loop in module_frob_arch_sections(), > they seem to be orthogonal to me. > > dedotify_versions() is only called for the "__versions" section. > get_stubs_size() only affects the ".stubs" section. > I did not get how they are related to each other. > > > BTW, we decided to not go in this direction in the former discussion. > I am not sure how much effort is needed to track down the issue > in this version. > > If we add new sections to keep the backward compatibility > for the current "__versions", this issue may not exist. I think need to investigate more on this. But according to the results (and some extra printk()'s that I added) it seems that sechdrs[].sh_size is bigger than expected, so using that to determine the end of the array is not reliable and it systematically breaks the boot on the ppc64 system that I'm currently using. Anyway, I know that we decided to drop this approach, I'm actually using this patch to see if we can enable CONFIG_MODVERSIONS with CONFIG_RUST enabled, that's why I'm exploring the possibility to have a variable length for module symbols. Hopefully we'll come up with a proper approach/solution, because it'd be really useful to have MODVERSIONS and RUST both enabled. Thanks, -Andrea
diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c index ff045644f13f..eac23c11d579 100644 --- a/arch/powerpc/kernel/module_64.c +++ b/arch/powerpc/kernel/module_64.c @@ -236,10 +236,11 @@ static void dedotify_versions(struct modversion_info *vers, { struct modversion_info *end; - for (end = (void *)vers + size; vers < end; vers++) + for (end = (void *)vers + size; vers < end; vers = (void *)vers + vers->next) { if (vers->name[0] == '.') { memmove(vers->name, vers->name+1, strlen(vers->name)); } + } } /* diff --git a/include/linux/module.h b/include/linux/module.h index 8c5909c0076c..37cb25af9099 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -34,8 +34,10 @@ #define MODULE_NAME_LEN MAX_PARAM_PREFIX_LEN struct modversion_info { - unsigned long crc; - char name[MODULE_NAME_LEN]; + /* Offset of the next modversion entry in relation to this one. */ + u32 next; + u32 crc; + char name[0]; }; struct module; diff --git a/kernel/module/version.c b/kernel/module/version.c index 53f43ac5a73e..af7478dcc158 100644 --- a/kernel/module/version.c +++ b/kernel/module/version.c @@ -17,32 +17,29 @@ int check_version(const struct load_info *info, { Elf_Shdr *sechdrs = info->sechdrs; unsigned int versindex = info->index.vers; - unsigned int i, num_versions; - struct modversion_info *versions; + struct modversion_info *versions, *end; + u32 crcval; /* Exporting module didn't supply crcs? OK, we're already tainted. */ if (!crc) return 1; + crcval = *crc; /* No versions at all? modprobe --force does this. */ if (versindex == 0) return try_to_force_load(mod, symname) == 0; versions = (void *)sechdrs[versindex].sh_addr; - num_versions = sechdrs[versindex].sh_size - / sizeof(struct modversion_info); + end = (void *)versions + sechdrs[versindex].sh_size; - for (i = 0; i < num_versions; i++) { - u32 crcval; - - if (strcmp(versions[i].name, symname) != 0) + for (; versions < end; versions = (void *)versions + versions->next) { + if (strcmp(versions->name, symname) != 0) continue; - crcval = *crc; - if (versions[i].crc == crcval) + if (versions->crc == crcval) return 1; - pr_debug("Found checksum %X vs module %lX\n", - crcval, versions[i].crc); + pr_debug("Found checksum %X vs module %X\n", + crcval, versions->crc); goto bad_version; } diff --git a/scripts/export_report.pl b/scripts/export_report.pl index feb3d5542a62..1117646f3141 100755 --- a/scripts/export_report.pl +++ b/scripts/export_report.pl @@ -116,18 +116,19 @@ foreach my $thismod (@allcfiles) { while ( <$module> ) { chomp; if ($state == 0) { - $state = 1 if ($_ =~ /static const struct modversion_info/); + $state = 1 if ($_ =~ /static const char ____versions/); next; } if ($state == 1) { - $state = 2 if ($_ =~ /__attribute__\(\(section\("__versions"\)\)\)/); + $state = 2 if ($_ =~ /__used __section\("__versions"\)/); next; } if ($state == 2) { - if ( $_ !~ /0x[0-9a-f]+,/ ) { + if ( $_ !~ /\\0"/ ) { + last if ($_ =~ /;/); next; } - my $sym = (split /([,"])/,)[4]; + my $sym = (split /(["\\])/,)[2]; my ($module, $value, $symbol, $gpl) = @{$SYMBOL{$sym}}; $SYMBOL{ $sym } = [ $module, $value+1, $symbol, $gpl]; push(@{$MODULE{$thismod}} , $sym); diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c index efff8078e395..334d170de31f 100644 --- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -2046,13 +2046,17 @@ static void add_exported_symbols(struct buffer *buf, struct module *mod) static void add_versions(struct buffer *b, struct module *mod) { struct symbol *s; + unsigned int name_len; + unsigned int name_len_padded; + unsigned int tmp; + unsigned char *tmp_view = (unsigned char *)&tmp; if (!modversions) return; buf_printf(b, "\n"); - buf_printf(b, "static const struct modversion_info ____versions[]\n"); - buf_printf(b, "__used __section(\"__versions\") = {\n"); + buf_printf(b, "static const char ____versions[]\n"); + buf_printf(b, "__used __section(\"__versions\") =\n"); list_for_each_entry(s, &mod->unresolved_symbols, list) { if (!s->module) @@ -2062,16 +2066,25 @@ static void add_versions(struct buffer *b, struct module *mod) s->name, mod->name); continue; } - if (strlen(s->name) >= MODULE_NAME_LEN) { - error("too long symbol \"%s\" [%s.ko]\n", - s->name, mod->name); - break; - } - buf_printf(b, "\t{ %#8x, \"%s\" },\n", - s->crc, s->name); + name_len = strlen(s->name); + name_len_padded = (name_len + 1 + 3) & ~3; + + /* Offset to next entry */ + tmp = TO_NATIVE(8 + name_len_padded); + buf_printf(b, "\t\"\\x%02x\\x%02x\\x%02x\\x%02x", + tmp_view[0], tmp_view[1], tmp_view[2], tmp_view[3]); + + tmp = TO_NATIVE(s->crc); + buf_printf(b, "\\x%02x\\x%02x\\x%02x\\x%02x\"\n", + tmp_view[0], tmp_view[1], tmp_view[2], tmp_view[3]); + + buf_printf(b, "\t\"%s", s->name); + for (; name_len < name_len_padded; name_len++) + buf_printf(b, "\\0"); + buf_printf(b, "\"\n"); } - buf_printf(b, "};\n"); + buf_printf(b, ";\n"); } static void add_depends(struct buffer *b, struct module *mod)
Currently modversion uses a fixed size array of size (64 - sizeof(long)) to store symbol names, thus placing a hard limit on length of symbols. Rust symbols (which encodes crate and module names) can be quite a bit longer. The length limit in kallsyms is increased to 512 for this reason. It's a waste of space to simply expand the fixed array size to 512 in modversion info entries. I therefore make it variably sized, with offset to the next entry indicated by the initial "next" field. In addition to supporting longer-than-56/60 byte symbols, this patch also reduce the size for short symbols by getting rid of excessive 0 paddings. There are still some zero paddings to ensure "next" and "crc" fields are properly aligned. This patch does have a tiny drawback that it makes ".mod.c" files generated a bit less easy to read, as code like "\x08\x00\x00\x00\x78\x56\x34\x12" "symbol\0\0" is generated as opposed to { 0x12345678, "symbol" }, because the structure is now variable-length. But hopefully nobody reads the generated file :) Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512") Link: https://github.com/Rust-for-Linux/linux/pull/379 Signed-off-by: Gary Guo <gary@garyguo.net> --- arch/powerpc/kernel/module_64.c | 3 ++- include/linux/module.h | 6 ++++-- kernel/module/version.c | 21 +++++++++------------ scripts/export_report.pl | 9 +++++---- scripts/mod/modpost.c | 33 +++++++++++++++++++++++---------- 5 files changed, 43 insertions(+), 29 deletions(-)