Message ID | 20210607023839.26387-1-mark-pk.tsai@mediatek.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | recordmcount: avoid using ABS symbol as reference | expand |
On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > Avoid using ABS symbol, which won't be relocate, as reference. > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1). > > Section Headers: > [Nr] Name Type Address Off Size ES Flg Lk Inf Al > [65521] .text.n_tty_receive_buf PROGBITS 0000000000000000 3cdab520 000054 00 AX 0 0 4 > [65522] .rela.text.n_tty_receive_buf RELA 0000000000000000 3cdab578 000030 18 I 152076 65521 8 > A RELA section's r_info field points to the section to which it applies. This is why in the example above section #65522 points to section #65521. This has nothing to do with the numerical value of SHN_ABS. > find_secsym_ndx, which use r_info in rela section to find the reference > symbol, may take ABS symbol as base. > > Symbol table '.symtab' contains 453285 entries: > Num: Value Size Type Bind Vis Ndx Name > 6: 0000000000000002 0 NOTYPE LOCAL DEFAULT ABS section_count > > Which cause an invalid address in __mcount_loc. > Could you give a better account of the error you are trying to address? Also, arm64 no longer defines a section_count symbol (since v5.11), so please make sure that the diagnostics of the issue you are addressing are accurate for mainline. > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> > --- > scripts/recordmcount.h | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > index f9b19524da11..9b69167fb7ff 100644 > --- a/scripts/recordmcount.h > +++ b/scripts/recordmcount.h > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx, > for (symp = sym0, t = nsym; t; --t, ++symp) { > unsigned int const st_bind = ELF_ST_BIND(symp->st_info); > > + /* avoid absolute symbols */ > + if (symp->st_shndx == SHN_ABS) > + continue; > + > if (txtndx == get_symindex(symp, symtab, symtab_shndx) > /* avoid STB_WEAK */ > && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) { > -- > 2.18.0 > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > Avoid using ABS symbol, which won't be relocate, as reference. > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1). > > > > Section Headers: > > [Nr] Name Type Address Off Size ES Flg Lk Inf Al > > [65521] .text.n_tty_receive_buf PROGBITS 0000000000000000 3cdab520 000054 00 AX 0 0 4 > > [65522] .rela.text.n_tty_receive_buf RELA 0000000000000000 3cdab578 000030 18 I 152076 65521 8 > > > > A RELA section's r_info field points to the section to which it > applies. This is why in the example above section #65522 points to > section #65521. This has nothing to do with the numerical value of > SHN_ABS. If the r_info of RELA section is 65521(0xfff1), find_secsym_ndx() will use it to find the base symbol. And in the symbol search loop in find_secsym_ndx(), get_symindex will return 0xfff1 if the symbol is in ABS section. In this case, find_secsym_ndx() will return a absolute symbol as base, which won't be relocate, if an ABS symbol is found before the real symbol in section 65521. > > > find_secsym_ndx, which use r_info in rela section to find the reference > > symbol, may take ABS symbol as base. > > > > Symbol table '.symtab' contains 453285 entries: > > Num: Value Size Type Bind Vis Ndx Name > > 6: 0000000000000002 0 NOTYPE LOCAL DEFAULT ABS section_count > > > > Which cause an invalid address in __mcount_loc. > > > > Could you give a better account of the error you are trying to address? > > Also, arm64 no longer defines a section_count symbol (since v5.11), so > please make sure that the diagnostics of the issue you are addressing > are accurate for mainline. > My kernel version is 5.4.61. But as I explained, I suppose mainline also have this issue. > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> > > --- > > scripts/recordmcount.h | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > > index f9b19524da11..9b69167fb7ff 100644 > > --- a/scripts/recordmcount.h > > +++ b/scripts/recordmcount.h > > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx, > > for (symp = sym0, t = nsym; t; --t, ++symp) { > > unsigned int const st_bind = ELF_ST_BIND(symp->st_info); > > > > + /* avoid absolute symbols */ > > + if (symp->st_shndx == SHN_ABS) > > + continue; > > + > > if (txtndx == get_symindex(symp, symtab, symtab_shndx) > > /* avoid STB_WEAK */ > > && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {
On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > Avoid using ABS symbol, which won't be relocate, as reference. > > > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1). > > > > > > Section Headers: > > > [Nr] Name Type Address Off Size ES Flg Lk Inf Al > > > [65521] .text.n_tty_receive_buf PROGBITS 0000000000000000 3cdab520 000054 00 AX 0 0 4 > > > [65522] .rela.text.n_tty_receive_buf RELA 0000000000000000 3cdab578 000030 18 I 152076 65521 8 > > > > > > > A RELA section's r_info field points to the section to which it > > applies. This is why in the example above section #65522 points to > > section #65521. This has nothing to do with the numerical value of > > SHN_ABS. > > If the r_info of RELA section is 65521(0xfff1), > find_secsym_ndx() will use it to find the base symbol. > But what does that have to do with the sh_info field of the RELA section's Elf_Shdr struct? IOW, what is the relevance of section #65521 here? > And in the symbol search loop in find_secsym_ndx(), get_symindex will > return 0xfff1 if the symbol is in ABS section. > > In this case, find_secsym_ndx() will return a absolute symbol as > base, which won't be relocate, if an ABS symbol is found before the > real symbol in section 65521. > I see your point here. > > > > > find_secsym_ndx, which use r_info in rela section to find the reference > > > symbol, may take ABS symbol as base. > > > > > > Symbol table '.symtab' contains 453285 entries: > > > Num: Value Size Type Bind Vis Ndx Name > > > 6: 0000000000000002 0 NOTYPE LOCAL DEFAULT ABS section_count > > > > > > Which cause an invalid address in __mcount_loc. > > > > > > > Could you give a better account of the error you are trying to address? > > > > Also, arm64 no longer defines a section_count symbol (since v5.11), so > > please make sure that the diagnostics of the issue you are addressing > > are accurate for mainline. > > > > My kernel version is 5.4.61. > But as I explained, I suppose mainline also have this issue. > Mainline is what we work on. So please base your changes (and your commit log) on mainline. > > > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> > > > --- > > > scripts/recordmcount.h | 4 ++++ > > > 1 file changed, 4 insertions(+) > > > > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > > > index f9b19524da11..9b69167fb7ff 100644 > > > --- a/scripts/recordmcount.h > > > +++ b/scripts/recordmcount.h > > > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx, > > > for (symp = sym0, t = nsym; t; --t, ++symp) { > > > unsigned int const st_bind = ELF_ST_BIND(symp->st_info); > > > > > > + /* avoid absolute symbols */ > > > + if (symp->st_shndx == SHN_ABS) > > > + continue; > > > + > > > if (txtndx == get_symindex(symp, symtab, symtab_shndx) > > > /* avoid STB_WEAK */ > > > && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {
> On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > Avoid using ABS symbol, which won't be relocate, as reference. > > > > > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1). > > > > > > > > Section Headers: > > > > [Nr] Name Type Address Off Size ES Flg Lk Inf Al > > > > [65521] .text.n_tty_receive_buf PROGBITS 0000000000000000 3cdab520 000054 00 AX 0 0 4 > > > > [65522] .rela.text.n_tty_receive_buf RELA 0000000000000000 3cdab578 000030 18 I 152076 65521 8 > > > > > > > > > > A RELA section's r_info field points to the section to which it > > > applies. This is why in the example above section #65522 points to > > > section #65521. This has nothing to do with the numerical value of > > > SHN_ABS. > > > > If the r_info of RELA section is 65521(0xfff1), Oh sorry, I mean sh_info here. > > find_secsym_ndx() will use it to find the base symbol. > > > > But what does that have to do with the sh_info field of the RELA > section's Elf_Shdr struct? IOW, what is the relevance of section > #65521 here? > So what I mean is the problem occur if the sh_info of a RELA section is #65521. > > And in the symbol search loop in find_secsym_ndx(), get_symindex will > > return 0xfff1 if the symbol is in ABS section. > > > > In this case, find_secsym_ndx() will return a absolute symbol as > > base, which won't be relocate, if an ABS symbol is found before the > > real symbol in section 65521. > > > > I see your point here. > > > > > > > > find_secsym_ndx, which use r_info in rela section to find the reference sh_info. > > > > symbol, may take ABS symbol as base. > > > > > > > > Symbol table '.symtab' contains 453285 entries: > > > > Num: Value Size Type Bind Vis Ndx Name > > > > 6: 0000000000000002 0 NOTYPE LOCAL DEFAULT ABS section_count > > > > > > > > Which cause an invalid address in __mcount_loc. > > > > > > > > > > Could you give a better account of the error you are trying to address? > > > > > > Also, arm64 no longer defines a section_count symbol (since v5.11), so > > > please make sure that the diagnostics of the issue you are addressing > > > are accurate for mainline. > > > > > > > My kernel version is 5.4.61. > > But as I explained, I suppose mainline also have this issue. > > > > Mainline is what we work on. So please base your changes (and your > commit log) on mainline. > I understand it. But the platform I can reproduce the problem is only support to 5.4 LTS now. And port it to the latest mainline kernel have much more work to do, can I just keep this commit log? Or just remove the example I posted in the commit messsage? > > > > > > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> > > > > --- > > > > scripts/recordmcount.h | 4 ++++ > > > > 1 file changed, 4 insertions(+) > > > > > > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > > > > index f9b19524da11..9b69167fb7ff 100644 > > > > --- a/scripts/recordmcount.h > > > > +++ b/scripts/recordmcount.h > > > > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx, > > > > for (symp = sym0, t = nsym; t; --t, ++symp) { > > > > unsigned int const st_bind = ELF_ST_BIND(symp->st_info); > > > > > > > > + /* avoid absolute symbols */ > > > > + if (symp->st_shndx == SHN_ABS) > > > > + continue; > > > > + > > > > if (txtndx == get_symindex(symp, symtab, symtab_shndx) > > > > /* avoid STB_WEAK */ > > > > && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {
> > On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > > > Avoid using ABS symbol, which won't be relocate, as reference. > > > > > > > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1). > > > > > > > > > > Section Headers: > > > > > [Nr] Name Type Address Off Size ES Flg Lk Inf Al > > > > > [65521] .text.n_tty_receive_buf PROGBITS 0000000000000000 3cdab520 000054 00 AX 0 0 4 > > > > > [65522] .rela.text.n_tty_receive_buf RELA 0000000000000000 3cdab578 000030 18 I 152076 65521 8 > > > > > > > > > > > > > A RELA section's r_info field points to the section to which it > > > > applies. This is why in the example above section #65522 points to > > > > section #65521. This has nothing to do with the numerical value of > > > > SHN_ABS. > > > > > > If the r_info of RELA section is 65521(0xfff1), > > Oh sorry, I mean sh_info here. > > > > find_secsym_ndx() will use it to find the base symbol. > > > > > > > But what does that have to do with the sh_info field of the RELA > > section's Elf_Shdr struct? IOW, what is the relevance of section > > #65521 here? > > > > So what I mean is the problem occur if the sh_info of a RELA section > is #65521. Actually the problem occur if the sh_info of a RELA section is in the special section index range(SHN_LORESERVE ~ SHN_HIRESERVE). Maybe I should add a is_shndx_special() to check this like scripts/mod/modpost.h did? > > > > And in the symbol search loop in find_secsym_ndx(), get_symindex will > > > return 0xfff1 if the symbol is in ABS section. > > > > > > In this case, find_secsym_ndx() will return a absolute symbol as > > > base, which won't be relocate, if an ABS symbol is found before the > > > real symbol in section 65521. > > > > > > > I see your point here. > > > > > > > > > > > find_secsym_ndx, which use r_info in rela section to find the reference > > sh_info. > > > > > > symbol, may take ABS symbol as base. > > > > > > > > > > Symbol table '.symtab' contains 453285 entries: > > > > > Num: Value Size Type Bind Vis Ndx Name > > > > > 6: 0000000000000002 0 NOTYPE LOCAL DEFAULT ABS section_count > > > > > > > > > > Which cause an invalid address in __mcount_loc. > > > > > > > > > > > > > Could you give a better account of the error you are trying to address? > > > > > > > > Also, arm64 no longer defines a section_count symbol (since v5.11), so > > > > please make sure that the diagnostics of the issue you are addressing > > > > are accurate for mainline. > > > > > > > > > > My kernel version is 5.4.61. > > > But as I explained, I suppose mainline also have this issue. > > > > > > > Mainline is what we work on. So please base your changes (and your > > commit log) on mainline. > > > > I understand it. > But the platform I can reproduce the problem is only support to 5.4 LTS now. > And port it to the latest mainline kernel have much more work to do, can I just > keep this commit log? Or just remove the example I posted in the commit messsage? > > > > > > > > > > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> > > > > > --- > > > > > scripts/recordmcount.h | 4 ++++ > > > > > 1 file changed, 4 insertions(+) > > > > > > > > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > > > > > index f9b19524da11..9b69167fb7ff 100644 > > > > > --- a/scripts/recordmcount.h > > > > > +++ b/scripts/recordmcount.h > > > > > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx, > > > > > for (symp = sym0, t = nsym; t; --t, ++symp) { > > > > > unsigned int const st_bind = ELF_ST_BIND(symp->st_info); > > > > > > > > > > + /* avoid absolute symbols */ > > > > > + if (symp->st_shndx == SHN_ABS) > > > > > + continue; > > > > > + > > > > > if (txtndx == get_symindex(symp, symtab, symtab_shndx) > > > > > /* avoid STB_WEAK */ > > > > > && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) { >
On Mon, 7 Jun 2021 at 10:06, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > > > > > Avoid using ABS symbol, which won't be relocate, as reference. > > > > > > > > > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1). > > > > > > > > > > > > Section Headers: > > > > > > [Nr] Name Type Address Off Size ES Flg Lk Inf Al > > > > > > [65521] .text.n_tty_receive_buf PROGBITS 0000000000000000 3cdab520 000054 00 AX 0 0 4 > > > > > > [65522] .rela.text.n_tty_receive_buf RELA 0000000000000000 3cdab578 000030 18 I 152076 65521 8 > > > > > > > > > > > > > > > > A RELA section's r_info field points to the section to which it > > > > > applies. This is why in the example above section #65522 points to > > > > > section #65521. This has nothing to do with the numerical value of > > > > > SHN_ABS. > > > > > > > > If the r_info of RELA section is 65521(0xfff1), > > > > Oh sorry, I mean sh_info here. > > > > > > find_secsym_ndx() will use it to find the base symbol. > > > > > > > > > > But what does that have to do with the sh_info field of the RELA > > > section's Elf_Shdr struct? IOW, what is the relevance of section > > > #65521 here? > > > > > > > So what I mean is the problem occur if the sh_info of a RELA section > > is #65521. > > Actually the problem occur if the sh_info of a RELA section is in > the special section index range(SHN_LORESERVE ~ SHN_HIRESERVE). > Maybe I should add a is_shndx_special() to check this like > scripts/mod/modpost.h did? > So if I understand all of this correctly, we are running into a fundamental issue here, where the linker emits more sections than the sh_info field can describe, overflowing into the reserved range. I don't think papering over it like this is going to be maintainable going forward. > > > > > > And in the symbol search loop in find_secsym_ndx(), get_symindex will > > > > return 0xfff1 if the symbol is in ABS section. > > > > > > > > In this case, find_secsym_ndx() will return a absolute symbol as > > > > base, which won't be relocate, if an ABS symbol is found before the > > > > real symbol in section 65521. > > > > > > > > > > I see your point here. > > > > > > > > > > > > > > find_secsym_ndx, which use r_info in rela section to find the reference > > > > sh_info. > > > > > > > > symbol, may take ABS symbol as base. > > > > > > > > > > > > Symbol table '.symtab' contains 453285 entries: > > > > > > Num: Value Size Type Bind Vis Ndx Name > > > > > > 6: 0000000000000002 0 NOTYPE LOCAL DEFAULT ABS section_count > > > > > > > > > > > > Which cause an invalid address in __mcount_loc. > > > > > > > > > > > > > > > > Could you give a better account of the error you are trying to address? > > > > > > > > > > Also, arm64 no longer defines a section_count symbol (since v5.11), so > > > > > please make sure that the diagnostics of the issue you are addressing > > > > > are accurate for mainline. > > > > > > > > > > > > > My kernel version is 5.4.61. > > > > But as I explained, I suppose mainline also have this issue. > > > > > > > > > > Mainline is what we work on. So please base your changes (and your > > > commit log) on mainline. > > > > > > > I understand it. > > But the platform I can reproduce the problem is only support to 5.4 LTS now. > > And port it to the latest mainline kernel have much more work to do, can I just > > keep this commit log? Or just remove the example I posted in the commit messsage? > > > > > > > > > > > > > > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> > > > > > > --- > > > > > > scripts/recordmcount.h | 4 ++++ > > > > > > 1 file changed, 4 insertions(+) > > > > > > > > > > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > > > > > > index f9b19524da11..9b69167fb7ff 100644 > > > > > > --- a/scripts/recordmcount.h > > > > > > +++ b/scripts/recordmcount.h > > > > > > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx, > > > > > > for (symp = sym0, t = nsym; t; --t, ++symp) { > > > > > > unsigned int const st_bind = ELF_ST_BIND(symp->st_info); > > > > > > > > > > > > + /* avoid absolute symbols */ > > > > > > + if (symp->st_shndx == SHN_ABS) > > > > > > + continue; > > > > > > + > > > > > > if (txtndx == get_symindex(symp, symtab, symtab_shndx) > > > > > > /* avoid STB_WEAK */ > > > > > > && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) { > >
> On Mon, 7 Jun 2021 at 10:06, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > > > > > > > Avoid using ABS symbol, which won't be relocate, as reference. > > > > > > > > > > > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1). > > > > > > > > > > > > > > Section Headers: > > > > > > > [Nr] Name Type Address Off Size ES Flg Lk Inf Al > > > > > > > [65521] .text.n_tty_receive_buf PROGBITS 0000000000000000 3cdab520 000054 00 AX 0 0 4 > > > > > > > [65522] .rela.text.n_tty_receive_buf RELA 0000000000000000 3cdab578 000030 18 I 152076 65521 8 > > > > > > > > > > > > > > > > > > > A RELA section's r_info field points to the section to which it > > > > > > applies. This is why in the example above section #65522 points to > > > > > > section #65521. This has nothing to do with the numerical value of > > > > > > SHN_ABS. > > > > > > > > > > If the r_info of RELA section is 65521(0xfff1), > > > > > > Oh sorry, I mean sh_info here. > > > > > > > > find_secsym_ndx() will use it to find the base symbol. > > > > > > > > > > > > > But what does that have to do with the sh_info field of the RELA > > > > section's Elf_Shdr struct? IOW, what is the relevance of section > > > > #65521 here? > > > > > > > > > > So what I mean is the problem occur if the sh_info of a RELA section > > > is #65521. > > > > Actually the problem occur if the sh_info of a RELA section is in > > the special section index range(SHN_LORESERVE ~ SHN_HIRESERVE). > > Maybe I should add a is_shndx_special() to check this like > > scripts/mod/modpost.h did? > > > > So if I understand all of this correctly, we are running into a > fundamental issue here, where the linker emits more sections than the > sh_info field can describe, overflowing into the reserved range. Actually the problem is about comparing st_shndx and sh_info. When the st_shndx is larger than SHN_LORESERVE, the linker mark it as SHN_XINDEX, and get_symindex() will handle it. But sh_info for a RELA section take the actual shndx which may be in the reserved range. So I suppose before compare the sh_info and st_shndx, we need to do more check in find_secsym_ndx(). > > I don't think papering over it like this is going to be maintainable > going forward. > Do you have suggestion for how to deal with this problem? > > > > > > > > > > And in the symbol search loop in find_secsym_ndx(), get_symindex will > > > > > return 0xfff1 if the symbol is in ABS section. > > > > > > > > > > In this case, find_secsym_ndx() will return a absolute symbol as > > > > > base, which won't be relocate, if an ABS symbol is found before the > > > > > real symbol in section 65521. > > > > > > > > > > > > > I see your point here. > > > > > > > > > > > > > > > > > find_secsym_ndx, which use r_info in rela section to find the reference > > > > > > sh_info. > > > > > > > > > > symbol, may take ABS symbol as base. > > > > > > > > > > > > > > Symbol table '.symtab' contains 453285 entries: > > > > > > > Num: Value Size Type Bind Vis Ndx Name > > > > > > > 6: 0000000000000002 0 NOTYPE LOCAL DEFAULT ABS section_count > > > > > > > > > > > > > > Which cause an invalid address in __mcount_loc. > > > > > > > > > > > > > > > > > > > Could you give a better account of the error you are trying to address? > > > > > > > > > > > > Also, arm64 no longer defines a section_count symbol (since v5.11), so > > > > > > please make sure that the diagnostics of the issue you are addressing > > > > > > are accurate for mainline. > > > > > > > > > > > > > > > > My kernel version is 5.4.61. > > > > > But as I explained, I suppose mainline also have this issue. > > > > > > > > > > > > > Mainline is what we work on. So please base your changes (and your > > > > commit log) on mainline. > > > > > > > > > > I understand it. > > > But the platform I can reproduce the problem is only support to 5.4 LTS now. > > > And port it to the latest mainline kernel have much more work to do, can I just > > > keep this commit log? Or just remove the example I posted in the commit messsage? > > > > > > > > > > > > > > > > > > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> > > > > > > > --- > > > > > > > scripts/recordmcount.h | 4 ++++ > > > > > > > 1 file changed, 4 insertions(+) > > > > > > > > > > > > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > > > > > > > index f9b19524da11..9b69167fb7ff 100644 > > > > > > > --- a/scripts/recordmcount.h > > > > > > > +++ b/scripts/recordmcount.h > > > > > > > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx, > > > > > > > for (symp = sym0, t = nsym; t; --t, ++symp) { > > > > > > > unsigned int const st_bind = ELF_ST_BIND(symp->st_info); > > > > > > > > > > > > > > + /* avoid absolute symbols */ > > > > > > > + if (symp->st_shndx == SHN_ABS) > > > > > > > + continue; > > > > > > > + > > > > > > > if (txtndx == get_symindex(symp, symtab, symtab_shndx) > > > > > > > /* avoid STB_WEAK */ > > > > > > > && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) { > > >
On Mon, Jun 07, 2021 at 11:50:40AM +0200, Ard Biesheuvel wrote: > On Mon, 7 Jun 2021 at 10:06, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > > > > > > > Avoid using ABS symbol, which won't be relocate, as reference. > > > > > > > > > > > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1). > > > > > > > > > > > > > > Section Headers: > > > > > > > [Nr] Name Type Address Off Size ES Flg Lk Inf Al > > > > > > > [65521] .text.n_tty_receive_buf PROGBITS 0000000000000000 3cdab520 000054 00 AX 0 0 4 > > > > > > > [65522] .rela.text.n_tty_receive_buf RELA 0000000000000000 3cdab578 000030 18 I 152076 65521 8 > > > > > > > > > > > > > > > > > > > A RELA section's r_info field points to the section to which it > > > > > > applies. This is why in the example above section #65522 points to > > > > > > section #65521. This has nothing to do with the numerical value of > > > > > > SHN_ABS. > > > > > > > > > > If the r_info of RELA section is 65521(0xfff1), > > > > > > Oh sorry, I mean sh_info here. > > > > > > > > find_secsym_ndx() will use it to find the base symbol. > > > > > > > > > > > > > But what does that have to do with the sh_info field of the RELA > > > > section's Elf_Shdr struct? IOW, what is the relevance of section > > > > #65521 here? > > > > > > > > > > So what I mean is the problem occur if the sh_info of a RELA section > > > is #65521. > > > > Actually the problem occur if the sh_info of a RELA section is in > > the special section index range(SHN_LORESERVE ~ SHN_HIRESERVE). > > Maybe I should add a is_shndx_special() to check this like > > scripts/mod/modpost.h did? > > > > So if I understand all of this correctly, we are running into a > fundamental issue here, where the linker emits more sections than the > sh_info field can describe, overflowing into the reserved range. > > I don't think papering over it like this is going to be maintainable > going forward. There's an extended section header index section for just that. And recordmcount actually seems to use that as well. I can't seem to find enough of the thread to figure out what the actual problem is though. The lore archive doesn't have anything prior to this message. One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When SHN_XINDEX, then use .symtab_shndx. Apparently you've found a case where neither is true? In that case objtool seems to use shndx 0. A matching recordmcount patch would be something like this. diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h index f9b19524da11..d99cc0aed6fe 100644 --- a/scripts/recordmcount.h +++ b/scripts/recordmcount.h @@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab, unsigned long offset; int index; - if (sym->st_shndx != SHN_XINDEX) + if (sym->st_shndx > SHN_UDEF && + sym->st_shndx < SHN_LORESERVE) return w2(sym->st_shndx); - offset = (unsigned long)sym - (unsigned long)symtab; - index = offset / sizeof(*sym); + if (sym->st_shndx == SHN_XINDEX) { + offset = (unsigned long)sym - (unsigned long)symtab; + index = offset / sizeof(*sym); - return w(symtab_shndx[index]); + return w(symtab_shndx[index]); + } + + return 0; } static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)
> On Mon, Jun 07, 2021 at 11:50:40AM +0200, Ard Biesheuvel wrote: > > On Mon, 7 Jun 2021 at 10:06, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > > > > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote: > > > > > > > > > > > > > > > > Avoid using ABS symbol, which won't be relocate, as reference. > > > > > > > > > > > > > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1). > > > > > > > > > > > > > > > > Section Headers: > > > > > > > > [Nr] Name Type Address Off Size ES Flg Lk Inf Al > > > > > > > > [65521] .text.n_tty_receive_buf PROGBITS 0000000000000000 3cdab520 000054 00 AX 0 0 4 > > > > > > > > [65522] .rela.text.n_tty_receive_buf RELA 0000000000000000 3cdab578 000030 18 I 152076 65521 8 > > > > > > > > > > > > > > > > > > > > > > A RELA section's r_info field points to the section to which it > > > > > > > applies. This is why in the example above section #65522 points to > > > > > > > section #65521. This has nothing to do with the numerical value of > > > > > > > SHN_ABS. > > > > > > > > > > > > If the r_info of RELA section is 65521(0xfff1), > > > > > > > > Oh sorry, I mean sh_info here. > > > > > > > > > > find_secsym_ndx() will use it to find the base symbol. > > > > > > > > > > > > > > > > But what does that have to do with the sh_info field of the RELA > > > > > section's Elf_Shdr struct? IOW, what is the relevance of section > > > > > #65521 here? > > > > > > > > > > > > > So what I mean is the problem occur if the sh_info of a RELA section > > > > is #65521. > > > > > > Actually the problem occur if the sh_info of a RELA section is in > > > the special section index range(SHN_LORESERVE ~ SHN_HIRESERVE). > > > Maybe I should add a is_shndx_special() to check this like > > > scripts/mod/modpost.h did? > > > > > > > So if I understand all of this correctly, we are running into a > > fundamental issue here, where the linker emits more sections than the > > sh_info field can describe, overflowing into the reserved range. > > > > I don't think papering over it like this is going to be maintainable > > going forward. > > There's an extended section header index section for just that. And > recordmcount actually seems to use that as well. > > I can't seem to find enough of the thread to figure out what the actual > problem is though. The lore archive doesn't have anything prior to this > message. > > One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When > SHN_XINDEX, then use .symtab_shndx. > > Apparently you've found a case where neither is true? In that case Yes, that's what my mean. get_symindex returns st_shndx directly even if st_shndx is in the reserve range. So either do not use get_symindex for those symbols or do extra handling for it like the patch you provide will solve the problem. > objtool seems to use shndx 0. A matching recordmcount patch would be > something like this. > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > index f9b19524da11..d99cc0aed6fe 100644 > --- a/scripts/recordmcount.h > +++ b/scripts/recordmcount.h > @@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab, > unsigned long offset; > int index; > > - if (sym->st_shndx != SHN_XINDEX) > + if (sym->st_shndx > SHN_UDEF && > + sym->st_shndx < SHN_LORESERVE) > return w2(sym->st_shndx); > > - offset = (unsigned long)sym - (unsigned long)symtab; > - index = offset / sizeof(*sym); > + if (sym->st_shndx == SHN_XINDEX) { > + offset = (unsigned long)sym - (unsigned long)symtab; > + index = offset / sizeof(*sym); > > - return w(symtab_shndx[index]); > + return w(symtab_shndx[index]); > + } > + > + return 0; > } > > static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0) Thanks for the suggestion. Skip all the symbols in the special sections seems fine because those sections should not be processed here.
On Mon, 7 Jun 2021 13:44:21 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > There's an extended section header index section for just that. And > recordmcount actually seems to use that as well. > > I can't seem to find enough of the thread to figure out what the actual > problem is though. The lore archive doesn't have anything prior to this > message. > > One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When > SHN_XINDEX, then use .symtab_shndx. > > Apparently you've found a case where neither is true? In that case > objtool seems to use shndx 0. A matching recordmcount patch would be > something like this. Mark-PK, Does the below patch fix it for you too (if you backport it to your kernel). I much rather have recordmcount match objtool, as one day the two will hopefully merge to one executable. -- Steve > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > index f9b19524da11..d99cc0aed6fe 100644 > --- a/scripts/recordmcount.h > +++ b/scripts/recordmcount.h > @@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab, > unsigned long offset; > int index; > > - if (sym->st_shndx != SHN_XINDEX) > + if (sym->st_shndx > SHN_UDEF && > + sym->st_shndx < SHN_LORESERVE) > return w2(sym->st_shndx); > > - offset = (unsigned long)sym - (unsigned long)symtab; > - index = offset / sizeof(*sym); > + if (sym->st_shndx == SHN_XINDEX) { > + offset = (unsigned long)sym - (unsigned long)symtab; > + index = offset / sizeof(*sym); > > - return w(symtab_shndx[index]); > + return w(symtab_shndx[index]); > + } > + > + return 0; > } > > static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)
On Mon, Jun 07, 2021 at 01:44:21PM +0200, Peter Zijlstra wrote: > One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When > SHN_XINDEX, then use .symtab_shndx. > > Apparently you've found a case where neither is true? In that case > objtool seems to use shndx 0. A matching recordmcount patch would be > something like this. > Apparently I'm consistently bad at spelling SHM_UNDEF today.. > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > index f9b19524da11..d99cc0aed6fe 100644 > --- a/scripts/recordmcount.h > +++ b/scripts/recordmcount.h > @@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab, > unsigned long offset; > int index; > > - if (sym->st_shndx != SHN_XINDEX) > + if (sym->st_shndx > SHN_UDEF && > + sym->st_shndx < SHN_LORESERVE) > return w2(sym->st_shndx); > > - offset = (unsigned long)sym - (unsigned long)symtab; > - index = offset / sizeof(*sym); > + if (sym->st_shndx == SHN_XINDEX) { > + offset = (unsigned long)sym - (unsigned long)symtab; > + index = offset / sizeof(*sym); > > - return w(symtab_shndx[index]); > + return w(symtab_shndx[index]); > + } > + > + return 0; > } > > static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)
> On Mon, 7 Jun 2021 13:44:21 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > > > There's an extended section header index section for just that. And > > recordmcount actually seems to use that as well. > > > > I can't seem to find enough of the thread to figure out what the actual > > problem is though. The lore archive doesn't have anything prior to this > > message. > > > > One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When > > SHN_XINDEX, then use .symtab_shndx. > > > > Apparently you've found a case where neither is true? In that case > > objtool seems to use shndx 0. A matching recordmcount patch would be > > something like this. Hi Peter, Should I resend the below patch as v2? > > Mark-PK, > > Does the below patch fix it for you too (if you backport it to your > kernel). I much rather have recordmcount match objtool, as one day the > two will hopefully merge to one executable. > > -- Steve > Yes, this patch fix it. > > > > > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > > index f9b19524da11..d99cc0aed6fe 100644 > > --- a/scripts/recordmcount.h > > +++ b/scripts/recordmcount.h > > @@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab, > > unsigned long offset; > > int index; > > > > - if (sym->st_shndx != SHN_XINDEX) > > + if (sym->st_shndx > SHN_UDEF && > > + sym->st_shndx < SHN_LORESERVE) > > return w2(sym->st_shndx); > > > > - offset = (unsigned long)sym - (unsigned long)symtab; > > - index = offset / sizeof(*sym); > > + if (sym->st_shndx == SHN_XINDEX) { > > + offset = (unsigned long)sym - (unsigned long)symtab; > > + index = offset / sizeof(*sym); > > > > - return w(symtab_shndx[index]); > > + return w(symtab_shndx[index]); > > + } > > + > > + return 0; > > } > > > > static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)
> On Mon, Jun 07, 2021 at 01:44:21PM +0200, Peter Zijlstra wrote: > > One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When > > SHN_XINDEX, then use .symtab_shndx. > > > > Apparently you've found a case where neither is true? In that case > > objtool seems to use shndx 0. A matching recordmcount patch would be > > something like this. > > > > Apparently I'm consistently bad at spelling SHM_UNDEF today.. I test the below patch and it work for me. I only correct the UNDEF typo without any other modification. Could I push this patch or you will push it? I guess I have to add your signed-off-by. > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h > > index f9b19524da11..d99cc0aed6fe 100644 > > --- a/scripts/recordmcount.h > > +++ b/scripts/recordmcount.h > > @@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab, > > unsigned long offset; > > int index; > > > > - if (sym->st_shndx != SHN_XINDEX) > > + if (sym->st_shndx > SHN_UDEF && > > + sym->st_shndx < SHN_LORESERVE) > > return w2(sym->st_shndx); > > > > - offset = (unsigned long)sym - (unsigned long)symtab; > > - index = offset / sizeof(*sym); > > + if (sym->st_shndx == SHN_XINDEX) { > > + offset = (unsigned long)sym - (unsigned long)symtab; > > + index = offset / sizeof(*sym); > > > > - return w(symtab_shndx[index]); > > + return w(symtab_shndx[index]); > > + } > > + > > + return 0; > > } > > > > static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)
diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h index f9b19524da11..9b69167fb7ff 100644 --- a/scripts/recordmcount.h +++ b/scripts/recordmcount.h @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx, for (symp = sym0, t = nsym; t; --t, ++symp) { unsigned int const st_bind = ELF_ST_BIND(symp->st_info); + /* avoid absolute symbols */ + if (symp->st_shndx == SHN_ABS) + continue; + if (txtndx == get_symindex(symp, symtab, symtab_shndx) /* avoid STB_WEAK */ && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {
Avoid using ABS symbol, which won't be relocate, as reference. On arm64 platform, if there's shndx equals SHN_ABS(0xfff1). Section Headers: [Nr] Name Type Address Off Size ES Flg Lk Inf Al [65521] .text.n_tty_receive_buf PROGBITS 0000000000000000 3cdab520 000054 00 AX 0 0 4 [65522] .rela.text.n_tty_receive_buf RELA 0000000000000000 3cdab578 000030 18 I 152076 65521 8 find_secsym_ndx, which use r_info in rela section to find the reference symbol, may take ABS symbol as base. Symbol table '.symtab' contains 453285 entries: Num: Value Size Type Bind Vis Ndx Name 6: 0000000000000002 0 NOTYPE LOCAL DEFAULT ABS section_count Which cause an invalid address in __mcount_loc. Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> --- scripts/recordmcount.h | 4 ++++ 1 file changed, 4 insertions(+)