Message ID | 20241105145507.613981-1-frediano.ziglio@cloud.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/boot: Fix build with LLVM toolchain | expand |
On 05.11.2024 15:55, Frediano Ziglio wrote: > This toolchain generates different object and map files. > Account for these changes. At least briefly mentioning what exactly the differences are would be quite nice, imo. > --- a/xen/tools/combine_two_binaries.py > +++ b/xen/tools/combine_two_binaries.py > @@ -67,13 +67,22 @@ if args.exports is not None: > > # Parse mapfile, look for ther symbols we want to export. > if args.mapfile is not None: > - symbol_re = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') > + symbol_re_clang = \ Is "clang" really appropriate to use here? Even without the alternative being named "gnu", I'd expect "llvm" to be more to the point, as at the linking stage the original language (encoded in "clang") shouldn't matter much anymore. > + re.compile(r'\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s{15,}(\S+)\n') Just wondering: - How stable is their map file format? - Do they not have an option to write GNU-compatible map files? - Why do you declare 5 groups, when you're only after the 1st and last, which would then allow to ... > + symbol_re_gnu = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') > for line in open(args.mapfile): > - m = symbol_re.match(line) > - if not m or m.group(2) not in exports: > + name = None > + m = symbol_re_clang.match(line) > + if m: > + name = m.group(5) > + else: > + m = symbol_re_gnu.match(line) > + if m: > + name = m.group(2) ... uniformly use m.group(2) here (outside of the conditional)? Jan > + if name is None or name not in exports: > continue > addr = int(m.group(1), 16) > - exports[m.group(2)] = addr > + exports[name] = addr > for (name, addr) in exports.items(): > if addr is None: > raise Exception("Required export symbols %s not found" % name)
On Tue, Nov 5, 2024 at 3:32 PM Jan Beulich <jbeulich@suse.com> wrote: > > On 05.11.2024 15:55, Frediano Ziglio wrote: > > This toolchain generates different object and map files. > > Account for these changes. > > At least briefly mentioning what exactly the differences are would be > quite nice, imo. > What about. Object have 3 additional sections which must be handled by the linker script. Map file has 7 columns: VMA, LMA, size, alignment, output section, definition, symbols, for our symbols output section and definition are empty. > > --- a/xen/tools/combine_two_binaries.py > > +++ b/xen/tools/combine_two_binaries.py > > @@ -67,13 +67,22 @@ if args.exports is not None: > > > > # Parse mapfile, look for ther symbols we want to export. > > if args.mapfile is not None: > > - symbol_re = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') > > + symbol_re_clang = \ > > Is "clang" really appropriate to use here? Even without the alternative > being named "gnu", I'd expect "llvm" to be more to the point, as at > the linking stage the original language (encoded in "clang") shouldn't > matter much anymore. > I'll change to "llvm". > > + re.compile(r'\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s{15,}(\S+)\n') > > Just wondering: > - How stable is their map file format? No idea, unfortunately, map files are not really standard. > - Do they not have an option to write GNU-compatible map files? I try to search, but I could not find such an option. > - Why do you declare 5 groups, when you're only after the 1st and last, > which would then allow to ... > > > + symbol_re_gnu = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') > > for line in open(args.mapfile): > > - m = symbol_re.match(line) > > - if not m or m.group(2) not in exports: > > + name = None > > + m = symbol_re_clang.match(line) > > + if m: > > + name = m.group(5) > > + else: > > + m = symbol_re_gnu.match(line) > > + if m: > > + name = m.group(2) > > ... uniformly use m.group(2) here (outside of the conditional)? > It could be done. The initial idea was testing that VMA and LMA fields should be identical, which gives a bit more certainty about the format used. About map file format, both formats have some headers. Would it be sensible to detect such headers? BTW, probably a mis-detection would cause symbols not to be detected. > Jan > > > + if name is None or name not in exports: > > continue > > addr = int(m.group(1), 16) > > - exports[m.group(2)] = addr > > + exports[name] = addr > > for (name, addr) in exports.items(): > > if addr is None: > > raise Exception("Required export symbols %s not found" % name) > Frediano
On 05.11.2024 17:35, Frediano Ziglio wrote: > On Tue, Nov 5, 2024 at 3:32 PM Jan Beulich <jbeulich@suse.com> wrote: >> >> On 05.11.2024 15:55, Frediano Ziglio wrote: >>> This toolchain generates different object and map files. >>> Account for these changes. >> >> At least briefly mentioning what exactly the differences are would be >> quite nice, imo. >> > > What about. > > Object have 3 additional sections which must be handled by the linker script. I expect these sections are there in both cases. The difference, I assume, is that for the GNU linker they don't need mentioning in the linker script. Maybe that's what you mean to say, but to me at least the sentence can also be interpreted differently. >>> + symbol_re_gnu = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') >>> for line in open(args.mapfile): >>> - m = symbol_re.match(line) >>> - if not m or m.group(2) not in exports: >>> + name = None >>> + m = symbol_re_clang.match(line) >>> + if m: >>> + name = m.group(5) >>> + else: >>> + m = symbol_re_gnu.match(line) >>> + if m: >>> + name = m.group(2) >> >> ... uniformly use m.group(2) here (outside of the conditional)? >> > > It could be done. The initial idea was testing that VMA and LMA fields > should be identical, which gives a bit more certainty about the format > used. > About map file format, both formats have some headers. Would it be > sensible to detect such headers? BTW, probably a mis-detection would > cause symbols not to be detected. Not sure either way, to be honest. Jan
On 05/11/2024 2:55 pm, Frediano Ziglio wrote: > diff --git a/xen/tools/combine_two_binaries.py b/xen/tools/combine_two_binaries.py > index 447c0d3bdb..79ae8900b1 100755 > --- a/xen/tools/combine_two_binaries.py > +++ b/xen/tools/combine_two_binaries.py > @@ -67,13 +67,22 @@ if args.exports is not None: > > # Parse mapfile, look for ther symbols we want to export. > if args.mapfile is not None: > - symbol_re = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') > + symbol_re_clang = \ > + re.compile(r'\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s{15,}(\S+)\n') > + symbol_re_gnu = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') These are specific to the linker, not the compiler, so really should be _ld and _lld rather than _gnu and _clang. The build environment does have CONFIG_LD_IS_GNU which is used for one LD vs LLD distinction. It seems reasonable to re-use it here (so you're not trying both regexes in the hope that one works), although you'll need to plumb it into the script somehow because it's not exported into the environment currently. But, regexes are utterly opaque, and given that neither Binutils nor LLVM document their map format, you really need to provide at least one example of what a relevant mapfile looks like in a comment. Looking at built-in-32.offset.map in it's gnu form, it's presumably looking for the lines that are just a hex address and a name with no other punctuation. But the lld form is clearly different. ~Andrew
On Tue, Nov 5, 2024 at 5:06 PM Jan Beulich <jbeulich@suse.com> wrote: > > On 05.11.2024 17:35, Frediano Ziglio wrote: > > On Tue, Nov 5, 2024 at 3:32 PM Jan Beulich <jbeulich@suse.com> wrote: > >> > >> On 05.11.2024 15:55, Frediano Ziglio wrote: > >>> This toolchain generates different object and map files. > >>> Account for these changes. > >> > >> At least briefly mentioning what exactly the differences are would be > >> quite nice, imo. > >> > > > > What about. > > > > Object have 3 additional sections which must be handled by the linker script. > > I expect these sections are there in both cases. The difference, I assume, > is that for the GNU linker they don't need mentioning in the linker script. > Maybe that's what you mean to say, but to me at least the sentence can also > be interpreted differently. > Why do you expect such sections? They are used for dynamic symbols in shared objects, we don't use shared objects here. Normal object symbols are not handled by these sections. GNU compiler+linker (we link multiple objects together) do not generate these sections. So the comment looks correct to me. > >>> + symbol_re_gnu = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') > >>> for line in open(args.mapfile): > >>> - m = symbol_re.match(line) > >>> - if not m or m.group(2) not in exports: > >>> + name = None > >>> + m = symbol_re_clang.match(line) > >>> + if m: > >>> + name = m.group(5) > >>> + else: > >>> + m = symbol_re_gnu.match(line) > >>> + if m: > >>> + name = m.group(2) > >> > >> ... uniformly use m.group(2) here (outside of the conditional)? > >> > > > > It could be done. The initial idea was testing that VMA and LMA fields > > should be identical, which gives a bit more certainty about the format > > used. > > About map file format, both formats have some headers. Would it be > > sensible to detect such headers? BTW, probably a mis-detection would > > cause symbols not to be detected. > > Not sure either way, to be honest. > > Jan Mumble... I'm looking for an alternative, maybe I can avoid using map files. Frediano
On Tue Nov 5, 2024 at 7:23 PM GMT, Andrew Cooper wrote: > On 05/11/2024 2:55 pm, Frediano Ziglio wrote: > > diff --git a/xen/tools/combine_two_binaries.py b/xen/tools/combine_two_binaries.py > > index 447c0d3bdb..79ae8900b1 100755 > > --- a/xen/tools/combine_two_binaries.py > > +++ b/xen/tools/combine_two_binaries.py > > @@ -67,13 +67,22 @@ if args.exports is not None: > > > > # Parse mapfile, look for ther symbols we want to export. > > if args.mapfile is not None: > > - symbol_re = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') > > + symbol_re_clang = \ > > + re.compile(r'\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s{15,}(\S+)\n') > > + symbol_re_gnu = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') > > These are specific to the linker, not the compiler, so really should be > _ld and _lld rather than _gnu and _clang. GNU binutils ships not one, but two linkers. _ld does not reflect the fact that it's not the map format of a linker, but a family of them. Probably shared by mold too. That's a good reason for _gnu to stay _gnu (or if changing, to _binutils; but that's hardly relevant in this patch). Otherwise _clang could become either _llvm or _lld. Given there's a single linker shipped by LLVM the difference is less than cosmetic. > > The build environment does have CONFIG_LD_IS_GNU which is used for one > LD vs LLD distinction. It seems reasonable to re-use it here (so you're > not trying both regexes in the hope that one works), although you'll > need to plumb it into the script somehow because it's not exported into > the environment currently. > > > But, regexes are utterly opaque, and given that neither Binutils nor > LLVM document their map format, you really need to provide at least one > example of what a relevant mapfile looks like in a comment. > > Looking at built-in-32.offset.map in it's gnu form, it's presumably > looking for the lines that are just a hex address and a name with no > other punctuation. > > But the lld form is clearly different. > > ~Andrew Cheers, Alejandro
On 06/11/2024 10:32 am, Alejandro Vallejo wrote: > On Tue Nov 5, 2024 at 7:23 PM GMT, Andrew Cooper wrote: >> On 05/11/2024 2:55 pm, Frediano Ziglio wrote: >>> diff --git a/xen/tools/combine_two_binaries.py b/xen/tools/combine_two_binaries.py >>> index 447c0d3bdb..79ae8900b1 100755 >>> --- a/xen/tools/combine_two_binaries.py >>> +++ b/xen/tools/combine_two_binaries.py >>> @@ -67,13 +67,22 @@ if args.exports is not None: >>> >>> # Parse mapfile, look for ther symbols we want to export. >>> if args.mapfile is not None: >>> - symbol_re = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') >>> + symbol_re_clang = \ >>> + re.compile(r'\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s{15,}(\S+)\n') >>> + symbol_re_gnu = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') >> These are specific to the linker, not the compiler, so really should be >> _ld and _lld rather than _gnu and _clang. > GNU binutils ships not one, but two linkers. Yes that's technically true. But while only one of them is remotely capable of linking a kernel, it's also not very relevant. Neither Gold, or indeed Mold which you mention too, are liable to be LD within Xen for the foreseeable future. ~Andrew
On 06.11.2024 07:56, Frediano Ziglio wrote: > On Tue, Nov 5, 2024 at 5:06 PM Jan Beulich <jbeulich@suse.com> wrote: >> >> On 05.11.2024 17:35, Frediano Ziglio wrote: >>> On Tue, Nov 5, 2024 at 3:32 PM Jan Beulich <jbeulich@suse.com> wrote: >>>> >>>> On 05.11.2024 15:55, Frediano Ziglio wrote: >>>>> This toolchain generates different object and map files. >>>>> Account for these changes. >>>> >>>> At least briefly mentioning what exactly the differences are would be >>>> quite nice, imo. >>>> >>> >>> What about. >>> >>> Object have 3 additional sections which must be handled by the linker script. >> >> I expect these sections are there in both cases. The difference, I assume, >> is that for the GNU linker they don't need mentioning in the linker script. >> Maybe that's what you mean to say, but to me at least the sentence can also >> be interpreted differently. > > Why do you expect such sections? They are used for dynamic symbols in > shared objects, we don't use shared objects here. Normal object > symbols are not handled by these sections. GNU compiler+linker (we > link multiple objects together) do not generate these sections. So the > comment looks correct to me. About every ELF object will have .symtab and .strtab, and many also a separate .shstrtab. There's nothing "dynamic" about them. IOW - I'm confused by your reply. Jan
On Wed, Nov 6, 2024 at 10:59 AM Jan Beulich <jbeulich@suse.com> wrote: > > On 06.11.2024 07:56, Frediano Ziglio wrote: > > On Tue, Nov 5, 2024 at 5:06 PM Jan Beulich <jbeulich@suse.com> wrote: > >> > >> On 05.11.2024 17:35, Frediano Ziglio wrote: > >>> On Tue, Nov 5, 2024 at 3:32 PM Jan Beulich <jbeulich@suse.com> wrote: > >>>> > >>>> On 05.11.2024 15:55, Frediano Ziglio wrote: > >>>>> This toolchain generates different object and map files. > >>>>> Account for these changes. > >>>> > >>>> At least briefly mentioning what exactly the differences are would be > >>>> quite nice, imo. > >>>> > >>> > >>> What about. > >>> > >>> Object have 3 additional sections which must be handled by the linker script. > >> > >> I expect these sections are there in both cases. The difference, I assume, > >> is that for the GNU linker they don't need mentioning in the linker script. > >> Maybe that's what you mean to say, but to me at least the sentence can also > >> be interpreted differently. > > > > Why do you expect such sections? They are used for dynamic symbols in > > shared objects, we don't use shared objects here. Normal object > > symbols are not handled by these sections. GNU compiler+linker (we > > link multiple objects together) do not generate these sections. So the > > comment looks correct to me. > > About every ELF object will have .symtab and .strtab, and many also a > separate .shstrtab. There's nothing "dynamic" about them. IOW - I'm > confused by your reply. > > Jan I checked the object files and there are no such sections using GNU toolchain. Frediano
On 06.11.2024 12:34, Frediano Ziglio wrote: > On Wed, Nov 6, 2024 at 10:59 AM Jan Beulich <jbeulich@suse.com> wrote: >> >> On 06.11.2024 07:56, Frediano Ziglio wrote: >>> On Tue, Nov 5, 2024 at 5:06 PM Jan Beulich <jbeulich@suse.com> wrote: >>>> >>>> On 05.11.2024 17:35, Frediano Ziglio wrote: >>>>> On Tue, Nov 5, 2024 at 3:32 PM Jan Beulich <jbeulich@suse.com> wrote: >>>>>> >>>>>> On 05.11.2024 15:55, Frediano Ziglio wrote: >>>>>>> This toolchain generates different object and map files. >>>>>>> Account for these changes. >>>>>> >>>>>> At least briefly mentioning what exactly the differences are would be >>>>>> quite nice, imo. >>>>>> >>>>> >>>>> What about. >>>>> >>>>> Object have 3 additional sections which must be handled by the linker script. >>>> >>>> I expect these sections are there in both cases. The difference, I assume, >>>> is that for the GNU linker they don't need mentioning in the linker script. >>>> Maybe that's what you mean to say, but to me at least the sentence can also >>>> be interpreted differently. >>> >>> Why do you expect such sections? They are used for dynamic symbols in >>> shared objects, we don't use shared objects here. Normal object >>> symbols are not handled by these sections. GNU compiler+linker (we >>> link multiple objects together) do not generate these sections. So the >>> comment looks correct to me. >> >> About every ELF object will have .symtab and .strtab, and many also a >> separate .shstrtab. There's nothing "dynamic" about them. IOW - I'm >> confused by your reply. > > I checked the object files and there are no such sections using GNU toolchain. I think I checked every *.o that's under boot/, and they all have these three sections. Can you clarify which one(s) specifically you checked? Jan
On Wed, Nov 6, 2024 at 11:45 AM Jan Beulich <jbeulich@suse.com> wrote: > > On 06.11.2024 12:34, Frediano Ziglio wrote: > > On Wed, Nov 6, 2024 at 10:59 AM Jan Beulich <jbeulich@suse.com> wrote: > >> > >> On 06.11.2024 07:56, Frediano Ziglio wrote: > >>> On Tue, Nov 5, 2024 at 5:06 PM Jan Beulich <jbeulich@suse.com> wrote: > >>>> > >>>> On 05.11.2024 17:35, Frediano Ziglio wrote: > >>>>> On Tue, Nov 5, 2024 at 3:32 PM Jan Beulich <jbeulich@suse.com> wrote: > >>>>>> > >>>>>> On 05.11.2024 15:55, Frediano Ziglio wrote: > >>>>>>> This toolchain generates different object and map files. > >>>>>>> Account for these changes. > >>>>>> > >>>>>> At least briefly mentioning what exactly the differences are would be > >>>>>> quite nice, imo. > >>>>>> > >>>>> > >>>>> What about. > >>>>> > >>>>> Object have 3 additional sections which must be handled by the linker script. > >>>> > >>>> I expect these sections are there in both cases. The difference, I assume, > >>>> is that for the GNU linker they don't need mentioning in the linker script. > >>>> Maybe that's what you mean to say, but to me at least the sentence can also > >>>> be interpreted differently. > >>> > >>> Why do you expect such sections? They are used for dynamic symbols in > >>> shared objects, we don't use shared objects here. Normal object > >>> symbols are not handled by these sections. GNU compiler+linker (we > >>> link multiple objects together) do not generate these sections. So the > >>> comment looks correct to me. > >> > >> About every ELF object will have .symtab and .strtab, and many also a > >> separate .shstrtab. There's nothing "dynamic" about them. IOW - I'm > >> confused by your reply. > > > > I checked the object files and there are no such sections using GNU toolchain. > > I think I checked every *.o that's under boot/, and they all have these three > sections. Can you clarify which one(s) specifically you checked? > > Jan $ gcc --version gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ ld --version GNU ld (GNU Binutils for Ubuntu) 2.38 Copyright (C) 2022 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public Licence version 3 or (at your option) a later version. This program has absolutely no warranty. $ find xen/normal/ xen/pvh/ -name \*.o | xargs -ifilename sh -c 'objdump -x filename' | grep -e \\. shstrtab -e \\.strtab -e \\.symtab (xen/normal and xen/pvh are the build directory, with different configurations) I'm saying that's possibly why the linker scripts didn't need to specify these sections. Frediano
On Wed, Nov 06, 2024 at 11:58:19AM +0000, Frediano Ziglio wrote: > On Wed, Nov 6, 2024 at 11:45 AM Jan Beulich <jbeulich@suse.com> wrote: > > > > On 06.11.2024 12:34, Frediano Ziglio wrote: > > > On Wed, Nov 6, 2024 at 10:59 AM Jan Beulich <jbeulich@suse.com> wrote: > > >> > > >> On 06.11.2024 07:56, Frediano Ziglio wrote: > > >>> On Tue, Nov 5, 2024 at 5:06 PM Jan Beulich <jbeulich@suse.com> wrote: > > >>>> > > >>>> On 05.11.2024 17:35, Frediano Ziglio wrote: > > >>>>> On Tue, Nov 5, 2024 at 3:32 PM Jan Beulich <jbeulich@suse.com> wrote: > > >>>>>> > > >>>>>> On 05.11.2024 15:55, Frediano Ziglio wrote: > > >>>>>>> This toolchain generates different object and map files. > > >>>>>>> Account for these changes. > > >>>>>> > > >>>>>> At least briefly mentioning what exactly the differences are would be > > >>>>>> quite nice, imo. > > >>>>>> > > >>>>> > > >>>>> What about. > > >>>>> > > >>>>> Object have 3 additional sections which must be handled by the linker script. > > >>>> > > >>>> I expect these sections are there in both cases. The difference, I assume, > > >>>> is that for the GNU linker they don't need mentioning in the linker script. > > >>>> Maybe that's what you mean to say, but to me at least the sentence can also > > >>>> be interpreted differently. > > >>> > > >>> Why do you expect such sections? They are used for dynamic symbols in > > >>> shared objects, we don't use shared objects here. Normal object > > >>> symbols are not handled by these sections. GNU compiler+linker (we > > >>> link multiple objects together) do not generate these sections. So the > > >>> comment looks correct to me. > > >> > > >> About every ELF object will have .symtab and .strtab, and many also a > > >> separate .shstrtab. There's nothing "dynamic" about them. IOW - I'm > > >> confused by your reply. > > > > > > I checked the object files and there are no such sections using GNU toolchain. > > > > I think I checked every *.o that's under boot/, and they all have these three > > sections. Can you clarify which one(s) specifically you checked? > > > > Jan > > $ gcc --version > gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 > Copyright (C) 2021 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > $ ld --version > GNU ld (GNU Binutils for Ubuntu) 2.38 > Copyright (C) 2022 Free Software Foundation, Inc. > This program is free software; you may redistribute it under the terms of > the GNU General Public Licence version 3 or (at your option) a later version. > This program has absolutely no warranty. > > $ find xen/normal/ xen/pvh/ -name \*.o | xargs -ifilename sh -c > 'objdump -x filename' | grep -e \\. > shstrtab -e \\.strtab -e \\.symtab > > (xen/normal and xen/pvh are the build directory, with different configurations) > > I'm saying that's possibly why the linker scripts didn't need to > specify these sections. objdump -x doesn't seem to list .symtab, .strtab or .shstrtab, but readelf -S does: # readelf -SW xen/arch/x86/boot/reloc.o [...] [11] .symtab SYMTAB 00000000 0004b0 000190 10 12 21 4 [12] .strtab STRTAB 00000000 000640 000092 00 0 0 1 [13] .shstrtab STRTAB 00000000 000734 000078 00 0 0 1 # objdump -x xen/arch/x86/boot/reloc.o [...] Sections: Idx Name Size VMA LMA File off Algn 0 .group 00000008 00000000 00000000 00000034 2**2 CONTENTS, READONLY, GROUP, LINK_ONCE_DISCARD 1 .text 0000041d 00000000 00000000 00000040 2**4 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 2 .data 00000000 00000000 00000000 0000045d 2**0 CONTENTS, ALLOC, LOAD, DATA 3 .bss 00000000 00000000 00000000 0000045d 2**0 ALLOC 4 .rodata 00000024 00000000 00000000 00000460 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA 5 .text.__x86.get_pc_thunk.bx 00000004 00000000 00000000 00000484 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 6 .comment 00000028 00000000 00000000 00000488 2**0 CONTENTS, READONLY 7 .note.GNU-stack 00000000 00000000 00000000 000004b0 2**0 CONTENTS, READONLY It also seems to skip .rel sections. It doesn't seem reliable to use objdump to get ELF sections. Regards, Roger.
On 06.11.2024 12:58, Frediano Ziglio wrote: > On Wed, Nov 6, 2024 at 11:45 AM Jan Beulich <jbeulich@suse.com> wrote: >> >> On 06.11.2024 12:34, Frediano Ziglio wrote: >>> On Wed, Nov 6, 2024 at 10:59 AM Jan Beulich <jbeulich@suse.com> wrote: >>>> >>>> On 06.11.2024 07:56, Frediano Ziglio wrote: >>>>> On Tue, Nov 5, 2024 at 5:06 PM Jan Beulich <jbeulich@suse.com> wrote: >>>>>> >>>>>> On 05.11.2024 17:35, Frediano Ziglio wrote: >>>>>>> On Tue, Nov 5, 2024 at 3:32 PM Jan Beulich <jbeulich@suse.com> wrote: >>>>>>>> >>>>>>>> On 05.11.2024 15:55, Frediano Ziglio wrote: >>>>>>>>> This toolchain generates different object and map files. >>>>>>>>> Account for these changes. >>>>>>>> >>>>>>>> At least briefly mentioning what exactly the differences are would be >>>>>>>> quite nice, imo. >>>>>>>> >>>>>>> >>>>>>> What about. >>>>>>> >>>>>>> Object have 3 additional sections which must be handled by the linker script. >>>>>> >>>>>> I expect these sections are there in both cases. The difference, I assume, >>>>>> is that for the GNU linker they don't need mentioning in the linker script. >>>>>> Maybe that's what you mean to say, but to me at least the sentence can also >>>>>> be interpreted differently. >>>>> >>>>> Why do you expect such sections? They are used for dynamic symbols in >>>>> shared objects, we don't use shared objects here. Normal object >>>>> symbols are not handled by these sections. GNU compiler+linker (we >>>>> link multiple objects together) do not generate these sections. So the >>>>> comment looks correct to me. >>>> >>>> About every ELF object will have .symtab and .strtab, and many also a >>>> separate .shstrtab. There's nothing "dynamic" about them. IOW - I'm >>>> confused by your reply. >>> >>> I checked the object files and there are no such sections using GNU toolchain. >> >> I think I checked every *.o that's under boot/, and they all have these three >> sections. Can you clarify which one(s) specifically you checked? > > $ gcc --version > gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 > Copyright (C) 2021 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > $ ld --version > GNU ld (GNU Binutils for Ubuntu) 2.38 > Copyright (C) 2022 Free Software Foundation, Inc. > This program is free software; you may redistribute it under the terms of > the GNU General Public Licence version 3 or (at your option) a later version. > This program has absolutely no warranty. > > $ find xen/normal/ xen/pvh/ -name \*.o | xargs -ifilename sh -c > 'objdump -x filename' | grep -e \\. > shstrtab -e \\.strtab -e \\.symtab > > (xen/normal and xen/pvh are the build directory, with different configurations) > > I'm saying that's possibly why the linker scripts didn't need to > specify these sections. Just to mention it here as well - objdump's -x option doesn't include "control" sections. Considering the help text for -x this feels like a bug. However, as documentation has it: "Display all available header information, including the symbol table and relocation entries." the symbol table and possible relocations _are_ being displayed, just not as part of "Sections:". Jan
diff --git a/xen/arch/x86/boot/build32.lds.S b/xen/arch/x86/boot/build32.lds.S index f20fc18977..2e565180d5 100644 --- a/xen/arch/x86/boot/build32.lds.S +++ b/xen/arch/x86/boot/build32.lds.S @@ -66,6 +66,15 @@ SECTIONS *(.comment.*) *(.note.*) } + .shstrtab : { + *(.shstrtab) + } + .strtab : { + *(.strtab) + } + .symtab : { + *(.symtab) + } /* Dynamic linkage sections. Collected simply so we can check they're empty. */ .got : { *(.got) diff --git a/xen/tools/combine_two_binaries.py b/xen/tools/combine_two_binaries.py index 447c0d3bdb..79ae8900b1 100755 --- a/xen/tools/combine_two_binaries.py +++ b/xen/tools/combine_two_binaries.py @@ -67,13 +67,22 @@ if args.exports is not None: # Parse mapfile, look for ther symbols we want to export. if args.mapfile is not None: - symbol_re = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') + symbol_re_clang = \ + re.compile(r'\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s{15,}(\S+)\n') + symbol_re_gnu = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n') for line in open(args.mapfile): - m = symbol_re.match(line) - if not m or m.group(2) not in exports: + name = None + m = symbol_re_clang.match(line) + if m: + name = m.group(5) + else: + m = symbol_re_gnu.match(line) + if m: + name = m.group(2) + if name is None or name not in exports: continue addr = int(m.group(1), 16) - exports[m.group(2)] = addr + exports[name] = addr for (name, addr) in exports.items(): if addr is None: raise Exception("Required export symbols %s not found" % name)
This toolchain generates different object and map files. Account for these changes. Added sections need to have special type so we put them in separate sections as linker will copy type from input sections. Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com> --- xen/arch/x86/boot/build32.lds.S | 9 +++++++++ xen/tools/combine_two_binaries.py | 17 +++++++++++++---- 2 files changed, 22 insertions(+), 4 deletions(-)