diff mbox series

[v2,2/3] x86/boot: Uses nm command instead of map file to get symbols

Message ID 20241106114150.1432512-3-frediano.ziglio@cloud.com (mailing list archive)
State Superseded
Headers show
Series x86/boot: Fix build with LLVM toolchain | expand

Commit Message

Frediano Ziglio Nov. 6, 2024, 11:41 a.m. UTC
Map file format is not standard making it code readind it
not portable and potentially hard to maintain.
Use nm command instead to get list of symbols; specifically
BSD format as it does not truncate symbols names like sysv one.

Fixes: aa9045e77130 ('x86/boot: Rework how 32bit C is linked/included for early boot')

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
---
 xen/arch/x86/boot/Makefile        |  5 +++--
 xen/tools/combine_two_binaries.py | 28 ++++++++++++++++++----------
 2 files changed, 21 insertions(+), 12 deletions(-)

Comments

Andrew Cooper Nov. 6, 2024, 12:10 p.m. UTC | #1
On 06/11/2024 11:41 am, Frediano Ziglio wrote:
> Map file format is not standard making it code readind it
> not portable and potentially hard to maintain.

I think you want to include a sentence along the lines of
"combine_two_binaries.py only understands GNU LD's format, and does not
work with LLVM's LLD."

That makes it more clear why this this gets a Fixes tag.

I'd also suggest having the following sentence in separate paragraph for
clarity.

> Use nm command instead to get list of symbols; specifically
> BSD format as it does not truncate symbols names like sysv one.
>
> Fixes: aa9045e77130 ('x86/boot: Rework how 32bit C is linked/included for early boot')
>
> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> ---
>  xen/arch/x86/boot/Makefile        |  5 +++--
>  xen/tools/combine_two_binaries.py | 28 ++++++++++++++++++----------
>  2 files changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
> index 777b4befeb..01100a4b72 100644
> --- a/xen/arch/x86/boot/Makefile
> +++ b/xen/arch/x86/boot/Makefile
> @@ -67,7 +67,8 @@ $(obj)/built-in-32.tmp.o: $(obj32)
>  # If possible we use --orphan-handling=error option to make sure we account
>  # for all possible sections from C code.
>  $(obj)/built-in-32.%.bin: $(obj)/build32.%.lds $(obj)/built-in-32.tmp.o
> -	$(LD32) $(orphan-handling-y) -N -T $< -Map $(@:bin=map) -o $(@:bin=o) $(filter %.o,$^)
> +	$(LD32) $(orphan-handling-y) -N -T $< -o $(@:bin=o) $(filter %.o,$^)
> +	$(NM) -p --format=bsd $(@:bin=o) > $(@:bin=nm)
>  	$(OBJCOPY) -j .text -O binary $(@:bin=o) $@
>  	rm -f $(@:bin=o)
>  
> @@ -79,7 +80,7 @@ cmd_combine = \
>                --script    $(obj)/build32.base.lds \
>                --bin1      $(obj)/built-in-32.base.bin \
>                --bin2      $(obj)/built-in-32.offset.bin \
> -              --map       $(obj)/built-in-32.base.map \
> +              --symbols   $(obj)/built-in-32.base.nm \
>                --exports   cmdline_parse_early,reloc,reloc_trampoline32 \
>                --output    $@
>  
> diff --git a/xen/tools/combine_two_binaries.py b/xen/tools/combine_two_binaries.py
> index 447c0d3bdb..db02494b28 100755
> --- a/xen/tools/combine_two_binaries.py
> +++ b/xen/tools/combine_two_binaries.py
> @@ -28,8 +28,8 @@ parser.add_argument('--text-diff', dest='text_diff',
>                      help='Difference between code section start')
>  parser.add_argument('--output', dest='output',
>                      help='Output file')
> -parser.add_argument('--map', dest='mapfile',
> -                    help='Map file to read for symbols to export')
> +parser.add_argument('--symbols', dest='symbols_file',
> +                    help='Nm command output to read for symbols to export')

We call the output of $(NM) uniformly .map elsewhere in Xen, even the
top level System.map

I'd suggest retaining the .map extension, and --map argument, and you
can probably just say help='Map file (NM) to ...' for the help text to
make it explicit.

That in turn reduces the churn ...


>  parser.add_argument('--exports', dest='exports',
>                      help='Symbols to export')
>  parser.add_argument('--section-header', dest='section_header',
> @@ -65,15 +65,23 @@ exports = []
>  if args.exports is not None:
>      exports = dict([(name, None) for name in args.exports.split(',')])
>  
> -# Parse mapfile, look for ther symbols we want to export.
> -if args.mapfile is not None:
> -    symbol_re = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n')
> -    for line in open(args.mapfile):
> -        m = symbol_re.match(line)
> -        if not m or m.group(2) not in exports:
> +# Parse symbols file, look for symbols we want to export.
> +if args.symbols_file is not None:

... here.

> +    dummy_start = -1
> +    for line in open(args.symbols_file):
> +        v = line.split()
> +        if len(v) != 3 or v[1].upper() != 'T':
>              continue

A slightly nicer way of doing this is:

    parts = line.split()

    if len(parts) != 3:
        continue

    addr, type, sym = parts

which means you have more legible code blow.


> -        addr = int(m.group(1), 16)
> -        exports[m.group(2)] = addr
> +        addr = int(v[0], 16)
> +        if v[2] == 'dummy_start':
> +            dummy_start = addr
> +            continue
> +        if v[2] not in exports:
> +            continue
> +        exports[v[2]] = addr
> +    if dummy_start != 0:
> +        raise Exception("dummy_start symbol expected to be present and 0")
> +
>  for (name, addr) in exports.items():
>      if addr is None:
>          raise Exception("Required export symbols %s not found" % name)

Something to consider.  Instead of special casing dummy_start in several
ways, you could, insert it into exports to begin with, then check

    if exports["dummy_start"] != 0:
        raise Exception("dummy_start symbol expected to be present and 0")
    del exports["dummy_start"]

after which you're back to just the real --exports in exports[].


All of this said, it definitely looks like a much more robust solution
to the problem.

~Andrew
diff mbox series

Patch

diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
index 777b4befeb..01100a4b72 100644
--- a/xen/arch/x86/boot/Makefile
+++ b/xen/arch/x86/boot/Makefile
@@ -67,7 +67,8 @@  $(obj)/built-in-32.tmp.o: $(obj32)
 # If possible we use --orphan-handling=error option to make sure we account
 # for all possible sections from C code.
 $(obj)/built-in-32.%.bin: $(obj)/build32.%.lds $(obj)/built-in-32.tmp.o
-	$(LD32) $(orphan-handling-y) -N -T $< -Map $(@:bin=map) -o $(@:bin=o) $(filter %.o,$^)
+	$(LD32) $(orphan-handling-y) -N -T $< -o $(@:bin=o) $(filter %.o,$^)
+	$(NM) -p --format=bsd $(@:bin=o) > $(@:bin=nm)
 	$(OBJCOPY) -j .text -O binary $(@:bin=o) $@
 	rm -f $(@:bin=o)
 
@@ -79,7 +80,7 @@  cmd_combine = \
               --script    $(obj)/build32.base.lds \
               --bin1      $(obj)/built-in-32.base.bin \
               --bin2      $(obj)/built-in-32.offset.bin \
-              --map       $(obj)/built-in-32.base.map \
+              --symbols   $(obj)/built-in-32.base.nm \
               --exports   cmdline_parse_early,reloc,reloc_trampoline32 \
               --output    $@
 
diff --git a/xen/tools/combine_two_binaries.py b/xen/tools/combine_two_binaries.py
index 447c0d3bdb..db02494b28 100755
--- a/xen/tools/combine_two_binaries.py
+++ b/xen/tools/combine_two_binaries.py
@@ -28,8 +28,8 @@  parser.add_argument('--text-diff', dest='text_diff',
                     help='Difference between code section start')
 parser.add_argument('--output', dest='output',
                     help='Output file')
-parser.add_argument('--map', dest='mapfile',
-                    help='Map file to read for symbols to export')
+parser.add_argument('--symbols', dest='symbols_file',
+                    help='Nm command output to read for symbols to export')
 parser.add_argument('--exports', dest='exports',
                     help='Symbols to export')
 parser.add_argument('--section-header', dest='section_header',
@@ -65,15 +65,23 @@  exports = []
 if args.exports is not None:
     exports = dict([(name, None) for name in args.exports.split(',')])
 
-# Parse mapfile, look for ther symbols we want to export.
-if args.mapfile is not None:
-    symbol_re = re.compile(r'\s{15,}0x([0-9a-f]+)\s+(\S+)\n')
-    for line in open(args.mapfile):
-        m = symbol_re.match(line)
-        if not m or m.group(2) not in exports:
+# Parse symbols file, look for symbols we want to export.
+if args.symbols_file is not None:
+    dummy_start = -1
+    for line in open(args.symbols_file):
+        v = line.split()
+        if len(v) != 3 or v[1].upper() != 'T':
             continue
-        addr = int(m.group(1), 16)
-        exports[m.group(2)] = addr
+        addr = int(v[0], 16)
+        if v[2] == 'dummy_start':
+            dummy_start = addr
+            continue
+        if v[2] not in exports:
+            continue
+        exports[v[2]] = addr
+    if dummy_start != 0:
+        raise Exception("dummy_start symbol expected to be present and 0")
+
 for (name, addr) in exports.items():
     if addr is None:
         raise Exception("Required export symbols %s not found" % name)