Message ID | 20230719155235.244478-6-deller@gmx.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [PULL,1/5] linux-user: Fix qemu brk() to not zero bytes on current page | expand |
19.07.2023 18:52, Helge Deller wrote: > qemu-user crashes immediately when running static binaries on the armhf > architecture. The problem is the memory layout where the executable is > loaded before the interpreter library, in which case the reserved brk > region clashes with the interpreter code and is released before qemu > tries to start the program. > > At load time qemu calculates a brk value for interpreter and executable > each. The fix is to choose the higher one of both. > > Signed-off-by: Helge Deller <deller@gmx.de> > Cc: Andreas Schwab <schwab@suse.de> > Cc: qemu-stable@nongnu.org > Reported-by: Venkata.Pyla@toshiba-tsip.com > Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040981 > --- > linux-user/elfload.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/linux-user/elfload.c b/linux-user/elfload.c > index a26200d9f3..94951630b1 100644 > --- a/linux-user/elfload.c > +++ b/linux-user/elfload.c > @@ -3615,6 +3615,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info) > > if (elf_interpreter) { > load_elf_interp(elf_interpreter, &interp_info, bprm->buf); > + /* > + * adjust brk address if the interpreter was loaded above the main > + * executable, e.g. happens with static binaries on armhf > + */ > + if (interp_info.brk > info->brk) { > + info->brk = interp_info.brk; > + } So, this is kinda amusing. This broke arm64, ppc64el and s390x: arm64$ ./qemu-aarch64 /bin/sh -c '/bin/ls -dCFl *[t]* >/dev/null' qemu: uncaught target signal 11 (Segmentation fault) - core dumped Segmentation fault (it was just a quick test from debian qemu-user package). Reverting this patch makes it work again.. *Sigh*. (This is currently broken in debian unstable too, - this is how I noticed the breakage). Thanks, /mjt
21.07.2023 18:14, Michael Tokarev пишет: > 19.07.2023 18:52, Helge Deller wrote: >> qemu-user crashes immediately when running static binaries on the armhf >> architecture. The problem is the memory layout where the executable is >> loaded before the interpreter library, in which case the reserved brk >> region clashes with the interpreter code and is released before qemu >> tries to start the program. >> >> At load time qemu calculates a brk value for interpreter and executable >> each. The fix is to choose the higher one of both. >> >> Signed-off-by: Helge Deller <deller@gmx.de> >> Cc: Andreas Schwab <schwab@suse.de> >> Cc: qemu-stable@nongnu.org >> Reported-by: Venkata.Pyla@toshiba-tsip.com >> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040981 >> --- >> linux-user/elfload.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/linux-user/elfload.c b/linux-user/elfload.c >> index a26200d9f3..94951630b1 100644 >> --- a/linux-user/elfload.c >> +++ b/linux-user/elfload.c >> @@ -3615,6 +3615,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info) >> >> if (elf_interpreter) { >> load_elf_interp(elf_interpreter, &interp_info, bprm->buf); >> + /* >> + * adjust brk address if the interpreter was loaded above the main >> + * executable, e.g. happens with static binaries on armhf >> + */ >> + if (interp_info.brk > info->brk) { >> + info->brk = interp_info.brk; >> + } I've added printf() inside this if() block, on arm64 it produces: fixing brk: interp_info.brk=0x5502875358 info=>brk=0x5500032ec0 FWIW, /mjt
On 7/21/23 17:14, Michael Tokarev wrote: > 19.07.2023 18:52, Helge Deller wrote: >> qemu-user crashes immediately when running static binaries on the armhf >> architecture. The problem is the memory layout where the executable is >> loaded before the interpreter library, in which case the reserved brk >> region clashes with the interpreter code and is released before qemu >> tries to start the program. >> >> At load time qemu calculates a brk value for interpreter and executable >> each. The fix is to choose the higher one of both. >> >> Signed-off-by: Helge Deller <deller@gmx.de> >> Cc: Andreas Schwab <schwab@suse.de> >> Cc: qemu-stable@nongnu.org >> Reported-by: Venkata.Pyla@toshiba-tsip.com >> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040981 >> --- >> linux-user/elfload.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/linux-user/elfload.c b/linux-user/elfload.c >> index a26200d9f3..94951630b1 100644 >> --- a/linux-user/elfload.c >> +++ b/linux-user/elfload.c >> @@ -3615,6 +3615,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info) >> >> if (elf_interpreter) { >> load_elf_interp(elf_interpreter, &interp_info, bprm->buf); >> + /* >> + * adjust brk address if the interpreter was loaded above the main >> + * executable, e.g. happens with static binaries on armhf >> + */ >> + if (interp_info.brk > info->brk) { >> + info->brk = interp_info.brk; >> + } > > So, this is kinda amusing. > This broke arm64, ppc64el and s390x: > > arm64$ ./qemu-aarch64 /bin/sh -c '/bin/ls -dCFl *[t]* >/dev/null' > qemu: uncaught target signal 11 (Segmentation fault) - core dumped > Segmentation fault > > (it was just a quick test from debian qemu-user package). > > Reverting this patch makes it work again.. > > *Sigh*. Argh, that's really unfortunate. I just tested myself. Running static busybox binary did work for me: # ./qemu-aarch64 busybox BusyBox v1.30.1 (Debian 1:1.30.1-6+b3) multi-call binary. BusyBox is copyrighted by many authors between 1998-2015. .... I'd like to test dynamic binary as well, but I'm currently failing to set up an aarch64 chroot here. Sadly I won't have time to do any further testing until sunday evening (travelling over the weekend). Maybe someone else can try? I leave it up to Peter if he wants to revert that patch right now, or if it can wait a few days until I'm back? Helge
22.07.2023 00:37, Helge Deller wrote: .. >> So, this is kinda amusing. >> This broke arm64, ppc64el and s390x: >> >> arm64$ ./qemu-aarch64 /bin/sh -c '/bin/ls -dCFl *[t]* >/dev/null' >> qemu: uncaught target signal 11 (Segmentation fault) - core dumped >> Segmentation fault Note: I run it on native arm64, so it is arm64 on arm64, - this is a quick test I had from the debian qemu autopkgtest (which run /bin/ls under qemu and natively and compares the result). I haven't tried to reproduce it locally on amd64 host, - I did it on a debian arm64 porterbox (which I was logged on anyway to debug a different issue on armel, with qemu git tree already cloned). > Argh, that's really unfortunate. > I just tested myself. > Running static busybox binary did work for me: > # ./qemu-aarch64 busybox It didn't trigger with ls, but it did when I run something from emulated /bin/sh. This whole email was more like a heads-up/warning, to collect more details later, - and maybe someone knows what the problem is already if it is obvious. .. > Maybe someone else can try? I leave it up to Peter if he wants to revert > that patch right now, or if it can wait a few days until I'm back? For the time being, how about a quick-n-hacky band-aid, to include this fixup only where the original prob actually triggered in the first place? Like, if the target is armel? Something like #if defined(TARGET_ARM) && !defined(TARGET_AARCH64) or what's the right preprocessor condition is? Thanks, /mjt
On 7/21/23 17:14, Michael Tokarev wrote: > 19.07.2023 18:52, Helge Deller wrote: >> qemu-user crashes immediately when running static binaries on the armhf >> architecture. The problem is the memory layout where the executable is >> loaded before the interpreter library, in which case the reserved brk >> region clashes with the interpreter code and is released before qemu >> tries to start the program. >> >> At load time qemu calculates a brk value for interpreter and executable >> each. The fix is to choose the higher one of both. >> >> Signed-off-by: Helge Deller <deller@gmx.de> >> Cc: Andreas Schwab <schwab@suse.de> >> Cc: qemu-stable@nongnu.org >> Reported-by: Venkata.Pyla@toshiba-tsip.com >> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040981 >> --- >> linux-user/elfload.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/linux-user/elfload.c b/linux-user/elfload.c >> index a26200d9f3..94951630b1 100644 >> --- a/linux-user/elfload.c >> +++ b/linux-user/elfload.c >> @@ -3615,6 +3615,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info) >> >> if (elf_interpreter) { >> load_elf_interp(elf_interpreter, &interp_info, bprm->buf); >> + /* >> + * adjust brk address if the interpreter was loaded above the main >> + * executable, e.g. happens with static binaries on armhf >> + */ >> + if (interp_info.brk > info->brk) { >> + info->brk = interp_info.brk; >> + } > > So, this is kinda amusing. > This broke arm64, ppc64el and s390x: > > arm64$ ./qemu-aarch64 /bin/sh -c '/bin/ls -dCFl *[t]* >/dev/null' > qemu: uncaught target signal 11 (Segmentation fault) - core dumped > Segmentation fault > > (it was just a quick test from debian qemu-user package). > > Reverting this patch makes it work again.. It seems the whole loading of binaries on aarch64 is somewhat wrong. My patch just triggers the whole thing to blow up. This is how the memory-map looks on physical hardware: deller@amdahl:~$ uname -a Linux amdahl 5.10.0-23-arm64 #1 SMP Debian 5.10.179-2 (2023-07-14) aarch64 GNU/Linux deller@amdahl:~$ cat /proc/self/maps aaaaafb70000-aaaaafb78000 r-xp 00000000 fe:02 131743 /bin/cat aaaaafb88000-aaaaafb89000 r--p 00008000 fe:02 131743 /bin/cat aaaaafb89000-aaaaafb8a000 rw-p 00009000 fe:02 131743 /bin/cat aaaae022b000-aaaae024c000 rw-p 00000000 00:00 0 [heap] ffff9ce78000-ffff9ce9a000 rw-p 00000000 00:00 0 ffff9ce9a000-ffff9cff6000 r-xp 00000000 fe:02 790863 /lib/aarch64-linux-gnu/libc-2.31.so ffff9cff6000-ffff9d005000 ---p 0015c000 fe:02 790863 /lib/aarch64-linux-gnu/libc-2.31.so ffff9d005000-ffff9d009000 r--p 0015b000 fe:02 790863 /lib/aarch64-linux-gnu/libc-2.31.so ffff9d009000-ffff9d00b000 rw-p 0015f000 fe:02 790863 /lib/aarch64-linux-gnu/libc-2.31.so ffff9d00b000-ffff9d00e000 rw-p 00000000 00:00 0 ffff9d00e000-ffff9d02f000 r-xp 00000000 fe:02 790820 /lib/aarch64-linux-gnu/ld-2.31.so ffff9d033000-ffff9d035000 rw-p 00000000 00:00 0 ffff9d03c000-ffff9d03e000 r--p 00000000 00:00 0 [vvar] ffff9d03e000-ffff9d03f000 r-xp 00000000 00:00 0 [vdso] ffff9d03f000-ffff9d040000 r--p 00021000 fe:02 790820 /lib/aarch64-linux-gnu/ld-2.31.so ffff9d040000-ffff9d042000 rw-p 00022000 fe:02 790820 /lib/aarch64-linux-gnu/ld-2.31.so ffffe9ea3000-ffffe9ec4000 rw-p 00000000 00:00 0 [stack] this is on qemu-linux-user-aarch64: I have no name!@paq:/# cat /proc/self/maps 5500000000-5500009000 r-xp 00000000 08:01 2380521 /usr/bin/cat 5500009000-550001f000 ---p 00000000 00:00 0 550001f000-5500020000 r--p 0000f000 08:01 2380521 /usr/bin/cat 5500020000-5500021000 rw-p 00010000 08:01 2380521 /usr/bin/cat 5500021000-5500042000 rw-p 00000000 00:00 0 5502021000-5502022000 ---p 00000000 00:00 0 5502022000-5502822000 rw-p 00000000 00:00 0 [stack] 5502822000-5502848000 r-xp 00000000 08:01 2382563 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1 5502848000-5502860000 ---p 00000000 00:00 0 5502860000-5502862000 r--p 0002e000 08:01 2382563 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1 5502862000-5502864000 rw-p 00030000 08:01 2382563 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1 5502864000-5502865000 r-xp 00000000 00:00 0 5502865000-5502867000 rw-p 00000000 00:00 0 5502870000-55029f7000 r-xp 00000000 08:01 2382566 /usr/lib/aarch64-linux-gnu/libc.so.6 55029f7000-5502a0d000 ---p 00187000 08:01 2382566 /usr/lib/aarch64-linux-gnu/libc.so.6 5502a0d000-5502a10000 r--p 0018d000 08:01 2382566 /usr/lib/aarch64-linux-gnu/libc.so.6 5502a10000-5502a12000 rw-p 00190000 08:01 2382566 /usr/lib/aarch64-linux-gnu/libc.so.6 5502a12000-5502a1f000 rw-p 00000000 00:00 0 Here I got: interp_info.brk 0x5502863358 info->brk 0x5500020458 diff 0x2842f00 40 MB I think we need to make sure that the shared libs (ld, glibc) gets loaded at the top of the memory to free up heap space. Right now (without my patch) there were 40MB heap, with my patch if clashed with ld.so. Helge
diff --git a/linux-user/elfload.c b/linux-user/elfload.c index a26200d9f3..94951630b1 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -3615,6 +3615,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info) if (elf_interpreter) { load_elf_interp(elf_interpreter, &interp_info, bprm->buf); + /* + * adjust brk address if the interpreter was loaded above the main + * executable, e.g. happens with static binaries on armhf + */ + if (interp_info.brk > info->brk) { + info->brk = interp_info.brk; + } /* If the program interpreter is one of these two, then assume an iBCS2 image. Otherwise assume a native linux image. */
qemu-user crashes immediately when running static binaries on the armhf architecture. The problem is the memory layout where the executable is loaded before the interpreter library, in which case the reserved brk region clashes with the interpreter code and is released before qemu tries to start the program. At load time qemu calculates a brk value for interpreter and executable each. The fix is to choose the higher one of both. Signed-off-by: Helge Deller <deller@gmx.de> Cc: Andreas Schwab <schwab@suse.de> Cc: qemu-stable@nongnu.org Reported-by: Venkata.Pyla@toshiba-tsip.com Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040981 --- linux-user/elfload.c | 7 +++++++ 1 file changed, 7 insertions(+) -- 2.41.0