diff mbox series

linux-user: Tolerate CONFIG_LSM_MMAP_MIN_ADDR

Message ID 20241021121820.483535-1-iii@linux.ibm.com (mailing list archive)
State New
Headers show
Series linux-user: Tolerate CONFIG_LSM_MMAP_MIN_ADDR | expand

Commit Message

Ilya Leoshkevich Oct. 21, 2024, 12:17 p.m. UTC
Running qemu-i386 on a system running with SELinux in enforcing mode
fails with:

    qemu-i386: tests/tcg/i386-linux-user/sigreturn-sigmask: Unable to find a guest_base to satisfy all guest address mapping requirements
      00000000-ffffffff

The reason is that main() determines mmap_min_addr from
/proc/sys/vm/mmap_min_addr, but SELinux additionally defines
CONFIG_LSM_MMAP_MIN_ADDR, which is normally larger: 32K or 64K, but,
in general, can be anything. There is no portable way to query its
value: /boot/config, /proc/config and /proc/config.gz are distro- and
environment-specific.

For maximum compatibility, probing is required. Use pgb_find_fallback()
for this purpose. The downside of this approach is that mmap_min_addr
remains incorrect, but there don't seem to be any practical
consequences from this. If a correct mmap_min_addr will be required in
the future, probing will need to be moved to linux-user main().

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 linux-user/elfload.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Richard Henderson Oct. 22, 2024, 2:34 a.m. UTC | #1
On 10/21/24 05:17, Ilya Leoshkevich wrote:
> Running qemu-i386 on a system running with SELinux in enforcing mode
> fails with:
> 
>      qemu-i386: tests/tcg/i386-linux-user/sigreturn-sigmask: Unable to find a guest_base to satisfy all guest address mapping requirements
>        00000000-ffffffff
> 
> The reason is that main() determines mmap_min_addr from
> /proc/sys/vm/mmap_min_addr, but SELinux additionally defines
> CONFIG_LSM_MMAP_MIN_ADDR, which is normally larger: 32K or 64K, but,
> in general, can be anything. There is no portable way to query its
> value: /boot/config, /proc/config and /proc/config.gz are distro- and
> environment-specific.
> 
> For maximum compatibility, probing is required. Use pgb_find_fallback()
> for this purpose. The downside of this approach is that mmap_min_addr
> remains incorrect, but there don't seem to be any practical
> consequences from this. If a correct mmap_min_addr will be required in
> the future, probing will need to be moved to linux-user main().
> 
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
>   linux-user/elfload.c | 4 ++++
>   1 file changed, 4 insertions(+)

This is

     https://gitlab.com/qemu-project/qemu/-/issues/2598

which we closed as a system configuration / kernel bug.

I'm open to working around the issue, because I can see it coming up again and again.

In pgb_find_fallback, we use a skip value of 4M or 4G, using that skip as the base at 
which to begin the search.  I think it might be better to use this as the initial start 
point in pgb_find_itree as well, rather than mmap_min_addr.

Since I never had a setup in which this triggered, would you be willing to test such a change?


r~
Ilya Leoshkevich Oct. 22, 2024, 9:50 a.m. UTC | #2
On Mon, 2024-10-21 at 19:34 -0700, Richard Henderson wrote:
> On 10/21/24 05:17, Ilya Leoshkevich wrote:
> > Running qemu-i386 on a system running with SELinux in enforcing
> > mode
> > fails with:
> > 
> >      qemu-i386: tests/tcg/i386-linux-user/sigreturn-sigmask: Unable
> > to find a guest_base to satisfy all guest address mapping
> > requirements
> >        00000000-ffffffff
> > 
> > The reason is that main() determines mmap_min_addr from
> > /proc/sys/vm/mmap_min_addr, but SELinux additionally defines
> > CONFIG_LSM_MMAP_MIN_ADDR, which is normally larger: 32K or 64K,
> > but,
> > in general, can be anything. There is no portable way to query its
> > value: /boot/config, /proc/config and /proc/config.gz are distro-
> > and
> > environment-specific.
> > 
> > For maximum compatibility, probing is required. Use
> > pgb_find_fallback()
> > for this purpose. The downside of this approach is that
> > mmap_min_addr
> > remains incorrect, but there don't seem to be any practical
> > consequences from this. If a correct mmap_min_addr will be required
> > in
> > the future, probing will need to be moved to linux-user main().
> > 
> > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > ---
> >   linux-user/elfload.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> 
> This is
> 
>      https://gitlab.com/qemu-project/qemu/-/issues/2598
> 
> which we closed as a system configuration / kernel bug.
> 
> I'm open to working around the issue, because I can see it coming up
> again and again.
> 
> In pgb_find_fallback, we use a skip value of 4M or 4G, using that
> skip as the base at 
> which to begin the search.  I think it might be better to use this as
> the initial start 
> point in pgb_find_itree as well, rather than mmap_min_addr.
> 
> Since I never had a setup in which this triggered, would you be
> willing to test such a change?
> 
> 
> r~

The environment is a trixie container running on Fedora 40.

The following works, I can send a v2 if that's the preferred way to go:

--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -2898,7 +2898,7 @@ static uintptr_t pgb_try_itree(const PGBAddrs
*ga, uintptr_t base,
 static uintptr_t pgb_find_itree(const PGBAddrs *ga, IntervalTreeRoot
*root,
                                 uintptr_t align, uintptr_t brk)
 {
-    uintptr_t last = mmap_min_addr;
+    uintptr_t last = sizeof(uintptr_t) == 4 ? MiB : GiB;
     uintptr_t base, skip;
 
     while (true) {

But just for my understanding, what is wrong with the current approach?
The intention here is to fix the weird case without affecting the happy
path. It also looks natural to try the fallback once the normal
handling fails.
Richard Henderson Oct. 22, 2024, 2:33 p.m. UTC | #3
On 10/22/24 02:50, Ilya Leoshkevich wrote:
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -2898,7 +2898,7 @@ static uintptr_t pgb_try_itree(const PGBAddrs
> *ga, uintptr_t base,
>   static uintptr_t pgb_find_itree(const PGBAddrs *ga, IntervalTreeRoot
> *root,
>                                   uintptr_t align, uintptr_t brk)
>   {
> -    uintptr_t last = mmap_min_addr;
> +    uintptr_t last = sizeof(uintptr_t) == 4 ? MiB : GiB;
>       uintptr_t base, skip;
>   
>       while (true) {
> 
> But just for my understanding, what is wrong with the current approach?
> The intention here is to fix the weird case without affecting the happy
> path.


Once the identity map fails, the magnitude of guest_base does not matter.  I've always 
been a bit uncomfortable with butting right up against mmap_min_addr.


> It also looks natural to try the fallback once the normal
> handling fails.

The normal path is supposed to have complete knowledge of the address space.  There should 
be no need to fall back to blind probing.  To me that does not seem natural at all.


r~
diff mbox series

Patch

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 6cef8db3b53..f6fb937d68a 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -2968,6 +2968,10 @@  static void pgb_dynamic(const char *image_name, uintptr_t guest_loaddr,
 
         ret = pgb_find_itree(&ga, root, align, brk);
         free_self_maps(root);
+
+        if (ret == -1) {
+            ret = pgb_find_fallback(&ga, align, brk);
+        }
     }
 
     if (ret == -1) {