Message ID | Y5bRnRaiSOUKRjdW@p100 (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | linux-user: Add emulation for MADV_WIPEONFORK and MADV_KEEPONFORK in madvise() | expand |
On Mon, Dec 12, 2022 at 08:00:45AM +0100, Helge Deller wrote: > Both parameters have a different value on the parisc platform, so first > translate the target value into a host value for usage in the native > madvise() syscall. > > Those parameters are often used by security sensitive applications (e.g. > tor browser, boringssl, ...) which expect the call to return a proper > return code on failure, so return -EINVAL if qemu fails to forward the > syscall to the host OS. > > Tested with testcase of tor browser when running hppa-linux guest on > x86-64 host. > > Signed-off-by: Helge Deller <deller@gmx.de> > > diff --git a/linux-user/mmap.c b/linux-user/mmap.c > index 10f5079331..c75342108c 100644 > --- a/linux-user/mmap.c > +++ b/linux-user/mmap.c > @@ -901,11 +901,25 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice) > return -TARGET_EINVAL; > } > > + /* Translate for some architectures which have different MADV_xxx values */ > + switch (advice) { > + case TARGET_MADV_DONTNEED: /* alpha */ > + advice = MADV_DONTNEED; > + break; > + case TARGET_MADV_WIPEONFORK: /* parisc */ > + advice = MADV_WIPEONFORK; > + break; > + case TARGET_MADV_KEEPONFORK: /* parisc */ > + advice = MADV_KEEPONFORK; > + break; > + /* we do not care about the other MADV_xxx values yet */ > + } > + > /* > * A straight passthrough may not be safe because qemu sometimes turns > * private file-backed mappings into anonymous mappings. > * > - * This is a hint, so ignoring and returning success is ok. > + * For MADV_DONTNEED, which is a hint, ignoring and returning success is ok. Actually, MADV_DONTNEED is one of the few values, which is not always a hint - it can be used to e.g. zero out pages. As the next paragraph states, strictly speaking, MADV_DONTNEED is currently broken, because it can indeed be ignored without indication in some cases, but it's still arguably better than not honoring it at all. > * > * This breaks MADV_DONTNEED, completely implementing which is quite > * complicated. However, there is one low-hanging fruit: mappings that are > @@ -913,11 +927,17 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice) > * passthrough is safe, so do it. > */ > mmap_lock(); > - if (advice == TARGET_MADV_DONTNEED && > - can_passthrough_madv_dontneed(start, end)) { > - ret = get_errno(madvise(g2h_untagged(start), len, MADV_DONTNEED)); > - if (ret == 0) { > - page_reset_target_data(start, start + len); > + switch (advice) { > + case MADV_WIPEONFORK: > + case MADV_KEEPONFORK: > + ret = -EINVAL; > + /* fall through */ > + case MADV_DONTNEED: > + if (can_passthrough_madv_dontneed(start, end)) { > + ret = get_errno(madvise(g2h_untagged(start), len, advice)); > + if ((advice == MADV_DONTNEED) && (ret == 0)) { > + page_reset_target_data(start, start + len); > + } > } > } > mmap_unlock(); > Nit: maybe rename can_passthrough_madv_dontneed() to can_passthrough(), since now it's used not only for MADV_DONTNEED? With the MADV_DONTNEED comment change: Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
On 12/12/22 22:16, Ilya Leoshkevich wrote: > On Mon, Dec 12, 2022 at 08:00:45AM +0100, Helge Deller wrote: >> Both parameters have a different value on the parisc platform, so first >> translate the target value into a host value for usage in the native >> madvise() syscall. >> >> Those parameters are often used by security sensitive applications (e.g. >> tor browser, boringssl, ...) which expect the call to return a proper >> return code on failure, so return -EINVAL if qemu fails to forward the >> syscall to the host OS. >> >> Tested with testcase of tor browser when running hppa-linux guest on >> x86-64 host. >> >> Signed-off-by: Helge Deller <deller@gmx.de> >> >> diff --git a/linux-user/mmap.c b/linux-user/mmap.c >> index 10f5079331..c75342108c 100644 >> --- a/linux-user/mmap.c >> +++ b/linux-user/mmap.c >> @@ -901,11 +901,25 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice) >> return -TARGET_EINVAL; >> } >> >> + /* Translate for some architectures which have different MADV_xxx values */ >> + switch (advice) { >> + case TARGET_MADV_DONTNEED: /* alpha */ >> + advice = MADV_DONTNEED; >> + break; >> + case TARGET_MADV_WIPEONFORK: /* parisc */ >> + advice = MADV_WIPEONFORK; >> + break; >> + case TARGET_MADV_KEEPONFORK: /* parisc */ >> + advice = MADV_KEEPONFORK; >> + break; >> + /* we do not care about the other MADV_xxx values yet */ >> + } >> + >> /* >> * A straight passthrough may not be safe because qemu sometimes turns >> * private file-backed mappings into anonymous mappings. >> * >> - * This is a hint, so ignoring and returning success is ok. >> + * For MADV_DONTNEED, which is a hint, ignoring and returning success is ok. > > Actually, MADV_DONTNEED is one of the few values, which is not always a > hint - it can be used to e.g. zero out pages. Right, it _should_ zero out pages and return 0, or otherwise return failure. I think the problem is that some userspace apps will then sadly break if we change the current behaviour.... Anyway, in this patch I didn't wanted to touch MAD_DONTNEED. > As the next paragraph states, strictly speaking, MADV_DONTNEED is > currently broken, because it can indeed be ignored without indication > in some cases, but it's still arguably better than not honoring it at > all. Yep. >> * >> * This breaks MADV_DONTNEED, completely implementing which is quite >> * complicated. However, there is one low-hanging fruit: mappings that are >> @@ -913,11 +927,17 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice) >> * passthrough is safe, so do it. >> */ >> mmap_lock(); >> - if (advice == TARGET_MADV_DONTNEED && >> - can_passthrough_madv_dontneed(start, end)) { >> - ret = get_errno(madvise(g2h_untagged(start), len, MADV_DONTNEED)); >> - if (ret == 0) { >> - page_reset_target_data(start, start + len); >> + switch (advice) { >> + case MADV_WIPEONFORK: >> + case MADV_KEEPONFORK: >> + ret = -EINVAL; >> + /* fall through */ >> + case MADV_DONTNEED: >> + if (can_passthrough_madv_dontneed(start, end)) { >> + ret = get_errno(madvise(g2h_untagged(start), len, advice)); >> + if ((advice == MADV_DONTNEED) && (ret == 0)) { >> + page_reset_target_data(start, start + len); >> + } >> } >> } >> mmap_unlock(); >> > > Nit: maybe rename can_passthrough_madv_dontneed() to can_passthrough(), > since now it's used not only for MADV_DONTNEED? Maybe can_passthrough_madvise() is better? > With the MADV_DONTNEED comment change: Just for me to understand correctly: You propose that I shouldn't touch that comment in my followup-patch, right? That's ok. > Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Thanks! Helge
On Mon, Dec 12, 2022 at 10:49:24PM +0100, Helge Deller wrote: > On 12/12/22 22:16, Ilya Leoshkevich wrote: > > On Mon, Dec 12, 2022 at 08:00:45AM +0100, Helge Deller wrote: > > > Both parameters have a different value on the parisc platform, so first > > > translate the target value into a host value for usage in the native > > > madvise() syscall. > > > > > > Those parameters are often used by security sensitive applications (e.g. > > > tor browser, boringssl, ...) which expect the call to return a proper > > > return code on failure, so return -EINVAL if qemu fails to forward the > > > syscall to the host OS. > > > > > > Tested with testcase of tor browser when running hppa-linux guest on > > > x86-64 host. > > > > > > Signed-off-by: Helge Deller <deller@gmx.de> > > > > > > diff --git a/linux-user/mmap.c b/linux-user/mmap.c > > > index 10f5079331..c75342108c 100644 > > > --- a/linux-user/mmap.c > > > +++ b/linux-user/mmap.c > > > @@ -901,11 +901,25 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice) > > > return -TARGET_EINVAL; > > > } > > > > > > + /* Translate for some architectures which have different MADV_xxx values */ > > > + switch (advice) { > > > + case TARGET_MADV_DONTNEED: /* alpha */ > > > + advice = MADV_DONTNEED; > > > + break; > > > + case TARGET_MADV_WIPEONFORK: /* parisc */ > > > + advice = MADV_WIPEONFORK; > > > + break; > > > + case TARGET_MADV_KEEPONFORK: /* parisc */ > > > + advice = MADV_KEEPONFORK; > > > + break; > > > + /* we do not care about the other MADV_xxx values yet */ > > > + } > > > + > > > /* > > > * A straight passthrough may not be safe because qemu sometimes turns > > > * private file-backed mappings into anonymous mappings. > > > * > > > - * This is a hint, so ignoring and returning success is ok. > > > + * For MADV_DONTNEED, which is a hint, ignoring and returning success is ok. > > > > Actually, MADV_DONTNEED is one of the few values, which is not always a > > hint - it can be used to e.g. zero out pages. > > Right, it _should_ zero out pages and return 0, or otherwise return failure. > I think the problem is that some userspace apps will then sadly break if we > change the current behaviour.... > > Anyway, in this patch I didn't wanted to touch MAD_DONTNEED. > > > As the next paragraph states, strictly speaking, MADV_DONTNEED is > > currently broken, because it can indeed be ignored without indication > > in some cases, but it's still arguably better than not honoring it at > > all. > > Yep. > > > > * > > > * This breaks MADV_DONTNEED, completely implementing which is quite > > > * complicated. However, there is one low-hanging fruit: mappings that are > > > @@ -913,11 +927,17 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice) > > > * passthrough is safe, so do it. > > > */ > > > mmap_lock(); > > > - if (advice == TARGET_MADV_DONTNEED && > > > - can_passthrough_madv_dontneed(start, end)) { > > > - ret = get_errno(madvise(g2h_untagged(start), len, MADV_DONTNEED)); > > > - if (ret == 0) { > > > - page_reset_target_data(start, start + len); > > > + switch (advice) { > > > + case MADV_WIPEONFORK: > > > + case MADV_KEEPONFORK: > > > + ret = -EINVAL; > > > + /* fall through */ > > > + case MADV_DONTNEED: > > > + if (can_passthrough_madv_dontneed(start, end)) { > > > + ret = get_errno(madvise(g2h_untagged(start), len, advice)); > > > + if ((advice == MADV_DONTNEED) && (ret == 0)) { > > > + page_reset_target_data(start, start + len); > > > + } > > > } > > > } > > > mmap_unlock(); > > > > > > > Nit: maybe rename can_passthrough_madv_dontneed() to can_passthrough(), > > since now it's used not only for MADV_DONTNEED? > > Maybe can_passthrough_madvise() is better? Sounds good to me as well. The idea with PAGE_PASSTHROUGH was that we should be able to passthrough anything; but with this patch we still use it only for madvise(), and indicating it in the name makes sense. > > > With the MADV_DONTNEED comment change: > > Just for me to understand correctly: > You propose that I shouldn't touch that comment in my followup-patch, right? > That's ok. Either that, or maybe make it more precise - strictly speaking, it's not correct at the moment anyway. Maybe something like this? Most advice values are hints, so ignoring and returning success is ok. However, this would break MADV_DONTNEED, MADV_WIPEONFORK and MADV_KEEPONFORK ... > > Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> > > Thanks! > Helge
diff --git a/linux-user/mmap.c b/linux-user/mmap.c index 10f5079331..c75342108c 100644 --- a/linux-user/mmap.c +++ b/linux-user/mmap.c @@ -901,11 +901,25 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice) return -TARGET_EINVAL; } + /* Translate for some architectures which have different MADV_xxx values */ + switch (advice) { + case TARGET_MADV_DONTNEED: /* alpha */ + advice = MADV_DONTNEED; + break; + case TARGET_MADV_WIPEONFORK: /* parisc */ + advice = MADV_WIPEONFORK; + break; + case TARGET_MADV_KEEPONFORK: /* parisc */ + advice = MADV_KEEPONFORK; + break; + /* we do not care about the other MADV_xxx values yet */ + } + /* * A straight passthrough may not be safe because qemu sometimes turns * private file-backed mappings into anonymous mappings. * - * This is a hint, so ignoring and returning success is ok. + * For MADV_DONTNEED, which is a hint, ignoring and returning success is ok. * * This breaks MADV_DONTNEED, completely implementing which is quite * complicated. However, there is one low-hanging fruit: mappings that are @@ -913,11 +927,17 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice) * passthrough is safe, so do it. */ mmap_lock(); - if (advice == TARGET_MADV_DONTNEED && - can_passthrough_madv_dontneed(start, end)) { - ret = get_errno(madvise(g2h_untagged(start), len, MADV_DONTNEED)); - if (ret == 0) { - page_reset_target_data(start, start + len); + switch (advice) { + case MADV_WIPEONFORK: + case MADV_KEEPONFORK: + ret = -EINVAL; + /* fall through */ + case MADV_DONTNEED: + if (can_passthrough_madv_dontneed(start, end)) { + ret = get_errno(madvise(g2h_untagged(start), len, advice)); + if ((advice == MADV_DONTNEED) && (ret == 0)) { + page_reset_target_data(start, start + len); + } } } mmap_unlock();
Both parameters have a different value on the parisc platform, so first translate the target value into a host value for usage in the native madvise() syscall. Those parameters are often used by security sensitive applications (e.g. tor browser, boringssl, ...) which expect the call to return a proper return code on failure, so return -EINVAL if qemu fails to forward the syscall to the host OS. Tested with testcase of tor browser when running hppa-linux guest on x86-64 host. Signed-off-by: Helge Deller <deller@gmx.de>