Message ID | 1442271047-4908-5-git-send-email-palmer@dabbelt.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Sep 14, 2015 at 03:50:38PM -0700, Palmer Dabbelt wrote: > This used to be hidden behind CONFIG_MMAP_ALLOW_UNINITIALIZED, so > userspace wouldn't actually ever see it be non-zero. While I could > have kept hiding it, the man pages seem to indicate that > MAP_UNINITIALIZED should be visible: > > mmap(2) > MAP_UNINITIALIZED (since Linux 2.6.33) > Don't clear anonymous pages. This flag is intended to improve > performance on embedded devices. This flag is honored only if the > kernel was configured with the CONFIG_MMAP_ALLOW_UNINITIALIZED > option. Because of the security implications, that option is > normally enabled only on embedded devices (i.e., devices where one > has complete control of the contents of user memory). > > and since the only time it shows up in my /usr/include is in this > header I believe this should have been visible to userspace (as > non-zero, which wouldn't do anything when or'd into the flags) all > along. Are you sure about "wouldn't do anything"? Suspiciously, 0x4000000 is also (1 << MAP_HUGE_SHIFT). I'm not sure if any architecture has order-1 huge pages, but still looks like we have conflict here. I think it's harmful to expose non-zero MAP_UNINITIALIZED to system which potentially can handle multiple users. Or non-trivial user space in general. Should we leave it at least under '#ifndef CONFIG_MMU'? I don't think it's possible to have single ABI for MMU and MMU-less systems anyway. And we can avoid conflict with MAP_HUGE_SHIFT this way. P.S. MAP_UNINITIALIZED itself looks very broken to me. I probably need dig mailing list on why it was allowed. But that's other topic.
On Mon, 14 Sep 2015 17:23:58 PDT (-0700), kirill@shutemov.name wrote: > On Mon, Sep 14, 2015 at 03:50:38PM -0700, Palmer Dabbelt wrote: >> This used to be hidden behind CONFIG_MMAP_ALLOW_UNINITIALIZED, so >> userspace wouldn't actually ever see it be non-zero. While I could >> have kept hiding it, the man pages seem to indicate that >> MAP_UNINITIALIZED should be visible: >> >> mmap(2) >> MAP_UNINITIALIZED (since Linux 2.6.33) >> Don't clear anonymous pages. This flag is intended to improve >> performance on embedded devices. This flag is honored only if the >> kernel was configured with the CONFIG_MMAP_ALLOW_UNINITIALIZED >> option. Because of the security implications, that option is >> normally enabled only on embedded devices (i.e., devices where one >> has complete control of the contents of user memory). >> >> and since the only time it shows up in my /usr/include is in this >> header I believe this should have been visible to userspace (as >> non-zero, which wouldn't do anything when or'd into the flags) all >> along. > > Are you sure about "wouldn't do anything"? That was bad writing for me. I'd originally written something like "I believe this should have been visible to userspace all along", but then added the ()'s. I meant to say: * I think MAP_UNINITIALIZED should have been non-zero in userspace. * MAP_UNINITAILIZED was zero in userspace. * A zero MAP_UNINITIALIZED does nothing when OR'd in. > Suspiciously, 0x4000000 is also (1 << MAP_HUGE_SHIFT). I'm not sure if any > architecture has order-1 huge pages, but still looks like we have conflict > here. > > I think it's harmful to expose non-zero MAP_UNINITIALIZED to system which > potentially can handle multiple users. Or non-trivial user space in > general. This doesn't have MAP_UNINITIALIZED do anything by default, it just defines the flag the same way on all systems. I was under the impression that this just happened if I set MAP_UNINITIALIZED. Looking at MAP_HUGE_SHIFT it mmap.c, that's definitely why my mmap() test case ignored the set MAP_UNINITIALIZED on my PC. I'm going to make this #ifndef MAP_UNINITAILIZED #define MAP_UNINITAILIZED 0 #endif and then leave Xtensa's port alone. This is what Arnd suggested originally, sorry for the extra work! > Should we leave it at least under '#ifndef CONFIG_MMU'? I don't think it's > possible to have single ABI for MMU and MMU-less systems anyway. And we > can avoid conflict with MAP_HUGE_SHIFT this way. The whole goal here was to eliminate "#ifndef CONFIG_*" from the user-visible headers. This all started because I got bit by a very similar-looking bug (see patch #1), so I'd prefer not to go down that route. > P.S. MAP_UNINITIALIZED itself looks very broken to me. I probably need dig > mailing list on why it was allowed. > But that's other topic. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 15, 2015 at 03:23:58AM +0300, Kirill A. Shutemov wrote: > On Mon, Sep 14, 2015 at 03:50:38PM -0700, Palmer Dabbelt wrote: > > This used to be hidden behind CONFIG_MMAP_ALLOW_UNINITIALIZED, so > > userspace wouldn't actually ever see it be non-zero. While I could > > have kept hiding it, the man pages seem to indicate that > > MAP_UNINITIALIZED should be visible: > > > > mmap(2) > > MAP_UNINITIALIZED (since Linux 2.6.33) > > Don't clear anonymous pages. This flag is intended to improve > > performance on embedded devices. This flag is honored only if the > > kernel was configured with the CONFIG_MMAP_ALLOW_UNINITIALIZED > > option. Because of the security implications, that option is > > normally enabled only on embedded devices (i.e., devices where one > > has complete control of the contents of user memory). > > > > and since the only time it shows up in my /usr/include is in this > > header I believe this should have been visible to userspace (as > > non-zero, which wouldn't do anything when or'd into the flags) all > > along. > > Are you sure about "wouldn't do anything"? > Suspiciously, 0x4000000 is also (1 << MAP_HUGE_SHIFT). I'm not sure if any > architecture has order-1 huge pages, but still looks like we have conflict > here. > > I think it's harmful to expose non-zero MAP_UNINITIALIZED to system which > potentially can handle multiple users. Or non-trivial user space in > general. The flag should always exist. If it was defined to conflict with something else, that's a serious ABI problem. But the flag should always exist, even if the kernel ends up ignoring it. > Should we leave it at least under '#ifndef CONFIG_MMU'? I don't think it's > possible to have single ABI for MMU and MMU-less systems anyway. And we > can avoid conflict with MAP_HUGE_SHIFT this way. No; even if you have an MMU (which is useful for things like fork()), a system without user separation (for instance, without CONFIG_MULTIUSER) can reasonably use MAP_UNINITIALIZED. > P.S. MAP_UNINITIALIZED itself looks very broken to me. I probably need dig > mailing list on why it was allowed. That's what the config option *and* explicit flag are for; there are more than enough warnings about the implications. - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 14, 2015 at 10:19:19PM -0700, Josh Triplett wrote: > On Tue, Sep 15, 2015 at 03:23:58AM +0300, Kirill A. Shutemov wrote: > > On Mon, Sep 14, 2015 at 03:50:38PM -0700, Palmer Dabbelt wrote: > > > This used to be hidden behind CONFIG_MMAP_ALLOW_UNINITIALIZED, so > > > userspace wouldn't actually ever see it be non-zero. While I could > > > have kept hiding it, the man pages seem to indicate that > > > MAP_UNINITIALIZED should be visible: > > > > > > mmap(2) > > > MAP_UNINITIALIZED (since Linux 2.6.33) > > > Don't clear anonymous pages. This flag is intended to improve > > > performance on embedded devices. This flag is honored only if the > > > kernel was configured with the CONFIG_MMAP_ALLOW_UNINITIALIZED > > > option. Because of the security implications, that option is > > > normally enabled only on embedded devices (i.e., devices where one > > > has complete control of the contents of user memory). > > > > > > and since the only time it shows up in my /usr/include is in this > > > header I believe this should have been visible to userspace (as > > > non-zero, which wouldn't do anything when or'd into the flags) all > > > along. > > > > Are you sure about "wouldn't do anything"? > > Suspiciously, 0x4000000 is also (1 << MAP_HUGE_SHIFT). I'm not sure if any > > architecture has order-1 huge pages, but still looks like we have conflict > > here. > > > > I think it's harmful to expose non-zero MAP_UNINITIALIZED to system which > > potentially can handle multiple users. Or non-trivial user space in > > general. > > The flag should always exist. Sure. And 0 is perfectly fine value for the flag. Like with MAP_FILE. > If it was defined to conflict with > something else, that's a serious ABI problem. But the flag > should always exist, even if the kernel ends up ignoring it. > > > Should we leave it at least under '#ifndef CONFIG_MMU'? I don't think it's > > possible to have single ABI for MMU and MMU-less systems anyway. And we > > can avoid conflict with MAP_HUGE_SHIFT this way. > > No; even if you have an MMU (which is useful for things like fork()), a > system without user separation (for instance, without CONFIG_MULTIUSER) > can reasonably use MAP_UNINITIALIZED. Can? Yes. Reasonably? I don't think so. > > P.S. MAP_UNINITIALIZED itself looks very broken to me. I probably need dig > > mailing list on why it was allowed. > > That's what the config option *and* explicit flag are for; there are > more than enough warnings about the implications. I think it's misdesigned. It doesn't require explicid opt-in from a process who owned the page allocated in MAP_UNINITIALIZED mapping before. #define MAP_LEAK_ME_SOME_DATA MAP_UNINITIALIZED
On Tue, Sep 15, 2015 at 12:42:00PM +0300, Kirill A. Shutemov wrote: > On Mon, Sep 14, 2015 at 10:19:19PM -0700, Josh Triplett wrote: > > On Tue, Sep 15, 2015 at 03:23:58AM +0300, Kirill A. Shutemov wrote: > > > On Mon, Sep 14, 2015 at 03:50:38PM -0700, Palmer Dabbelt wrote: > > > > This used to be hidden behind CONFIG_MMAP_ALLOW_UNINITIALIZED, so > > > > userspace wouldn't actually ever see it be non-zero. While I could > > > > have kept hiding it, the man pages seem to indicate that > > > > MAP_UNINITIALIZED should be visible: > > > > > > > > mmap(2) > > > > MAP_UNINITIALIZED (since Linux 2.6.33) > > > > Don't clear anonymous pages. This flag is intended to improve > > > > performance on embedded devices. This flag is honored only if the > > > > kernel was configured with the CONFIG_MMAP_ALLOW_UNINITIALIZED > > > > option. Because of the security implications, that option is > > > > normally enabled only on embedded devices (i.e., devices where one > > > > has complete control of the contents of user memory). > > > > > > > > and since the only time it shows up in my /usr/include is in this > > > > header I believe this should have been visible to userspace (as > > > > non-zero, which wouldn't do anything when or'd into the flags) all > > > > along. > > > > > > Are you sure about "wouldn't do anything"? > > > Suspiciously, 0x4000000 is also (1 << MAP_HUGE_SHIFT). I'm not sure if any > > > architecture has order-1 huge pages, but still looks like we have conflict > > > here. > > > > > > I think it's harmful to expose non-zero MAP_UNINITIALIZED to system which > > > potentially can handle multiple users. Or non-trivial user space in > > > general. > > > > The flag should always exist. > > Sure. And 0 is perfectly fine value for the flag. Like with MAP_FILE. Rephrasing: the flag should always exist with the correct value. Whether the kernel handles it or not, the kernel *headers* shouldn't change to match the kernel, not least of which because they don't necessarily match the running kernel. Just like we define the prototypes for syscalls that the running kernel may return ENOSYS for. > > If it was defined to conflict with > > something else, that's a serious ABI problem. But the flag > > should always exist, even if the kernel ends up ignoring it. > > > > > Should we leave it at least under '#ifndef CONFIG_MMU'? I don't think it's > > > possible to have single ABI for MMU and MMU-less systems anyway. And we > > > can avoid conflict with MAP_HUGE_SHIFT this way. > > > > No; even if you have an MMU (which is useful for things like fork()), a > > system without user separation (for instance, without CONFIG_MULTIUSER) > > can reasonably use MAP_UNINITIALIZED. > > Can? Yes. Reasonably? I don't think so. Not all systems care. Otherwise you should be complaining more bitterly about options like CONFIG_MMU=n, which (*gasp*) allow access to *arbitrary memory*. > > > P.S. MAP_UNINITIALIZED itself looks very broken to me. I probably need dig > > > mailing list on why it was allowed. > > > > That's what the config option *and* explicit flag are for; there are > > more than enough warnings about the implications. > > I think it's misdesigned. It doesn't require explicid opt-in from a > process who owned the page allocated in MAP_UNINITIALIZED mapping before. > > #define MAP_LEAK_ME_SOME_DATA MAP_UNINITIALIZED Hence why it has a config option. The userspace option exists primarily because otherwise userspace might get surprised by receiving a non-zeroed page. On a system with the config option turned on, processes have access to arbitrary freed memory, as long as they say they can handle not having their memory pre-zeroed. - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Josh Triplett <josh@joshtriplett.org> wrote: > > Sure. And 0 is perfectly fine value for the flag. Like with MAP_FILE. > > Rephrasing: the flag should always exist with the correct value. > Whether the kernel handles it or not, the kernel *headers* shouldn't > change to match the kernel, not least of which because they don't > necessarily match the running kernel. Just like we define the > prototypes for syscalls that the running kernel may return ENOSYS for. Josh is correct. CONFIG_xxx *should* *not* be seen in UAPI headers, except inside #ifdef __KERNEL__ guards under special circumstances - and #ifdef __KERNEL__ guards *should* *not* be seen in UAPI headers except under special circumstances. In terms of such special circumstances, take a peek in include/uapi/linux/acct.h at struct acct with this: /* m68k had no padding here. */ #if !defined(CONFIG_M68K) || !defined(__KERNEL__) __u16 ac_ahz; /* AHZ */ #endif in the middle of it... Or include/{,uapi/}linux/agpgart.h where it defines two different but same-named variants of several structs. Now, some of these - particularly things like the latter - can be fixed by someone who has the time. David -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h index 201aec0e0446..2cbc1e717082 100644 --- a/arch/xtensa/include/uapi/asm/mman.h +++ b/arch/xtensa/include/uapi/asm/mman.h @@ -55,11 +55,9 @@ #define MAP_NONBLOCK 0x20000 /* do not block on IO */ #define MAP_STACK 0x40000 /* give out an address that is best suited for process/thread stacks */ #define MAP_HUGETLB 0x80000 /* create a huge page mapping */ -#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED +#ifndef MAP_UNINITIALIZED # define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be * uninitialized */ -#else -# define MAP_UNINITIALIZED 0x0 /* Don't support this flag */ #endif /* diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index ddc3b36f1046..7aeeb12db193 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -19,10 +19,8 @@ #define MAP_TYPE 0x0f /* Mask for type of mapping */ #define MAP_FIXED 0x10 /* Interpret addr exactly */ #define MAP_ANONYMOUS 0x20 /* don't use a file */ -#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED +#ifndef MAP_UNINITIALIZED # define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be uninitialized */ -#else -# define MAP_UNINITIALIZED 0x0 /* Don't support this flag */ #endif #define MS_ASYNC 1 /* sync memory asynchronously */