Message ID | 20230218211433.26859-26-rick.p.edgecombe@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Shadow stacks for userspace | expand |
On Sat, Feb 18, 2023 at 01:14:17PM -0800, Rick Edgecombe wrote: > The x86 Control-flow Enforcement Technology (CET) feature includes a new > type of memory called shadow stack. This shadow stack memory has some > unusual properties, which require some core mm changes to function > properly. > > One of the properties is that the shadow stack pointer (SSP), which is a > CPU register that points to the shadow stack like the stack pointer points > to the stack, can't be pointing outside of the 32 bit address space when > the CPU is executing in 32 bit mode. It is desirable to prevent executing > in 32 bit mode when shadow stack is enabled because the kernel can't easily > support 32 bit signals. > > On x86 it is possible to transition to 32 bit mode without any special > interaction with the kernel, by doing a "far call" to a 32 bit segment. > So the shadow stack implementation can use this address space behavior > as a feature, by enforcing that shadow stack memory is always crated > outside of the 32 bit address space. This way userspace will trigger a > general protection fault which will in turn trigger a segfault if it > tries to transition to 32 bit mode with shadow stack enabled. > > This provides a clean error generating border for the user if they try > attempt to do 32 bit mode shadow stack, rather than leave the kernel in a > half working state for userspace to be surprised by. > > So to allow future shadow stack enabling patches to map shadow stacks > out of the 32 bit address space, introduce MAP_ABOVE4G. The behavior > is pretty much like MAP_32BIT, except that it has the opposite address > range. The are a few differences though. > > If both MAP_32BIT and MAP_ABOVE4G are provided, the kernel will use the > MAP_ABOVE4G behavior. Like MAP_32BIT, MAP_ABOVE4G is ignored in a 32 bit > syscall. Should the interface refuse to accept both set instead? Reviewed-by: Kees Cook <keescook@chromium.org>
On Sun, 2023-02-19 at 12:43 -0800, Kees Cook wrote: > On Sat, Feb 18, 2023 at 01:14:17PM -0800, Rick Edgecombe wrote: > > The x86 Control-flow Enforcement Technology (CET) feature includes > > a new > > type of memory called shadow stack. This shadow stack memory has > > some > > unusual properties, which require some core mm changes to function > > properly. > > > > One of the properties is that the shadow stack pointer (SSP), which > > is a > > CPU register that points to the shadow stack like the stack pointer > > points > > to the stack, can't be pointing outside of the 32 bit address space > > when > > the CPU is executing in 32 bit mode. It is desirable to prevent > > executing > > in 32 bit mode when shadow stack is enabled because the kernel > > can't easily > > support 32 bit signals. > > > > On x86 it is possible to transition to 32 bit mode without any > > special > > interaction with the kernel, by doing a "far call" to a 32 bit > > segment. > > So the shadow stack implementation can use this address space > > behavior > > as a feature, by enforcing that shadow stack memory is always > > crated > > outside of the 32 bit address space. This way userspace will > > trigger a > > general protection fault which will in turn trigger a segfault if > > it > > tries to transition to 32 bit mode with shadow stack enabled. > > > > This provides a clean error generating border for the user if they > > try > > attempt to do 32 bit mode shadow stack, rather than leave the > > kernel in a > > half working state for userspace to be surprised by. > > > > So to allow future shadow stack enabling patches to map shadow > > stacks > > out of the 32 bit address space, introduce MAP_ABOVE4G. The > > behavior > > is pretty much like MAP_32BIT, except that it has the opposite > > address > > range. The are a few differences though. > > > > If both MAP_32BIT and MAP_ABOVE4G are provided, the kernel will use > > the > > MAP_ABOVE4G behavior. Like MAP_32BIT, MAP_ABOVE4G is ignored in a > > 32 bit > > syscall. > > Should the interface refuse to accept both set instead? I guess that might be less surprising. But I think to do this would either require adding logic to core mm or a new arch breakout. I actually kind of wish there was an easy way to keep this flag from being used from userspace and just be a kernel only thing. It is only used internally in this series and there isn't any know use for userspace. > > Reviewed-by: Kees Cook <keescook@chromium.org>
diff --git a/arch/x86/include/uapi/asm/mman.h b/arch/x86/include/uapi/asm/mman.h index 775dbd3aff73..5a0256e73f1e 100644 --- a/arch/x86/include/uapi/asm/mman.h +++ b/arch/x86/include/uapi/asm/mman.h @@ -3,6 +3,7 @@ #define _ASM_X86_MMAN_H #define MAP_32BIT 0x40 /* only give out 32bit addresses */ +#define MAP_ABOVE4G 0x80 /* only map above 4GB */ #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS #define arch_calc_vm_prot_bits(prot, key) ( \ diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c index 8cc653ffdccd..06378b5682c1 100644 --- a/arch/x86/kernel/sys_x86_64.c +++ b/arch/x86/kernel/sys_x86_64.c @@ -193,7 +193,11 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0, info.flags = VM_UNMAPPED_AREA_TOPDOWN; info.length = len; - info.low_limit = PAGE_SIZE; + if (!in_32bit_syscall() && (flags & MAP_ABOVE4G)) + info.low_limit = 0x100000000; + else + info.low_limit = PAGE_SIZE; + info.high_limit = get_mmap_base(0); /* diff --git a/include/linux/mman.h b/include/linux/mman.h index 58b3abd457a3..32156daa985a 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -15,6 +15,9 @@ #ifndef MAP_32BIT #define MAP_32BIT 0 #endif +#ifndef MAP_ABOVE4G +#define MAP_ABOVE4G 0 +#endif #ifndef MAP_HUGE_2MB #define MAP_HUGE_2MB 0 #endif @@ -50,6 +53,7 @@ | MAP_STACK \ | MAP_HUGETLB \ | MAP_32BIT \ + | MAP_ABOVE4G \ | MAP_HUGE_2MB \ | MAP_HUGE_1GB)