From patchwork Tue Jul 18 22:33:33 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Garnier X-Patchwork-Id: 9850095 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5E50660392 for ; Tue, 18 Jul 2017 22:37:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4997B285CD for ; Tue, 18 Jul 2017 22:37:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3DF9C285D8; Tue, 18 Jul 2017 22:37:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8ABAA285CB for ; Tue, 18 Jul 2017 22:37:36 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dXb4r-000198-TW; Tue, 18 Jul 2017 22:34:49 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dXb4q-00014d-2c for xen-devel@lists.xenproject.org; Tue, 18 Jul 2017 22:34:48 +0000 Received: from [85.158.139.211] by server-1.bemta-5.messagelabs.com id BC/A1-01993-70D8E695; Tue, 18 Jul 2017 22:34:47 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrJIsWRWlGSWpSXmKPExsXiVRusqcvWmxd psOyIhMX3LZOZHBg9Dn+4whLAGMWamZeUX5HAmvH++mT2gk2OFQ+u3GRvYLxp3sXIxSEkMJ1R 4sqlo8wgDovAKxaJW6fvgjkSAv2sEnu/L2TqYuQEcuIk/j+fyAhhV0pMubqHFcQWElCS2LphK TPEqH+MErP+3mQDSbAJaEnsaZjPBJIQEZjNIbG46wwLiCMg8JFN4saWw2AtzAItTBJP/m0DKu PgEBYIkJh9XQekm0VAVWL66Q6wFbwCVhJ/Nm8HK5EQMJb4+0UfxOQECi+/zwNxhKVEb28DywR GwQWMDKsYNYpTi8pSi3QNTfWSijLTM0pyEzNzdA0NTPVyU4uLE9NTcxKTivWS83M3MQKDjgEI djA2bPc8xCjJwaQkyrtVOS9SiC8pP6UyI7E4I76oNCe1+BCjDAeHkgSvdQ9QTrAoNT21Ii0zB xj+MGkJDh4lEV4bkDRvcUFibnFmOkTqFKMlx6QD278wcXTM+PmNiePVhP/fmIRY8vLzUqXEeR d0AzUIgDRklObBjYPF6CVGWSlhXkagA4V4ClKLcjNLUOVfMYpzMCoJ8x4DmcKTmVcCt/UV0EF MQAcJ++aAHFSSiJCSamCcXqyn5F2sxnvz6W6DzHRVQ1+1s+nKjfffHza0jiqvVLsoV+s973uO gGfWD+eFmzcoXrr8veXeHMVAm6IVoW1F678E7vu8RGnegksle8T0nmk+2Dlvk9L83IaUcjcG9 3V1pqadr2OP85RET3hdtc6o+d7jJVI1mTsXPFXmXepl9UfBYxZfT4oSS3FGoqEWc1FxIgCWv7 f2zAIAAA== X-Env-Sender: thgarnie@google.com X-Msg-Ref: server-6.tower-206.messagelabs.com!1500417285!100639000!1 X-Originating-IP: [74.125.83.41] X-SpamReason: No, hits=0.5 required=7.0 tests=BODY_RANDOM_LONG X-StarScan-Received: X-StarScan-Version: 9.4.25; banners=-,-,- X-VirusChecked: Checked Received: (qmail 40268 invoked from network); 18 Jul 2017 22:34:46 -0000 Received: from mail-pg0-f41.google.com (HELO mail-pg0-f41.google.com) (74.125.83.41) by server-6.tower-206.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 18 Jul 2017 22:34:46 -0000 Received: by mail-pg0-f41.google.com with SMTP id u5so19923249pgq.3 for ; Tue, 18 Jul 2017 15:34:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=BjJyUuGh9W9kWdINSMpp09dGpikqyVAmzehD0pXiuMQ=; b=hMCAFGohfB7/VHWIudKjd52KgcqLZoZlkeB0FwgHcZgR+krcRlL3dyg5h/DLAPtppC xnR2L2zgIUahaT+P8q6RokaWuMO0DavUGsfm6a0+XnLG/XxmvnvD0nnkcDoNbvmWVlPV lvwPnhguP/7jUKpzMaWsIf/nD6XagkqPO3TTOjEUVtJrP5YFy9J27/Jht4CPEEWl83rd U10ZueG0KL4w0UCDpNgdozxPiXIhtma7PX6i8Ma+lDvbRIOMcryhyuhAHGMZV/nnFHS1 aUggbmEtOMiCUAvV5cGj6Yap7bgLnu2Cn36TAFmdBJrMLXaY6AOFOtLw6Q3A/SOyXF1+ GlSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=BjJyUuGh9W9kWdINSMpp09dGpikqyVAmzehD0pXiuMQ=; b=ONUDLLe3SnVXPQY/azHpsO/3b3aPSi2xC5mAeOL/O0ILT94I/jQQFTwkPlSafY7T8p iRLTk/lOyTYDREKu2cjO49v3LD90sGk8A6JqYM+AYn9rZ9TvUiI5ZivcaqSZplnQJ3+Y 14doFFtb9UCl2LFv4VBcN3vskf5KJcYMeT+Acqk8u7OTqcCY4il71dPCVashadWV6UAu XkTmUwkC26Oywl7poFXzsmM4n2+P/2ejXhcdjJ86BgaXcjJcGooKSfwTqOufFqT2zQS1 bT/W5j2CS1tlBUnn+i2oOp7HrUdecTjT3ZOuT0M6FWYaQifv7ztAGrVqdB2q7sf4REJ+ qR0w== X-Gm-Message-State: AIVw110dL55KZ6dpJ2NOIkc5WRu4sse7aHxbvT8X9Ho2PoDvz3Z7iiuw ftM6NI8SlggtvBFW X-Received: by 10.99.109.15 with SMTP id i15mr4044500pgc.204.1500417284302; Tue, 18 Jul 2017 15:34:44 -0700 (PDT) Received: from skynet.sea.corp.google.com ([100.100.206.164]) by smtp.gmail.com with ESMTPSA id b6sm6444242pgn.67.2017.07.18.15.34.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 18 Jul 2017 15:34:43 -0700 (PDT) From: Thomas Garnier To: Herbert Xu , "David S . Miller" , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , Peter Zijlstra , Josh Poimboeuf , Thomas Garnier , Arnd Bergmann , Matthias Kaehlcke , Boris Ostrovsky , Juergen Gross , Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Joerg Roedel , Andy Lutomirski , Borislav Petkov , "Kirill A . Shutemov" , Brian Gerst , Borislav Petkov , Christian Borntraeger , "Rafael J . Wysocki" , Len Brown , Pavel Machek , Tejun Heo , Christoph Lameter , Kees Cook , Paul Gortmaker , Chris Metcalf , "Paul E . McKenney" , Andrew Morton , Christopher Li , Dou Liyang , Masahiro Yamada , Daniel Borkmann , Markus Trippelsdorf , Peter Foley , Steven Rostedt , Tim Chen , Ard Biesheuvel , Catalin Marinas , Matthew Wilcox , Michal Hocko , Rob Landley , Jiri Kosina , "H . J . Lu" , Paul Bolle , Baoquan He , Daniel Micay Date: Tue, 18 Jul 2017 15:33:33 -0700 Message-Id: <20170718223333.110371-23-thgarnie@google.com> X-Mailer: git-send-email 2.13.2.932.g7449e964c-goog In-Reply-To: <20170718223333.110371-1-thgarnie@google.com> References: <20170718223333.110371-1-thgarnie@google.com> Cc: linux-arch@vger.kernel.org, kvm@vger.kernel.org, linux-pm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, linux-sparse@vger.kernel.org, linux-crypto@vger.kernel.org, kernel-hardening@lists.openwall.com, xen-devel@lists.xenproject.org Subject: [Xen-devel] [RFC 22/22] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Add a new CONFIG_RANDOMIZE_BASE_LARGE option to benefit from PIE support. It increases the KASLR range from 1GB to 3GB. The new range stars at 0xffffffff00000000 just above the EFI memory region. This option is off by default. The boot code is adapted to create the appropriate page table spanning three PUD pages. The relocation table uses 64-bit integers generated with the updated relocation tool with the large-reloc option. Signed-off-by: Thomas Garnier --- arch/x86/Kconfig | 21 +++++++++++++++++++++ arch/x86/boot/compressed/Makefile | 5 +++++ arch/x86/boot/compressed/misc.c | 10 +++++++++- arch/x86/include/asm/page_64_types.h | 9 +++++++++ arch/x86/kernel/head64.c | 18 ++++++++++++++---- arch/x86/kernel/head_64.S | 11 ++++++++++- 6 files changed, 68 insertions(+), 6 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 60d161391d5a..8054eef76dfc 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2096,6 +2096,27 @@ config X86_MODULE_PLTS select X86_MODULE_MODEL_LARGE select HAVE_MOD_ARCH_SPECIFIC +config RANDOMIZE_BASE_LARGE + bool "Increase the randomization range of the kernel image" + depends on X86_64 && RANDOMIZE_BASE + select X86_PIE + select X86_MODULE_PLTS if MODULES + default n + ---help--- + Build the kernel as a Position Independent Executable (PIE) and + increase the available randomization range from 1GB to 3GB. + + This option impacts performance on kernel CPU intensive workloads up + to 10% due to PIE generated code. Impact on user-mode processes and + typical usage would be significantly less (0.50% when you build the + kernel). + + The kernel and modules will generate slightly more assembly (1 to 2% + increase on the .text sections). The vmlinux binary will be + significantly smaller due to less relocations. + + If unsure say N + config HOTPLUG_CPU bool "Support for hot-pluggable CPUs" depends on SMP diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 2c860ad4fe06..8f4317864e98 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -111,7 +111,12 @@ $(obj)/vmlinux.bin: vmlinux FORCE targets += $(patsubst $(obj)/%,%,$(vmlinux-objs-y)) vmlinux.bin.all vmlinux.relocs +# Large randomization require bigger relocation table +ifeq ($(CONFIG_RANDOMIZE_BASE_LARGE),y) +CMD_RELOCS = arch/x86/tools/relocs --large-reloc +else CMD_RELOCS = arch/x86/tools/relocs +endif quiet_cmd_relocs = RELOCS $@ cmd_relocs = $(CMD_RELOCS) $< > $@;$(CMD_RELOCS) --abs-relocs $< $(obj)/vmlinux.relocs: vmlinux FORCE diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index a0838ab929f2..0a0c80ab1842 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -170,10 +170,18 @@ void __puthex(unsigned long value) } #if CONFIG_X86_NEED_RELOCS + +/* Large randomization go lower than -2G and use large relocation table */ +#ifdef CONFIG_RANDOMIZE_BASE_LARGE +typedef long rel_t; +#else +typedef int rel_t; +#endif + static void handle_relocations(void *output, unsigned long output_len, unsigned long virt_addr) { - int *reloc; + rel_t *reloc; unsigned long delta, map, ptr; unsigned long min_addr = (unsigned long)output; unsigned long max_addr = min_addr + (VO___bss_start - VO__text); diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h index 3f5f08b010d0..6b65f846dd64 100644 --- a/arch/x86/include/asm/page_64_types.h +++ b/arch/x86/include/asm/page_64_types.h @@ -48,7 +48,11 @@ #define __PAGE_OFFSET __PAGE_OFFSET_BASE #endif /* CONFIG_RANDOMIZE_MEMORY */ +#ifdef CONFIG_RANDOMIZE_BASE_LARGE +#define __START_KERNEL_map _AC(0xffffffff00000000, UL) +#else #define __START_KERNEL_map _AC(0xffffffff80000000, UL) +#endif /* CONFIG_RANDOMIZE_BASE_LARGE */ /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */ #ifdef CONFIG_X86_5LEVEL @@ -65,9 +69,14 @@ * 512MiB by default, leaving 1.5GiB for modules once the page tables * are fully set up. If kernel ASLR is configured, it can extend the * kernel page table mapping, reducing the size of the modules area. + * On PIE, we relocate the binary 2G lower so add this extra space. */ #if defined(CONFIG_RANDOMIZE_BASE) +#ifdef CONFIG_RANDOMIZE_BASE_LARGE +#define KERNEL_IMAGE_SIZE (_AC(3, UL) * 1024 * 1024 * 1024) +#else #define KERNEL_IMAGE_SIZE (1024 * 1024 * 1024) +#endif #else #define KERNEL_IMAGE_SIZE (512 * 1024 * 1024) #endif diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 4103e90ff128..235c3f7b46c7 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -39,6 +39,7 @@ static unsigned int __initdata next_early_pgt; pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX); #define __head __section(.head.text) +#define pud_count(x) (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT) static void __head *fixup_pointer(void *ptr, unsigned long physaddr) { @@ -54,6 +55,8 @@ unsigned long _text_offset = (unsigned long)(_text - __START_KERNEL_map); void __head notrace __startup_64(unsigned long physaddr) { unsigned long load_delta, *p; + unsigned long level3_kernel_start, level3_kernel_count; + unsigned long level3_fixmap_start; pgdval_t *pgd; p4dval_t *p4d; pudval_t *pud; @@ -74,6 +77,11 @@ void __head notrace __startup_64(unsigned long physaddr) if (load_delta & ~PMD_PAGE_MASK) for (;;); + /* Look at the randomization spread to adapt page table used */ + level3_kernel_start = pud_index(__START_KERNEL_map); + level3_kernel_count = pud_count(KERNEL_IMAGE_SIZE); + level3_fixmap_start = level3_kernel_start + level3_kernel_count; + /* Fixup the physical addresses in the page table */ pgd = fixup_pointer(&early_top_pgt, physaddr); @@ -85,8 +93,9 @@ void __head notrace __startup_64(unsigned long physaddr) } pud = fixup_pointer(&level3_kernel_pgt, physaddr); - pud[510] += load_delta; - pud[511] += load_delta; + for (i = 0; i < level3_kernel_count; i++) + pud[level3_kernel_start + i] += load_delta; + pud[level3_fixmap_start] += load_delta; pmd = fixup_pointer(level2_fixmap_pgt, physaddr); pmd[506] += load_delta; @@ -137,7 +146,7 @@ void __head notrace __startup_64(unsigned long physaddr) */ pmd = fixup_pointer(level2_kernel_pgt, physaddr); - for (i = 0; i < PTRS_PER_PMD; i++) { + for (i = 0; i < PTRS_PER_PMD * level3_kernel_count; i++) { if (pmd[i] & _PAGE_PRESENT) pmd[i] += load_delta; } @@ -268,7 +277,8 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data) */ BUILD_BUG_ON(MODULES_VADDR < __START_KERNEL_map); BUILD_BUG_ON(MODULES_VADDR - __START_KERNEL_map < KERNEL_IMAGE_SIZE); - BUILD_BUG_ON(MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE); + BUILD_BUG_ON(!IS_ENABLED(CONFIG_RANDOMIZE_BASE_LARGE) && + MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE); BUILD_BUG_ON((__START_KERNEL_map & ~PMD_MASK) != 0); BUILD_BUG_ON((MODULES_VADDR & ~PMD_MASK) != 0); BUILD_BUG_ON(!(MODULES_VADDR > __START_KERNEL)); diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 4d0a7e68bfe8..e8b2d6706eca 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -39,11 +39,15 @@ #define p4d_index(x) (((x) >> P4D_SHIFT) & (PTRS_PER_P4D-1)) #define pud_index(x) (((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1)) +#define pud_count(x) (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT) PGD_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE) PGD_START_KERNEL = pgd_index(__START_KERNEL_map) L3_START_KERNEL = pud_index(__START_KERNEL_map) +/* Adapt page table L3 space based on range of randomization */ +L3_KERNEL_ENTRY_COUNT = pud_count(KERNEL_IMAGE_SIZE) + .text __HEAD .code64 @@ -396,7 +400,12 @@ NEXT_PAGE(level4_kernel_pgt) NEXT_PAGE(level3_kernel_pgt) .fill L3_START_KERNEL,8,0 /* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */ - .quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE + i = 0 + .rept L3_KERNEL_ENTRY_COUNT + .quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE \ + + PAGE_SIZE*i + i = i + 1 + .endr .quad level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE NEXT_PAGE(level2_kernel_pgt)