From patchwork Wed Oct 11 20:30:27 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Garnier X-Patchwork-Id: 10000573 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 3DF096037F for ; Wed, 11 Oct 2017 20:33:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3221C28B66 for ; Wed, 11 Oct 2017 20:33:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2617428B6D; Wed, 11 Oct 2017 20:33:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, RCVD_IN_DNSWL_MED, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 710B728B66 for ; Wed, 11 Oct 2017 20:33:40 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e2NfX-0001JU-4s; Wed, 11 Oct 2017 20:31:55 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e2NfW-0001Gb-Bo for xen-devel@lists.xenproject.org; Wed, 11 Oct 2017 20:31:54 +0000 Received: from [193.109.254.147] (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256 bits)) by server-2.bemta-6.messagelabs.com id 92/2C-02373-9BF7ED95; Wed, 11 Oct 2017 20:31:53 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrPIsWRWlGSWpSXmKPExsVyMfTAWt0d9fc iDdYf0rL4vmUykwOjx+EPV1gCGKNYM/OS8isSWDP+TvrMVrDHtuLVLqMGxrkmXYxcHEICMxkl vr/pYwZxWAResUgc7ljGBuJICPSzSuz+/AHI4QRysiTmNh+GstMkls6dxQRhV0j8mvMDLC4ko CSxdcNSZoix/xglug5uZgRJsAloSexpmM8EkhAROCEssenDb0YQh1ngDJPE8b6TrCBVwgKhEh dXfQPrYBFQlbjZ1ww2llfAUuLtkT3MEOssJI4dO8wCYnMCxb9vPMQMsdpCYuWs86wTGAUXMDK sYlQvTi0qSy3SNdFLKspMzyjJTczM0TU0MNPLTS0uTkxPzUlMKtZLzs/dxAgMOgYg2MHYfdn/ EKMkB5OSKO/DmHuRQnxJ+SmVGYnFGfFFpTmpxYcYZTg4lCR4jYFBLCRYlJqeWpGWmQMMf5i0B AePkghvUx1Qmre4IDG3ODMdInWK0Z7jwp1Lf5g4Duy5BSQf3bgLJDtuAkkhlrz8vFQpcd6HIG 0CIG0ZpXlwQ2HxeolRVkqYlxHoTCGegtSi3MwSVPlXjOIcjErCvIdBpvBk5pXA7X4FdBYT0Fm iaXdAzipJREhJNTAWffir+2EJj0fNzaf35UUnLP5ZvPeMpRJna5/1Sc9n1jn/Qn/7OL99PJ9F IsR257Jv/DMOfJjpsMjvWSjjK5d9P//u37boS9EzJrWIO/sjf19U833/tnZ2nofY6tPBf5u5N /55HhD4S5hZd4vqnW28d1RUjVa7LPW7t/i3Q0hM5LZzFydMf+z/X4mlOCPRUIu5qDgRAPg+Z3 7SAgAA X-Env-Sender: thgarnie@google.com X-Msg-Ref: server-4.tower-27.messagelabs.com!1507753911!110447266!1 X-Originating-IP: [209.85.192.173] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.4.45; banners=-,-,- X-VirusChecked: Checked Received: (qmail 22367 invoked from network); 11 Oct 2017 20:31:52 -0000 Received: from mail-pf0-f173.google.com (HELO mail-pf0-f173.google.com) (209.85.192.173) by server-4.tower-27.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 11 Oct 2017 20:31:52 -0000 Received: by mail-pf0-f173.google.com with SMTP id m28so1939019pfi.11 for ; Wed, 11 Oct 2017 13:31:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=TIBXddbB7brqoudFKH/Kj0d1+0lIKo3bYHIzsJ8CRhs=; b=UOxRjGgZmj1yNIMKRH43QhrovaWSr6mAlNFwAuqQHAwF5Dfp8AL1yx5fqjDsFArVw5 8pdcFb560MnzDS0SQozXQeL4JuDI44PgQiuc5c6MB4LPTHN9KMbpphjHpBG/6PpY9rrQ 0IIp9J7/M9s3lgdKqWa7CLvvDmKRCE9ygqS+b54AkUvKnRZILlcRoXKroZtpL4F40Nqx O8QvhoE6u6NJD9yUL4ZXig6zsGoklDWbmy1RpJnRX0CIJT5lbZO2DP9OD20DZsSlkqYt yB9Tvazr3lkvcYQhI7gYceki+EOgR90tmxJfEeRsn3M0fKLCJHXchZ/S9lZf5QKepDzt VVvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=TIBXddbB7brqoudFKH/Kj0d1+0lIKo3bYHIzsJ8CRhs=; b=JaEqr5/Y1TbU/QSeeiZhFJH2P9WpX0hvT+nbDEzigK8Xq60jHIBKzt70yR3/lzFNpY PgYdvPBD504nVxslzSCkTokxBQoNjLuc6GEvJKKXLMd1+dOBPJr83M5YpbVeBsDTG1Eq PGJYNVRJ3djbAkn7Dq0dV8PDEeZ4P7WDKXAK011hCHxkgDIRb1KpvYWMCwcBvoxe1oHn 6NQx5VTrvCoZkXhwOuhee3wheSoP3lL7STcHsrmMx/y/Ma9pyZDAu1WxUTjkfmv96+yR 9dHVpEXxCSCQ+j9I4ImMFh0QvbAi1BAZ0dYTI17z9U8uD/v7I7i+xe9jRmM/fB6Be59g wRbw== X-Gm-Message-State: AMCzsaXlMFDGPfM9SzleTMvUQtoCR2sVWdg01S/v4KJsgWVpv5j8z2DK IAjzubr2Lpl568wr3JrMQmMveg== X-Google-Smtp-Source: AOwi7QA6iMI9sZ/EX+eiID1Dlf8RQ4Z3EJdUFuMGcIStfYvY5lOnvN8RNL3YxTrmor3DBTyOw1bQpg== X-Received: by 10.84.134.164 with SMTP id 33mr162054plh.323.1507753908878; Wed, 11 Oct 2017 13:31:48 -0700 (PDT) Received: from skynet.sea.corp.google.com ([172.31.92.33]) by smtp.gmail.com with ESMTPSA id n12sm20691913pfb.149.2017.10.11.13.31.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 11 Oct 2017 13:31:47 -0700 (PDT) From: Thomas Garnier To: Herbert Xu , "David S . Miller" , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , Peter Zijlstra , Josh Poimboeuf , Arnd Bergmann , Thomas Garnier , Kees Cook , Andrey Ryabinin , Matthias Kaehlcke , Tom Lendacky , Andy Lutomirski , "Kirill A . Shutemov" , Borislav Petkov , "Rafael J . Wysocki" , Len Brown , Pavel Machek , Juergen Gross , Chris Wright , Alok Kataria , Rusty Russell , Tejun Heo , Christoph Lameter , Boris Ostrovsky , Paul Gortmaker , Andrew Morton , Alexey Dobriyan , "Paul E . McKenney" , Nicolas Pitre , Borislav Petkov , "Luis R . Rodriguez" , Greg Kroah-Hartman , Christopher Li , Steven Rostedt , Jason Baron , Mika Westerberg , Dou Liyang , "Rafael J . Wysocki" , Lukas Wunner , Masahiro Yamada , Alexei Starovoitov , Daniel Borkmann , Markus Trippelsdorf , Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Joerg Roedel , Rik van Riel , David Howells , Ard Biesheuvel , Waiman Long , Kyle Huey , Jonathan Corbet , Michal Hocko , Peter Foley , Paul Bolle , Jiri Kosina , "H . J . Lu" , Rob Landley , Baoquan He , =?UTF-8?q?Jan=20H=20=2E=20Sch=C3=B6nherr?= , Daniel Micay Date: Wed, 11 Oct 2017 13:30:27 -0700 Message-Id: <20171011203027.11248-28-thgarnie@google.com> X-Mailer: git-send-email 2.15.0.rc0.271.g36b669edcc-goog In-Reply-To: <20171011203027.11248-1-thgarnie@google.com> References: <20171011203027.11248-1-thgarnie@google.com> Cc: linux-arch@vger.kernel.org, kvm@vger.kernel.org, linux-pm@vger.kernel.org, x86@kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-sparse@vger.kernel.org, linux-crypto@vger.kernel.org, kernel-hardening@lists.openwall.com, xen-devel@lists.xenproject.org Subject: [Xen-devel] [PATCH v1 27/27] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Add a new CONFIG_RANDOMIZE_BASE_LARGE option to benefit from PIE support. It increases the KASLR range from 1GB to 3GB. The new range stars at 0xffffffff00000000 just above the EFI memory region. This option is off by default. The boot code is adapted to create the appropriate page table spanning three PUD pages. The relocation table uses 64-bit integers generated with the updated relocation tool with the large-reloc option. Signed-off-by: Thomas Garnier --- arch/x86/Kconfig | 21 +++++++++++++++++++++ arch/x86/boot/compressed/Makefile | 5 +++++ arch/x86/boot/compressed/misc.c | 10 +++++++++- arch/x86/include/asm/page_64_types.h | 9 +++++++++ arch/x86/kernel/head64.c | 15 ++++++++++++--- arch/x86/kernel/head_64.S | 11 ++++++++++- 6 files changed, 66 insertions(+), 5 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index bbd28a46ab55..54627dd06348 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2155,6 +2155,27 @@ config X86_PIE select DYNAMIC_MODULE_BASE select MODULE_REL_CRCS if MODVERSIONS +config RANDOMIZE_BASE_LARGE + bool "Increase the randomization range of the kernel image" + depends on X86_64 && RANDOMIZE_BASE + select X86_PIE + select X86_MODULE_PLTS if MODULES + default n + ---help--- + Build the kernel as a Position Independent Executable (PIE) and + increase the available randomization range from 1GB to 3GB. + + This option impacts performance on kernel CPU intensive workloads up + to 10% due to PIE generated code. Impact on user-mode processes and + typical usage would be significantly less (0.50% when you build the + kernel). + + The kernel and modules will generate slightly more assembly (1 to 2% + increase on the .text sections). The vmlinux binary will be + significantly smaller due to less relocations. + + If unsure say N + config HOTPLUG_CPU bool "Support for hot-pluggable CPUs" depends on SMP diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 8a958274b54c..94dfee5a7cd2 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -112,7 +112,12 @@ $(obj)/vmlinux.bin: vmlinux FORCE targets += $(patsubst $(obj)/%,%,$(vmlinux-objs-y)) vmlinux.bin.all vmlinux.relocs +# Large randomization require bigger relocation table +ifeq ($(CONFIG_RANDOMIZE_BASE_LARGE),y) +CMD_RELOCS = arch/x86/tools/relocs --large-reloc +else CMD_RELOCS = arch/x86/tools/relocs +endif quiet_cmd_relocs = RELOCS $@ cmd_relocs = $(CMD_RELOCS) $< > $@;$(CMD_RELOCS) --abs-relocs $< $(obj)/vmlinux.relocs: vmlinux FORCE diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index c14217cd0155..c1ac9f2e283d 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -169,10 +169,18 @@ void __puthex(unsigned long value) } #if CONFIG_X86_NEED_RELOCS + +/* Large randomization go lower than -2G and use large relocation table */ +#ifdef CONFIG_RANDOMIZE_BASE_LARGE +typedef long rel_t; +#else +typedef int rel_t; +#endif + static void handle_relocations(void *output, unsigned long output_len, unsigned long virt_addr) { - int *reloc; + rel_t *reloc; unsigned long delta, map, ptr; unsigned long min_addr = (unsigned long)output; unsigned long max_addr = min_addr + (VO___bss_start - VO__text); diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h index 3f5f08b010d0..6b65f846dd64 100644 --- a/arch/x86/include/asm/page_64_types.h +++ b/arch/x86/include/asm/page_64_types.h @@ -48,7 +48,11 @@ #define __PAGE_OFFSET __PAGE_OFFSET_BASE #endif /* CONFIG_RANDOMIZE_MEMORY */ +#ifdef CONFIG_RANDOMIZE_BASE_LARGE +#define __START_KERNEL_map _AC(0xffffffff00000000, UL) +#else #define __START_KERNEL_map _AC(0xffffffff80000000, UL) +#endif /* CONFIG_RANDOMIZE_BASE_LARGE */ /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */ #ifdef CONFIG_X86_5LEVEL @@ -65,9 +69,14 @@ * 512MiB by default, leaving 1.5GiB for modules once the page tables * are fully set up. If kernel ASLR is configured, it can extend the * kernel page table mapping, reducing the size of the modules area. + * On PIE, we relocate the binary 2G lower so add this extra space. */ #if defined(CONFIG_RANDOMIZE_BASE) +#ifdef CONFIG_RANDOMIZE_BASE_LARGE +#define KERNEL_IMAGE_SIZE (_AC(3, UL) * 1024 * 1024 * 1024) +#else #define KERNEL_IMAGE_SIZE (1024 * 1024 * 1024) +#endif #else #define KERNEL_IMAGE_SIZE (512 * 1024 * 1024) #endif diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index b6363f0d11a7..d603d0f5a40a 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -39,6 +39,7 @@ static unsigned int __initdata next_early_pgt; pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX); #define __head __section(.head.text) +#define pud_count(x) (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT) static void __head *fixup_pointer(void *ptr, unsigned long physaddr) { @@ -56,6 +57,8 @@ unsigned long __head notrace __startup_64(unsigned long physaddr, { unsigned long load_delta, *p; unsigned long pgtable_flags; + unsigned long level3_kernel_start, level3_kernel_count; + unsigned long level3_fixmap_start; pgdval_t *pgd; p4dval_t *p4d; pudval_t *pud; @@ -83,6 +86,11 @@ unsigned long __head notrace __startup_64(unsigned long physaddr, /* Include the SME encryption mask in the fixup value */ load_delta += sme_get_me_mask(); + /* Look at the randomization spread to adapt page table used */ + level3_kernel_start = pud_index(__START_KERNEL_map); + level3_kernel_count = pud_count(KERNEL_IMAGE_SIZE); + level3_fixmap_start = level3_kernel_start + level3_kernel_count; + /* Fixup the physical addresses in the page table */ pgd = fixup_pointer(&early_top_pgt, physaddr); @@ -94,8 +102,9 @@ unsigned long __head notrace __startup_64(unsigned long physaddr, } pud = fixup_pointer(&level3_kernel_pgt, physaddr); - pud[510] += load_delta; - pud[511] += load_delta; + for (i = 0; i < level3_kernel_count; i++) + pud[level3_kernel_start + i] += load_delta; + pud[level3_fixmap_start] += load_delta; pmd = fixup_pointer(level2_fixmap_pgt, physaddr); pmd[506] += load_delta; @@ -150,7 +159,7 @@ unsigned long __head notrace __startup_64(unsigned long physaddr, */ pmd = fixup_pointer(level2_kernel_pgt, physaddr); - for (i = 0; i < PTRS_PER_PMD; i++) { + for (i = 0; i < PTRS_PER_PMD * level3_kernel_count; i++) { if (pmd[i] & _PAGE_PRESENT) pmd[i] += load_delta; } diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index df5198e310fc..7918ffefc9c9 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -39,11 +39,15 @@ #define p4d_index(x) (((x) >> P4D_SHIFT) & (PTRS_PER_P4D-1)) #define pud_index(x) (((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1)) +#define pud_count(x) (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT) PGD_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE) PGD_START_KERNEL = pgd_index(__START_KERNEL_map) L3_START_KERNEL = pud_index(__START_KERNEL_map) +/* Adapt page table L3 space based on range of randomization */ +L3_KERNEL_ENTRY_COUNT = pud_count(KERNEL_IMAGE_SIZE) + .text __HEAD .code64 @@ -413,7 +417,12 @@ NEXT_PAGE(level4_kernel_pgt) NEXT_PAGE(level3_kernel_pgt) .fill L3_START_KERNEL,8,0 /* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */ - .quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC + i = 0 + .rept L3_KERNEL_ENTRY_COUNT + .quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC \ + + PAGE_SIZE*i + i = i + 1 + .endr .quad level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC NEXT_PAGE(level2_kernel_pgt)