From patchwork Tue Jul 18 22:33:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Garnier X-Patchwork-Id: 9850093 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EC68A60392 for ; Tue, 18 Jul 2017 22:37:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D7B2C285CB for ; Tue, 18 Jul 2017 22:37:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CBD7E285D8; Tue, 18 Jul 2017 22:37:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, RCVD_IN_DNSWL_MED, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id EF6CB285CB for ; Tue, 18 Jul 2017 22:37:34 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dXb4p-00014h-Kf; Tue, 18 Jul 2017 22:34:47 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dXb4o-0000rE-AV for xen-devel@lists.xenproject.org; Tue, 18 Jul 2017 22:34:46 +0000 Received: from [85.158.143.35] by server-7.bemta-6.messagelabs.com id 42/40-03557-60D8E695; Tue, 18 Jul 2017 22:34:46 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrJIsWRWlGSWpSXmKPExsXiVRuso8vSmxd psOOHucX3LZOZHBg9Dn+4whLAGMWamZeUX5HAmrFzzhzGghURFRemvWRrYNzm0cXIxSEkMJ1R 4tniq6wgDovAKxaJlx8XsYA4EgL9rBKHGg4COZxATpzEvX/bgWwOILtK4ve+UJCwkICSxNYNS 5khJv1jlDj5cBY7SIJNQEtiT8N8JpCEiMBsDonFXWfApgoIfGSTuLHlMFgLs0ALk8STf9uYQF qEBdwlJj86CdbOIqAqMfH0QTYQm1fASuJS9yJ2iNXGEn+/6IOYnEDh5fd5IK6wlOjtbWCZwCi 4gJFhFaNGcWpRWWqRrpGFXlJRZnpGSW5iZo6uoYGZXm5qcXFiempOYlKxXnJ+7iZGYNAxAMEO xvNrAw8xSnIwKYnyblXOixTiS8pPqcxILM6ILyrNSS0+xCjDwaEkwXuoGygnWJSanlqRlpkDD H+YtAQHj5IIr00PUJq3uCAxtzgzHSJ1itGSY9KB7V+YODpm/PzGxPFqwv9vTEIsefl5qVLivA tA5gmANGSU5sGNg8XoJUZZKWFeRqADhXgKUotyM0tQ5V8xinMwKgnzOoCs5cnMK4Hb+groICa gg4R9c0AOKklESEk1MC6wXqNoahj7MP+GMl+0roicvlX0Sq8NH8JXVZ99d53/9vXVn+7b3N/P bcY87f6Gl24vvz7bw3BnqhTXe/EOlZzAmisifbt1L328P8f1idGPt4tWKLwPSngztd18Gucdy YltnkGrNpT8rH51csvX6p+3PilLzn2z1uGmzkMfWw1NPUa2oplWVyWVWIozEg21mIuKEwF1hH SfzAIAAA== X-Env-Sender: thgarnie@google.com X-Msg-Ref: server-10.tower-21.messagelabs.com!1500417283!67237771!1 X-Originating-IP: [74.125.83.44] X-SpamReason: No, hits=0.5 required=7.0 tests=BODY_RANDOM_LONG X-StarScan-Received: X-StarScan-Version: 9.4.25; banners=-,-,- X-VirusChecked: Checked Received: (qmail 5821 invoked from network); 18 Jul 2017 22:34:44 -0000 Received: from mail-pg0-f44.google.com (HELO mail-pg0-f44.google.com) (74.125.83.44) by server-10.tower-21.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 18 Jul 2017 22:34:44 -0000 Received: by mail-pg0-f44.google.com with SMTP id u5so19922949pgq.3 for ; Tue, 18 Jul 2017 15:34:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=JIRHf5B4nu22xbfdSs6GDo6OFG37aCtXAtS5Wr/tsMY=; b=hYy2Eagk7CNnAYjG8Cw5Wf1TPX0dyXiNyNdJJjBjVJAcksW6HlRURHSKncBRPmBfOK iYOeT149lfxAnTat7hUw9/fQPSNxDl8c1tNvRS3Pb5xEXitsuI9ZAJuHUHqH3+myFDUx 4nIPS9Kmngdqo5XkXiFDYoZPeYofIckBCV7/JPf5pdRIsxLTbTk2HLFOWq/oWcYc3r8J W0otTnvTNpR84Heiat9MFqaMJIEg3XDgQxdux3Q8TB0PnzxzGgctH5VY0+VhmpVTgxsL UCEJO8YkBTx4G+yMH2PAqnhAHBflUH0APT5W6oc/yQF6sXfwbJSia3h5W5OdQ/dQd5/9 ZeuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=JIRHf5B4nu22xbfdSs6GDo6OFG37aCtXAtS5Wr/tsMY=; b=ZclxZ3bgC4M/ih5bgctw3UOIc3r4DgJFCagvboKnPjrEkSlWV8KGUkKmYNTAAbJTNO Amh1gpZpSCOZMeojXz1XF0NUutxecSaq6q+TfvogX+b81yyEgMnti/4ob7w2iZ9UBeNV ZTGCXMJIkCn/o37VOYDzkULCNE0rqbeIiEJt8/7FmJKyZg1/N+TST6Bl3OrK0vGSc1r9 d5BqzG4dTfUs7KBDWBWab6b0do2C87jmNTy3s4DnYN0CrD36DAn+4i5HkDpoTTT/GyTN zZqYDAEFXcxFyvs10Fm30UIpdl9K8ozvHOx/fWP4oucaOSbP09pvyTB9cADfiw3XQBX6 IMfw== X-Gm-Message-State: AIVw112YMewjjcnRoZIDbGl0BoH81WiRxw5EtMSLukAWMKV8MtXOKdyH JYTjmUFYnpbrlcwD X-Received: by 10.98.216.193 with SMTP id e184mr3980450pfg.46.1500417282659; Tue, 18 Jul 2017 15:34:42 -0700 (PDT) Received: from skynet.sea.corp.google.com ([100.100.206.164]) by smtp.gmail.com with ESMTPSA id b6sm6444242pgn.67.2017.07.18.15.34.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 18 Jul 2017 15:34:42 -0700 (PDT) From: Thomas Garnier To: Herbert Xu , "David S . Miller" , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , Peter Zijlstra , Josh Poimboeuf , Thomas Garnier , Arnd Bergmann , Matthias Kaehlcke , Boris Ostrovsky , Juergen Gross , Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Joerg Roedel , Andy Lutomirski , Borislav Petkov , "Kirill A . Shutemov" , Brian Gerst , Borislav Petkov , Christian Borntraeger , "Rafael J . Wysocki" , Len Brown , Pavel Machek , Tejun Heo , Christoph Lameter , Kees Cook , Paul Gortmaker , Chris Metcalf , "Paul E . McKenney" , Andrew Morton , Christopher Li , Dou Liyang , Masahiro Yamada , Daniel Borkmann , Markus Trippelsdorf , Peter Foley , Steven Rostedt , Tim Chen , Ard Biesheuvel , Catalin Marinas , Matthew Wilcox , Michal Hocko , Rob Landley , Jiri Kosina , "H . J . Lu" , Paul Bolle , Baoquan He , Daniel Micay Date: Tue, 18 Jul 2017 15:33:32 -0700 Message-Id: <20170718223333.110371-22-thgarnie@google.com> X-Mailer: git-send-email 2.13.2.932.g7449e964c-goog In-Reply-To: <20170718223333.110371-1-thgarnie@google.com> References: <20170718223333.110371-1-thgarnie@google.com> Cc: linux-arch@vger.kernel.org, kvm@vger.kernel.org, linux-pm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, linux-sparse@vger.kernel.org, linux-crypto@vger.kernel.org, kernel-hardening@lists.openwall.com, xen-devel@lists.xenproject.org Subject: [Xen-devel] [RFC 21/22] x86/module: Add support for mcmodel large and PLTs X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP With PIE support and KASLR extended range, the modules may be further away from the kernel than before breaking mcmodel=kernel expectations. Add an option to build modules with mcmodel=large. The modules generated code will make no assumptions on placement in memory. Despite this option, modules still expect kernel functions to be within 2G and generate relative calls. To solve this issue, the PLT arm64 code was adapted for x86_64. When a relative relocation go outside its range, a dynamic PLT entry is used to correctly jump to the destination. Signed-off-by: Thomas Garnier --- arch/x86/Kconfig | 10 +++ arch/x86/Makefile | 10 ++- arch/x86/include/asm/module.h | 16 ++++ arch/x86/kernel/Makefile | 2 + arch/x86/kernel/module-plts.c | 198 ++++++++++++++++++++++++++++++++++++++++++ arch/x86/kernel/module.c | 18 ++-- arch/x86/kernel/module.lds | 4 + 7 files changed, 251 insertions(+), 7 deletions(-) create mode 100644 arch/x86/kernel/module-plts.c create mode 100644 arch/x86/kernel/module.lds diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b26ee6751021..60d161391d5a 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2086,6 +2086,16 @@ config X86_PIE select DEFAULT_HIDDEN select MODULE_REL_CRCS if MODVERSIONS +config X86_MODULE_MODEL_LARGE + bool + depends on X86_64 && X86_PIE + +config X86_MODULE_PLTS + bool + depends on X86_64 + select X86_MODULE_MODEL_LARGE + select HAVE_MOD_ARCH_SPECIFIC + config HOTPLUG_CPU bool "Support for hot-pluggable CPUs" depends on SMP diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 452a9621af8f..72a90da0149a 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -138,10 +138,18 @@ else KBUILD_CFLAGS += -mno-red-zone ifdef CONFIG_X86_PIE KBUILD_CFLAGS += -fPIC - KBUILD_CFLAGS_MODULE += -fno-PIC -mcmodel=kernel + KBUILD_CFLAGS_MODULE += -fno-PIC else KBUILD_CFLAGS += -mcmodel=kernel endif +ifdef CONFIG_X86_MODULE_MODEL_LARGE + KBUILD_CFLAGS_MODULE += -mcmodel=large +else + KBUILD_CFLAGS_MODULE += -mcmodel=kernel +endif +ifdef CONFIG_X86_MODULE_PLTS + KBUILD_LDFLAGS_MODULE += -T $(srctree)/arch/x86/kernel/module.lds +endif # -funit-at-a-time shrinks the kernel .text considerably # unfortunately it makes reading oopses harder. diff --git a/arch/x86/include/asm/module.h b/arch/x86/include/asm/module.h index e3b7819caeef..d054c37656ea 100644 --- a/arch/x86/include/asm/module.h +++ b/arch/x86/include/asm/module.h @@ -61,4 +61,20 @@ # define MODULE_ARCH_VERMAGIC MODULE_PROC_FAMILY #endif +#ifdef CONFIG_X86_MODULE_PLTS +struct mod_plt_sec { + struct elf64_shdr *plt; + int plt_num_entries; + int plt_max_entries; +}; + +struct mod_arch_specific { + struct mod_plt_sec core; + struct mod_plt_sec init; +}; +#endif + +u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela, + Elf64_Sym *sym); + #endif /* _ASM_X86_MODULE_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index a01892bdd61a..e294aefb747c 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -142,4 +142,6 @@ ifeq ($(CONFIG_X86_64),y) obj-$(CONFIG_PCI_MMCONFIG) += mmconf-fam10h_64.o obj-y += vsmp_64.o + + obj-$(CONFIG_X86_MODULE_PLTS) += module-plts.o endif diff --git a/arch/x86/kernel/module-plts.c b/arch/x86/kernel/module-plts.c new file mode 100644 index 000000000000..bbf11771f424 --- /dev/null +++ b/arch/x86/kernel/module-plts.c @@ -0,0 +1,198 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * Generate PLT entries for out-of-bound PC-relative relocations. It is required + * when a module can be mapped more than 2G away from the kernel. + * + * Based on arm64 module-plts implementation. + */ + +#include +#include +#include +#include + +/* jmp QWORD PTR [rip+0xfffffffffffffff2] */ +const u8 jmp_target[] = { 0xFF, 0x25, 0xF2, 0xFF, 0xFF, 0xFF }; + +struct plt_entry { + u64 target; /* Hold the target address */ + u8 jmp[sizeof(jmp_target)]; /* jmp opcode to target */ +}; + +static bool in_init(const struct module *mod, void *loc) +{ + return (u64)loc - (u64)mod->init_layout.base < mod->init_layout.size; +} + +u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela, + Elf64_Sym *sym) +{ + struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core : + &mod->arch.init; + struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr; + int i = pltsec->plt_num_entries; + u64 ret; + + /* + * + * jmp QWORD PTR [rip+0xfffffffffffffff2] # Target address + */ + plt[i].target = sym->st_value; + memcpy(plt[i].jmp, jmp_target, sizeof(jmp_target)); + + /* + * Check if the entry we just created is a duplicate. Given that the + * relocations are sorted, this will be the last entry we allocated. + * (if one exists). + */ + if (i > 0 && plt[i].target == plt[i - 2].target) { + ret = (u64)&plt[i - 1].jmp; + } else { + pltsec->plt_num_entries++; + BUG_ON(pltsec->plt_num_entries > pltsec->plt_max_entries); + ret = (u64)&plt[i].jmp; + } + + return ret + rela->r_addend; +} + +#define cmp_3way(a,b) ((a) < (b) ? -1 : (a) > (b)) + +static int cmp_rela(const void *a, const void *b) +{ + const Elf64_Rela *x = a, *y = b; + int i; + + /* sort by type, symbol index and addend */ + i = cmp_3way(ELF64_R_TYPE(x->r_info), ELF64_R_TYPE(y->r_info)); + if (i == 0) + i = cmp_3way(ELF64_R_SYM(x->r_info), ELF64_R_SYM(y->r_info)); + if (i == 0) + i = cmp_3way(x->r_addend, y->r_addend); + return i; +} + +static bool duplicate_rel(const Elf64_Rela *rela, int num) +{ + /* + * Entries are sorted by type, symbol index and addend. That means + * that, if a duplicate entry exists, it must be in the preceding + * slot. + */ + return num > 0 && cmp_rela(rela + num, rela + num - 1) == 0; +} + +static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num, + Elf64_Word dstidx) +{ + unsigned int ret = 0; + Elf64_Sym *s; + int i; + + for (i = 0; i < num; i++) { + switch (ELF64_R_TYPE(rela[i].r_info)) { + case R_X86_64_PC32: + /* + * We only have to consider branch targets that resolve + * to symbols that are defined in a different section. + * This is not simply a heuristic, it is a fundamental + * limitation, since there is no guaranteed way to emit + * PLT entries sufficiently close to the branch if the + * section size exceeds the range of a branch + * instruction. So ignore relocations against defined + * symbols if they live in the same section as the + * relocation target. + */ + s = syms + ELF64_R_SYM(rela[i].r_info); + if (s->st_shndx == dstidx) + break; + + /* + * Jump relocations with non-zero addends against + * undefined symbols are supported by the ELF spec, but + * do not occur in practice (e.g., 'jump n bytes past + * the entry point of undefined function symbol f'). + * So we need to support them, but there is no need to + * take them into consideration when trying to optimize + * this code. So let's only check for duplicates when + * the addend is zero: this allows us to record the PLT + * entry address in the symbol table itself, rather than + * having to search the list for duplicates each time we + * emit one. + */ + if (rela[i].r_addend != 0 || !duplicate_rel(rela, i)) + ret++; + break; + } + } + return ret; +} + +int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, + char *secstrings, struct module *mod) +{ + unsigned long core_plts = 0; + unsigned long init_plts = 0; + Elf64_Sym *syms = NULL; + int i; + + /* + * Find the empty .plt section so we can expand it to store the PLT + * entries. Record the symtab address as well. + */ + for (i = 0; i < ehdr->e_shnum; i++) { + if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt")) + mod->arch.core.plt = sechdrs + i; + else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt")) + mod->arch.init.plt = sechdrs + i; + else if (sechdrs[i].sh_type == SHT_SYMTAB) + syms = (Elf64_Sym *)sechdrs[i].sh_addr; + } + + if (!mod->arch.core.plt || !mod->arch.init.plt) { + pr_err("%s: module PLT section(s) missing\n", mod->name); + return -ENOEXEC; + } + if (!syms) { + pr_err("%s: module symtab section missing\n", mod->name); + return -ENOEXEC; + } + + for (i = 0; i < ehdr->e_shnum; i++) { + Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset; + int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela); + Elf64_Shdr *dstsec = sechdrs + sechdrs[i].sh_info; + + if (sechdrs[i].sh_type != SHT_RELA) + continue; + + /* sort by type, symbol index and addend */ + sort(rels, numrels, sizeof(Elf64_Rela), cmp_rela, NULL); + + if (strncmp(secstrings + dstsec->sh_name, ".init", 5) != 0) + core_plts += count_plts(syms, rels, numrels, + sechdrs[i].sh_info); + else + init_plts += count_plts(syms, rels, numrels, + sechdrs[i].sh_info); + } + + mod->arch.core.plt->sh_type = SHT_NOBITS; + mod->arch.core.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC; + mod->arch.core.plt->sh_addralign = L1_CACHE_BYTES; + mod->arch.core.plt->sh_size = (core_plts + 1) * sizeof(struct plt_entry); + mod->arch.core.plt_num_entries = 0; + mod->arch.core.plt_max_entries = core_plts; + + mod->arch.init.plt->sh_type = SHT_NOBITS; + mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC; + mod->arch.init.plt->sh_addralign = L1_CACHE_BYTES; + mod->arch.init.plt->sh_size = (init_plts + 1) * sizeof(struct plt_entry); + mod->arch.init.plt_num_entries = 0; + mod->arch.init.plt_max_entries = init_plts; + + return 0; +} diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index f67bd3205df7..a2b31973572b 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -186,10 +186,15 @@ int apply_relocate_add(Elf64_Shdr *sechdrs, case R_X86_64_PC32: val -= (u64)loc; *(u32 *)loc = val; -#if 0 - if ((s64)val != *(s32 *)loc) - goto overflow; -#endif + if (IS_ENABLED(CONFIG_X86_MODULE_MODEL_LARGE) && + (s64)val != *(s32 *)loc) { + val = module_emit_plt_entry(me, loc, &rel[i], + sym); + val -= (u64)loc; + *(u32 *)loc = val; + if ((s64)val != *(s32 *)loc) + goto overflow; + } break; default: pr_err("%s: Unknown rela relocation: %llu\n", @@ -202,8 +207,9 @@ int apply_relocate_add(Elf64_Shdr *sechdrs, overflow: pr_err("overflow in relocation type %d val %Lx\n", (int)ELF64_R_TYPE(rel[i].r_info), val); - pr_err("`%s' likely not compiled with -mcmodel=kernel\n", - me->name); + pr_err("`%s' likely not compiled with -mcmodel=%s\n", + me->name, + IS_ENABLED(CONFIG_X86_MODULE_MODEL_LARGE) ? "large" : "kernel"); return -ENOEXEC; } #endif diff --git a/arch/x86/kernel/module.lds b/arch/x86/kernel/module.lds new file mode 100644 index 000000000000..f7c9781a9d48 --- /dev/null +++ b/arch/x86/kernel/module.lds @@ -0,0 +1,4 @@ +SECTIONS { + .plt (NOLOAD) : { BYTE(0) } + .init.plt (NOLOAD) : { BYTE(0) } +}