From patchwork Wed Sep 23 17:38:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 11795341 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 942CA112E for ; Wed, 23 Sep 2020 17:40:10 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id 922B720B1F for ; Wed, 23 Sep 2020 17:40:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 922B720B1F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19972-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 1738 invoked by uid 550); 23 Sep 2020 17:40:07 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 1635 invoked from network); 23 Sep 2020 17:40:06 -0000 IronPort-SDR: ZVVxduvQm71L0iMvfk3iYZ0pw2dG9iSd3k0Yr6bHpjvXF7VxgvaUn+bd5KRdowbGXJK5lx5VUG 3dXuIn87eF6g== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="161903130" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="161903130" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False IronPort-SDR: QAI0UBYfGdr7CH5Q2h95uVpDuXwHnthM15CGbtYVBHxf40B3vz5i7pSEIQbLIWF+LnaoeTqTA5 sNCVdV9zDERA== X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="309992877" From: Kristen Carlson Accardi To: keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, "H. Peter Anvin" Cc: arjan@linux.intel.com, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, rick.p.edgecombe@intel.com, Kristen Carlson Accardi , Tony Luck Subject: [PATCH v5 01/10] x86: tools/relocs: Support >64K section headers Date: Wed, 23 Sep 2020 10:38:55 -0700 Message-Id: <20200923173905.11219-2-kristen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200923173905.11219-1-kristen@linux.intel.com> References: <20200923173905.11219-1-kristen@linux.intel.com> MIME-Version: 1.0 While the relocs tool already supports finding the total number of section headers if vmlinux exceeds 64K sections, it fails to read the extended symbol table to get section header indexes for symbols, causing incorrect symbol table indexes to be used when there are > 64K symbols. Parse the elf file to read the extended symbol table info, and then replace all direct references to st_shndx with calls to sym_index(), which will determine whether the value can be read directly or whether the value should be pulled out of the extended table. Signed-off-by: Kristen Carlson Accardi Reviewed-by: Kees Cook Reviewed-by: Tony Luck Tested-by: Tony Luck Acked-by: H. Peter Anvin (Intel) --- arch/x86/tools/relocs.c | 104 ++++++++++++++++++++++++++++++---------- 1 file changed, 78 insertions(+), 26 deletions(-) diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index ce7188cbdae5..31b2d151aa63 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -14,6 +14,10 @@ static Elf_Ehdr ehdr; static unsigned long shnum; static unsigned int shstrndx; +static unsigned int shsymtabndx; +static unsigned int shxsymtabndx; + +static int sym_index(Elf_Sym *sym); struct relocs { uint32_t *offset; @@ -32,6 +36,7 @@ struct section { Elf_Shdr shdr; struct section *link; Elf_Sym *symtab; + Elf32_Word *xsymtab; Elf_Rel *reltab; char *strtab; }; @@ -265,7 +270,7 @@ static const char *sym_name(const char *sym_strtab, Elf_Sym *sym) name = sym_strtab + sym->st_name; } else { - name = sec_name(sym->st_shndx); + name = sec_name(sym_index(sym)); } return name; } @@ -335,6 +340,23 @@ static uint64_t elf64_to_cpu(uint64_t val) #define elf_xword_to_cpu(x) elf32_to_cpu(x) #endif +static int sym_index(Elf_Sym *sym) +{ + Elf_Sym *symtab = secs[shsymtabndx].symtab; + Elf32_Word *xsymtab = secs[shxsymtabndx].xsymtab; + unsigned long offset; + int index; + + if (sym->st_shndx != SHN_XINDEX) + return sym->st_shndx; + + /* calculate offset of sym from head of table. */ + offset = (unsigned long)sym - (unsigned long)symtab; + index = offset / sizeof(*sym); + + return elf32_to_cpu(xsymtab[index]); +} + static void read_ehdr(FILE *fp) { if (fread(&ehdr, sizeof(ehdr), 1, fp) != 1) { @@ -468,31 +490,60 @@ static void read_strtabs(FILE *fp) static void read_symtabs(FILE *fp) { int i,j; + for (i = 0; i < shnum; i++) { struct section *sec = &secs[i]; - if (sec->shdr.sh_type != SHT_SYMTAB) { + int num_syms; + + switch (sec->shdr.sh_type) { + case SHT_SYMTAB_SHNDX: + sec->xsymtab = malloc(sec->shdr.sh_size); + if (!sec->xsymtab) { + die("malloc of %d bytes for xsymtab failed\n", + sec->shdr.sh_size); + } + if (fseek(fp, sec->shdr.sh_offset, SEEK_SET) < 0) { + die("Seek to %d failed: %s\n", + sec->shdr.sh_offset, strerror(errno)); + } + if (fread(sec->xsymtab, 1, sec->shdr.sh_size, fp) + != sec->shdr.sh_size) { + die("Cannot read extended symbol table: %s\n", + strerror(errno)); + } + shxsymtabndx = i; + continue; + + case SHT_SYMTAB: + num_syms = sec->shdr.sh_size / sizeof(Elf_Sym); + + sec->symtab = malloc(sec->shdr.sh_size); + if (!sec->symtab) { + die("malloc of %d bytes for symtab failed\n", + sec->shdr.sh_size); + } + if (fseek(fp, sec->shdr.sh_offset, SEEK_SET) < 0) { + die("Seek to %d failed: %s\n", + sec->shdr.sh_offset, strerror(errno)); + } + if (fread(sec->symtab, 1, sec->shdr.sh_size, fp) + != sec->shdr.sh_size) { + die("Cannot read symbol table: %s\n", + strerror(errno)); + } + for (j = 0; j < num_syms; j++) { + Elf_Sym *sym = &sec->symtab[j]; + + sym->st_name = elf_word_to_cpu(sym->st_name); + sym->st_value = elf_addr_to_cpu(sym->st_value); + sym->st_size = elf_xword_to_cpu(sym->st_size); + sym->st_shndx = elf_half_to_cpu(sym->st_shndx); + } + shsymtabndx = i; + continue; + + default: continue; - } - sec->symtab = malloc(sec->shdr.sh_size); - if (!sec->symtab) { - die("malloc of %d bytes for symtab failed\n", - sec->shdr.sh_size); - } - if (fseek(fp, sec->shdr.sh_offset, SEEK_SET) < 0) { - die("Seek to %d failed: %s\n", - sec->shdr.sh_offset, strerror(errno)); - } - if (fread(sec->symtab, 1, sec->shdr.sh_size, fp) - != sec->shdr.sh_size) { - die("Cannot read symbol table: %s\n", - strerror(errno)); - } - for (j = 0; j < sec->shdr.sh_size/sizeof(Elf_Sym); j++) { - Elf_Sym *sym = &sec->symtab[j]; - sym->st_name = elf_word_to_cpu(sym->st_name); - sym->st_value = elf_addr_to_cpu(sym->st_value); - sym->st_size = elf_xword_to_cpu(sym->st_size); - sym->st_shndx = elf_half_to_cpu(sym->st_shndx); } } } @@ -759,13 +810,14 @@ static void percpu_init(void) */ static int is_percpu_sym(ElfW(Sym) *sym, const char *symname) { - return (sym->st_shndx == per_cpu_shndx) && + int shndx = sym_index(sym); + + return (shndx == per_cpu_shndx) && strcmp(symname, "__init_begin") && strcmp(symname, "__per_cpu_load") && strncmp(symname, "init_per_cpu_", 13); } - static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym, const char *symname) { @@ -1088,7 +1140,7 @@ static int do_reloc_info(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym, sec_name(sec->shdr.sh_info), rel_type(ELF_R_TYPE(rel->r_info)), symname, - sec_name(sym->st_shndx)); + sec_name(sym_index(sym))); return 0; } From patchwork Wed Sep 23 17:38:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 11795345 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EFE95112E for ; Wed, 23 Sep 2020 17:40:18 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id 4F8E220B1F for ; Wed, 23 Sep 2020 17:40:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4F8E220B1F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19973-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 1877 invoked by uid 550); 23 Sep 2020 17:40:08 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 1767 invoked from network); 23 Sep 2020 17:40:07 -0000 IronPort-SDR: s2hhLme555rwW3f7dRB+GngFEyFEzSLlchLD1JdF7SHT87hWajeCTuyNU3fpgw5eXrz5mxzQXd VWmRn64r2z9g== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="161903136" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="161903136" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False IronPort-SDR: l/p7iSt4D/0o7SpwXtXWSrDNVwVnREFcqk9shbPX11s02ofuyv57DcHWdIF87fDkIrTiMrnvG8 rXC6RVZcGEEw== X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="309992887" From: Kristen Carlson Accardi To: keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, "H. Peter Anvin" Cc: arjan@linux.intel.com, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, rick.p.edgecombe@intel.com, Kristen Carlson Accardi Subject: [PATCH v5 02/10] x86/boot: Allow a "silent" kaslr random byte fetch Date: Wed, 23 Sep 2020 10:38:56 -0700 Message-Id: <20200923173905.11219-3-kristen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200923173905.11219-1-kristen@linux.intel.com> References: <20200923173905.11219-1-kristen@linux.intel.com> MIME-Version: 1.0 From: Kees Cook Under earlyprintk, each RNG call produces a debug report line. When shuffling hundreds of functions, this is not useful information (each line is identical and tells us nothing new). Instead, allow for a NULL "purpose" to suppress the debug reporting. Signed-off-by: Kees Cook Signed-off-by: Kristen Carlson Accardi --- arch/x86/lib/kaslr.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/x86/lib/kaslr.c b/arch/x86/lib/kaslr.c index a53665116458..2b3eb8c948a3 100644 --- a/arch/x86/lib/kaslr.c +++ b/arch/x86/lib/kaslr.c @@ -56,11 +56,14 @@ unsigned long kaslr_get_random_long(const char *purpose) unsigned long raw, random = get_boot_seed(); bool use_i8254 = true; - debug_putstr(purpose); - debug_putstr(" KASLR using"); + if (purpose) { + debug_putstr(purpose); + debug_putstr(" KASLR using"); + } if (has_cpuflag(X86_FEATURE_RDRAND)) { - debug_putstr(" RDRAND"); + if (purpose) + debug_putstr(" RDRAND"); if (rdrand_long(&raw)) { random ^= raw; use_i8254 = false; @@ -68,7 +71,8 @@ unsigned long kaslr_get_random_long(const char *purpose) } if (has_cpuflag(X86_FEATURE_TSC)) { - debug_putstr(" RDTSC"); + if (purpose) + debug_putstr(" RDTSC"); raw = rdtsc(); random ^= raw; @@ -76,7 +80,8 @@ unsigned long kaslr_get_random_long(const char *purpose) } if (use_i8254) { - debug_putstr(" i8254"); + if (purpose) + debug_putstr(" i8254"); random ^= i8254(); } @@ -86,7 +91,8 @@ unsigned long kaslr_get_random_long(const char *purpose) : "a" (random), "rm" (mix_const)); random += raw; - debug_putstr("...\n"); + if (purpose) + debug_putstr("...\n"); return random; } From patchwork Wed Sep 23 17:38:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 11795347 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E32B2618 for ; Wed, 23 Sep 2020 17:40:28 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id 4505C20B1F for ; Wed, 23 Sep 2020 17:40:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4505C20B1F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19974-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 5610 invoked by uid 550); 23 Sep 2020 17:40:23 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 5501 invoked from network); 23 Sep 2020 17:40:22 -0000 IronPort-SDR: Dw5jrwNM+8ko+XpH9x21J2tNFSluxII/6j24uzk4wTuQyFicjMn873e2oFDT7+pPfNlVddQO0E WCJHRzP7xQ4Q== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="222548898" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="222548898" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False IronPort-SDR: FE1hSON25G6Tvgbqpi7Vg0+F9ucWFU0hDup715ktVBXKfJP0FaBnXW8V5iOFFVYiHwTpjB948b jp54cm1zd7LQ== X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="309992993" From: Kristen Carlson Accardi To: keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, Masahiro Yamada , Michal Marek , x86@kernel.org, "H. Peter Anvin" , Arnd Bergmann Cc: arjan@linux.intel.com, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, rick.p.edgecombe@intel.com, Kristen Carlson Accardi , Tony Luck , linux-kbuild@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v5 03/10] x86: Makefile: Add build and config option for CONFIG_FG_KASLR Date: Wed, 23 Sep 2020 10:38:57 -0700 Message-Id: <20200923173905.11219-4-kristen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200923173905.11219-1-kristen@linux.intel.com> References: <20200923173905.11219-1-kristen@linux.intel.com> MIME-Version: 1.0 Allow user to select CONFIG_FG_KASLR if dependencies are met. Change the make file to build with -ffunction-sections if CONFIG_FG_KASLR. While the only architecture that supports CONFIG_FG_KASLR does not currently enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION, make sure these 2 features play nicely together for the future by ensuring that if CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is selected when used with CONFIG_FG_KASLR the function sections will not be consolidated back into .text. Thanks to Kees Cook for the dead code elimination changes. Signed-off-by: Kristen Carlson Accardi Reviewed-by: Tony Luck Reviewed-by: Kees Cook Tested-by: Tony Luck --- Makefile | 6 +++++- arch/x86/Kconfig | 4 ++++ include/asm-generic/vmlinux.lds.h | 16 ++++++++++++++-- init/Kconfig | 14 ++++++++++++++ 4 files changed, 37 insertions(+), 3 deletions(-) diff --git a/Makefile b/Makefile index 2b66d3398878..0c116b833fd5 100644 --- a/Makefile +++ b/Makefile @@ -878,10 +878,14 @@ KBUILD_CFLAGS += $(call cc-option, -fno-inline-functions-called-once) endif ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION -KBUILD_CFLAGS_KERNEL += -ffunction-sections -fdata-sections +KBUILD_CFLAGS_KERNEL += -fdata-sections LDFLAGS_vmlinux += --gc-sections endif +ifneq ($(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION)$(CONFIG_FG_KASLR),) +KBUILD_CFLAGS += -ffunction-sections +endif + ifdef CONFIG_SHADOW_CALL_STACK CC_FLAGS_SCS := -fsanitize=shadow-call-stack KBUILD_CFLAGS += $(CC_FLAGS_SCS) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 7101ac64bb20..ff0f90d0421f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -374,6 +374,10 @@ config CC_HAS_SANE_STACKPROTECTOR We have to make sure stack protector is unconditionally disabled if the compiler produces broken code. +config ARCH_HAS_FG_KASLR + def_bool y + depends on RANDOMIZE_BASE && X86_64 + menu "Processor type and features" config ZONE_DMA diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 5430febd34be..afd5cdf79a3a 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -93,14 +93,12 @@ * sections to be brought in with rodata. */ #ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION -#define TEXT_MAIN .text .text.[0-9a-zA-Z_]* #define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..LPBX* #define SDATA_MAIN .sdata .sdata.[0-9a-zA-Z_]* #define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* #define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* #define SBSS_MAIN .sbss .sbss.[0-9a-zA-Z_]* #else -#define TEXT_MAIN .text #define DATA_MAIN .data #define SDATA_MAIN .sdata #define RODATA_MAIN .rodata @@ -108,6 +106,20 @@ #define SBSS_MAIN .sbss #endif +/* + * Both LD_DEAD_CODE_DATA_ELIMINATION and CONFIG_FG_KASLR options enable + * -ffunction-sections, which produces separately named .text sections. In + * the case of CONFIG_FG_KASLR, they need to stay distict so they can be + * separately randomized. Without CONFIG_FG_KASLR, the separate .text + * sections can be collected back into a common section, which makes the + * resulting image slightly smaller + */ +#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) && !defined(CONFIG_FG_KASLR) +#define TEXT_MAIN .text .text.[0-9a-zA-Z_]* +#else +#define TEXT_MAIN .text +#endif + /* * GCC 4.5 and later have a 32 bytes section alignment for structures. * Except GCC 4.9, that feels the need to align on 64 bytes. diff --git a/init/Kconfig b/init/Kconfig index d6a0b31b13dc..81220973b064 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2019,6 +2019,20 @@ config PROFILING config TRACEPOINTS bool +config FG_KASLR + bool "Function Granular Kernel Address Space Layout Randomization" + depends on $(cc-option, -ffunction-sections) + depends on ARCH_HAS_FG_KASLR + default n + help + This option improves the randomness of the kernel text + over basic Kernel Address Space Layout Randomization (KASLR) + by reordering the kernel text at boot time. This feature + uses information generated at compile time to re-layout the + kernel text section at boot time at function level granularity. + + If unsure, say N. + endmenu # General setup source "arch/Kconfig" From patchwork Wed Sep 23 17:38:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 11795349 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A3795618 for ; Wed, 23 Sep 2020 17:40:36 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id 02F852067B for ; Wed, 23 Sep 2020 17:40:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 02F852067B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19975-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 5699 invoked by uid 550); 23 Sep 2020 17:40:24 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 5633 invoked from network); 23 Sep 2020 17:40:23 -0000 IronPort-SDR: uHVEBkXT3v1vA87jw7rwRT+jVBo/hvRuy36EUjKg3R0QgRwoucuFmd7YBzVr8PyasdqHgoCqu9 5ZMLZmaNnuuQ== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="222548979" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="222548979" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False IronPort-SDR: HF6l80Vl4GPB6WBckXE/1zi6edCdb0E0SoZ0b2AIqsy1uPUzAKQS8Xx8vNJW5nsqHkQEO/idC8 xMkCI9K6Mq7w== X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="309993016" From: Kristen Carlson Accardi To: keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, "H. Peter Anvin" , Arnd Bergmann Cc: arjan@linux.intel.com, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, rick.p.edgecombe@intel.com, Kristen Carlson Accardi , Tony Luck , linux-arch@vger.kernel.org Subject: [PATCH v5 04/10] x86: Make sure _etext includes function sections Date: Wed, 23 Sep 2020 10:38:58 -0700 Message-Id: <20200923173905.11219-5-kristen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200923173905.11219-1-kristen@linux.intel.com> References: <20200923173905.11219-1-kristen@linux.intel.com> MIME-Version: 1.0 When using -ffunction-sections to place each function in it's own text section so it can be randomized at load time, the linker considers these .text.* sections "orphaned sections", and will place them after the first similar section (.text). In order to accurately represent the end of the text section and the orphaned sections, _etext must be moved so that it is after both .text and .text.* The text size must also be calculated to include .text AND .text.* Signed-off-by: Kristen Carlson Accardi Reviewed-by: Tony Luck Tested-by: Tony Luck Reviewed-by: Kees Cook --- arch/x86/kernel/vmlinux.lds.S | 17 +++++++++++++++-- include/asm-generic/vmlinux.lds.h | 2 +- 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 9a03e5b23135..b0718eef283f 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -146,9 +146,22 @@ SECTIONS #endif } :text =0xcccc - /* End of text section, which should occupy whole number of pages */ - _etext = .; + /* + * -ffunction-sections creates .text.* sections, which are considered + * "orphan sections" and added after the first similar section (.text). + * Placing this ALIGN statement before _etext causes the address of + * _etext to be below that of all the .text.* orphaned sections + */ . = ALIGN(PAGE_SIZE); + _etext = .; + + /* + * the size of the .text section is used to calculate the address + * range for orc lookups. If we just use SIZEOF(.text), we will + * miss all the .text.* sections. Calculate the size using _etext + * and _stext and save the value for later. + */ + text_size = _etext - _stext; X86_ALIGN_RODATA_BEGIN RO_DATA(PAGE_SIZE) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index afd5cdf79a3a..6f7239e033e8 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -863,7 +863,7 @@ . = ALIGN(4); \ .orc_lookup : AT(ADDR(.orc_lookup) - LOAD_OFFSET) { \ orc_lookup = .; \ - . += (((SIZEOF(.text) + LOOKUP_BLOCK_SIZE - 1) / \ + . += (((text_size + LOOKUP_BLOCK_SIZE - 1) / \ LOOKUP_BLOCK_SIZE) + 1) * 4; \ orc_lookup_end = .; \ } From patchwork Wed Sep 23 17:38:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 11795353 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E5298618 for ; Wed, 23 Sep 2020 17:40:45 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id 0AB8220665 for ; Wed, 23 Sep 2020 17:40:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0AB8220665 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19976-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 9373 invoked by uid 550); 23 Sep 2020 17:40:40 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 9253 invoked from network); 23 Sep 2020 17:40:39 -0000 IronPort-SDR: DPLyWYXGBmtgucW/cCvuxGog7kqJmjb/8aHrpYYf8QrIdg5c8odlruCXxP8oGjZyd8hx1MkUfh IYG73NKglcag== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="140436832" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="140436832" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False IronPort-SDR: +8lD5YwemFKokF65cK5bwQ/58UJWSjwBpYG4/ZBG3wzyKFaShf3R0y4rg3Kv2SzvFBdPeBFHFL FcOpr9syJplw== X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="309993059" From: Kristen Carlson Accardi To: keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, "H. Peter Anvin" Cc: arjan@linux.intel.com, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, rick.p.edgecombe@intel.com, Kristen Carlson Accardi , Tony Luck Subject: [PATCH v5 05/10] x86/tools: Add relative relocs for randomized functions Date: Wed, 23 Sep 2020 10:38:59 -0700 Message-Id: <20200923173905.11219-6-kristen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200923173905.11219-1-kristen@linux.intel.com> References: <20200923173905.11219-1-kristen@linux.intel.com> MIME-Version: 1.0 When reordering functions, the relative offsets for relocs that are either in the randomized sections, or refer to the randomized sections will need to be adjusted. Add code to detect whether a reloc satisfies these cases, and if so, add them to the appropriate reloc list. Signed-off-by: Kristen Carlson Accardi Reviewed-by: Tony Luck Tested-by: Tony Luck Reviewed-by: Kees Cook --- arch/x86/boot/compressed/Makefile | 7 +++++- arch/x86/tools/relocs.c | 41 ++++++++++++++++++++++++++++--- arch/x86/tools/relocs.h | 4 +-- arch/x86/tools/relocs_common.c | 15 +++++++---- 4 files changed, 55 insertions(+), 12 deletions(-) diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index ff7894f39e0e..a83935c2fff4 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -122,6 +122,11 @@ $(obj)/vmlinux: $(vmlinux-objs-y) $(efi-obj-y) FORCE $(call if_changed,check-and-link-vmlinux) OBJCOPYFLAGS_vmlinux.bin := -R .comment -S + +ifdef CONFIG_FG_KASLR + RELOCS_ARGS += --fg-kaslr +endif + $(obj)/vmlinux.bin: vmlinux FORCE $(call if_changed,objcopy) @@ -129,7 +134,7 @@ targets += $(patsubst $(obj)/%,%,$(vmlinux-objs-y)) vmlinux.bin.all vmlinux.relo CMD_RELOCS = arch/x86/tools/relocs quiet_cmd_relocs = RELOCS $@ - cmd_relocs = $(CMD_RELOCS) $< > $@;$(CMD_RELOCS) --abs-relocs $< + cmd_relocs = $(CMD_RELOCS) $(RELOCS_ARGS) $< > $@;$(CMD_RELOCS) $(RELOCS_ARGS) --abs-relocs $< $(obj)/vmlinux.relocs: vmlinux FORCE $(call if_changed,relocs) diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index 31b2d151aa63..e0665038742e 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -42,6 +42,8 @@ struct section { }; static struct section *secs; +static int fgkaslr_mode; + static const char * const sym_regex_kernel[S_NSYMTYPES] = { /* * Following symbols have been audited. There values are constant and do @@ -818,6 +820,32 @@ static int is_percpu_sym(ElfW(Sym) *sym, const char *symname) strncmp(symname, "init_per_cpu_", 13); } +static int is_function_section(struct section *sec) +{ + const char *name; + + if (!fgkaslr_mode) + return 0; + + name = sec_name(sec->shdr.sh_info); + + return(!strncmp(name, ".text.", 6)); +} + +static int is_randomized_sym(ElfW(Sym) *sym) +{ + const char *name; + + if (!fgkaslr_mode) + return 0; + + if (sym->st_shndx > shnum) + return 0; + + name = sec_name(sym_index(sym)); + return(!strncmp(name, ".text.", 6)); +} + static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym, const char *symname) { @@ -842,13 +870,17 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym, case R_X86_64_PC32: case R_X86_64_PLT32: /* - * PC relative relocations don't need to be adjusted unless - * referencing a percpu symbol. + * we need to keep pc relative relocations for sections which + * might be randomized, and for the percpu section. + * We also need to keep relocations for any offset which might + * reference an address in a section which has been randomized. * * NB: R_X86_64_PLT32 can be treated as R_X86_64_PC32. */ - if (is_percpu_sym(sym, symname)) + if (is_function_section(sec) || is_randomized_sym(sym) || + is_percpu_sym(sym, symname)) add_reloc(&relocs32neg, offset); + break; case R_X86_64_PC64: @@ -1158,8 +1190,9 @@ static void print_reloc_info(void) void process(FILE *fp, int use_real_mode, int as_text, int show_absolute_syms, int show_absolute_relocs, - int show_reloc_info) + int show_reloc_info, int fgkaslr) { + fgkaslr_mode = fgkaslr; regex_init(use_real_mode); read_ehdr(fp); read_shdrs(fp); diff --git a/arch/x86/tools/relocs.h b/arch/x86/tools/relocs.h index 43c83c0fd22c..f582895c04dd 100644 --- a/arch/x86/tools/relocs.h +++ b/arch/x86/tools/relocs.h @@ -31,8 +31,8 @@ enum symtype { void process_32(FILE *fp, int use_real_mode, int as_text, int show_absolute_syms, int show_absolute_relocs, - int show_reloc_info); + int show_reloc_info, int fgkaslr); void process_64(FILE *fp, int use_real_mode, int as_text, int show_absolute_syms, int show_absolute_relocs, - int show_reloc_info); + int show_reloc_info, int fgkaslr); #endif /* RELOCS_H */ diff --git a/arch/x86/tools/relocs_common.c b/arch/x86/tools/relocs_common.c index 6634352a20bc..a80efa2f53ff 100644 --- a/arch/x86/tools/relocs_common.c +++ b/arch/x86/tools/relocs_common.c @@ -12,14 +12,14 @@ void die(char *fmt, ...) static void usage(void) { - die("relocs [--abs-syms|--abs-relocs|--reloc-info|--text|--realmode]" \ - " vmlinux\n"); + die("relocs [--abs-syms|--abs-relocs|--reloc-info|--text|--realmode|" \ + "--fg-kaslr] vmlinux\n"); } int main(int argc, char **argv) { int show_absolute_syms, show_absolute_relocs, show_reloc_info; - int as_text, use_real_mode; + int as_text, use_real_mode, fgkaslr_opt; const char *fname; FILE *fp; int i; @@ -30,6 +30,7 @@ int main(int argc, char **argv) show_reloc_info = 0; as_text = 0; use_real_mode = 0; + fgkaslr_opt = 0; fname = NULL; for (i = 1; i < argc; i++) { char *arg = argv[i]; @@ -54,6 +55,10 @@ int main(int argc, char **argv) use_real_mode = 1; continue; } + if (strcmp(arg, "--fg-kaslr") == 0) { + fgkaslr_opt = 1; + continue; + } } else if (!fname) { fname = arg; @@ -75,11 +80,11 @@ int main(int argc, char **argv) if (e_ident[EI_CLASS] == ELFCLASS64) process_64(fp, use_real_mode, as_text, show_absolute_syms, show_absolute_relocs, - show_reloc_info); + show_reloc_info, fgkaslr_opt); else process_32(fp, use_real_mode, as_text, show_absolute_syms, show_absolute_relocs, - show_reloc_info); + show_reloc_info, fgkaslr_opt); fclose(fp); return 0; } From patchwork Wed Sep 23 17:39:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 11795357 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 52344618 for ; Wed, 23 Sep 2020 17:40:55 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id A55202067B for ; Wed, 23 Sep 2020 17:40:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A55202067B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19977-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 9543 invoked by uid 550); 23 Sep 2020 17:40:42 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 9405 invoked from network); 23 Sep 2020 17:40:40 -0000 IronPort-SDR: 6FG/TeE4LR2VUw9J/eVaJUvJMcvrb4LSoGW/nI0FV7JmT+TEpH02ngUaC71IOPGsGpaKJybUx7 Yi9l1S0sBrUQ== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="140436884" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="140436884" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False IronPort-SDR: fZzUQIEHpQ8Ru67dYVx+/L5ydlgDqSuix7iws7AnqcrQrWWS5G8Hj/7bcwjbQ5i9JyzvGEWPwf Qm9s2ozYM5TQ== X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="309993115" From: Kristen Carlson Accardi To: keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, "H. Peter Anvin" Cc: arjan@linux.intel.com, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, rick.p.edgecombe@intel.com, Kristen Carlson Accardi Subject: [PATCH v5 06/10] x86/boot/compressed: Avoid duplicate malloc() implementations Date: Wed, 23 Sep 2020 10:39:00 -0700 Message-Id: <20200923173905.11219-7-kristen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200923173905.11219-1-kristen@linux.intel.com> References: <20200923173905.11219-1-kristen@linux.intel.com> MIME-Version: 1.0 From: Kees Cook The preboot malloc() (and free()) implementation in include/linux/decompress/mm.h (which is also included by the static decompressors) is static. This is fine when the only thing interested in using malloc() is the decompression code, but the x86 preboot environment uses malloc() in a couple places, leading to a potential collision when the static copies of the available memory region ("malloc_ptr") gets reset to the global "free_mem_ptr" value. As it happens, the existing usage pattern happened to be safe because each user did 1 malloc() and 1 free() before returning and were not nested: extract_kernel() (misc.c) choose_random_location() (kaslr.c) mem_avoid_init() handle_mem_options() malloc() ... free() ... parse_elf() (misc.c) malloc() ... free() Adding FGKASLR, however, will insert additional malloc() calls local to fgkaslr.c in the middle of parse_elf()'s malloc()/free() pair: parse_elf() (misc.c) malloc() if (...) { layout_randomized_image(output, &ehdr, phdrs); malloc() <- boom ... else layout_image(output, &ehdr, phdrs); free() To avoid collisions, there must be a single implementation of malloc(). Adjust include/linux/decompress/mm.h so that visibility can be controlled, provide prototypes in misc.h, and implement the functions in misc.c. This also results in a small size savings: $ size vmlinux.before vmlinux.after text data bss dec hex filename 8842314 468 178320 9021102 89a6ae vmlinux.before 8842240 468 178320 9021028 89a664 vmlinux.after Signed-off-by: Kees Cook Signed-off-by: Kristen Carlson Accardi --- arch/x86/boot/compressed/kaslr.c | 4 ---- arch/x86/boot/compressed/misc.c | 3 +++ arch/x86/boot/compressed/misc.h | 2 ++ include/linux/decompress/mm.h | 12 ++++++++++-- 4 files changed, 15 insertions(+), 6 deletions(-) diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c index dde7cb3724df..e811071ce5d2 100644 --- a/arch/x86/boot/compressed/kaslr.c +++ b/arch/x86/boot/compressed/kaslr.c @@ -32,10 +32,6 @@ #include #include -/* Macros used by the included decompressor code below. */ -#define STATIC -#include - #ifdef CONFIG_X86_5LEVEL unsigned int __pgtable_l5_enabled; unsigned int pgdir_shift __ro_after_init = 39; diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index e478e40fbe5a..dc396321eba8 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -28,6 +28,9 @@ /* Macros used by the included decompressor code below. */ #define STATIC static +/* Define an externally visible malloc()/free(). */ +#define MALLOC_VISIBLE +#include /* * Provide definitions of memzero and memmove as some of the decompressors will diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h index 726e264410ff..81fbc8d686fa 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -39,6 +39,8 @@ /* misc.c */ extern memptr free_mem_ptr; extern memptr free_mem_end_ptr; +extern void *malloc(int size); +extern void free(void *where); extern struct boot_params *boot_params; void __putstr(const char *s); void __puthex(unsigned long value); diff --git a/include/linux/decompress/mm.h b/include/linux/decompress/mm.h index 868e9eacd69e..9192986b1a73 100644 --- a/include/linux/decompress/mm.h +++ b/include/linux/decompress/mm.h @@ -25,13 +25,21 @@ #define STATIC_RW_DATA static #endif +/* + * When an architecture needs to share the malloc()/free() implementation + * between compilation units, it needs to have non-local visibility. + */ +#ifndef MALLOC_VISIBLE +#define MALLOC_VISIBLE static +#endif + /* A trivial malloc implementation, adapted from * malloc by Hannu Savolainen 1993 and Matthias Urlichs 1994 */ STATIC_RW_DATA unsigned long malloc_ptr; STATIC_RW_DATA int malloc_count; -static void *malloc(int size) +MALLOC_VISIBLE void *malloc(int size) { void *p; @@ -52,7 +60,7 @@ static void *malloc(int size) return p; } -static void free(void *where) +MALLOC_VISIBLE void free(void *where) { malloc_count--; if (!malloc_count) From patchwork Wed Sep 23 17:39:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 11795359 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6B019112E for ; Wed, 23 Sep 2020 17:41:18 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id 014CE20665 for ; Wed, 23 Sep 2020 17:41:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 014CE20665 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19978-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 13516 invoked by uid 550); 23 Sep 2020 17:41:15 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 13465 invoked from network); 23 Sep 2020 17:41:13 -0000 IronPort-SDR: 4OWcyA1Qd7lqvVteWAfgByLCDGoFXRnvkYIUFEEbsdHUWTSV1SiARTqpBC8oBN0rX4G086/n5f zVip2sQk9QHw== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="140437084" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="140437084" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False IronPort-SDR: h+DSrt7nmpiaxngAXkvc3wwB9kya92bD2NQ1TUnQfIRWig7qWMN48Sd3sNV7WccKQJTsZCnYva qffHr5+zrzOw== X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="309993180" From: Kristen Carlson Accardi To: keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, Jonathan Corbet , x86@kernel.org, "H. Peter Anvin" Cc: arjan@linux.intel.com, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, rick.p.edgecombe@intel.com, Kristen Carlson Accardi , Tony Luck , linux-doc@vger.kernel.org Subject: [PATCH v5 07/10] x86: Add support for function granular KASLR Date: Wed, 23 Sep 2020 10:39:01 -0700 Message-Id: <20200923173905.11219-8-kristen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200923173905.11219-1-kristen@linux.intel.com> References: <20200923173905.11219-1-kristen@linux.intel.com> MIME-Version: 1.0 This commit contains the changes required to re-layout the kernel text sections generated by -ffunction-sections shortly after decompression. Documentation of the feature is also added. After decompression, the decompressed image's elf headers are parsed. In order to manually update certain data structures that are built with relative offsets during the kernel build process, certain symbols are not stripped by objdump and their location is retained in the elf symbol tables. These addresses are saved. If the image was built with -ffunction-sections, there will be ELF section headers present which contain information about the address range of each section. Anything that is not broken out into function sections (i.e. is consolidated into .text) is left in it's original location, but any other executable section which begins with ".text." is located and shuffled randomly within the remaining text segment address range. After the sections have been copied to their new locations, but before relocations have been applied, the kallsyms tables must be updated to reflect the new symbol locations. Because it is expected that these tables will be sorted by address, the kallsyms tables will need to be sorted after the update. When applying relocations, the address of the relocation needs to be adjusted by the offset from the original location of the section that was randomized to it's new location. In addition, if a value at that relocation was a location in the text segment that was randomized, it's value will be adjusted to a new location. After relocations have been applied, the exception table must be updated with with new symbol locations, and then re-sorted by the new address. The orc table will have been updated as part of applying relocations, but since it is expected to be sorted by address, it will need to be resorted. Signed-off-by: Kristen Carlson Accardi Reviewed-by: Tony Luck Tested-by: Tony Luck Reviewed-by: Kees Cook --- .../admin-guide/kernel-parameters.txt | 6 + Documentation/security/fgkaslr.rst | 172 ++++ Documentation/security/index.rst | 1 + arch/x86/boot/compressed/Makefile | 2 + arch/x86/boot/compressed/fgkaslr.c | 898 ++++++++++++++++++ arch/x86/boot/compressed/misc.c | 154 ++- arch/x86/boot/compressed/misc.h | 28 + arch/x86/boot/compressed/utils.c | 11 + arch/x86/boot/compressed/vmlinux.symbols | 17 + arch/x86/include/asm/boot.h | 17 +- include/uapi/linux/elf.h | 1 + 11 files changed, 1279 insertions(+), 28 deletions(-) create mode 100644 Documentation/security/fgkaslr.rst create mode 100644 arch/x86/boot/compressed/fgkaslr.c create mode 100644 arch/x86/boot/compressed/utils.c create mode 100644 arch/x86/boot/compressed/vmlinux.symbols diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a1068742a6df..6742c301479a 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2105,6 +2105,12 @@ kernel and module base offset ASLR (Address Space Layout Randomization). + nofgkaslr [KNL] + When CONFIG_FG_KASLR is set, this parameter + disables kernel function granular ASLR + (Address Space Layout Randomization). + See Documentation/security/fgkaslr.rst. + kasan_multi_shot [KNL] Enforce KASAN (Kernel Address Sanitizer) to print report on every invalid memory access. Without this diff --git a/Documentation/security/fgkaslr.rst b/Documentation/security/fgkaslr.rst new file mode 100644 index 000000000000..50dc24f675b5 --- /dev/null +++ b/Documentation/security/fgkaslr.rst @@ -0,0 +1,172 @@ +.. SPDX-License-Identifier: GPL-2.0 + +===================================================================== +Function Granular Kernel Address Space Layout Randomization (fgkaslr) +===================================================================== + +:Date: 6 April 2020 +:Author: Kristen Accardi + +Kernel Address Space Layout Randomization (KASLR) was merged into the kernel +with the objective of increasing the difficulty of code reuse attacks. Code +reuse attacks reused existing code snippets to get around existing memory +protections. They exploit software bugs which expose addresses of useful code +snippets to control the flow of execution for their own nefarious purposes. +KASLR as it was originally implemented moves the entire kernel code text as a +unit at boot time in order to make addresses less predictable. The order of the +code within the segment is unchanged - only the base address is shifted. There +are a few shortcomings to this algorithm. + +1. Low Entropy - there are only so many locations the kernel can fit in. This + means an attacker could guess without too much trouble. +2. Knowledge of a single address can reveal the offset of the base address, + exposing all other locations for a published/known kernel image. +3. Info leaks abound. + +Finer grained ASLR has been proposed as a way to make ASLR more resistant +to info leaks. It is not a new concept at all, and there are many variations +possible. Function reordering is an implementation of finer grained ASLR +which randomizes the layout of an address space on a function level +granularity. The term "fgkaslr" is used in this document to refer to the +technique of function reordering when used with KASLR, as well as finer grained +KASLR in general. + +The objective of this patch set is to improve a technology that is already +merged into the kernel (KASLR). This code will not prevent all code reuse +attacks, and should be considered as one of several tools that can be used. + +Implementation Details +====================== + +The over-arching objective of the fgkaslr implementation is incremental +improvement over the existing KASLR algorithm. It is designed to work with +the existing solution, and there are two main area where code changes occur: +Build time, and Load time. + +Build time +---------- + +GCC has had an option to place functions into individual .text sections +for many years now (-ffunction-sections). This option is used to implement +function reordering at load time. The final compiled vmlinux retains all the +section headers, which can be used to help find the address ranges of each +function. Using this information and an expanded table of relocation addresses, +individual text sections can be shuffled immediately after decompression. +Some data tables inside the kernel that have assumptions about order +require sorting after the update. In order to modify these tables, +a few key symbols from the objcopy symbol stripping process are preserved +for use after shuffling the text segments. Any special input sections which are +defined by the kernel build process and collected into the .text output +segment are left unmodified and will still be present inside the .text segment, +unrandomized other than normal base address randomization. + +Load time +--------- + +The boot kernel was modified to parse the vmlinux elf file after +decompression to check for symbols for modifying data tables, and to +look for any .text.* sections to randomize. The sections are then shuffled, +and tables are updated or resorted. The existing code which updated relocation +addresses was modified to account for not just a fixed delta from the load +address, but the offset that the function section was moved to. This requires +inspection of each address to see if it was impacted by a randomization. + +In order to hide the new layout, symbols reported through /proc/kallsyms will +be displayed in a random order. + +Performance Impact +================== + +There are two areas where function reordering can impact performance: boot +time latency, and run time performance. + +Boot time latency +----------------- + +This implementation of finer grained KASLR impacts the boot time of the kernel +in several places. It requires additional parsing of the kernel ELF file to +obtain the section headers of the sections to be randomized. It calls the +random number generator for each section to be randomized to determine that +section's new memory location. It copies the decompressed kernel into a new +area of memory to avoid corruption when laying out the newly randomized +sections. It increases the number of relocations the kernel has to perform at +boot time vs. standard KASLR, and it also requires a lookup on each address +that needs to be relocated to see if it was in a randomized section and needs +to be adjusted by a new offset. Finally, it re-sorts a few data tables that +are required to be sorted by address. + +Booting a test VM on a modern, well appointed system showed an increase in +latency of approximately 1 second. + +Run time +-------- + +The performance impact at run-time of function reordering varies by workload. +Randomly reordering the functions will cause an increase in cache misses +for some workloads. Some workloads perform significantly worse under FGKASLR, +while others stay the same or even improve. In general, it will depend on the +code flow whether or not finer grained KASLR will impact a workload, and how +the underlying code was designed. Because the layout changes per boot, each +time a system is rebooted the performance of a workload may change. + +Image Size +========== + +fgkaslr increases the size of the kernel binary due to the extra section +headers that are included, as well as the extra relocations that need to +be added. You can expect fgkaslr to increase the size of the resulting +vmlinux by about 3%, and the compressed image (bzImage) by 15%. + +Memory Usage +============ + +fgkaslr increases the amount of heap that is required at boot time, +although this extra memory is released when the kernel has finished +decompression. As a result, it may not be appropriate to use this feature +on systems without much memory. + +Building +======== + +To enable fine grained KASLR, you need to have the following config options +set (including all the ones you would use to build normal KASLR) + +``CONFIG_FG_KASLR=y`` + +fgkaslr for the kernel is only supported for the X86_64 architecture. + +Modules +======= + +Modules are randomized similarly to the rest of the kernel by shuffling +the sections at load time prior to moving them into memory. The module must +also have been build with the -ffunction-sections compiler option. + +Although fgkaslr for the kernel is only supported for the X86_64 architecture, +it is possible to use fgkaslr with modules on other architectures. To enable +this feature, select the following config option: + +``CONFIG_MODULE_FG_KASLR`` + +This option is selected automatically for X86_64 when CONFIG_FG_KASLR is set. + +Disabling +========= + +Disabling normal kaslr using the nokaslr command line option also disables +fgkaslr. In addition, it is possible to disable fgkaslr separately by booting +with "nofgkaslr" on the commandline. + +Further Information +=================== + +There are a lot of academic papers which explore finer grained ASLR. +This paper in particular contributed significantly to the implementation design. + +Selfrando: Securing the Tor Browser against De-anonymization Exploits, +M. Conti, S. Crane, T. Frassetto, et al. + +For more information on how function layout impacts performance, see: + +Optimizing Function Placement for Large-Scale Data-Center Applications, +G. Ottoni, B. Maher diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst index 8129405eb2cc..19677beb33d4 100644 --- a/Documentation/security/index.rst +++ b/Documentation/security/index.rst @@ -7,6 +7,7 @@ Security Documentation credentials IMA-templates + fgkaslr keys/index lsm lsm-development diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index a83935c2fff4..03306595aabe 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -85,6 +85,7 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/kernel_info.o $(obj)/head_$(BITS).o vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o +vmlinux-objs-$(CONFIG_FG_KASLR) += $(obj)/utils.o $(obj)/fgkaslr.o ifdef CONFIG_X86_64 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr_64.o vmlinux-objs-y += $(obj)/mem_encrypt.o @@ -125,6 +126,7 @@ OBJCOPYFLAGS_vmlinux.bin := -R .comment -S ifdef CONFIG_FG_KASLR RELOCS_ARGS += --fg-kaslr + OBJCOPYFLAGS += --keep-symbols=$(srctree)/$(src)/vmlinux.symbols endif $(obj)/vmlinux.bin: vmlinux FORCE diff --git a/arch/x86/boot/compressed/fgkaslr.c b/arch/x86/boot/compressed/fgkaslr.c new file mode 100644 index 000000000000..406c86b55de2 --- /dev/null +++ b/arch/x86/boot/compressed/fgkaslr.c @@ -0,0 +1,898 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * fgkaslr.c + * + * This contains the routines needed to reorder the kernel text section + * at boot time. + */ +#include "misc.h" +#include "error.h" +#include "pgtable.h" +#include "../string.h" +#include "../voffset.h" +#include +#include +#include "../../include/asm/extable.h" +#include "../../include/asm/orc_types.h" + +/* + * Longest parameter of 'fgkaslr=' is 'off' right now, plus an extra '\0' + * for termination. + */ +#define MAX_FGKASLR_ARG_LENGTH 4 +static int nofgkaslr; + +/* + * Use normal definitions of mem*() from string.c. There are already + * included header files which expect a definition of memset() and by + * the time we define memset macro, it is too late. + */ +#undef memcpy +#undef memset +#define memzero(s, n) memset((s), 0, (n)) +#define memmove memmove + +void *memmove(void *dest, const void *src, size_t n); + +static unsigned long percpu_start; +static unsigned long percpu_end; + +static long addr_kallsyms_names; +static long addr_kallsyms_offsets; +static long addr_kallsyms_num_syms; +static long addr_kallsyms_relative_base; +static long addr_kallsyms_markers; +static long addr___start___ex_table; +static long addr___stop___ex_table; +static long addr__stext; +static long addr__etext; +static long addr__sinittext; +static long addr__einittext; +static long addr___start_orc_unwind_ip; +static long addr___stop_orc_unwind_ip; +static long addr___start_orc_unwind; + +/* addresses in mapped address space */ +static int *base; +static u8 *names; +static unsigned long relative_base; +static unsigned int *markers_addr; + +struct kallsyms_name { + u8 len; + u8 indecis[256]; +}; + +static struct kallsyms_name *names_table; + +static struct orc_entry *cur_orc_table; +static int *cur_orc_ip_table; + +/* Array of pointers to sections headers for randomized sections */ +Elf_Shdr **sections; + +/* Number of elements in the randomized section header array (sections) */ +static int sections_size; + +/* Array of all section headers, randomized or otherwise */ +static Elf_Shdr *sechdrs; + +static bool is_orc_unwind(long addr) +{ + if (addr >= addr___start_orc_unwind_ip && + addr < addr___stop_orc_unwind_ip) + return true; + return false; +} + +static bool is_text(long addr) +{ + if ((addr >= addr__stext && addr < addr__etext) || + (addr >= addr__sinittext && addr < addr__einittext)) + return true; + return false; +} + +bool is_percpu_addr(long pc, long offset) +{ + unsigned long ptr; + long address; + + address = pc + offset + 4; + + ptr = (unsigned long)address; + + if (ptr >= percpu_start && ptr < percpu_end) + return true; + + return false; +} + +static int cmp_section_addr(const void *a, const void *b) +{ + unsigned long ptr = (unsigned long)a; + Elf_Shdr *s = *(Elf_Shdr **)b; + unsigned long end = s->sh_addr + s->sh_size; + + if (ptr >= s->sh_addr && ptr < end) + return 0; + + if (ptr < s->sh_addr) + return -1; + + return 1; +} + +static int cmp_section_addr_orc(const void *a, const void *b) +{ + unsigned long ptr = (unsigned long)a; + Elf_Shdr *s = *(Elf_Shdr **)b; + unsigned long end = s->sh_addr + s->sh_size; + + /* orc relocations can be one past the end of the section */ + if (ptr >= s->sh_addr && ptr <= end) + return 0; + + if (ptr < s->sh_addr) + return -1; + + return 1; +} + +/* + * Discover if the orc_unwind address is in a randomized section and if so, + * adjust by the saved offset. + */ +Elf_Shdr *adjust_address_orc(long *address) +{ + Elf_Shdr **s; + Elf_Shdr *shdr; + + if (nofgkaslr) + return NULL; + + s = bsearch((const void *)*address, sections, sections_size, sizeof(*s), + cmp_section_addr_orc); + if (s) { + shdr = *s; + *address += shdr->sh_offset; + return shdr; + } + + return NULL; +} + +/* + * Discover if the address is in a randomized section and if so, adjust + * by the saved offset. + */ +Elf_Shdr *adjust_address(long *address) +{ + Elf_Shdr **s; + Elf_Shdr *shdr; + + if (nofgkaslr) + return NULL; + + s = bsearch((const void *)*address, sections, sections_size, sizeof(*s), + cmp_section_addr); + if (s) { + shdr = *s; + *address += shdr->sh_offset; + return shdr; + } + + return NULL; +} + +void adjust_relative_offset(long pc, long *value, Elf_Shdr *section) +{ + Elf_Shdr *s; + long address; + + if (nofgkaslr) + return; + + /* + * sometimes we are updating a relative offset that would + * normally be relative to the next instruction (such as a call). + * In this case to calculate the target, you need to add 32bits to + * the pc to get the next instruction value. However, sometimes + * targets are just data that was stored in a table such as ksymtab + * or cpu alternatives. In this case our target is not relative to + * the next instruction. + */ + + /* Calculate the address that this offset would call. */ + if (!is_text(pc)) + address = pc + *value; + else + address = pc + *value + 4; + + /* + * orc ip addresses are sorted at build time after relocs have + * been applied, making the relocs no longer valid. Skip any + * relocs for the orc_unwind_ip table. These will be updated + * separately. + */ + if (is_orc_unwind(pc)) + return; + + s = adjust_address(&address); + + /* + * if the address is in section that was randomized, + * we need to adjust the offset. + */ + if (s) + *value += s->sh_offset; + + /* + * If the PC that this offset was calculated for was in a section + * that has been randomized, the value needs to be adjusted by the + * same amount as the randomized section was adjusted from it's original + * location. + */ + if (section) + *value -= section->sh_offset; +} + +static void kallsyms_swp(void *a, void *b, int size) +{ + int idx1, idx2; + int temp; + struct kallsyms_name name_a; + + /* Determine our index into the array. */ + idx1 = (int *)a - base; + idx2 = (int *)b - base; + temp = base[idx1]; + base[idx1] = base[idx2]; + base[idx2] = temp; + + /* Swap the names table. */ + memcpy(&name_a, &names_table[idx1], sizeof(name_a)); + memcpy(&names_table[idx1], &names_table[idx2], + sizeof(struct kallsyms_name)); + memcpy(&names_table[idx2], &name_a, sizeof(struct kallsyms_name)); +} + +static int kallsyms_cmp(const void *a, const void *b) +{ + int addr_a, addr_b; + unsigned long uaddr_a, uaddr_b; + + addr_a = *(int *)a; + addr_b = *(int *)b; + + if (addr_a >= 0) + uaddr_a = addr_a; + if (addr_b >= 0) + uaddr_b = addr_b; + + if (addr_a < 0) + uaddr_a = relative_base - 1 - addr_a; + if (addr_b < 0) + uaddr_b = relative_base - 1 - addr_b; + + if (uaddr_b > uaddr_a) + return -1; + + return 0; +} + +static void deal_with_names(int num_syms) +{ + int num_bytes; + int i, j; + int offset; + + /* we should have num_syms kallsyms_name entries */ + num_bytes = num_syms * sizeof(*names_table); + names_table = malloc(num_syms * sizeof(*names_table)); + if (!names_table) { + debug_putstr("\nbytes requested: "); + debug_puthex(num_bytes); + error("\nunable to allocate space for names table\n"); + } + + /* read all the names entries */ + offset = 0; + for (i = 0; i < num_syms; i++) { + names_table[i].len = names[offset]; + offset++; + for (j = 0; j < names_table[i].len; j++) { + names_table[i].indecis[j] = names[offset]; + offset++; + } + } +} + +static void write_sorted_names(int num_syms) +{ + int i, j; + int offset = 0; + unsigned int *markers; + + /* + * we are going to need to regenerate the markers table, which is a + * table of offsets into the compressed stream every 256 symbols. + * this code copied almost directly from scripts/kallsyms.c + */ + markers = malloc(sizeof(unsigned int) * ((num_syms + 255) / 256)); + if (!markers) { + debug_putstr("\nfailed to allocate heap space of "); + debug_puthex(((num_syms + 255) / 256)); + debug_putstr(" bytes\n"); + error("Unable to allocate space for markers table"); + } + + for (i = 0; i < num_syms; i++) { + if ((i & 0xFF) == 0) + markers[i >> 8] = offset; + + names[offset] = (u8)names_table[i].len; + offset++; + for (j = 0; j < names_table[i].len; j++) { + names[offset] = (u8)names_table[i].indecis[j]; + offset++; + } + } + + /* write new markers table over old one */ + for (i = 0; i < ((num_syms + 255) >> 8); i++) + markers_addr[i] = markers[i]; + + free(markers); + free(names_table); +} + +static void sort_kallsyms(unsigned long map) +{ + int num_syms; + int i; + + debug_putstr("\nRe-sorting kallsyms...\n"); + + num_syms = *(int *)(addr_kallsyms_num_syms + map); + base = (int *)(addr_kallsyms_offsets + map); + relative_base = *(unsigned long *)(addr_kallsyms_relative_base + map); + markers_addr = (unsigned int *)(addr_kallsyms_markers + map); + names = (u8 *)(addr_kallsyms_names + map); + + /* + * the kallsyms table was generated prior to any randomization. + * it is a bunch of offsets from "relative base". In order for + * us to check if a symbol has an address that was in a randomized + * section, we need to reconstruct the address to it's original + * value prior to handle_relocations. + */ + for (i = 0; i < num_syms; i++) { + unsigned long addr; + int new_base; + + /* + * according to kernel/kallsyms.c, positive offsets are absolute + * values and negative offsets are relative to the base. + */ + if (base[i] >= 0) + addr = base[i]; + else + addr = relative_base - 1 - base[i]; + + if (adjust_address(&addr)) { + /* here we need to recalcuate the offset */ + new_base = relative_base - 1 - addr; + base[i] = new_base; + } + } + + /* + * here we need to read in all the kallsyms_names info + * so that we can regenerate it. + */ + deal_with_names(num_syms); + + sort(base, num_syms, sizeof(int), kallsyms_cmp, kallsyms_swp); + + /* write the newly sorted names table over the old one */ + write_sorted_names(num_syms); +} + +/* + * We need to include this file here rather than in utils.c because + * some of the helper functions in extable.c are used to update + * the extable below and are defined as "static" in extable.c + */ +#include "../../../../lib/extable.c" + +static inline unsigned long +ex_fixup_handler(const struct exception_table_entry *x) +{ + return ((unsigned long)&x->handler + x->handler); +} + +static inline unsigned long +ex_fixup_addr(const struct exception_table_entry *x) +{ + return ((unsigned long)&x->fixup + x->fixup); +} + +static void update_ex_table(unsigned long map) +{ + struct exception_table_entry *start_ex_table = + (struct exception_table_entry *)(addr___start___ex_table + map); + struct exception_table_entry *stop_ex_table = + (struct exception_table_entry *)(addr___stop___ex_table + map); + int num_entries = + (addr___stop___ex_table - addr___start___ex_table) / + sizeof(struct exception_table_entry); + int i; + + debug_putstr("\nUpdating exception table..."); + for (i = 0; i < num_entries; i++) { + unsigned long insn = ex_to_insn(&start_ex_table[i]); + unsigned long fixup = ex_fixup_addr(&start_ex_table[i]); + unsigned long handler = ex_fixup_handler(&start_ex_table[i]); + unsigned long addr; + Elf_Shdr *s; + + /* check each address to see if it needs adjusting */ + addr = insn - map; + s = adjust_address(&addr); + if (s) + start_ex_table[i].insn += s->sh_offset; + + addr = fixup - map; + s = adjust_address(&addr); + if (s) + start_ex_table[i].fixup += s->sh_offset; + + addr = handler - map; + s = adjust_address(&addr); + if (s) + start_ex_table[i].handler += s->sh_offset; + } +} + +static void sort_ex_table(unsigned long map) +{ + struct exception_table_entry *start_ex_table = + (struct exception_table_entry *)(addr___start___ex_table + map); + struct exception_table_entry *stop_ex_table = + (struct exception_table_entry *)(addr___stop___ex_table + map); + + debug_putstr("\nRe-sorting exception table..."); + + sort_extable(start_ex_table, stop_ex_table); +} + +static inline unsigned long orc_ip(const int *ip) +{ + return (unsigned long)ip + *ip; +} + +static void orc_sort_swap(void *_a, void *_b, int size) +{ + struct orc_entry *orc_a, *orc_b; + struct orc_entry orc_tmp; + int *a = _a, *b = _b, tmp; + int delta = _b - _a; + + /* Swap the .orc_unwind_ip entries: */ + tmp = *a; + *a = *b + delta; + *b = tmp - delta; + + /* Swap the corresponding .orc_unwind entries: */ + orc_a = cur_orc_table + (a - cur_orc_ip_table); + orc_b = cur_orc_table + (b - cur_orc_ip_table); + orc_tmp = *orc_a; + *orc_a = *orc_b; + *orc_b = orc_tmp; +} + +static int orc_sort_cmp(const void *_a, const void *_b) +{ + struct orc_entry *orc_a; + const int *a = _a, *b = _b; + unsigned long a_val = orc_ip(a); + unsigned long b_val = orc_ip(b); + + if (a_val > b_val) + return 1; + if (a_val < b_val) + return -1; + + /* + * The "weak" section terminator entries need to always be on the left + * to ensure the lookup code skips them in favor of real entries. + * These terminator entries exist to handle any gaps created by + * whitelisted .o files which didn't get objtool generation. + */ + orc_a = cur_orc_table + (a - cur_orc_ip_table); + return orc_a->sp_reg == ORC_REG_UNDEFINED && !orc_a->end ? -1 : 1; +} + +static void update_orc_table(unsigned long map) +{ + int i; + int num_entries = + (addr___stop_orc_unwind_ip - addr___start_orc_unwind_ip) / sizeof(int); + + cur_orc_ip_table = (int *)(addr___start_orc_unwind_ip + map); + cur_orc_table = (struct orc_entry *)(addr___start_orc_unwind + map); + + debug_putstr("\nUpdating orc tables...\n"); + for (i = 0; i < num_entries; i++) { + unsigned long ip = orc_ip(&cur_orc_ip_table[i]); + Elf_Shdr *s; + + /* check each address to see if it needs adjusting */ + ip = ip - map; + + /* + * objtool places terminator entries just outside the end of + * the section. To identify an orc_unwind_ip address that might + * need adjusting, the address should be compared differently + * than a normal address. + */ + s = adjust_address_orc(&ip); + if (s) + cur_orc_ip_table[i] += s->sh_offset; + + } +} + +static void sort_orc_table(unsigned long map) +{ + int num_entries = + (addr___stop_orc_unwind_ip - addr___start_orc_unwind_ip) / sizeof(int); + + cur_orc_ip_table = (int *)(addr___start_orc_unwind_ip + map); + cur_orc_table = (struct orc_entry *)(addr___start_orc_unwind + map); + + debug_putstr("\nRe-sorting orc tables...\n"); + sort(cur_orc_ip_table, num_entries, sizeof(int), orc_sort_cmp, + orc_sort_swap); +} + +void post_relocations_cleanup(unsigned long map) +{ + if (!nofgkaslr) { + update_ex_table(map); + sort_ex_table(map); + update_orc_table(map); + sort_orc_table(map); + } + + /* + * maybe one day free will do something. So, we "free" this memory + * in either case + */ + free(sections); + free(sechdrs); +} + +void pre_relocations_cleanup(unsigned long map) +{ + if (nofgkaslr) + return; + + sort_kallsyms(map); +} + +static void shuffle_sections(int *list, int size) +{ + int i; + unsigned long j; + int temp; + + for (i = size - 1; i > 0; i--) { + j = kaslr_get_random_long(NULL) % (i + 1); + + temp = list[i]; + list[i] = list[j]; + list[j] = temp; + } +} + +static void move_text(int num_sections, char *secstrings, Elf_Shdr *text, + void *source, void *dest, Elf64_Phdr *phdr) +{ + unsigned long adjusted_addr; + int copy_bytes; + void *stash; + Elf_Shdr **sorted_sections; + int *index_list; + int i, j; + + memmove(dest, source + text->sh_offset, text->sh_size); + copy_bytes = text->sh_size; + dest += text->sh_size; + adjusted_addr = text->sh_addr + text->sh_size; + + /* + * we leave the sections sorted in their original order + * by s->sh_addr, but shuffle the indexes in a random + * order for copying. + */ + index_list = malloc(sizeof(int) * num_sections); + if (!index_list) + error("Failed to allocate space for index list"); + + for (i = 0; i < num_sections; i++) + index_list[i] = i; + + shuffle_sections(index_list, num_sections); + + /* + * to avoid overwriting earlier sections before they can get + * copied to dest, stash everything into a buffer first. + * this will cause our source address to be off by + * phdr->p_offset though, so we'll adjust s->sh_offset below. + * + * TBD: ideally we'd simply decompress higher up so that our + * copy wasn't in danger of overwriting anything important. + */ + stash = malloc(phdr->p_filesz); + if (!stash) + error("Failed to allocate space for text stash"); + + memcpy(stash, source + phdr->p_offset, phdr->p_filesz); + + /* now we'd walk through the sections. */ + for (j = 0; j < num_sections; j++) { + unsigned long aligned_addr; + Elf_Shdr *s; + const char *sname; + void *src; + int pad_bytes; + + s = sections[index_list[j]]; + + sname = secstrings + s->sh_name; + + /* align addr for this section */ + aligned_addr = ALIGN(adjusted_addr, s->sh_addralign); + + /* + * copy out of stash, so adjust offset + */ + src = stash + s->sh_offset - phdr->p_offset; + + /* + * Fill any space between sections with int3 + */ + pad_bytes = aligned_addr - adjusted_addr; + memset(dest, 0xcc, pad_bytes); + + dest = (void *)ALIGN((unsigned long)dest, s->sh_addralign); + + memmove(dest, src, s->sh_size); + + dest += s->sh_size; + copy_bytes += s->sh_size + pad_bytes; + adjusted_addr = aligned_addr + s->sh_size; + + /* we can blow away sh_offset for our own uses */ + s->sh_offset = aligned_addr - s->sh_addr; + } + + free(index_list); + + /* + * move remainder of text segment. Ok to just use original source + * here since this area is untouched. + */ + memmove(dest, source + text->sh_offset + copy_bytes, + phdr->p_filesz - copy_bytes); + free(stash); +} + +#define GET_SYM(name) \ + do { \ + if (!addr_ ## name) { \ + if (strcmp(#name, strtab + sym->st_name) == 0) {\ + addr_ ## name = sym->st_value; \ + continue; \ + } \ + } \ + } while (0) + +static void parse_symtab(Elf64_Sym *symtab, char *strtab, long num_syms) +{ + Elf64_Sym *sym; + + if (!symtab || !strtab) + return; + + debug_putstr("\nLooking for symbols... "); + + /* + * walk through the symbol table looking for the symbols + * that we care about. + */ + for (sym = symtab; --num_syms >= 0; sym++) { + if (!sym->st_name) + continue; + + GET_SYM(kallsyms_num_syms); + GET_SYM(kallsyms_offsets); + GET_SYM(kallsyms_relative_base); + GET_SYM(kallsyms_names); + GET_SYM(kallsyms_markers); + GET_SYM(_stext); + GET_SYM(_etext); + GET_SYM(_sinittext); + GET_SYM(_einittext); + GET_SYM(__start_orc_unwind_ip); + GET_SYM(__stop_orc_unwind_ip); + GET_SYM(__start_orc_unwind); + GET_SYM(__start___ex_table); + GET_SYM(__stop___ex_table); + } +} + +void layout_randomized_image(void *output, Elf64_Ehdr *ehdr, Elf64_Phdr *phdrs) +{ + Elf64_Phdr *phdr; + Elf_Shdr *s; + Elf_Shdr *text = NULL; + Elf_Shdr *percpu = NULL; + char *secstrings; + const char *sname; + int num_sections = 0; + Elf64_Sym *symtab = NULL; + char *strtab = NULL; + long num_syms = 0; + void *dest; + int i; + char arg[MAX_FGKASLR_ARG_LENGTH]; + Elf_Shdr shdr; + unsigned long shnum; + unsigned int shstrndx; + + debug_putstr("\nParsing ELF section headers... "); + + /* + * Even though fgkaslr may have been disabled, we still + * need to parse through the section headers to get the + * start and end of the percpu section. This is because + * if we were built with CONFIG_FG_KASLR, there are more + * relative relocations present in vmlinux.relocs than + * just the percpu, and only the percpu relocs need to be + * adjusted when using just normal base address kaslr. + */ + if (cmdline_find_option_bool("nofgkaslr")) { + warn("FG_KASLR disabled on cmdline."); + nofgkaslr = 1; + } + + /* read the first section header */ + shnum = ehdr->e_shnum; + shstrndx = ehdr->e_shstrndx; + if (shnum == SHN_UNDEF || shstrndx == SHN_XINDEX) { + memcpy(&shdr, output + ehdr->e_shoff, sizeof(shdr)); + if (shnum == SHN_UNDEF) + shnum = shdr.sh_size; + if (shstrndx == SHN_XINDEX) + shstrndx = shdr.sh_link; + } + + /* we are going to need to allocate space for the section headers */ + sechdrs = malloc(sizeof(*sechdrs) * shnum); + if (!sechdrs) + error("Failed to allocate space for shdrs"); + + sections = malloc(sizeof(*sections) * shnum); + if (!sections) + error("Failed to allocate space for section pointers"); + + memcpy(sechdrs, output + ehdr->e_shoff, + sizeof(*sechdrs) * shnum); + + /* we need to allocate space for the section string table */ + s = &sechdrs[shstrndx]; + + secstrings = malloc(s->sh_size); + if (!secstrings) + error("Failed to allocate space for shstr"); + + memcpy(secstrings, output + s->sh_offset, s->sh_size); + + /* + * now we need to walk through the section headers and collect the + * sizes of the .text sections to be randomized. + */ + for (i = 0; i < shnum; i++) { + s = &sechdrs[i]; + sname = secstrings + s->sh_name; + + if (s->sh_type == SHT_SYMTAB) { + /* only one symtab per image */ + if (symtab) + error("Unexpected duplicate symtab"); + + symtab = malloc(s->sh_size); + if (!symtab) + error("Failed to allocate space for symtab"); + + memcpy(symtab, output + s->sh_offset, s->sh_size); + num_syms = s->sh_size / sizeof(*symtab); + continue; + } + + if (s->sh_type == SHT_STRTAB && i != ehdr->e_shstrndx) { + if (strtab) + error("Unexpected duplicate strtab"); + + strtab = malloc(s->sh_size); + if (!strtab) + error("Failed to allocate space for strtab"); + + memcpy(strtab, output + s->sh_offset, s->sh_size); + } + + if (!strcmp(sname, ".text")) { + if (text) + error("Unexpected duplicate .text section"); + + text = s; + continue; + } + + if (!strcmp(sname, ".data..percpu")) { + /* get start addr for later */ + percpu = s; + continue; + } + + if (!(s->sh_flags & SHF_ALLOC) || + !(s->sh_flags & SHF_EXECINSTR) || + !(strstarts(sname, ".text"))) + continue; + + sections[num_sections] = s; + + num_sections++; + } + sections[num_sections] = NULL; + sections_size = num_sections; + + parse_symtab(symtab, strtab, num_syms); + + for (i = 0; i < ehdr->e_phnum; i++) { + phdr = &phdrs[i]; + + switch (phdr->p_type) { + case PT_LOAD: + if ((phdr->p_align % 0x200000) != 0) + error("Alignment of LOAD segment isn't multiple of 2MB"); + dest = output; + dest += (phdr->p_paddr - LOAD_PHYSICAL_ADDR); + if (!nofgkaslr && + (text && phdr->p_offset == text->sh_offset)) { + move_text(num_sections, secstrings, text, + output, dest, phdr); + } else { + if (percpu && + phdr->p_offset == percpu->sh_offset) { + percpu_start = percpu->sh_addr; + percpu_end = percpu_start + + phdr->p_filesz; + } + memmove(dest, output + phdr->p_offset, + phdr->p_filesz); + } + break; + default: /* Ignore other PT_* */ + break; + } + } + + /* we need to keep the section info to redo relocs */ + free(secstrings); + + free(phdrs); +} diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index dc396321eba8..949b8e75b13a 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -207,10 +207,21 @@ static void handle_relocations(void *output, unsigned long output_len, if (IS_ENABLED(CONFIG_X86_64)) delta = virt_addr - LOAD_PHYSICAL_ADDR; - if (!delta) { - debug_putstr("No relocation needed... "); - return; + /* + * it is possible to have delta be zero and still have enabled + * fg kaslr. We need to perform relocations for fgkaslr regardless + * of whether the base address has moved. + */ + if (!IS_ENABLED(CONFIG_FG_KASLR) || + cmdline_find_option_bool("nokaslr")) { + if (!delta) { + debug_putstr("No relocation needed... "); + return; + } } + + pre_relocations_cleanup(map); + debug_putstr("Performing relocations... "); /* @@ -234,35 +245,106 @@ static void handle_relocations(void *output, unsigned long output_len, */ for (reloc = output + output_len - sizeof(*reloc); *reloc; reloc--) { long extended = *reloc; + long value; + + /* + * if using fgkaslr, we might have moved the address + * of the relocation. Check it to see if it needs adjusting + * from the original address. + */ + adjust_address(&extended); + extended += map; ptr = (unsigned long)extended; if (ptr < min_addr || ptr > max_addr) error("32-bit relocation outside of kernel!\n"); - *(uint32_t *)ptr += delta; + value = *(int32_t *)ptr; + + /* + * If using fgkaslr, the value of the relocation + * might need to be changed because it referred + * to an address that has moved. + */ + adjust_address(&value); + + value += delta; + + *(uint32_t *)ptr = value; } #ifdef CONFIG_X86_64 while (*--reloc) { long extended = *reloc; + long value; + long oldvalue; + Elf64_Shdr *s; + + /* + * if using fgkaslr, we might have moved the address + * of the relocation. Check it to see if it needs adjusting + * from the original address. + */ + s = adjust_address(&extended); + extended += map; ptr = (unsigned long)extended; if (ptr < min_addr || ptr > max_addr) error("inverse 32-bit relocation outside of kernel!\n"); - *(int32_t *)ptr -= delta; + value = *(int32_t *)ptr; + oldvalue = value; + + /* + * If using fgkaslr, these relocs will contain + * relative offsets which might need to be + * changed because it referred + * to an address that has moved. + */ + adjust_relative_offset(*reloc, &value, s); + + /* + * only percpu symbols need to have their values adjusted for + * base address kaslr since relative offsets within the .text + * and .text.* sections are ok wrt each other. + */ + if (is_percpu_addr(*reloc, oldvalue)) + value -= delta; + + *(int32_t *)ptr = value; } for (reloc--; *reloc; reloc--) { long extended = *reloc; + long value; + + /* + * if using fgkaslr, we might have moved the address + * of the relocation. Check it to see if it needs adjusting + * from the original address. + */ + adjust_address(&extended); + extended += map; ptr = (unsigned long)extended; if (ptr < min_addr || ptr > max_addr) error("64-bit relocation outside of kernel!\n"); - *(uint64_t *)ptr += delta; + value = *(int64_t *)ptr; + + /* + * If using fgkaslr, the value of the relocation + * might need to be changed because it referred + * to an address that has moved. + */ + adjust_address(&value); + + value += delta; + + *(uint64_t *)ptr = value; } + post_relocations_cleanup(map); #endif } #else @@ -271,6 +353,35 @@ static inline void handle_relocations(void *output, unsigned long output_len, { } #endif +static void layout_image(void *output, Elf_Ehdr *ehdr, Elf_Phdr *phdrs) +{ + int i; + void *dest; + Elf_Phdr *phdr; + + for (i = 0; i < ehdr->e_phnum; i++) { + phdr = &phdrs[i]; + + switch (phdr->p_type) { + case PT_LOAD: +#ifdef CONFIG_X86_64 + if ((phdr->p_align % 0x200000) != 0) + error("Alignment of LOAD segment isn't multiple of 2MB"); +#endif +#ifdef CONFIG_RELOCATABLE + dest = output; + dest += (phdr->p_paddr - LOAD_PHYSICAL_ADDR); +#else + dest = (void *)(phdr->p_paddr); +#endif + memmove(dest, output + phdr->p_offset, phdr->p_filesz); + break; + default: /* Ignore other PT_* */ + break; + } + } +} + static void parse_elf(void *output) { #ifdef CONFIG_X86_64 @@ -282,6 +393,7 @@ static void parse_elf(void *output) #endif void *dest; int i; + int nokaslr; memcpy(&ehdr, output, sizeof(ehdr)); if (ehdr.e_ident[EI_MAG0] != ELFMAG0 || @@ -292,6 +404,12 @@ static void parse_elf(void *output) return; } + if (IS_ENABLED(CONFIG_FG_KASLR)) { + nokaslr = cmdline_find_option_bool("nokaslr"); + if (nokaslr) + warn("FG_KASLR disabled: 'nokaslr' on cmdline."); + } + debug_putstr("Parsing ELF... "); phdrs = malloc(sizeof(*phdrs) * ehdr.e_phnum); @@ -300,26 +418,10 @@ static void parse_elf(void *output) memcpy(phdrs, output + ehdr.e_phoff, sizeof(*phdrs) * ehdr.e_phnum); - for (i = 0; i < ehdr.e_phnum; i++) { - phdr = &phdrs[i]; - - switch (phdr->p_type) { - case PT_LOAD: -#ifdef CONFIG_X86_64 - if ((phdr->p_align % 0x200000) != 0) - error("Alignment of LOAD segment isn't multiple of 2MB"); -#endif -#ifdef CONFIG_RELOCATABLE - dest = output; - dest += (phdr->p_paddr - LOAD_PHYSICAL_ADDR); -#else - dest = (void *)(phdr->p_paddr); -#endif - memmove(dest, output + phdr->p_offset, phdr->p_filesz); - break; - default: /* Ignore other PT_* */ break; - } - } + if (IS_ENABLED(CONFIG_FG_KASLR) && !nokaslr) + layout_randomized_image(output, &ehdr, phdrs); + else + layout_image(output, &ehdr, phdrs); free(phdrs); } diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h index 81fbc8d686fa..fd0c63cfaa4a 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -76,6 +76,34 @@ struct mem_vector { unsigned long long size; }; +#ifdef CONFIG_X86_64 +#define Elf_Ehdr Elf64_Ehdr +#define Elf_Phdr Elf64_Phdr +#define Elf_Shdr Elf64_Shdr +#else +#define Elf_Ehdr Elf32_Ehdr +#define Elf_Phdr Elf32_Phdr +#define Elf_Shdr Elf32_Shdr +#endif + +#if CONFIG_FG_KASLR +void layout_randomized_image(void *output, Elf_Ehdr *ehdr, Elf_Phdr *phdrs); +void pre_relocations_cleanup(unsigned long map); +void post_relocations_cleanup(unsigned long map); +Elf_Shdr *adjust_address(long *address); +void adjust_relative_offset(long pc, long *value, Elf_Shdr *section); +bool is_percpu_addr(long pc, long offset); +#else +static inline void layout_randomized_image(void *output, Elf_Ehdr *ehdr, + Elf_Phdr *phdrs) { } +static inline void pre_relocations_cleanup(unsigned long map) { } +static inline void post_relocations_cleanup(unsigned long map) { } +static inline Elf_Shdr *adjust_address(long *address) { return NULL; } +static inline void adjust_relative_offset(long pc, long *value, + Elf_Shdr *section) { } +static inline bool is_percpu_addr(long pc, long offset) { return true; } +#endif /* CONFIG_FG_KASLR */ + #if CONFIG_RANDOMIZE_BASE /* kaslr.c */ void choose_random_location(unsigned long input, diff --git a/arch/x86/boot/compressed/utils.c b/arch/x86/boot/compressed/utils.c new file mode 100644 index 000000000000..726f5b9092dc --- /dev/null +++ b/arch/x86/boot/compressed/utils.c @@ -0,0 +1,11 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * utils.c + * + * This contains various libraries that are needed for fgkaslr + */ +#define __DISABLE_EXPORTS +#define _LINUX_KPROBES_H +#define NOKPROBE_SYMBOL(fname) +#include "../../../../lib/sort.c" +#include "../../../../lib/bsearch.c" diff --git a/arch/x86/boot/compressed/vmlinux.symbols b/arch/x86/boot/compressed/vmlinux.symbols new file mode 100644 index 000000000000..cc86e79a2a3d --- /dev/null +++ b/arch/x86/boot/compressed/vmlinux.symbols @@ -0,0 +1,17 @@ +kallsyms_offsets +kallsyms_addresses +kallsyms_num_syms +kallsyms_relative_base +kallsyms_names +kallsyms_token_table +kallsyms_token_index +kallsyms_markers +__start___ex_table +__stop___ex_table +_sinittext +_einittext +_stext +_etext +__start_orc_unwind_ip +__stop_orc_unwind_ip +__start_orc_unwind diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h index 9191280d9ea3..227551dcf34b 100644 --- a/arch/x86/include/asm/boot.h +++ b/arch/x86/include/asm/boot.h @@ -24,6 +24,18 @@ # error "Invalid value for CONFIG_PHYSICAL_ALIGN" #endif +#ifdef CONFIG_FG_KASLR +/* + * We need extra boot heap when using fgkaslr because we make a copy + * of the original decompressed kernel to avoid issues with writing + * over ourselves when shuffling the sections. We also need extra + * space for resorting kallsyms after shuffling. This value could + * be decreased if free() would release memory properly, or if we + * could avoid the kernel copy. It would need to be increased if we + * find additional tables that need to be resorted. + */ +# define BOOT_HEAP_SIZE 0x4000000 +#else /* !CONFIG_FG_KASLR */ #if defined(CONFIG_KERNEL_BZIP2) # define BOOT_HEAP_SIZE 0x400000 #elif defined(CONFIG_KERNEL_ZSTD) @@ -34,8 +46,9 @@ */ # define BOOT_HEAP_SIZE 0x30000 #else -# define BOOT_HEAP_SIZE 0x10000 -#endif +# define BOOT_HEAP_SIZE 0x10000 +#endif /* CONFIG_KERNEL_BZIP2 */ +#endif /* CONFIG_FG_KASLR */ #ifdef CONFIG_X86_64 # define BOOT_STACK_SIZE 0x4000 diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h index 22220945a5fd..b219ab04ff3b 100644 --- a/include/uapi/linux/elf.h +++ b/include/uapi/linux/elf.h @@ -299,6 +299,7 @@ typedef struct elf64_phdr { #define SHN_LIVEPATCH 0xff20 #define SHN_ABS 0xfff1 #define SHN_COMMON 0xfff2 +#define SHN_XINDEX 0xffff #define SHN_HIRESERVE 0xffff typedef struct elf32_shdr { From patchwork Wed Sep 23 17:39:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 11795361 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EB250618 for ; Wed, 23 Sep 2020 17:41:28 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id 1E3C6206D9 for ; Wed, 23 Sep 2020 17:41:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1E3C6206D9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19979-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 13659 invoked by uid 550); 23 Sep 2020 17:41:16 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 13520 invoked from network); 23 Sep 2020 17:41:15 -0000 IronPort-SDR: C8oeFxg7alppA4OiGtJ6atMgqwjiVqGgzJxkxF6IZhsCSPZVB2L4UA9Bsvv1eViFXWUFx5M4/u VKjJAoE5+kwg== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="140437142" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="140437142" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False IronPort-SDR: M0CXtnWRZ0VMbeqbDCczIQsseBv19FXOlkRmvgghUQ2+CQrEpmj9ruHQu8dq76eveQ3+ALKDN1 MtvwZ+5f4CJw== X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="309993224" From: Kristen Carlson Accardi To: keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de Cc: arjan@linux.intel.com, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, rick.p.edgecombe@intel.com, Kristen Carlson Accardi , Tony Luck Subject: [PATCH v5 08/10] kallsyms: Hide layout Date: Wed, 23 Sep 2020 10:39:02 -0700 Message-Id: <20200923173905.11219-9-kristen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200923173905.11219-1-kristen@linux.intel.com> References: <20200923173905.11219-1-kristen@linux.intel.com> MIME-Version: 1.0 This patch makes /proc/kallsyms display in a random order, rather than sorted by address in order to hide the newly randomized address layout. Signed-off-by: Kristen Carlson Accardi Reviewed-by: Tony Luck Tested-by: Tony Luck --- kernel/kallsyms.c | 163 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 162 insertions(+), 1 deletion(-) diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c index 4fb15fa96734..6047771ad408 100644 --- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -448,6 +448,12 @@ struct kallsym_iter { int show_value; }; +struct kallsyms_shuffled_iter { + struct kallsym_iter iter; + loff_t total_syms; + loff_t shuffled_index[]; +}; + int __weak arch_get_kallsym(unsigned int symnum, unsigned long *value, char *type, char *name) { @@ -695,7 +701,7 @@ bool kallsyms_show_value(const struct cred *cred) } } -static int kallsyms_open(struct inode *inode, struct file *file) +static int __kallsyms_open(struct inode *inode, struct file *file) { /* * We keep iterator in m->private, since normal case is to @@ -716,6 +722,161 @@ static int kallsyms_open(struct inode *inode, struct file *file) return 0; } +/* + * When function granular kaslr is enabled, we need to print out the symbols + * at random so we don't reveal the new layout. + */ +#if defined(CONFIG_FG_KASLR) +static int update_random_pos(struct kallsyms_shuffled_iter *s_iter, + loff_t pos, loff_t *new_pos) +{ + loff_t new; + + if (pos >= s_iter->total_syms) + return 0; + + new = s_iter->shuffled_index[pos]; + + /* + * normally this would be done as part of update_iter, however, + * we want to avoid triggering this in the event that new is + * zero since we don't want to blow away our pos end indicators. + */ + if (new == 0) { + s_iter->iter.name[0] = '\0'; + s_iter->iter.nameoff = get_symbol_offset(new); + s_iter->iter.pos = new; + } + + *new_pos = new; + return 1; +} + +static void *shuffled_start(struct seq_file *m, loff_t *pos) +{ + struct kallsyms_shuffled_iter *s_iter = m->private; + loff_t new_pos; + + if (!update_random_pos(s_iter, *pos, &new_pos)) + return NULL; + + return s_start(m, &new_pos); +} + +static void *shuffled_next(struct seq_file *m, void *p, loff_t *pos) +{ + struct kallsyms_shuffled_iter *s_iter = m->private; + loff_t new_pos; + + (*pos)++; + + if (!update_random_pos(s_iter, *pos, &new_pos)) + return NULL; + + if (!update_iter(m->private, new_pos)) + return NULL; + + return p; +} + +/* + * shuffle_index_list() + * Use a Fisher Yates algorithm to shuffle a list of text sections. + */ +static void shuffle_index_list(loff_t *indexes, loff_t size) +{ + int i; + unsigned int j; + loff_t temp; + + for (i = size - 1; i > 0; i--) { + /* pick a random index from 0 to i */ + get_random_bytes(&j, sizeof(j)); + j = j % (i + 1); + + temp = indexes[i]; + indexes[i] = indexes[j]; + indexes[j] = temp; + } +} + +static const struct seq_operations kallsyms_shuffled_op = { + .start = shuffled_start, + .next = shuffled_next, + .stop = s_stop, + .show = s_show +}; + +static int kallsyms_random_open(struct inode *inode, struct file *file) +{ + loff_t pos; + struct kallsyms_shuffled_iter *shuffled_iter; + struct kallsym_iter iter; + bool show_value; + + /* + * If privileged, go ahead and use the normal algorithm for + * displaying symbols + */ + show_value = kallsyms_show_value(file->f_cred); + if (show_value) + return __kallsyms_open(inode, file); + + /* + * we need to figure out how many extra symbols there are + * to print out past kallsyms_num_syms + */ + pos = kallsyms_num_syms; + reset_iter(&iter, 0); + do { + if (!update_iter(&iter, pos)) + break; + pos++; + } while (1); + + /* + * add storage space for an array of loff_t equal to the size + * of the total number of symbols we need to print + */ + shuffled_iter = __seq_open_private(file, &kallsyms_shuffled_op, + sizeof(*shuffled_iter) + + (sizeof(pos) * pos)); + if (!shuffled_iter) + return -ENOMEM; + + reset_iter(&shuffled_iter->iter, 0); + shuffled_iter->iter.show_value = show_value; + shuffled_iter->total_syms = pos; + + /* + * the existing update_iter algorithm requires that we + * are either moving along increasing pos sequentially, + * or that these values are correct. Since these values + * were discovered above, initialize our new iter so we + * can use update_iter non-sequentially. + */ + shuffled_iter->iter.pos_arch_end = iter.pos_arch_end; + shuffled_iter->iter.pos_mod_end = iter.pos_mod_end; + shuffled_iter->iter.pos_ftrace_mod_end = iter.pos_ftrace_mod_end; + + /* + * initialize the array with all possible pos values, then + * shuffle the array so that the values will display in a random + * order. + */ + for (pos = 0; pos < shuffled_iter->total_syms; pos++) + shuffled_iter->shuffled_index[pos] = pos; + + shuffle_index_list(shuffled_iter->shuffled_index, shuffled_iter->total_syms); + + return 0; +} + +#define kallsyms_open kallsyms_random_open +#else +#define kallsyms_open __kallsyms_open +#endif /* CONFIG_FG_KASLR */ + #ifdef CONFIG_KGDB_KDB const char *kdb_walk_kallsyms(loff_t *pos) { From patchwork Wed Sep 23 17:39:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 11795363 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B5176112E for ; Wed, 23 Sep 2020 17:41:39 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id DFB752067B for ; Wed, 23 Sep 2020 17:41:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DFB752067B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19980-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 15470 invoked by uid 550); 23 Sep 2020 17:41:27 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 15390 invoked from network); 23 Sep 2020 17:41:26 -0000 IronPort-SDR: rVkKAnJBFdutEJkMmn/ei3Uo1PLPdOg1uY2QB5DqIa2icfcoOj1nc4Qd2juiEQ3AULg1jRMtzT JoGwSdulB5tg== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="161051953" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="161051953" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False IronPort-SDR: 6aipNJl81IPykfPmIti3zS454C03a/PEhFlkW73lg5gtGghr/CBMZ6BmSqbhVH0u+CX5SN3JE8 POQge5p9ExQw== X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="309993282" From: Kristen Carlson Accardi To: keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, "H. Peter Anvin" , Jessica Yu Cc: arjan@linux.intel.com, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, rick.p.edgecombe@intel.com, Kristen Carlson Accardi , Ard Biesheuvel , Tony Luck Subject: [PATCH v5 09/10] module: Reorder functions Date: Wed, 23 Sep 2020 10:39:03 -0700 Message-Id: <20200923173905.11219-10-kristen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200923173905.11219-1-kristen@linux.intel.com> References: <20200923173905.11219-1-kristen@linux.intel.com> MIME-Version: 1.0 Introduce a new config option to allow modules to be re-ordered by function. This option can be enabled independently of the kernel text KASLR or FG_KASLR settings so that it can be used by architectures that do not support either of these features. This option will be selected by default if CONFIG_FG_KASLR is selected. If a module has functions split out into separate text sections (i.e. compiled with the -ffunction-sections flag), reorder the functions to provide some code diversification to modules. Signed-off-by: Kristen Carlson Accardi Reviewed-by: Kees Cook Acked-by: Ard Biesheuvel Tested-by: Ard Biesheuvel Reviewed-by: Tony Luck Tested-by: Tony Luck Acked-by: Jessica Yu Tested-by: Jessica Yu --- arch/x86/Makefile | 9 +++++ init/Kconfig | 12 +++++++ kernel/kallsyms.c | 2 +- kernel/module.c | 85 +++++++++++++++++++++++++++++++++++++++++++++-- 4 files changed, 105 insertions(+), 3 deletions(-) diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 4346ffb2e39f..9d13980c9f81 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -47,6 +47,15 @@ export REALMODE_CFLAGS # e.g.: obj-y += foo_$(BITS).o export BITS +ifdef CONFIG_X86_NEED_RELOCS + LDFLAGS_vmlinux := --emit-relocs --discard-none +endif + +ifndef CONFIG_FG_KASLR + ifdef CONFIG_MODULE_FG_KASLR + KBUILD_CFLAGS_MODULE += -ffunction-sections + endif +endif # # Prevent GCC from generating any FP code by mistake. # diff --git a/init/Kconfig b/init/Kconfig index 81220973b064..0b380962a2db 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2023,6 +2023,7 @@ config FG_KASLR bool "Function Granular Kernel Address Space Layout Randomization" depends on $(cc-option, -ffunction-sections) depends on ARCH_HAS_FG_KASLR + select MODULE_FG_KASLR default n help This option improves the randomness of the kernel text @@ -2307,6 +2308,17 @@ config UNUSED_KSYMS_WHITELIST one per line. The path can be absolute, or relative to the kernel source tree. +config MODULE_FG_KASLR + depends on $(cc-option, -ffunction-sections) + bool "Module Function Granular Layout Randomization" + help + This option randomizes the module text section by reordering the text + section by function at module load time. In order to use this + feature, the module must have been compiled with the + -ffunction-sections compiler flag. + + If unsure, say N. + endif # MODULES config MODULES_TREE_LOOKUP diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c index 6047771ad408..039ce953986c 100644 --- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -726,7 +726,7 @@ static int __kallsyms_open(struct inode *inode, struct file *file) * When function granular kaslr is enabled, we need to print out the symbols * at random so we don't reveal the new layout. */ -#if defined(CONFIG_FG_KASLR) +#if defined(CONFIG_FG_KASLR) || defined(CONFIG_MODULE_FG_KASLR) static int update_random_pos(struct kallsyms_shuffled_iter *s_iter, loff_t pos, loff_t *new_pos) { diff --git a/kernel/module.c b/kernel/module.c index 1c5cff34d9f2..55a061ba19ab 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -56,6 +56,7 @@ #include #include #include +#include #include #include "module-internal.h" @@ -1584,7 +1585,7 @@ static void free_sect_attrs(struct module_sect_attrs *sect_attrs) for (section = 0; section < sect_attrs->nsections; section++) kfree(sect_attrs->attrs[section].battr.attr.name); - kfree(sect_attrs); + kvfree(sect_attrs); } static void add_sect_attrs(struct module *mod, const struct load_info *info) @@ -1601,7 +1602,7 @@ static void add_sect_attrs(struct module *mod, const struct load_info *info) size[0] = ALIGN(struct_size(sect_attrs, attrs, nloaded), sizeof(sect_attrs->grp.bin_attrs[0])); size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.bin_attrs[0]); - sect_attrs = kzalloc(size[0] + size[1], GFP_KERNEL); + sect_attrs = kvzalloc(size[0] + size[1], GFP_KERNEL); if (sect_attrs == NULL) return; @@ -2435,6 +2436,83 @@ static long get_offset(struct module *mod, unsigned int *size, return ret; } +/* + * shuffle_text_list() + * Use a Fisher Yates algorithm to shuffle a list of text sections. + */ +static void shuffle_text_list(Elf_Shdr **list, int size) +{ + int i; + unsigned int j; + Elf_Shdr *temp; + + for (i = size - 1; i > 0; i--) { + /* + * pick a random index from 0 to i + */ + get_random_bytes(&j, sizeof(j)); + j = j % (i + 1); + + temp = list[i]; + list[i] = list[j]; + list[j] = temp; + } +} + +/* + * randomize_text() + * Look through the core section looking for executable code sections. + * Store sections in an array and then shuffle the sections + * to reorder the functions. + */ +static void randomize_text(struct module *mod, struct load_info *info) +{ + int i; + int num_text_sections = 0; + Elf_Shdr **text_list; + int size = 0; + int max_sections = info->hdr->e_shnum; + unsigned int sec = find_sec(info, ".text"); + + if (sec == 0) + return; + + text_list = kvmalloc_array(max_sections, sizeof(*text_list), GFP_KERNEL); + if (!text_list) + return; + + for (i = 0; i < max_sections; i++) { + Elf_Shdr *shdr = &info->sechdrs[i]; + const char *sname = info->secstrings + shdr->sh_name; + + if (!(shdr->sh_flags & SHF_ALLOC) || + !(shdr->sh_flags & SHF_EXECINSTR) || + strstarts(sname, ".init")) + continue; + + text_list[num_text_sections] = shdr; + num_text_sections++; + } + + shuffle_text_list(text_list, num_text_sections); + + for (i = 0; i < num_text_sections; i++) { + Elf_Shdr *shdr = text_list[i]; + + /* + * get_offset has a section index for it's last + * argument, that is only used by arch_mod_section_prepend(), + * which is only defined by parisc. Since this type + * of randomization isn't supported on parisc, we can + * safely pass in zero as the last argument, as it is + * ignored. + */ + shdr->sh_entsize = get_offset(mod, &size, shdr, 0); + } + + kvfree(text_list); +} + /* Lay out the SHF_ALLOC sections in a way not dissimilar to how ld might -- code, read-only data, read-write data, small data. Tally sizes, and place the offsets into sh_entsize fields: high bit means it @@ -2525,6 +2603,9 @@ static void layout_sections(struct module *mod, struct load_info *info) break; } } + + if (IS_ENABLED(CONFIG_MODULE_FG_KASLR)) + randomize_text(mod, info); } static void set_license(struct module *mod, const char *license) From patchwork Wed Sep 23 17:39:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristen Carlson Accardi X-Patchwork-Id: 11795365 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D6147112E for ; Wed, 23 Sep 2020 17:41:49 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id 3887F206D9 for ; Wed, 23 Sep 2020 17:41:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3887F206D9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19981-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 16189 invoked by uid 550); 23 Sep 2020 17:41:38 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 16123 invoked from network); 23 Sep 2020 17:41:37 -0000 IronPort-SDR: dtNsfSLAghKj315EVodv8OCbNWGsxOU3Z9HK0uqkqFqGDxuQPb/FAxx2pGNoWljiF8dZnxocop ZuRt2f2o9XaA== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="158372520" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="158372520" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False IronPort-SDR: GZk1FRtVMnmSVB61ksssWSFNFU+xjc7kddUpaUMFeMzOTmuJTaNnLCO4Kbb4lhQ2TY/3WZNv/0 FyFrdXKqoIKw== X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="309993326" From: Kristen Carlson Accardi To: keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, Josh Poimboeuf , Jiri Kosina , Miroslav Benes , Petr Mladek , Joe Lawrence Cc: arjan@linux.intel.com, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, rick.p.edgecombe@intel.com, Kristen Carlson Accardi , live-patching@vger.kernel.org Subject: [PATCH v5 10/10] livepatch: only match unique symbols when using fgkaslr Date: Wed, 23 Sep 2020 10:39:04 -0700 Message-Id: <20200923173905.11219-11-kristen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200923173905.11219-1-kristen@linux.intel.com> References: <20200923173905.11219-1-kristen@linux.intel.com> MIME-Version: 1.0 If any type of function granular randomization is enabled, the sympos algorithm will fail, as it will be impossible to resolve symbols when there are duplicates using the previous symbol position. Override the value of sympos to always be zero if fgkaslr is enabled for either the core kernel or modules, forcing the algorithm to require that only unique symbols are allowed to be patched. Signed-off-by: Kristen Carlson Accardi --- kernel/livepatch/core.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index f76fdb925532..da08e40f2da2 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -170,6 +170,17 @@ static int klp_find_object_symbol(const char *objname, const char *name, kallsyms_on_each_symbol(klp_find_callback, &args); mutex_unlock(&module_mutex); + /* + * If any type of function granular randomization is enabled, it + * will be impossible to resolve symbols when there are duplicates + * using the previous symbol position (i.e. sympos != 0). Override + * the value of sympos to always be zero in this case. This will + * force the algorithm to require that only unique symbols are + * allowed to be patched. + */ + if (IS_ENABLED(CONFIG_FG_KASLR) || IS_ENABLED(CONFIG_MODULE_FG_KASLR)) + sympos = 0; + /* * Ensure an address was found. If sympos is 0, ensure symbol is unique; * otherwise ensure the symbol position count matches sympos.