Message ID | 20230505152046.6575-4-mic@digikod.net (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | Hypervisor-Enforced Kernel Integrity | expand |
On Fri, May 05, 2023 at 05:20:40PM +0200, Mickaël Salaün wrote: > From: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> > > Hypervisor Enforced Kernel Integrity (Heki) is a feature that will use > the hypervisor to enhance guest virtual machine security. > > Configuration > ============= > > Define the config variables for the feature. This feature depends on > support from the architecture as well as the hypervisor. > > Enabling HEKI > ============= > > Define a kernel command line parameter "heki" to turn the feature on or > off. By default, Heki is on. For such a newfangled feature can we have it off by default? Especially when there are unsolved issues around dynamically loaded code. > [...] > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 3604074a878b..5cf5a7a97811 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -297,6 +297,7 @@ config X86 > select FUNCTION_ALIGNMENT_4B > imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI > select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE > + select ARCH_SUPPORTS_HEKI if X86_64 Why is there a restriction on X86_64? > > config INSTRUCTION_DECODER > def_bool y > diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h > index a6e8373a5170..42ef1e33b8a5 100644 > --- a/arch/x86/include/asm/sections.h > +++ b/arch/x86/include/asm/sections.h [...] > > +#ifdef CONFIG_HEKI > + > +/* > + * Gather all of the statically defined sections so heki_late_init() can > + * protect these sections in the host page table. > + * > + * The sections are defined under "SECTIONS" in vmlinux.lds.S > + * Keep this array in sync with SECTIONS. > + */ This seems a bit fragile, because it requires constant attention from people who care about this functionality. Can this table be automatically generated? Thanks, Wei. > +struct heki_va_range __initdata heki_va_ranges[] = { > + { > + .va_start = _stext, > + .va_end = _etext, > + .attributes = HEKI_ATTR_MEM_NOWRITE | HEKI_ATTR_MEM_EXEC, > + }, > + { > + .va_start = __start_rodata, > + .va_end = __end_rodata, > + .attributes = HEKI_ATTR_MEM_NOWRITE, > + }, > +#ifdef CONFIG_UNWINDER_ORC > + { > + .va_start = __start_orc_unwind_ip, > + .va_end = __stop_orc_unwind_ip, > + .attributes = HEKI_ATTR_MEM_NOWRITE, > + }, > + { > + .va_start = __start_orc_unwind, > + .va_end = __stop_orc_unwind, > + .attributes = HEKI_ATTR_MEM_NOWRITE, > + }, > + { > + .va_start = orc_lookup, > + .va_end = orc_lookup_end, > + .attributes = HEKI_ATTR_MEM_NOWRITE, > + }, > +#endif /* CONFIG_UNWINDER_ORC */ > +}; > +
Sorry for the delay. See inline... On 5/8/23 12:29, Wei Liu wrote: > On Fri, May 05, 2023 at 05:20:40PM +0200, Mickaël Salaün wrote: >> From: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> >> >> Hypervisor Enforced Kernel Integrity (Heki) is a feature that will use >> the hypervisor to enhance guest virtual machine security. >> >> Configuration >> ============= >> >> Define the config variables for the feature. This feature depends on >> support from the architecture as well as the hypervisor. >> >> Enabling HEKI >> ============= >> >> Define a kernel command line parameter "heki" to turn the feature on or >> off. By default, Heki is on. > > For such a newfangled feature can we have it off by default? Especially > when there are unsolved issues around dynamically loaded code. > Yes. We can certainly do that. >> > [...] >> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig >> index 3604074a878b..5cf5a7a97811 100644 >> --- a/arch/x86/Kconfig >> +++ b/arch/x86/Kconfig >> @@ -297,6 +297,7 @@ config X86 >> select FUNCTION_ALIGNMENT_4B >> imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI >> select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE >> + select ARCH_SUPPORTS_HEKI if X86_64 > > Why is there a restriction on X86_64? > We want to get the PoC working and reviewed on X64 first. We have tested this only on X64 so far. >> >> config INSTRUCTION_DECODER >> def_bool y >> diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h >> index a6e8373a5170..42ef1e33b8a5 100644 >> --- a/arch/x86/include/asm/sections.h >> +++ b/arch/x86/include/asm/sections.h > [...] >> >> +#ifdef CONFIG_HEKI >> + >> +/* >> + * Gather all of the statically defined sections so heki_late_init() can >> + * protect these sections in the host page table. >> + * >> + * The sections are defined under "SECTIONS" in vmlinux.lds.S >> + * Keep this array in sync with SECTIONS. >> + */ > > This seems a bit fragile, because it requires constant attention from > people who care about this functionality. Can this table be > automatically generated? > We realize that. But I don't know of a way this can be automatically generated. Also, the permissions for each section is specific to the use of that section. The developer who introduces a new section is the one who will know what the permissions should be. If any one has any ideas of how we can generate this table automatically or even just add a build time check of some sort, please let us know. Thanks. Madhavan > Thanks, > Wei. > >> +struct heki_va_range __initdata heki_va_ranges[] = { >> + { >> + .va_start = _stext, >> + .va_end = _etext, >> + .attributes = HEKI_ATTR_MEM_NOWRITE | HEKI_ATTR_MEM_EXEC, >> + }, >> + { >> + .va_start = __start_rodata, >> + .va_end = __end_rodata, >> + .attributes = HEKI_ATTR_MEM_NOWRITE, >> + }, >> +#ifdef CONFIG_UNWINDER_ORC >> + { >> + .va_start = __start_orc_unwind_ip, >> + .va_end = __stop_orc_unwind_ip, >> + .attributes = HEKI_ATTR_MEM_NOWRITE, >> + }, >> + { >> + .va_start = __start_orc_unwind, >> + .va_end = __stop_orc_unwind, >> + .attributes = HEKI_ATTR_MEM_NOWRITE, >> + }, >> + { >> + .va_start = orc_lookup, >> + .va_end = orc_lookup_end, >> + .attributes = HEKI_ATTR_MEM_NOWRITE, >> + }, >> +#endif /* CONFIG_UNWINDER_ORC */ >> +}; >> +
On 17/05/2023 14:47, Madhavan T. Venkataraman wrote: > Sorry for the delay. See inline... > > On 5/8/23 12:29, Wei Liu wrote: >> On Fri, May 05, 2023 at 05:20:40PM +0200, Mickaël Salaün wrote: >>> From: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> >>> >>> Hypervisor Enforced Kernel Integrity (Heki) is a feature that will use >>> the hypervisor to enhance guest virtual machine security. >>> >>> Configuration >>> ============= >>> >>> Define the config variables for the feature. This feature depends on >>> support from the architecture as well as the hypervisor. >>> >>> Enabling HEKI >>> ============= >>> >>> Define a kernel command line parameter "heki" to turn the feature on or >>> off. By default, Heki is on. >> >> For such a newfangled feature can we have it off by default? Especially >> when there are unsolved issues around dynamically loaded code. >> > > Yes. We can certainly do that. By default the Kconfig option should definitely be off. We also need to change the Kconfig option to only be set if kernel module, JIT, kprobes and other dynamic text change feature are disabled at build time (see discussion with Sean). With this new Kconfig option for the static case, I think the boot option should be on by default because otherwise it would not really be possible to switch back to on later without taking the risk to silently breaking users' machines. However, we should rename this option to something like "heki_static" to be in line with the new Kconfig option. The goal of Heki is to improve and complement kernel self-protection mechanisms (which don't have boot time options), and to make it available to everyone, see https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings In practice, it would then be kind of useless to be required to set a boot option to enable Heki (rather than to disable it). > >>> >> [...] >>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig >>> index 3604074a878b..5cf5a7a97811 100644 >>> --- a/arch/x86/Kconfig >>> +++ b/arch/x86/Kconfig >>> @@ -297,6 +297,7 @@ config X86 >>> select FUNCTION_ALIGNMENT_4B >>> imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI >>> select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE >>> + select ARCH_SUPPORTS_HEKI if X86_64 >> >> Why is there a restriction on X86_64? >> > > We want to get the PoC working and reviewed on X64 first. We have tested this only on X64 so far. X86_64 includes Intel CPUs, which can support EPT and MBEC, which are a requirement for Heki. ARM might have similar features but we're focused on x86 for now. As a side note, I only have access to an Intel machine, which means that I cannot work on AMD support. However, I'll be pleased to implement such support if I get access to a machine with a recent AMD CPU. > >>> >>> config INSTRUCTION_DECODER >>> def_bool y >>> diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h >>> index a6e8373a5170..42ef1e33b8a5 100644 >>> --- a/arch/x86/include/asm/sections.h >>> +++ b/arch/x86/include/asm/sections.h >> [...] >>> >>> +#ifdef CONFIG_HEKI >>> + >>> +/* >>> + * Gather all of the statically defined sections so heki_late_init() can >>> + * protect these sections in the host page table. >>> + * >>> + * The sections are defined under "SECTIONS" in vmlinux.lds.S >>> + * Keep this array in sync with SECTIONS. >>> + */ >> >> This seems a bit fragile, because it requires constant attention from >> people who care about this functionality. Can this table be >> automatically generated? >> > > We realize that. But I don't know of a way this can be automatically generated. Also, the permissions for > each section is specific to the use of that section. The developer who introduces a new section is the > one who will know what the permissions should be. > > If any one has any ideas of how we can generate this table automatically or even just add a build time check > of some sort, please let us know. One clean solution might be to parse the vmlinux.lds.S file, extract section and their permission, and fill that into an automatically generated header file. Another way to do it would be to extract sections and associated permissions with objdump, but that could be an issue because of longer build time. A better solution would be to extract such sections and associated permissions at boot time. I guess the kernel already has such helpers used in early boot.
diff --git a/Kconfig b/Kconfig index 745bc773f567..0c844d9bcb03 100644 --- a/Kconfig +++ b/Kconfig @@ -29,4 +29,6 @@ source "lib/Kconfig" source "lib/Kconfig.debug" +source "virt/heki/Kconfig" + source "Documentation/Kconfig" diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3604074a878b..5cf5a7a97811 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -297,6 +297,7 @@ config X86 select FUNCTION_ALIGNMENT_4B imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE + select ARCH_SUPPORTS_HEKI if X86_64 config INSTRUCTION_DECODER def_bool y diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h index a6e8373a5170..42ef1e33b8a5 100644 --- a/arch/x86/include/asm/sections.h +++ b/arch/x86/include/asm/sections.h @@ -18,6 +18,10 @@ extern char __end_of_kernel_reserve[]; extern unsigned long _brk_start, _brk_end; +extern int __start_orc_unwind_ip[], __stop_orc_unwind_ip[]; +extern struct orc_entry __start_orc_unwind[], __stop_orc_unwind[]; +extern unsigned int orc_lookup[], orc_lookup_end[]; + static inline bool arch_is_kernel_initmem_freed(unsigned long addr) { /* diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 88188549647c..f0ddaf24ab63 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -11,6 +11,7 @@ #include <linux/dma-map-ops.h> #include <linux/dmi.h> #include <linux/efi.h> +#include <linux/heki.h> #include <linux/ima.h> #include <linux/init_ohci1394_dma.h> #include <linux/initrd.h> @@ -850,6 +851,54 @@ static void __init x86_report_nx(void) } } +#ifdef CONFIG_HEKI + +/* + * Gather all of the statically defined sections so heki_late_init() can + * protect these sections in the host page table. + * + * The sections are defined under "SECTIONS" in vmlinux.lds.S + * Keep this array in sync with SECTIONS. + */ +struct heki_va_range __initdata heki_va_ranges[] = { + { + .va_start = _stext, + .va_end = _etext, + .attributes = HEKI_ATTR_MEM_NOWRITE | HEKI_ATTR_MEM_EXEC, + }, + { + .va_start = __start_rodata, + .va_end = __end_rodata, + .attributes = HEKI_ATTR_MEM_NOWRITE, + }, +#ifdef CONFIG_UNWINDER_ORC + { + .va_start = __start_orc_unwind_ip, + .va_end = __stop_orc_unwind_ip, + .attributes = HEKI_ATTR_MEM_NOWRITE, + }, + { + .va_start = __start_orc_unwind, + .va_end = __stop_orc_unwind, + .attributes = HEKI_ATTR_MEM_NOWRITE, + }, + { + .va_start = orc_lookup, + .va_end = orc_lookup_end, + .attributes = HEKI_ATTR_MEM_NOWRITE, + }, +#endif /* CONFIG_UNWINDER_ORC */ +}; + +void __init heki_arch_init(void) +{ + heki.num_static_ranges = ARRAY_SIZE(heki_va_ranges); + heki.static_ranges = + heki_alloc_pa_ranges(heki_va_ranges, heki.num_static_ranges); +} + +#endif /* CONFIG_HEKI */ + /* * Determine if we were loaded by an EFI loader. If so, then we have also been * passed the efi memmap, systab, etc., so we should use these data structures diff --git a/include/linux/heki.h b/include/linux/heki.h new file mode 100644 index 000000000000..e4a3192ba687 --- /dev/null +++ b/include/linux/heki.h @@ -0,0 +1,90 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Hypervisor Enforced Kernel Integrity (Heki) - Headers + * + * Copyright © 2023 Microsoft Corporation + */ + +#ifndef __HEKI_H__ +#define __HEKI_H__ + +#ifdef CONFIG_HEKI + +#include <linux/kvm_types.h> + +/* Heki attributes for memory pages. */ +/* clang-format off */ +#define HEKI_ATTR_MEM_NOWRITE (1ULL << 0) +#define HEKI_ATTR_MEM_EXEC (1ULL << 1) +/* clang-format on */ + +/* + * heki_va_range is used to specify a virtual address range within the kernel + * address space along with their attributes. + */ +struct heki_va_range { + void *va_start; + void *va_end; + u64 attributes; +}; + +/* + * heki_pa_range is passed to the VMM or hypervisor so it can be processed by + * the VMM or the hypervisor based on range attributes. Examples of ranges: + * + * - a range whose permissions need to be set in the host page table + * - a range that contains information needed for authentication + * + * When an array of these is passed to the Hypervisor or VMM, the array + * must be in physically contiguous memory. + */ +struct heki_pa_range { + gfn_t gfn_start; + gfn_t gfn_end; + u64 attributes; +}; + +/* + * A hypervisor that supports Heki will instantiate this structure to + * provide hypervisor specific functions for Heki. + */ +struct heki_hypervisor { + int (*protect_ranges)(struct heki_pa_range *ranges, int num_ranges); + int (*lock_crs)(void); +}; + +/* + * If the architecture supports Heki, it will initialize static_ranges in + * early boot. + * + * If the active hypervisor supports Heki, it will plug its heki_hypervisor + * pointer into this heki structure. + */ +struct heki { + struct heki_pa_range *static_ranges; + int num_static_ranges; + struct heki_hypervisor *hypervisor; +}; + +extern struct heki heki; + +void heki_early_init(void); +void heki_arch_init(void); +void heki_late_init(void); + +struct heki_pa_range *heki_alloc_pa_ranges(struct heki_va_range *va_ranges, + int num_ranges); +void heki_free_pa_ranges(struct heki_pa_range *pa_ranges, int num_ranges); + +#else /* !CONFIG_HEKI */ + +static inline void heki_early_init(void) +{ +} +static inline void heki_late_init(void) +{ +} + +#endif /* CONFIG_HEKI */ + +#endif /* __HEKI_H__ */ diff --git a/init/main.c b/init/main.c index e1c3911d7c70..8649dbb07f18 100644 --- a/init/main.c +++ b/init/main.c @@ -102,6 +102,7 @@ #include <linux/stackdepot.h> #include <linux/randomize_kstack.h> #include <net/net_namespace.h> +#include <linux/heki.h> #include <asm/io.h> #include <asm/bugs.h> @@ -999,6 +1000,7 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void) sort_main_extable(); trap_init(); mm_init(); + heki_early_init(); poking_init(); ftrace_init(); @@ -1530,6 +1532,7 @@ static int __ref kernel_init(void *unused) exit_boot_config(); free_initmem(); mark_readonly(); + heki_late_init(); /* * Kernel mappings are now finalized - update the userspace page-table diff --git a/virt/Makefile b/virt/Makefile index 1cfea9436af9..4550dc624466 100644 --- a/virt/Makefile +++ b/virt/Makefile @@ -1,2 +1,3 @@ # SPDX-License-Identifier: GPL-2.0-only obj-y += lib/ +obj-$(CONFIG_HEKI) += heki/ diff --git a/virt/heki/Kconfig b/virt/heki/Kconfig new file mode 100644 index 000000000000..9858a827fe17 --- /dev/null +++ b/virt/heki/Kconfig @@ -0,0 +1,22 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Hypervisor Enforced Kernel Integrity (HEKI) +# + +config HEKI + bool "Hypervisor Enforced Kernel Integrity (Heki)" + default y + depends on !JUMP_LABEL && ARCH_SUPPORTS_HEKI + select KVM_EXTERNAL_WRITE_TRACKING if KVM + help + This feature enhances guest virtual machine security by taking + advantage of security features provided by the hypervisor for guests. + This feature is helpful in maintaining guest virtual machine security + even after the guest kernel has been compromised. + +config ARCH_SUPPORTS_HEKI + bool "Architecture support for HEKI" + help + An architecture should select this when it can successfully build + and run with CONFIG_HEKI. That is, it should provide all of the + architecture support required for the HEKI feature. diff --git a/virt/heki/Makefile b/virt/heki/Makefile new file mode 100644 index 000000000000..2bc2061c9dfc --- /dev/null +++ b/virt/heki/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0-only + +obj-y += heki.o diff --git a/virt/heki/heki.c b/virt/heki/heki.c new file mode 100644 index 000000000000..c8cb1b84cceb --- /dev/null +++ b/virt/heki/heki.c @@ -0,0 +1,135 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Hypervisor Enforced Kernel Integrity (Heki) - Common code + * + * Copyright © 2023 Microsoft Corporation + */ + +#include <linux/cache.h> +#include <linux/heki.h> +#include <linux/kernel.h> +#include <linux/mm.h> +#include <linux/printk.h> +#include <linux/types.h> +#include <linux/vmalloc.h> + +#ifdef pr_fmt +#undef pr_fmt +#endif + +#define pr_fmt(fmt) "heki-guest: " fmt + +static bool heki_enabled __ro_after_init = true; + +struct heki heki = {}; + +struct heki_pa_range *heki_alloc_pa_ranges(struct heki_va_range *va_ranges, + int num_ranges) +{ + struct heki_pa_range *pa_ranges, *pa_range; + struct heki_va_range *va_range; + u64 attributes; + size_t size; + int i; + + size = PAGE_ALIGN(sizeof(struct heki_pa_range) * num_ranges); + pa_ranges = alloc_pages_exact(size, GFP_KERNEL); + if (!pa_ranges) + return NULL; + + for (i = 0; i < num_ranges; i++) { + va_range = &va_ranges[i]; + pa_range = &pa_ranges[i]; + + pa_range->gfn_start = PFN_DOWN(__pa_symbol(va_range->va_start)); + pa_range->gfn_end = PFN_UP(__pa_symbol(va_range->va_end)) - 1; + pa_range->attributes = va_range->attributes; + + /* + * WARNING: + * Leaks addresses, should only be kept for development. + */ + attributes = pa_range->attributes; + pr_warn("Configuring GFN 0x%llx-0x%llx with %s\n", + pa_range->gfn_start, pa_range->gfn_end, + (attributes & HEKI_ATTR_MEM_NOWRITE) ? "[nowrite]" : + ""); + } + + return pa_ranges; +} + +void heki_free_pa_ranges(struct heki_pa_range *pa_ranges, int num_ranges) +{ + size_t size; + + size = PAGE_ALIGN(sizeof(struct heki_pa_range) * num_ranges); + free_pages_exact(pa_ranges, size); +} + +void __init heki_early_init(void) +{ + if (!heki_enabled) { + pr_warn("Disabled\n"); + return; + } + pr_warn("Enabled\n"); + + heki_arch_init(); +} + +void heki_late_init(void) +{ + struct heki_hypervisor *hypervisor = heki.hypervisor; + int ret; + + if (!heki_enabled) + return; + + if (!heki.static_ranges) { + pr_warn("Architecture did not initialize static ranges\n"); + return; + } + + /* + * Hypervisor support will be added in the future. When it is, the + * hypervisor will be used to protect guest kernel memory and + * control registers. + */ + + if (!hypervisor) { + /* This happens for kernels running on bare metal as well. */ + pr_warn("No hypervisor support\n"); + goto out; + } + + /* Protects statically defined sections in the host page table. */ + ret = hypervisor->protect_ranges(heki.static_ranges, + heki.num_static_ranges); + if (WARN(ret, "Failed to protect static sections: %d\n", ret)) + goto out; + pr_warn("Static sections protected\n"); + + /* + * Locks control registers so a compromised guest cannot change + * them. + */ + ret = hypervisor->lock_crs(); + if (WARN(ret, "Failed to lock control registers: %d\n", ret)) + goto out; + pr_warn("Control registers locked\n"); + +out: + heki_free_pa_ranges(heki.static_ranges, heki.num_static_ranges); + heki.static_ranges = NULL; + heki.num_static_ranges = 0; +} + +static int __init heki_parse_config(char *str) +{ + if (strtobool(str, &heki_enabled)) + pr_warn("Invalid option string for heki: '%s'\n", str); + return 1; +} + +__setup("heki=", heki_parse_config);