From patchwork Tue Jul 28 13:10:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Madhavan T. Venkataraman" X-Patchwork-Id: 11689141 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CB69614E3 for ; Tue, 28 Jul 2020 13:11:28 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id 736A2206D4 for ; Tue, 28 Jul 2020 13:11:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux.microsoft.com header.i=@linux.microsoft.com header.b="X8qUTBwI" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 736A2206D4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.microsoft.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19462-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 26036 invoked by uid 550); 28 Jul 2020 13:11:14 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 25968 invoked from network); 28 Jul 2020 13:11:13 -0000 DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 7A6CC20B4909 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1595941861; bh=c0m/icnHpfgTfp+pXFs520StvNMbxQAtleLNrAWcPjU=; h=From:To:Subject:Date:In-Reply-To:References:From; b=X8qUTBwIWeQnVYrbZS7ouQXlSVntZVheFTmDs5TpIxtLqydyb/88wJjmrbrX2y4oQ ov4U984dl9LZvXw0T5QhmZKHQC7gK25kBx+Tx7jTgBcdw6jqz/RFUMf+jbtFNouE2z j05/BCVAbVYgHlXDu5mVeO0QU+Y6vDpNCVvCQR3I= From: madvenka@linux.microsoft.com To: kernel-hardening@lists.openwall.com, linux-api@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-integrity@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org, oleg@redhat.com, x86@kernel.org, madvenka@linux.microsoft.com Subject: [PATCH v1 1/4] [RFC] fs/trampfd: Implement the trampoline file descriptor API Date: Tue, 28 Jul 2020 08:10:47 -0500 Message-Id: <20200728131050.24443-2-madvenka@linux.microsoft.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200728131050.24443-1-madvenka@linux.microsoft.com> References: <20200728131050.24443-1-madvenka@linux.microsoft.com> From: "Madhavan T. Venkataraman" There are many applications that use trampoline code. Trampoline code is usually placed in a data page or a stack page. In order to execute a trampoline, the page that contains the trampoline needs to have execute permissions. Writable pages with execute permissions provide an attack surface for hackers. To mitigate this, LSMs such as SELinux may prevent a page from having both write and execute permissions. An application may attempt to circumvent this by writing the trampoline code into a temporary file and mapping the file into its process address space with just execute permissions. This presents the same opportunity to hackers as before. LSMs that implement cryptographic verification of files can prevent such temporary files from being mapped. Such security mitigations prevent genuine trampoline code from running as well. Typically, trampolines simply load some values in some registers and/or push some values on the stack and jump to a target PC. For such simple trampolines, an application could request the kernel to do that work instead of executing trampoline code to do that work. trampfd allows applications to do exactly this. Such applications can then run without having to relax security settings for them. For instance, libffi trampolines can easily be replaced by trampfd. libffi is used by a variety of applications. trampfd_create() system call ---------------------------- A new system call is introduced to create a trampoline. The system call number for this is 440. The system call is invoked like this: int trampfd; trampfd = syscall(440, type, data); type Trampoline type. data Trampoline type-specific data. Types of trampolines -------------------- Different types of trampolines can be defined based on the desired functionality. In this initial work, the following type is defined: TRAMPFD_USER This implements the simple trampoline type I referred to earlier. The type-specific structure for TRAMPFD_USER is struct trampfd_user. Trampoline contexts ------------------- A trampoline can have one or more contexts associated with it. Contexts are of two kinds: - Contexts that can be specified by the user. These can be added, retrieved and removed by user code. - Contexts that are specified by the kernel. This can only be added by the kernel. But these can be read by the user. In this initial work, I define the following contexts: User specifiable: Register Context ---------------- Contains register name-value pairs. When a trampoline is invoked, the specified values are loaded in the specified registers. This includes the value of the PC register. The kernel specifies the subset of registers that can be specified. Stack Context ------------- Contains data to push on the user stack when a trampoline is invoked. Allowed PCs ----------- This specifies a list of PCs that the trampoline is allowed to jump to. This prevents a hacker from modifying the trampoline's target PC. Kernel specified: Mapping parameters ------------------ Used to map a trampoline into an address space. Mapping parameters are determined by the kernel based on the trampoline type and type-specific information. Other contexts can be defined in the future. How to set and read contexts ---------------------------- A symbolic file offset is associated with each context type. TRAMPFD_MAP_OFFSET TRAMPFD_REGS_OFFSET TRAMPFD_STACK_OFFSET TRAMPFD_PCS_OFFSET A structure is defined for each context type as well: struct trampfd_map struct trampfd_regs struct trampfd_stack struct trampfd_pcs To set/retrieve a context, seek to the corresponding offset and write()/read() the corresponding structure. As a convenience, pread() and pwrite() can be used so it can be done in one call instead of two. Invoking a trampoline --------------------- Map the file descriptor into process address space using mmap(). The kernel returns an address to invoke the trampoline with. The protection for the mapping is set to PROT_NONE. Execute the trampoline in one of two ways depending upon what the target PC points to: - Branch to the trampoline address. - Use the trampoline address as a function pointer and call it. Because the user process does not have execute permissions on the trampoline address, it traps into the kernel. The kernel recognizes it as a trampoline invocation and performs the action indicated by the trampoline's type and context. In the case of TRAMPFD_USER, the kernel loads the user registers with the values specified in the register context, pushes the values specfied in the stack context on the user stack and sets the user PC to point to the PC register value in the register context. Then, the process returns to user land and continues execution at the target PC. Removing a context ------------------ To remove a context, write the context structure into trampfd but specify a zero context. For example, for register context, specify the number of registers as 0. For stack context, specify size of stack data as 0. Removing a trampoline --------------------- To remove a trampoline, unmap it and close the file descriptor. When the last reference on the trampoline goes away, the trampoline is freed. Sharing trampolines ------------------- A trampoline created by one thread can be used by other threads sharing the same address space. Trampolines, in general, may be shared across processes by the usual mechanism of sending the file descriptor to another process over a Unix domain socket. Architecture support -------------------- The handling of the trampoline page fault and the setting up of the register and stack contexts are architecture specific. Architecture specific patches will implement support for the API. The signal delivery code in the kernel already implements the elements needed for this work. That will be leveraged. Signed-off-by: Madhavan T. Venkataraman --- fs/Makefile | 1 + fs/trampfd/Makefile | 6 ++ fs/trampfd/trampfd_data.c | 43 ++++++++ fs/trampfd/trampfd_fops.c | 131 +++++++++++++++++++++++ fs/trampfd/trampfd_map.c | 78 ++++++++++++++ fs/trampfd/trampfd_pcs.c | 95 +++++++++++++++++ fs/trampfd/trampfd_regs.c | 137 ++++++++++++++++++++++++ fs/trampfd/trampfd_stack.c | 131 +++++++++++++++++++++++ fs/trampfd/trampfd_stubs.c | 41 +++++++ fs/trampfd/trampfd_syscall.c | 92 ++++++++++++++++ include/linux/syscalls.h | 3 + include/linux/trampfd.h | 82 ++++++++++++++ include/uapi/asm-generic/unistd.h | 4 +- include/uapi/linux/trampfd.h | 171 ++++++++++++++++++++++++++++++ init/Kconfig | 8 ++ kernel/sys_ni.c | 3 + 16 files changed, 1025 insertions(+), 1 deletion(-) create mode 100644 fs/trampfd/Makefile create mode 100644 fs/trampfd/trampfd_data.c create mode 100644 fs/trampfd/trampfd_fops.c create mode 100644 fs/trampfd/trampfd_map.c create mode 100644 fs/trampfd/trampfd_pcs.c create mode 100644 fs/trampfd/trampfd_regs.c create mode 100644 fs/trampfd/trampfd_stack.c create mode 100644 fs/trampfd/trampfd_stubs.c create mode 100644 fs/trampfd/trampfd_syscall.c create mode 100644 include/linux/trampfd.h create mode 100644 include/uapi/linux/trampfd.h diff --git a/fs/Makefile b/fs/Makefile index 2ce5112b02c8..227761302000 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -136,3 +136,4 @@ obj-$(CONFIG_EFIVAR_FS) += efivarfs/ obj-$(CONFIG_EROFS_FS) += erofs/ obj-$(CONFIG_VBOXSF_FS) += vboxsf/ obj-$(CONFIG_ZONEFS_FS) += zonefs/ +obj-$(CONFIG_TRAMPFD) += trampfd/ diff --git a/fs/trampfd/Makefile b/fs/trampfd/Makefile new file mode 100644 index 000000000000..bdf5e487facc --- /dev/null +++ b/fs/trampfd/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-$(CONFIG_TRAMPFD) += trampfd.o + +trampfd-y += trampfd_data.o trampfd_fops.o trampfd_map.o trampfd_pcs.o +trampfd-y += trampfd_regs.o trampfd_stack.o trampfd_stubs.o trampfd_syscall.o diff --git a/fs/trampfd/trampfd_data.c b/fs/trampfd/trampfd_data.c new file mode 100644 index 000000000000..0a316754cbe4 --- /dev/null +++ b/fs/trampfd/trampfd_data.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trampoline File Descriptor - Trampoline type-specific code. + * + * Author: Madhavan T. Venkataraman (madvenka@microsoft.com) + * + * Copyright (C) 2020 Microsoft Corporation. + */ + +#include +#include +#include +#include +#include + +int trampfd_create_data(struct trampfd *trampfd, const void __user *tramp_data) +{ + struct trampfd_map *map = &trampfd->map; + struct trampfd_user *user; + + if (trampfd->type == TRAMPFD_USER) { + user = kmalloc(sizeof(*user), GFP_KERNEL); + if (!user) + return -ENOMEM; + + if (copy_from_user(user, tramp_data, sizeof(*user))) { + kfree(user); + return -EFAULT; + } + if (user->flags || user->reserved) { + kfree(user); + return -EINVAL; + } + trampfd->data = user; + + map->size = PAGE_SIZE; + map->prot = PROT_NONE; + map->flags = MAP_PRIVATE; + map->offset = 0; + map->ioffset = 0; + } + return 0; +} diff --git a/fs/trampfd/trampfd_fops.c b/fs/trampfd/trampfd_fops.c new file mode 100644 index 000000000000..94b82e0da75b --- /dev/null +++ b/fs/trampfd/trampfd_fops.c @@ -0,0 +1,131 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trampoline File Descriptor - File operations. + * + * Author: Madhavan T. Venkataraman (madvenka@microsoft.com) + * + * Copyright (C) 2020 Microsoft Corporation. + */ + +#include +#include +#include +#include +#include + +#ifdef CONFIG_PROC_FS +static const char * const trampfd_type_names[TRAMPFD_NUM_TYPES] = { + "TRAMPFD_USER", +}; + +static void trampfd_show_fdinfo(struct seq_file *sfile, struct file *file) +{ + struct trampfd *trampfd = file->private_data; + + seq_printf(sfile, "type: %s\n", trampfd_type_names[trampfd->type]); +} +#endif + +static loff_t trampfd_llseek(struct file *file, loff_t offset, int whence) +{ + struct trampfd *trampfd = file->private_data; + + if (whence != SEEK_SET) + return -EINVAL; + + if ((offset < 0) || (offset >= TRAMPFD_NUM_OFFSETS)) + return -EINVAL; + + mutex_lock(&trampfd->lock); + if (offset != file->f_pos) { + file->f_pos = offset; + file->f_version = 0; + } + mutex_unlock(&trampfd->lock); + return offset; +} + +static ssize_t trampfd_read(struct file *file, char __user *arg, + size_t count, loff_t *ppos) +{ + int rc; + + if (!arg || !count) + return -EINVAL; + + switch (*ppos) { + case TRAMPFD_MAP_OFFSET: + rc = trampfd_get_map(file, arg, count); + break; + + case TRAMPFD_REGS_OFFSET: + rc = trampfd_get_regs(file, arg, count); + break; + + case TRAMPFD_STACK_OFFSET: + rc = trampfd_get_stack(file, arg, count); + break; + + default: + rc = -EINVAL; + goto out; + } +out: + return rc ? rc : (ssize_t) count; +} + +static ssize_t trampfd_write(struct file *file, const char __user *arg, + size_t count, loff_t *ppos) +{ + int rc; + + if (!arg || !count) + return -EINVAL; + + switch (*ppos) { + case TRAMPFD_REGS_OFFSET: + rc = trampfd_set_regs(file, arg, count); + break; + + case TRAMPFD_STACK_OFFSET: + rc = trampfd_set_stack(file, arg, count); + break; + + case TRAMPFD_ALLOWED_PCS_OFFSET: + rc = trampfd_set_allowed_pcs(file, arg, count); + break; + + default: + rc = -EINVAL; + goto out; + } +out: + return rc ? rc : (ssize_t) count; +} + +static int trampfd_release(struct inode *inode, struct file *file) +{ + struct trampfd *trampfd = file->private_data; + + if (trampfd->type == TRAMPFD_USER) { + kfree(trampfd->regs); + kfree(trampfd->stack); + kfree(trampfd->allowed_pcs); + } + kfree(trampfd->data); + mutex_destroy(&trampfd->lock); + kmem_cache_free(trampfd_cache, trampfd); + return 0; +} + +const struct file_operations trampfd_fops = { +#ifdef CONFIG_PROC_FS + .show_fdinfo = trampfd_show_fdinfo, +#endif + .llseek = trampfd_llseek, + .read = trampfd_read, + .write = trampfd_write, + .release = trampfd_release, + .mmap = trampfd_mmap, + .get_unmapped_area = trampfd_get_unmapped_area, +}; diff --git a/fs/trampfd/trampfd_map.c b/fs/trampfd/trampfd_map.c new file mode 100644 index 000000000000..1a156c850ca8 --- /dev/null +++ b/fs/trampfd/trampfd_map.c @@ -0,0 +1,78 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trampoline File Descriptor - Memory mapping. + * + * Author: Madhavan T. Venkataraman (madvenka@microsoft.com) + * + * Copyright (C) 2020 Microsoft Corporation. + */ + +#include +#include +#include +#include +#include +#include + +int trampfd_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct trampfd *trampfd = file->private_data; + + if (trampfd->type == TRAMPFD_USER) { + /* + * These mappings are special mappings that should not be + * merged or inherited. No physical page is currently allocated + * to these mappings. So, there is nothing to read/write. + * When the trampoline is invoked, an execute fault must be + * encountered so the kernel can intercept the invocation and + * set up user context. + */ + if (vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) + return -EINVAL; + vma->vm_flags = VM_SPECIAL | VM_DONTCOPY | VM_DONTDUMP; + } + vma->vm_private_data = trampfd; + return 0; +} + +unsigned long +trampfd_get_unmapped_area(struct file *file, unsigned long orig_addr, + unsigned long len, unsigned long pgoff, + unsigned long flags) +{ + struct trampfd *trampfd = file->private_data; + struct trampfd_map *map = &trampfd->map; + unsigned long map_pgoff = map->offset >> PAGE_SHIFT; + + const typeof_member(struct file_operations, get_unmapped_area) + get_area = current->mm->get_unmapped_area; + + if (len != map->size || pgoff != map_pgoff || (flags != map->flags)) + return -EINVAL; + + return get_area(file, orig_addr, len, pgoff, flags); +} + +/* + * Retrieve the mapping parameters of a trampoline. + */ +int trampfd_get_map(struct file *file, char __user *arg, size_t count) +{ + struct trampfd *trampfd = file->private_data; + + if (count != sizeof(trampfd->map)) + return -EINVAL; + if (copy_to_user(arg, &trampfd->map, count)) + return -EFAULT; + return 0; +} + +bool is_trampfd_vma(struct vm_area_struct *vma) +{ + struct file *file = vma->vm_file; + + if (!file) + return false; + return !strcmp(file->f_path.dentry->d_name.name, trampfd_name); +} +EXPORT_SYMBOL_GPL(is_trampfd_vma); diff --git a/fs/trampfd/trampfd_pcs.c b/fs/trampfd/trampfd_pcs.c new file mode 100644 index 000000000000..0ed36fd2169f --- /dev/null +++ b/fs/trampfd/trampfd_pcs.c @@ -0,0 +1,95 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trampoline File Descriptor - Allowed PCs context. + * + * Author: Madhavan T. Venkataraman (madvenka@microsoft.com) + * + * Copyright (C) 2020 Microsoft Corporation. + */ + +#include +#include +#include +#include + +/* + * Copy list of allowed PCs from the user and validate it. + */ +static int trampfd_copy_allowed_pcs(struct trampfd_values *allowed_pcs, + const void __user *arg, size_t count) +{ + u32 npcs; + size_t size; + u64 *values; + int i; + + if (copy_from_user(allowed_pcs, arg, count)) + return -EFAULT; + + if (allowed_pcs->reserved) + return -EINVAL; + + npcs = allowed_pcs->nvalues; + if (npcs > TRAMPFD_MAX_PCS) + return -EINVAL; + + size = sizeof(*allowed_pcs); + size += npcs * sizeof(u64); + if (size != count) + return -EINVAL; + + values = allowed_pcs->values; + for (i = 0; i < npcs; i++) { + if (!values[i]) + return -EINVAL; + } + + return 0; +} + +/* + * Set the allowed PCs for a trampoline. If the trampoline has a register + * context at this point, the PC register value in that register context is + * not checked against this list of allowed PCs. + */ +int trampfd_set_allowed_pcs(struct file *file, const char __user *arg, + size_t count) +{ + struct trampfd *trampfd = file->private_data; + struct trampfd_values *allowed_pcs, *cur_allowed_pcs; + int rc; + + if (count < sizeof(*allowed_pcs) || count > TRAMPFD_MAX_PCS_SIZE) + return -EINVAL; + + allowed_pcs = kmalloc(count, GFP_KERNEL); + if (!allowed_pcs) + return -ENOMEM; + + rc = trampfd_copy_allowed_pcs(allowed_pcs, arg, count); + if (rc) + goto out; + + /* + * If number of PCs is 0, there is no new PCS to set. + */ + if (!allowed_pcs->nvalues) { + kfree(allowed_pcs); + allowed_pcs = NULL; + } + + /* + * Swap the new PCs with the current one and free the current one, + * if any. + */ + mutex_lock(&trampfd->lock); + + cur_allowed_pcs = trampfd->allowed_pcs; + trampfd->allowed_pcs = allowed_pcs; + allowed_pcs = cur_allowed_pcs; + + mutex_unlock(&trampfd->lock); +out: + kfree(allowed_pcs); + return rc; +} diff --git a/fs/trampfd/trampfd_regs.c b/fs/trampfd/trampfd_regs.c new file mode 100644 index 000000000000..35114d647385 --- /dev/null +++ b/fs/trampfd/trampfd_regs.c @@ -0,0 +1,137 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trampoline File Descriptor - Register context. + * + * Author: Madhavan T. Venkataraman (madvenka@microsoft.com) + * + * Copyright (C) 2020 Microsoft Corporation. + */ + +#include +#include +#include +#include + +/* + * Copy context from the user and validate it. + */ +static int trampfd_copy_regs(struct trampfd_regs *regs, const void __user *arg, + size_t count) +{ + u32 nregs; + size_t size; + + if (copy_from_user(regs, arg, count)) + return -EFAULT; + + if (regs->reserved) + return -EINVAL; + + nregs = regs->nregs; + if (nregs > TRAMPFD_MAX_REGS) + return -EINVAL; + + size = sizeof(*regs); + size += nregs * sizeof(struct trampfd_reg); + if (size != count) + return -EINVAL; + + if (nregs && !trampfd_valid_regs(regs)) + return -EINVAL; + return 0; +} + +/* + * Set the register context for a trampoline. + */ +int trampfd_set_regs(struct file *file, const char __user *arg, size_t count) +{ + struct trampfd *trampfd = file->private_data; + struct trampfd_regs *regs, *cur_regs; + int rc; + + if (count < sizeof(*regs) || count > TRAMPFD_MAX_REGS_SIZE) + return -EINVAL; + + regs = kmalloc(count, GFP_KERNEL); + if (!regs) + return -ENOMEM; + + rc = trampfd_copy_regs(regs, arg, count); + if (rc) + goto out; + + /* + * If nregs is 0, there is no new register context to set. + */ + if (!regs->nregs) { + kfree(regs); + regs = NULL; + } + + /* + * Swap the new register context with the current one and free the + * current one, if any. + */ + mutex_lock(&trampfd->lock); + + /* + * Check if the specified PC is allowed. + */ + if (!regs || trampfd_allowed_pc(trampfd, regs)) { + cur_regs = trampfd->regs; + trampfd->regs = regs; + regs = cur_regs; + } else { + rc = -EINVAL; + } + + mutex_unlock(&trampfd->lock); +out: + kfree(regs); + return rc; +} + +/* + * Retrieve the register context of a trampoline. + */ +int trampfd_get_regs(struct file *file, char __user *arg, size_t count) +{ + struct trampfd *trampfd = file->private_data; + struct trampfd_regs *regs, *cur_regs; + size_t size; + int rc = 0; + + if (count < sizeof(*regs) || count > TRAMPFD_MAX_REGS_SIZE) + return -EINVAL; + + regs = kmalloc(count, GFP_KERNEL); + if (!regs) + return -ENOMEM; + + mutex_lock(&trampfd->lock); + + /* + * Copy the current register context into a local buffer so we can + * copy it to the user outside the lock. + */ + cur_regs = trampfd->regs; + if (cur_regs) { + size = sizeof(*cur_regs); + size += sizeof(struct trampfd_reg) * cur_regs->nregs; + if (size > count) + size = count; + memcpy(regs, cur_regs, size); + } else { + size = sizeof(*regs); + memset(regs, 0, size); + } + + mutex_unlock(&trampfd->lock); + + if (copy_to_user(arg, regs, size)) + rc = -EFAULT; + + kfree(regs); + return rc; +} diff --git a/fs/trampfd/trampfd_stack.c b/fs/trampfd/trampfd_stack.c new file mode 100644 index 000000000000..032c5ed70d57 --- /dev/null +++ b/fs/trampfd/trampfd_stack.c @@ -0,0 +1,131 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trampoline File Descriptor - Stack context. + * + * Author: Madhavan T. Venkataraman (madvenka@microsoft.com) + * + * Copyright (C) 2020 Microsoft Corporation. + */ + +#include +#include +#include +#include + +/* + * Copy context from the user and validate it. + */ +static int trampfd_copy_stack(struct trampfd_stack *stack, + const void __user *arg, size_t count) +{ + size_t size; + + if (copy_from_user(stack, arg, count)) + return -EFAULT; + + if (stack->reserved) + return -EINVAL; + + size = stack->size; + if (size > TRAMPFD_MAX_DATA_SIZE) + return -EINVAL; + + size += sizeof(*stack); + if (size != count) + return -EINVAL; + + if (!stack->size) + return 0; + + if ((stack->flags & ~TRAMPFD_SFLAGS) || + stack->offset > TRAMPFD_MAX_STACK_OFFSET) + return -EINVAL; + return 0; +} + +/* + * Set the register context for a trampoline. + */ +int trampfd_set_stack(struct file *file, const char __user *arg, size_t count) +{ + struct trampfd *trampfd = file->private_data; + struct trampfd_stack *stack, *cur_stack; + int rc; + + if (count < sizeof(*stack) || count > TRAMPFD_MAX_STACK_SIZE) + return -EINVAL; + + stack = kmalloc(count, GFP_KERNEL); + if (!stack) + return -ENOMEM; + + rc = trampfd_copy_stack(stack, arg, count); + if (rc) + goto out; + + /* + * If size is 0, there is no new stack context to set. + */ + if (!stack->size) { + kfree(stack); + stack = NULL; + } + + /* + * Swap the new stack context with the current one and free the + * current one, if any. + */ + mutex_lock(&trampfd->lock); + + cur_stack = trampfd->stack; + trampfd->stack = stack; + stack = cur_stack; + + mutex_unlock(&trampfd->lock); +out: + kfree(stack); + return rc; +} + +/* + * Retrieve the register context of a trampoline. + */ +int trampfd_get_stack(struct file *file, char __user *arg, size_t count) +{ + struct trampfd *trampfd = file->private_data; + struct trampfd_stack *stack, *cur_stack; + size_t size; + int rc = 0; + + if (count < sizeof(*stack) || count > TRAMPFD_MAX_STACK_SIZE) + return -EINVAL; + + stack = kmalloc(count, GFP_KERNEL); + if (!stack) + return -ENOMEM; + + mutex_lock(&trampfd->lock); + + /* + * Copy the current register context into a local buffer so we can + * copy it to the user outside the lock. + */ + cur_stack = trampfd->stack; + if (cur_stack) { + size = sizeof(*cur_stack) + cur_stack->size; + if (size > count) + size = count; + memcpy(stack, cur_stack, size); + } else { + size = sizeof(*stack); + memset(stack, 0, size); + } + + mutex_unlock(&trampfd->lock); + + if (copy_to_user(arg, stack, size)) + rc = -EFAULT; + + kfree(stack); + return rc; +} diff --git a/fs/trampfd/trampfd_stubs.c b/fs/trampfd/trampfd_stubs.c new file mode 100644 index 000000000000..8ca29dccbbf7 --- /dev/null +++ b/fs/trampfd/trampfd_stubs.c @@ -0,0 +1,41 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trampoline File Descriptor - Stub functions. + * + * Author: Madhavan T. Venkataraman (madvenka@microsoft.com) + * + * Copyright (C) 2020 Microsoft Corporation. + */ + +#include + +/* + * Stub for the arch function that checks if a trampoline type is supported + * by the architecture. Return an error for all types that require architecture + * support. Return success for the rest as they are generic. + */ +int __attribute__((weak)) trampfd_check_arch(struct trampfd *trampfd) +{ + if (trampfd->type == TRAMPFD_USER) + return -EINVAL; + return 0; +} + +/* + * Stub for the arch function that checks if a specified register context + * is valid. + */ +bool __attribute__((weak)) trampfd_valid_regs(struct trampfd_regs *regs) +{ + return false; +} + +/* + * Stub for the arch function that checks if the PC register in a specified + * register context is allowed. + */ +bool __attribute__((weak)) trampfd_allowed_pc(struct trampfd *trampfd, + struct trampfd_regs *regs) +{ + return false; +} diff --git a/fs/trampfd/trampfd_syscall.c b/fs/trampfd/trampfd_syscall.c new file mode 100644 index 000000000000..675460afc521 --- /dev/null +++ b/fs/trampfd/trampfd_syscall.c @@ -0,0 +1,92 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trampoline File Descriptor - System call. + * + * Author: Madhavan T. Venkataraman (madvenka@microsoft.com) + * + * Copyright (C) 2020 Microsoft Corporation. + */ + +#include +#include +#include +#include +#include +#include +#include + +char *trampfd_name = "[trampfd]"; + +struct kmem_cache *trampfd_cache; + +SYSCALL_DEFINE3(trampfd_create, + int, tramp_type, + const void __user *, tramp_data, + unsigned int, flags) +{ + struct trampfd *trampfd; + struct file *file; + int fd, rc = 0; + + if (!trampfd_cache) + return -ENOMEM; + + /* + * Flags are for future use. + */ + if (flags || !tramp_data) + return -EINVAL; + + if (tramp_type < 0 || tramp_type >= TRAMPFD_NUM_TYPES) + return -EINVAL; + + trampfd = kmem_cache_zalloc(trampfd_cache, GFP_KERNEL); + if (!trampfd) + return -ENOMEM; + + mutex_init(&trampfd->lock); + trampfd->type = tramp_type; + + rc = trampfd_create_data(trampfd, tramp_data); + if (rc) + goto freetramp; + + rc = trampfd_check_arch(trampfd); + if (rc) + goto freedata; + + rc = get_unused_fd_flags(O_CLOEXEC); + if (rc < 0) + goto freedata; + fd = rc; + + file = anon_inode_getfile(trampfd_name, &trampfd_fops, trampfd, O_RDWR); + if (IS_ERR(file)) { + rc = PTR_ERR(file); + goto freefd; + } + file->f_mode |= (FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE); + + fd_install(fd, file); + return fd; +freefd: + put_unused_fd(fd); +freedata: + kfree(trampfd->data); +freetramp: + kmem_cache_free(trampfd_cache, trampfd); + return rc; +} + +int __init trampfd_init(void) +{ + trampfd_cache = kmem_cache_create("trampfd_cache", + sizeof(struct trampfd), 0, SLAB_HWCACHE_ALIGN, NULL); + + if (trampfd_cache == NULL) { + pr_warn("%s: kmem_cache_create failed", __func__); + return -ENOMEM; + } + return 0; +} +core_initcall(trampfd_init); diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index b951a87da987..25ddf29477bc 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1005,6 +1005,9 @@ asmlinkage long sys_pidfd_send_signal(int pidfd, int sig, siginfo_t __user *info, unsigned int flags); asmlinkage long sys_pidfd_getfd(int pidfd, int fd, unsigned int flags); +asmlinkage long sys_trampfd_create(int tramp_type, + const void __user *tramp_data, + unsigned int flags); /* * Architecture-specific system calls diff --git a/include/linux/trampfd.h b/include/linux/trampfd.h new file mode 100644 index 000000000000..383d7eeda2d1 --- /dev/null +++ b/include/linux/trampfd.h @@ -0,0 +1,82 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Trampoline File Descriptor - Internal structures and definitions. + * + * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com) + * + * Copyright (c) 2020, Microsoft Corporation. + */ +#ifndef _LINUX_TRAMPFD_H +#define _LINUX_TRAMPFD_H + +#include + +#define TRAMPFD_MAX_REGS_SIZE \ + (sizeof(struct trampfd_regs) + \ + (sizeof(struct trampfd_reg) * TRAMPFD_MAX_REGS)) + +#define TRAMPFD_MAX_STACK_SIZE \ + (sizeof(struct trampfd_stack) + TRAMPFD_MAX_DATA_SIZE) + +#define TRAMPFD_MAX_PCS_SIZE \ + (sizeof(struct trampfd_values) + sizeof(u64) * TRAMPFD_MAX_PCS) + +/* + * Trampoline structure. + */ +struct trampfd { + struct mutex lock; /* to serialize access */ + enum trampfd_type type; /* type of trampoline */ + void *data; /* type specific data */ + struct trampfd_map map; /* mmap() parameters */ + struct trampfd_regs *regs; /* register context */ + struct trampfd_stack *stack; /* stack context */ + struct trampfd_values *allowed_pcs; /* allowed PCs */ +}; + +#ifdef CONFIG_TRAMPFD + +/* Trampoline mapping */ +int trampfd_mmap(struct file *file, struct vm_area_struct *vma); +unsigned long trampfd_get_unmapped_area(struct file *file, + unsigned long orig_addr, + unsigned long len, + unsigned long pgoff, + unsigned long flags); +bool is_trampfd_vma(struct vm_area_struct *vma); + +/* Trampoline context */ +int trampfd_get_map(struct file *file, char __user *arg, size_t count); +int trampfd_set_regs(struct file *file, const char __user *arg, size_t count); +int trampfd_get_regs(struct file *file, char __user *arg, size_t count); +int trampfd_set_stack(struct file *file, const char __user *arg, size_t count); +int trampfd_get_stack(struct file *file, char __user *arg, size_t count); +int trampfd_set_allowed_pcs(struct file *file, const char __user *arg, + size_t count); + +/* Arch functions */ +bool trampfd_fault(struct vm_area_struct *vma, struct pt_regs *pt_regs); +bool trampfd_valid_regs(struct trampfd_regs *regs); +bool trampfd_allowed_pc(struct trampfd *trampfd, struct trampfd_regs *regs); +int trampfd_check_arch(struct trampfd *trampfd); + +/* Trampoline type-specific */ +int trampfd_create_data(struct trampfd *trampfd, const void __user *tramp_data); + +extern char *trampfd_name; +extern struct kmem_cache *trampfd_cache; +extern const struct file_operations trampfd_fops; + +#define USERPTR(ptr) ((void __user *)(uintptr_t)(ptr)) + +#else + +static inline bool trampfd_fault(struct vm_area_struct *vma, + struct pt_regs *pt_regs) +{ + return false; +} + +#endif /* CONFIG_TRAMPFD */ + +#endif /* _LINUX_TRAMPFD_H */ diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index f4a01305d9a6..14e526a45624 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -857,9 +857,11 @@ __SYSCALL(__NR_openat2, sys_openat2) __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd) #define __NR_faccessat2 439 __SYSCALL(__NR_faccessat2, sys_faccessat2) +#define __NR_trampfd_create 440 +__SYSCALL(__NR_trampfd_create, sys_trampfd_create) #undef __NR_syscalls -#define __NR_syscalls 440 +#define __NR_syscalls 441 /* * 32 bit systems traditionally used different diff --git a/include/uapi/linux/trampfd.h b/include/uapi/linux/trampfd.h new file mode 100644 index 000000000000..bf9a6ef3683b --- /dev/null +++ b/include/uapi/linux/trampfd.h @@ -0,0 +1,171 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* + * Trampoline File Descriptor - API structures and definitions. + * + * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com) + * + * Copyright (c) 2020, Microsoft Corporation. + */ +#ifndef _UAPI_LINUX_TRAMPFD_H +#define _UAPI_LINUX_TRAMPFD_H + +#include +#include + +/* + * All structure fields are defined so that they are the same width and at the + * same structure offset on 32-bit and 64-bit to avoid compat code. + * + * All fields named "reserved" must be set to 0. They are there primarily for + * alignment. But they may be used in the future. + */ + +/* ------------------------- Types of Trampolines ------------------------- */ + +/* + * TRAMPFD_USER + * User programs use the kernel as a trampoline to setup a user context + * and jump to a user function. This trampoline type can be used to + * replace user trampoline code. + */ +enum trampfd_type { + TRAMPFD_USER, + TRAMPFD_NUM_TYPES, +}; + +/* ---------------------------- Context offsets ---------------------------- */ + +/* + * A trampoline has different types of context associated with it. Each context + * type has a symbolic offset into trampfd. The context can be read from or + * written to at its symbolic offset in trampfd. + * + * TRAMPFD_MAP_OFFSET + * To read trampoline mapping parameters - struct ktramp_map. + * + * TRAMPFD_REGS_OFFSET + * To read/write trampoline register context - struct ktramp_regs. + * + * TRAMPFD_STACK_OFFSET + * To read/write trampoline stack context - struct ktramp_stack. + * + * TRAMPFD_ALLOWED_PCS_OFFSET + * To write a list of allowed PCs - struct trampfd_values. + */ +enum trampfd_offsets { + TRAMPFD_MAP_OFFSET, + TRAMPFD_REGS_OFFSET, + TRAMPFD_STACK_OFFSET, + TRAMPFD_ALLOWED_PCS_OFFSET, + TRAMPFD_NUM_OFFSETS, +}; + +/* ------------------- Trampoline type specific data -------------------- */ + +/* + * For TRAMPFD_USER. + */ +struct trampfd_user { + __u32 flags; /* for future enhancements */ + __u32 reserved; +}; + +/* ------------------- Trampoline mapping parameters ---------------------- */ + +/* + * Since the kernel implements the trampoline object, the kernel specifies + * how a trampoline should be mapped. User code must obtain these parameters + * and do an mmap() to map the trampoline. The first four parameters are used + * in the mmap() call. User code must add ioffset to the address returned by + * mmap() to get the actual invocation address for the trampoline. + */ +struct trampfd_map { + __u32 size; /* Size of the mapping */ + __u32 prot; /* memory protection */ + __u32 flags; /* map flags */ + __u32 offset; /* file offset */ + __u32 ioffset; /* invocation offset */ + __u32 reserved; +}; + +/* -------------------------- Register context -------------------------- */ + +/* + * A register context may be specified for a trampoline, if applicable + * to the trampoline type. E.g., TRAMPFD_USER. The register context is + * an array of name-value pairs. When a trampoline is invoked, its user + * registers are loaded with the specified values. Register names are + * architecture specific and can be found in for architectures + * that support trampolines. Enumerations reg_32_name and reg_64_name in + * refer to 32-bit and 64-bit respectively. + */ +struct trampfd_reg { + __u32 name; /* Register name */ + __u32 reserved; + __u64 value; /* Register value */ +}; + +/* + * Register context. It is a variable sized structure sized by the number + * of registers. + */ +struct trampfd_regs { + __u32 nregs; /* Number of registers */ + __u32 reserved; + struct trampfd_reg regs[0]; /* Array of registers */ +}; + +#define TRAMPFD_MAX_REGS 40 + +/* ---------------------------- Stack context ---------------------------- */ + +/* + * A stack context may be specified for a trampoline, if applicable + * to the trampoline type. E.g., TRAMPFD_USER. The stack context contains + * a data buffer. When a trampoline is invoked, the specified data is pushed + * on the stack at a specified offset from the current stack pointer. + * Optionally, the stack pointer can be moved to the top of the data. + * + * This is a variable sized structure sized by the amount of data that is + * to be pushed on the user stack. + */ +struct trampfd_stack { + __u32 flags; /* TRAMPFD_SFLAGS */ + __u32 offset; /* Offset from top of stack */ + __u32 size; /* Size of data to push */ + __u32 reserved; + __u8 data[0]; /* Data to push on the stack */ +}; + +#define TRAMPFD_MAX_DATA_SIZE 64 +#define TRAMPFD_MAX_STACK_OFFSET 256 + +/* + * Stack context flags: + * + * TRAMPFD_SET_SP + * After pushing the data to user stack, move the stack pointer to the + * base of the data pushed. Note that the kernel will align the stack + * pointer based on the alignment requirements of the architecture. + */ +#define TRAMPFD_SET_SP 0x1 +#define TRAMPFD_SFLAGS (TRAMPFD_SET_SP) + +/* ---------------------------- Values context ---------------------------- */ + +/* + * Some contexts may be just a list of values. For instance, the user can + * specify a list of allowed PCs for a trampoline. The following structure + * is used for those contexts. + */ +struct trampfd_values { + __u32 nvalues; /* number of values */ + __u32 reserved; + __u64 values[0]; /* Array of values */ +}; + +#define TRAMPFD_MAX_PCS 16 + +/* -------------------------------------------------------------------------- */ + +#endif /* _UAPI_LINUX_TRAMPFD_H */ diff --git a/init/Kconfig b/init/Kconfig index 0498af567f70..783a0b98fce1 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2313,3 +2313,11 @@ config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE # . config ARCH_HAS_SYSCALL_WRAPPER def_bool n + +config TRAMPFD + bool "Enable trampfd_create() system call" + depends on MMU + help + Enable the trampfd_create() system call that allows a process to + map trampolines within its address space that can be invoked + with the help of the kernel. diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 3b69a560a7ac..136acf9234a3 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -349,6 +349,9 @@ COND_SYSCALL(pkey_mprotect); COND_SYSCALL(pkey_alloc); COND_SYSCALL(pkey_free); +/* Trampoline fd */ +COND_SYSCALL(trampfd_create); + /* * Architecture specific weak syscall entries. From patchwork Tue Jul 28 13:10:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Madhavan T. Venkataraman" X-Patchwork-Id: 11689149 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F041813B6 for ; Tue, 28 Jul 2020 13:11:38 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id F2F4A206D8 for ; Tue, 28 Jul 2020 13:11:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux.microsoft.com header.i=@linux.microsoft.com header.b="VG2Osgg9" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F2F4A206D8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.microsoft.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19463-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 26174 invoked by uid 550); 28 Jul 2020 13:11:15 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 25992 invoked from network); 28 Jul 2020 13:11:13 -0000 DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 5F08A20B490A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1595941862; bh=OZu1yPFbyEzIVjDAN8OqyOEkz38xe1dWw+AeD/DACgE=; h=From:To:Subject:Date:In-Reply-To:References:From; b=VG2Osgg9paicCBTa6gJkRmLr9Ki9+dDbqrV+zYqWR7A/SsLcPSkA9wYYE1gA6kPjm hC9FEcLgF+8jD9V9A31OBvIYHWE8Km10/AkOh3U4ExKlW7vBlzhzVJNYSU086AdSm+ 6kd70Oqr4vSqDmsjHrXWyaFjTW5WpOmmzKqjLAI0= From: madvenka@linux.microsoft.com To: kernel-hardening@lists.openwall.com, linux-api@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-integrity@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org, oleg@redhat.com, x86@kernel.org, madvenka@linux.microsoft.com Subject: [PATCH v1 2/4] [RFC] x86/trampfd: Provide support for the trampoline file descriptor Date: Tue, 28 Jul 2020 08:10:48 -0500 Message-Id: <20200728131050.24443-3-madvenka@linux.microsoft.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200728131050.24443-1-madvenka@linux.microsoft.com> References: <20200728131050.24443-1-madvenka@linux.microsoft.com> From: "Madhavan T. Venkataraman" Implement 32-bit and 64-bit X86 support for the trampoline file descriptor. - Define architecture specific register names - Handle the trampoline invocation page fault - Setup the user register context on trampoline invocation - Setup the user stack context on trampoline invocation Signed-off-by: Madhavan T. Venkataraman --- arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/x86/include/uapi/asm/ptrace.h | 38 +++ arch/x86/kernel/Makefile | 2 + arch/x86/kernel/trampfd.c | 313 +++++++++++++++++++++++++ arch/x86/mm/fault.c | 11 + 6 files changed, 366 insertions(+) create mode 100644 arch/x86/kernel/trampfd.c diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index d8f8a1a69ed1..77eb50414591 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -443,3 +443,4 @@ 437 i386 openat2 sys_openat2 438 i386 pidfd_getfd sys_pidfd_getfd 439 i386 faccessat2 sys_faccessat2 +440 i386 trampfd_create sys_trampfd_create diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 78847b32e137..9d962de1d21f 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -360,6 +360,7 @@ 437 common openat2 sys_openat2 438 common pidfd_getfd sys_pidfd_getfd 439 common faccessat2 sys_faccessat2 +440 common trampfd_create sys_trampfd_create # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/arch/x86/include/uapi/asm/ptrace.h b/arch/x86/include/uapi/asm/ptrace.h index 85165c0edafc..b031598f857e 100644 --- a/arch/x86/include/uapi/asm/ptrace.h +++ b/arch/x86/include/uapi/asm/ptrace.h @@ -9,6 +9,44 @@ #ifndef __ASSEMBLY__ +/* + * These register names are to be used by 32-bit applications. + */ +enum reg_32_name { + x32_eax, + x32_ebx, + x32_ecx, + x32_edx, + x32_esi, + x32_edi, + x32_ebp, + x32_eip, + x32_max, +}; + +/* + * These register names are to be used by 64-bit applications. + */ +enum reg_64_name { + x64_rax = x32_max, + x64_rbx, + x64_rcx, + x64_rdx, + x64_rsi, + x64_rdi, + x64_rbp, + x64_r8, + x64_r9, + x64_r10, + x64_r11, + x64_r12, + x64_r13, + x64_r14, + x64_r15, + x64_rip, + x64_max, +}; + #ifdef __i386__ /* this struct defines the way the registers are stored on the stack during a system call. */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index e77261db2391..5d968ac4c7d9 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -157,3 +157,5 @@ ifeq ($(CONFIG_X86_64),y) endif obj-$(CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT) += ima_arch.o + +obj-$(CONFIG_TRAMPFD) += trampfd.o diff --git a/arch/x86/kernel/trampfd.c b/arch/x86/kernel/trampfd.c new file mode 100644 index 000000000000..f6b5507134d2 --- /dev/null +++ b/arch/x86/kernel/trampfd.c @@ -0,0 +1,313 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trampoline File Descriptor - X86 support. + * + * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com) + * + * Copyright (c) 2020, Microsoft Corporation. + */ + +#include +#include +#include +#include + +/* ---------------------------- Register Context ---------------------------- */ + +static inline bool is_compat(void) +{ + return (IS_ENABLED(CONFIG_X86_32) || + (IS_ENABLED(CONFIG_COMPAT) && test_thread_flag(TIF_ADDR32))); +} + +static void set_reg_32(struct pt_regs *pt_regs, u32 name, u64 value) +{ + switch (name) { + case x32_eax: + pt_regs->ax = (unsigned long)value; + break; + case x32_ebx: + pt_regs->bx = (unsigned long)value; + break; + case x32_ecx: + pt_regs->cx = (unsigned long)value; + break; + case x32_edx: + pt_regs->dx = (unsigned long)value; + break; + case x32_esi: + pt_regs->si = (unsigned long)value; + break; + case x32_edi: + pt_regs->di = (unsigned long)value; + break; + case x32_ebp: + pt_regs->bp = (unsigned long)value; + break; + case x32_eip: + pt_regs->ip = (unsigned long)value; + break; + default: + WARN(1, "%s: Illegal register name %d\n", __func__, name); + break; + } +} + +#ifdef __i386__ + +static void set_reg_64(struct pt_regs *pt_regs, u32 name, u64 value) +{ +} + +#else + +static void set_reg_64(struct pt_regs *pt_regs, u32 name, u64 value) +{ + switch (name) { + case x64_rax: + pt_regs->ax = (unsigned long)value; + break; + case x64_rbx: + pt_regs->bx = (unsigned long)value; + break; + case x64_rcx: + pt_regs->cx = (unsigned long)value; + break; + case x64_rdx: + pt_regs->dx = (unsigned long)value; + break; + case x64_rsi: + pt_regs->si = (unsigned long)value; + break; + case x64_rdi: + pt_regs->di = (unsigned long)value; + break; + case x64_rbp: + pt_regs->bp = (unsigned long)value; + break; + case x64_r8: + pt_regs->r8 = (unsigned long)value; + break; + case x64_r9: + pt_regs->r9 = (unsigned long)value; + break; + case x64_r10: + pt_regs->r10 = (unsigned long)value; + break; + case x64_r11: + pt_regs->r11 = (unsigned long)value; + break; + case x64_r12: + pt_regs->r12 = (unsigned long)value; + break; + case x64_r13: + pt_regs->r13 = (unsigned long)value; + break; + case x64_r14: + pt_regs->r14 = (unsigned long)value; + break; + case x64_r15: + pt_regs->r15 = (unsigned long)value; + break; + case x64_rip: + pt_regs->ip = (unsigned long)value; + break; + default: + WARN(1, "%s: Illegal register name %d\n", __func__, name); + break; + } +} + +#endif /* __i386__ */ + +static void set_regs(struct pt_regs *pt_regs, struct trampfd_regs *tregs) +{ + struct trampfd_reg *reg = tregs->regs; + struct trampfd_reg *reg_end = reg + tregs->nregs; + bool compat = is_compat(); + + for (; reg < reg_end; reg++) { + if (compat) + set_reg_32(pt_regs, reg->name, reg->value); + else + set_reg_64(pt_regs, reg->name, reg->value); + } +} + +/* + * Check if the register names are valid. Check if the user PC has been set. + */ +bool trampfd_valid_regs(struct trampfd_regs *tregs) +{ + struct trampfd_reg *reg = tregs->regs; + struct trampfd_reg *reg_end = reg + tregs->nregs; + int min, max, pc_name; + bool pc_set = false; + + if (is_compat()) { + min = 0; + pc_name = x32_eip; + max = x32_max; + } else { + min = x32_max; + pc_name = x64_rip; + max = x64_max; + } + + for (; reg < reg_end; reg++) { + if (reg->name < min || reg->name >= max || reg->reserved) + return false; + if (reg->name == pc_name && reg->value) + pc_set = true; + } + return pc_set; +} +EXPORT_SYMBOL_GPL(trampfd_valid_regs); + +/* + * Check if the PC specified in a register context is allowed. + */ +bool trampfd_allowed_pc(struct trampfd *trampfd, struct trampfd_regs *tregs) +{ + struct trampfd_reg *reg = tregs->regs; + struct trampfd_reg *reg_end = reg + tregs->nregs; + struct trampfd_values *allowed_pcs = trampfd->allowed_pcs; + u64 *allowed_values, pc_value = 0; + u32 nvalues, pc_name; + int i; + + if (!allowed_pcs) + return true; + + pc_name = is_compat() ? x32_eip : x64_rip; + + /* + * Find the PC register and its value. If the PC register has been + * specified multiple times, only the last one counts. + */ + for (; reg < reg_end; reg++) { + if (reg->name == pc_name) + pc_value = reg->value; + } + + allowed_values = allowed_pcs->values; + nvalues = allowed_pcs->nvalues; + + for (i = 0; i < nvalues; i++) { + if (pc_value == allowed_values[i]) + return true; + } + return false; +} +EXPORT_SYMBOL_GPL(trampfd_allowed_pc); + +/* ---------------------------- Stack Context ---------------------------- */ + +static int push_data(struct pt_regs *pt_regs, struct trampfd_stack *tstack) +{ + unsigned long sp; + + sp = user_stack_pointer(pt_regs) - tstack->size - tstack->offset; + if (tstack->flags & TRAMPFD_SET_SP) { + if (is_compat()) + sp = ((sp + 4) & -16ul) - 4; + else + sp = round_down(sp, 16) - 8; + } + + if (!access_ok(sp, user_stack_pointer(pt_regs) - sp)) + return -EFAULT; + + if (copy_to_user(USERPTR(sp), tstack->data, tstack->size)) + return -EFAULT; + + if (tstack->flags & TRAMPFD_SET_SP) + user_stack_pointer_set(pt_regs, sp); + + return 0; +} + +/* ---------------------------- Fault Handlers ---------------------------- */ + +static int trampfd_user_fault(struct trampfd *trampfd, + struct vm_area_struct *vma, + struct pt_regs *pt_regs) +{ + char buf[TRAMPFD_MAX_STACK_SIZE]; + struct trampfd_regs *tregs; + struct trampfd_stack *tstack = NULL; + unsigned long addr; + size_t size; + int rc = 0; + + mutex_lock(&trampfd->lock); + + /* + * Execution of the trampoline must start at the offset specfied by + * the kernel. + */ + addr = vma->vm_start + trampfd->map.ioffset; + if (addr != pt_regs->ip) { + rc = -EINVAL; + goto unlock; + } + + /* + * At a minimum, the user PC register must be specified for a + * user trampoline. + */ + tregs = trampfd->regs; + if (!tregs) { + rc = -EINVAL; + goto unlock; + } + + /* + * Set the register context for the trampoline. + */ + set_regs(pt_regs, tregs); + + if (trampfd->stack) { + /* + * Copy the stack context into a local buffer and push stack + * data after dropping the lock. + */ + size = sizeof(*trampfd->stack) + trampfd->stack->size; + tstack = (struct trampfd_stack *) buf; + memcpy(tstack, trampfd->stack, size); + } +unlock: + mutex_unlock(&trampfd->lock); + + if (!rc && tstack) { + mmap_read_unlock(vma->vm_mm); + rc = push_data(pt_regs, tstack); + mmap_read_lock(vma->vm_mm); + } + return rc; +} + +/* + * Handle it if it is a trampoline fault. + */ +bool trampfd_fault(struct vm_area_struct *vma, struct pt_regs *pt_regs) +{ + struct trampfd *trampfd; + + if (!is_trampfd_vma(vma)) + return false; + trampfd = vma->vm_private_data; + + if (trampfd->type == TRAMPFD_USER) + return !trampfd_user_fault(trampfd, vma, pt_regs); + return false; +} +EXPORT_SYMBOL_GPL(trampfd_fault); + +/* ------------------------- Arch Initialization ------------------------- */ + +int trampfd_check_arch(struct trampfd *trampfd) +{ + return 0; +} +EXPORT_SYMBOL_GPL(trampfd_check_arch); diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 1ead568c0101..a1432ee2a1a2 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -18,6 +18,7 @@ #include /* faulthandler_disabled() */ #include /* efi_recover_from_page_fault()*/ #include +#include /* trampoline invocation */ #include /* boot_cpu_has, ... */ #include /* dotraplinkage, ... */ @@ -1142,6 +1143,7 @@ void do_user_addr_fault(struct pt_regs *regs, struct mm_struct *mm; vm_fault_t fault, major = 0; unsigned int flags = FAULT_FLAG_DEFAULT; + unsigned long tflags = X86_PF_INSTR | X86_PF_USER; tsk = current; mm = tsk->mm; @@ -1275,6 +1277,15 @@ void do_user_addr_fault(struct pt_regs *regs, */ good_area: if (unlikely(access_error(hw_error_code, vma))) { + /* + * If it is a user execute fault, it could be a trampoline + * invocation. + */ + if ((hw_error_code & tflags) == tflags && + trampfd_fault(vma, regs)) { + mmap_read_unlock(mm); + return; + } bad_area_access_error(regs, hw_error_code, address, vma); return; } From patchwork Tue Jul 28 13:10:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Madhavan T. Venkataraman" X-Patchwork-Id: 11689151 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9FDE5138A for ; Tue, 28 Jul 2020 13:11:50 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id A0591206D4 for ; Tue, 28 Jul 2020 13:11:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux.microsoft.com header.i=@linux.microsoft.com header.b="Zelt39lG" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A0591206D4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.microsoft.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19464-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 26215 invoked by uid 550); 28 Jul 2020 13:11:16 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 26035 invoked from network); 28 Jul 2020 13:11:14 -0000 DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 35EF720B490C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1595941862; bh=8Vxha6xG5iZEw3wvrMCm+5RmpGmvMZada0wtXeD9y+A=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Zelt39lGKsDTqpf+B5KIBob2G2aSNP3d2dICWZ2pAI4O7+mYJ1o3rMJZM+ta1LBFe My84y/3nj6OMcfg6VvYOSJHbwcHzyazxuBzfljnbMntH9BAE8jiqNR+HBZMvy6tC87 CEAIa/3w8Z+GlTF6lqrgxbBMstTNru2mXoJYPUTU= From: madvenka@linux.microsoft.com To: kernel-hardening@lists.openwall.com, linux-api@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-integrity@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org, oleg@redhat.com, x86@kernel.org, madvenka@linux.microsoft.com Subject: [PATCH v1 3/4] [RFC] arm64/trampfd: Provide support for the trampoline file descriptor Date: Tue, 28 Jul 2020 08:10:49 -0500 Message-Id: <20200728131050.24443-4-madvenka@linux.microsoft.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200728131050.24443-1-madvenka@linux.microsoft.com> References: <20200728131050.24443-1-madvenka@linux.microsoft.com> From: "Madhavan T. Venkataraman" Implement 64-bit ARM support for the trampoline file descriptor. - Define architecture specific register names - Handle the trampoline invocation page fault - Setup the user register context on trampoline invocation - Setup the user stack context on trampoline invocation Signed-off-by: Madhavan T. Venkataraman --- arch/arm64/include/asm/ptrace.h | 9 + arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 + arch/arm64/include/uapi/asm/ptrace.h | 57 ++++++ arch/arm64/kernel/Makefile | 2 + arch/arm64/kernel/trampfd.c | 278 +++++++++++++++++++++++++++ arch/arm64/mm/fault.c | 15 +- 7 files changed, 361 insertions(+), 4 deletions(-) create mode 100644 arch/arm64/kernel/trampfd.c diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index 953b6a1ce549..dad6cdbd59c6 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -232,6 +232,15 @@ static inline unsigned long user_stack_pointer(struct pt_regs *regs) return regs->sp; } +static inline void user_stack_pointer_set(struct pt_regs *regs, + unsigned long val) +{ + if (compat_user_mode(regs)) + regs->compat_sp = val; + else + regs->sp = val; +} + extern int regs_query_register_offset(const char *name); extern unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs, unsigned int n); diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 3b859596840d..b3b2019f8d16 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -38,7 +38,7 @@ #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 440 +#define __NR_compat_syscalls 441 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index 6d95d0c8bf2f..821ddcaf9683 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -885,6 +885,8 @@ __SYSCALL(__NR_openat2, sys_openat2) __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd) #define __NR_faccessat2 439 __SYSCALL(__NR_faccessat2, sys_faccessat2) +#define __NR_trampfd_create 440 +__SYSCALL(__NR_trampfd_create, sys_trampfd_create) /* * Please add new compat syscalls above this comment and update diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h index 42cbe34d95ce..f4d1974dd795 100644 --- a/arch/arm64/include/uapi/asm/ptrace.h +++ b/arch/arm64/include/uapi/asm/ptrace.h @@ -88,6 +88,63 @@ struct user_pt_regs { __u64 pstate; }; +/* + * These register names are to be used by 32-bit applications. + */ +enum reg_32_name { + arm_r0, + arm_r1, + arm_r2, + arm_r3, + arm_r4, + arm_r5, + arm_r6, + arm_r7, + arm_r8, + arm_r9, + arm_r10, + arm_ip, + arm_pc, + arm_max, +}; + +/* + * These register names are to be used by 64-bit applications. + */ +enum reg_64_name { + arm64_r0 = arm_max, + arm64_r1, + arm64_r2, + arm64_r3, + arm64_r4, + arm64_r5, + arm64_r6, + arm64_r7, + arm64_r8, + arm64_r9, + arm64_r10, + arm64_r11, + arm64_r12, + arm64_r13, + arm64_r14, + arm64_r15, + arm64_r16, + arm64_r17, + arm64_r18, + arm64_r19, + arm64_r20, + arm64_r21, + arm64_r22, + arm64_r23, + arm64_r24, + arm64_r25, + arm64_r26, + arm64_r27, + arm64_r28, + arm64_pc, + arm64_max, +}; + struct user_fpsimd_state { __uint128_t vregs[32]; __u32 fpsr; diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index a561cbb91d4d..18d373fb1208 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -71,3 +71,5 @@ extra-y += $(head-y) vmlinux.lds ifeq ($(CONFIG_DEBUG_EFI),y) AFLAGS_head.o += -DVMLINUX_PATH="\"$(realpath $(objtree)/vmlinux)\"" endif + +obj-$(CONFIG_TRAMPFD) += trampfd.o diff --git a/arch/arm64/kernel/trampfd.c b/arch/arm64/kernel/trampfd.c new file mode 100644 index 000000000000..d79e749e0c30 --- /dev/null +++ b/arch/arm64/kernel/trampfd.c @@ -0,0 +1,278 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trampoline File Descriptor - ARM64 support. + * + * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com) + * + * Copyright (c) 2020, Microsoft Corporation. + */ + +#include +#include +#include + +/* ---------------------------- Register Context ---------------------------- */ + +static inline bool is_compat(void) +{ + return is_compat_thread(task_thread_info(current)); +} + +static void set_reg_32(struct pt_regs *pt_regs, u32 name, u64 value) +{ + switch (name) { + case arm_r0: + case arm_r1: + case arm_r2: + case arm_r3: + case arm_r4: + case arm_r5: + case arm_r6: + case arm_r7: + case arm_r8: + case arm_r9: + case arm_r10: + pt_regs->regs[name] = (__u64)value; + break; + case arm_ip: + pt_regs->regs[arm64_r16 - arm_max] = (__u64)value; + break; + case arm_pc: + pt_regs->pc = (__u64)value; + break; + default: + WARN(1, "%s: Illegal register name %d\n", __func__, name); + break; + } +} + +static void set_reg_64(struct pt_regs *pt_regs, u32 name, u64 value) +{ + switch (name) { + case arm64_r0: + case arm64_r1: + case arm64_r2: + case arm64_r3: + case arm64_r4: + case arm64_r5: + case arm64_r6: + case arm64_r7: + case arm64_r8: + case arm64_r9: + case arm64_r10: + case arm64_r11: + case arm64_r12: + case arm64_r13: + case arm64_r14: + case arm64_r15: + case arm64_r16: + case arm64_r17: + case arm64_r18: + case arm64_r19: + case arm64_r20: + case arm64_r21: + case arm64_r22: + case arm64_r23: + case arm64_r24: + case arm64_r25: + case arm64_r26: + case arm64_r27: + case arm64_r28: + pt_regs->regs[name - arm_max] = (__u64)value; + break; + case arm64_pc: + pt_regs->pc = (__u64)value; + break; + default: + WARN(1, "%s: Illegal register name %d\n", __func__, name); + break; + } +} + +static void set_regs(struct pt_regs *pt_regs, struct trampfd_regs *tregs) +{ + struct trampfd_reg *reg = tregs->regs; + struct trampfd_reg *reg_end = reg + tregs->nregs; + bool compat = is_compat(); + + for (; reg < reg_end; reg++) { + if (compat) + set_reg_32(pt_regs, reg->name, reg->value); + else + set_reg_64(pt_regs, reg->name, reg->value); + } +} + +/* + * Check if the register names are valid. Check if the user PC has been set. + */ +bool trampfd_valid_regs(struct trampfd_regs *tregs) +{ + struct trampfd_reg *reg = tregs->regs; + struct trampfd_reg *reg_end = reg + tregs->nregs; + int min, max, pc_name; + bool pc_set = false; + + if (is_compat()) { + min = 0; + pc_name = arm_pc; + max = arm_max; + } else { + min = arm_max; + pc_name = arm64_pc; + max = arm64_max; + } + + for (; reg < reg_end; reg++) { + if (reg->name < min || reg->name >= max || reg->reserved) + return false; + if (reg->name == pc_name && reg->value) + pc_set = true; + } + return pc_set; +} +EXPORT_SYMBOL_GPL(trampfd_valid_regs); + +/* + * Check if the PC specified in a register context is allowed. + */ +bool trampfd_allowed_pc(struct trampfd *trampfd, struct trampfd_regs *tregs) +{ + struct trampfd_reg *reg = tregs->regs; + struct trampfd_reg *reg_end = reg + tregs->nregs; + struct trampfd_values *allowed_pcs = trampfd->allowed_pcs; + u64 *allowed_values, pc_value = 0; + u32 nvalues, pc_name; + int i; + + if (!allowed_pcs) + return true; + + pc_name = is_compat() ? arm_pc : arm64_pc; + + /* + * Find the PC register and its value. If the PC register has been + * specified multiple times, only the last one counts. + */ + for (; reg < reg_end; reg++) { + if (reg->name == pc_name) + pc_value = reg->value; + } + + allowed_values = allowed_pcs->values; + nvalues = allowed_pcs->nvalues; + + for (i = 0; i < nvalues; i++) { + if (pc_value == allowed_values[i]) + return true; + } + return false; +} +EXPORT_SYMBOL_GPL(trampfd_allowed_pc); + +/* ---------------------------- Stack Context ---------------------------- */ + +static int push_data(struct pt_regs *pt_regs, struct trampfd_stack *tstack) +{ + unsigned long sp; + + sp = user_stack_pointer(pt_regs) - tstack->size - tstack->offset; + if (tstack->flags & TRAMPFD_SET_SP) + sp = round_down(sp, 16); + + if (!access_ok((void *)sp, user_stack_pointer(pt_regs) - sp)) + return -EFAULT; + + if (copy_to_user(USERPTR(sp), tstack->data, tstack->size)) + return -EFAULT; + + if (tstack->flags & TRAMPFD_SET_SP) + user_stack_pointer_set(pt_regs, sp); + + return 0; +} + +/* ---------------------------- Fault Handlers ---------------------------- */ + +static int trampfd_user_fault(struct trampfd *trampfd, + struct vm_area_struct *vma, + struct pt_regs *pt_regs) +{ + char buf[TRAMPFD_MAX_STACK_SIZE]; + struct trampfd_regs *tregs; + struct trampfd_stack *tstack = NULL; + unsigned long addr; + size_t size; + int rc = 0; + + mutex_lock(&trampfd->lock); + + /* + * Execution of the trampoline must start at the offset specfied by + * the kernel. + */ + addr = vma->vm_start + trampfd->map.ioffset; + if (addr != pt_regs->pc) { + rc = -EINVAL; + goto unlock; + } + + /* + * At a minimum, the user PC register must be specified for a + * user trampoline. + */ + tregs = trampfd->regs; + if (!tregs) { + rc = -EINVAL; + goto unlock; + } + + /* + * Set the register context for the trampoline. + */ + set_regs(pt_regs, tregs); + + if (trampfd->stack) { + /* + * Copy the stack context into a local buffer and push stack + * data after dropping the lock. + */ + size = sizeof(*trampfd->stack) + trampfd->stack->size; + tstack = (struct trampfd_stack *) buf; + memcpy(tstack, trampfd->stack, size); + } +unlock: + mutex_unlock(&trampfd->lock); + + if (!rc && tstack) { + mmap_read_unlock(vma->vm_mm); + rc = push_data(pt_regs, tstack); + mmap_read_lock(vma->vm_mm); + } + return rc; +} + +/* + * Handle it if it is a trampoline fault. + */ +bool trampfd_fault(struct vm_area_struct *vma, struct pt_regs *pt_regs) +{ + struct trampfd *trampfd; + + if (!is_trampfd_vma(vma)) + return false; + trampfd = vma->vm_private_data; + + if (trampfd->type == TRAMPFD_USER) + return !trampfd_user_fault(trampfd, vma, pt_regs); + return false; +} +EXPORT_SYMBOL_GPL(trampfd_fault); + +/* ---------------------------- Miscellaneous ---------------------------- */ + +int trampfd_check_arch(struct trampfd *trampfd) +{ + return 0; +} +EXPORT_SYMBOL_GPL(trampfd_check_arch); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 8afb238ff335..6e5e3193919a 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include @@ -404,7 +405,8 @@ static void do_bad_area(unsigned long addr, unsigned int esr, struct pt_regs *re #define VM_FAULT_BADACCESS 0x020000 static vm_fault_t __do_page_fault(struct mm_struct *mm, unsigned long addr, - unsigned int mm_flags, unsigned long vm_flags) + unsigned int mm_flags, unsigned long vm_flags, + struct pt_regs *regs) { struct vm_area_struct *vma = find_vma(mm, addr); @@ -426,8 +428,15 @@ static vm_fault_t __do_page_fault(struct mm_struct *mm, unsigned long addr, * Check that the permissions on the VMA allow for the fault which * occurred. */ - if (!(vma->vm_flags & vm_flags)) + if (!(vma->vm_flags & vm_flags)) { + /* + * If it is an execute fault, it could be a trampoline + * invocation. + */ + if ((vm_flags & VM_EXEC) && trampfd_fault(vma, regs)) + return 0; return VM_FAULT_BADACCESS; + } return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags); } @@ -516,7 +525,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, #endif } - fault = __do_page_fault(mm, addr, mm_flags, vm_flags); + fault = __do_page_fault(mm, addr, mm_flags, vm_flags, regs); major |= fault & VM_FAULT_MAJOR; /* Quick path to respond to signals */ From patchwork Tue Jul 28 13:10:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Madhavan T. Venkataraman" X-Patchwork-Id: 11689157 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8ED9D138A for ; Tue, 28 Jul 2020 13:12:03 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id BC1D2206D4 for ; Tue, 28 Jul 2020 13:12:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux.microsoft.com header.i=@linux.microsoft.com header.b="bbIPUm/u" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BC1D2206D4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.microsoft.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-19465-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 26278 invoked by uid 550); 28 Jul 2020 13:11:18 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 26172 invoked from network); 28 Jul 2020 13:11:15 -0000 DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 0DFED20B490D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1595941863; bh=13WUf+vFpSs1y88gZGDgG7+CzTTQoGQm+q9JUkDCSOE=; h=From:To:Subject:Date:In-Reply-To:References:From; b=bbIPUm/uLvpqe3t3Tf9S0PC6sBFZYRseudQhdj0zWfh9UFhjeA3f5v1vtTbJknmnE vC+9xAiSB5k2bQnOl/iHhPvLrBV1I5NyB8fgu21tv1quHoTn+ez0O5xM5W8ZVaa8U6 2Bpg2W2/j8pUMh2052OAGTVnCE6cifMP27GIxgaY= From: madvenka@linux.microsoft.com To: kernel-hardening@lists.openwall.com, linux-api@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-integrity@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org, oleg@redhat.com, x86@kernel.org, madvenka@linux.microsoft.com Subject: [PATCH v1 4/4] [RFC] arm/trampfd: Provide support for the trampoline file descriptor Date: Tue, 28 Jul 2020 08:10:50 -0500 Message-Id: <20200728131050.24443-5-madvenka@linux.microsoft.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200728131050.24443-1-madvenka@linux.microsoft.com> References: <20200728131050.24443-1-madvenka@linux.microsoft.com> From: "Madhavan T. Venkataraman" Implement 32-bit ARM support for the trampoline file descriptor. - Define architecture specific register names - Handle the trampoline invocation page fault - Setup the user register context on trampoline invocation - Setup the user stack context on trampoline invocation Signed-off-by: Madhavan T. Venkataraman --- arch/arm/include/uapi/asm/ptrace.h | 20 +++ arch/arm/kernel/Makefile | 1 + arch/arm/kernel/trampfd.c | 214 +++++++++++++++++++++++++++++ arch/arm/mm/fault.c | 12 +- arch/arm/tools/syscall.tbl | 1 + 5 files changed, 246 insertions(+), 2 deletions(-) create mode 100644 arch/arm/kernel/trampfd.c diff --git a/arch/arm/include/uapi/asm/ptrace.h b/arch/arm/include/uapi/asm/ptrace.h index e61c65b4018d..47b1c5e2f32c 100644 --- a/arch/arm/include/uapi/asm/ptrace.h +++ b/arch/arm/include/uapi/asm/ptrace.h @@ -151,6 +151,26 @@ struct pt_regs { #define ARM_r0 uregs[0] #define ARM_ORIG_r0 uregs[17] +/* + * These register names are to be used by 32-bit applications. + */ +enum reg_32_name { + arm_r0, + arm_r1, + arm_r2, + arm_r3, + arm_r4, + arm_r5, + arm_r6, + arm_r7, + arm_r8, + arm_r9, + arm_r10, + arm_ip, + arm_pc, + arm_max, +}; + /* * The size of the user-visible VFP state as seen by PTRACE_GET/SETVFPREGS * and core dumps. diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile index 89e5d864e923..652c54c2f19a 100644 --- a/arch/arm/kernel/Makefile +++ b/arch/arm/kernel/Makefile @@ -105,5 +105,6 @@ obj-$(CONFIG_SMP) += psci_smp.o endif obj-$(CONFIG_HAVE_ARM_SMCCC) += smccc-call.o +obj-$(CONFIG_TRAMPFD) += trampfd.o extra-y := $(head-y) vmlinux.lds diff --git a/arch/arm/kernel/trampfd.c b/arch/arm/kernel/trampfd.c new file mode 100644 index 000000000000..50fc5706e85b --- /dev/null +++ b/arch/arm/kernel/trampfd.c @@ -0,0 +1,214 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trampoline File Descriptor - ARM support. + * + * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com) + * + * Copyright (c) 2020, Microsoft Corporation. + */ + +#include +#include +#include + +/* ---------------------------- Register Context ---------------------------- */ + +static void set_reg(long *uregs, u32 name, u64 value) +{ + switch (name) { + case arm_r0: + case arm_r1: + case arm_r2: + case arm_r3: + case arm_r4: + case arm_r5: + case arm_r6: + case arm_r7: + case arm_r8: + case arm_r9: + case arm_r10: + uregs[name] = (__u64)value; + break; + case arm_ip: + ARM_ip = (__u64)value; + break; + case arm_pc: + ARM_pc = (__u64)value; + break; + default: + WARN(1, "%s: Illegal register name %d\n", __func__, name); + break; + } +} + +static void set_regs(long *uregs, struct trampfd_regs *tregs) +{ + struct trampfd_reg *reg = tregs->regs; + struct trampfd_reg *reg_end = reg + tregs->nregs; + + for (; reg < reg_end; reg++) + set_reg(uregs, reg->name, reg->value); +} + +/* + * Check if the register names are valid. Check if the user PC has been set. + */ +bool trampfd_valid_regs(struct trampfd_regs *tregs) +{ + struct trampfd_reg *reg = tregs->regs; + struct trampfd_reg *reg_end = reg + tregs->nregs; + bool pc_set = false; + + for (; reg < reg_end; reg++) { + if (reg->name >= arm_max || reg->reserved) + return false; + if (reg->name == arm_pc && reg->value) + pc_set = true; + } + return pc_set; +} +EXPORT_SYMBOL_GPL(trampfd_valid_regs); + +/* + * Check if the PC specified in a register context is allowed. + */ +bool trampfd_allowed_pc(struct trampfd *trampfd, struct trampfd_regs *tregs) +{ + struct trampfd_reg *reg = tregs->regs; + struct trampfd_reg *reg_end = reg + tregs->nregs; + struct trampfd_values *allowed_pcs = trampfd->allowed_pcs; + u64 *allowed_values, pc_value = 0; + u32 nvalues, pc_name; + int i; + + if (!allowed_pcs) + return true; + + pc_name = arm_pc; + + /* + * Find the PC register and its value. If the PC register has been + * specified multiple times, only the last one counts. + */ + for (; reg < reg_end; reg++) { + if (reg->name == pc_name) + pc_value = reg->value; + } + + allowed_values = allowed_pcs->values; + nvalues = allowed_pcs->nvalues; + + for (i = 0; i < nvalues; i++) { + if (pc_value == allowed_values[i]) + return true; + } + return false; +} +EXPORT_SYMBOL_GPL(trampfd_allowed_pc); + +/* ---------------------------- Stack Context ---------------------------- */ + +static int push_data(long *uregs, struct trampfd_stack *tstack) +{ + unsigned long sp; + + sp = ARM_sp - tstack->size - tstack->offset; + if (tstack->flags & TRAMPFD_SET_SP) + sp &= ~7; + + if (!access_ok(sp, ARM_sp - sp)) + return -EFAULT; + + if (copy_to_user(USERPTR(sp), tstack->data, tstack->size)) + return -EFAULT; + + if (tstack->flags & TRAMPFD_SET_SP) + ARM_sp = sp; + return 0; +} + +/* ---------------------------- Fault Handlers ---------------------------- */ + +static int trampfd_user_fault(struct trampfd *trampfd, + struct vm_area_struct *vma, + long *uregs) +{ + char buf[TRAMPFD_MAX_STACK_SIZE]; + struct trampfd_regs *tregs; + struct trampfd_stack *tstack = NULL; + unsigned long addr; + size_t size; + int rc; + + mutex_lock(&trampfd->lock); + + /* + * Execution of the trampoline must start at the offset specfied by + * the kernel. + */ + addr = vma->vm_start + trampfd->map.ioffset; + if (addr != ARM_pc) { + rc = -EINVAL; + goto unlock; + } + + /* + * At a minimum, the user PC register must be specified for a + * user trampoline. + */ + tregs = trampfd->regs; + if (!tregs) { + rc = -EINVAL; + goto unlock; + } + + /* + * Set the register context for the trampoline. + */ + set_regs(uregs, tregs); + + if (trampfd->stack) { + /* + * Copy the stack context into a local buffer and push stack + * data after dropping the lock. + */ + size = sizeof(*trampfd->stack) + trampfd->stack->size; + tstack = (struct trampfd_stack *) buf; + memcpy(tstack, trampfd->stack, size); + } +unlock: + mutex_unlock(&trampfd->lock); + + if (!rc && tstack) { + mmap_read_unlock(vma->vm_mm); + rc = push_data(uregs, tstack); + mmap_read_lock(vma->vm_mm); + } + return rc; +} + +/* + * Handle it if it is a trampoline fault. + */ +bool trampfd_fault(struct vm_area_struct *vma, struct pt_regs *pt_regs) +{ + struct trampfd *trampfd; + unsigned long *uregs = pt_regs->uregs; + + if (!is_trampfd_vma(vma)) + return false; + trampfd = vma->vm_private_data; + + if (trampfd->type == TRAMPFD_USER) + return !trampfd_user_fault(trampfd, vma, uregs); + return false; +} +EXPORT_SYMBOL_GPL(trampfd_fault); + +/* ---------------------------- Miscellaneous ---------------------------- */ + +int trampfd_check_arch(struct trampfd *trampfd) +{ + return 0; +} +EXPORT_SYMBOL_GPL(trampfd_check_arch); diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index c6550eddfce1..21a81d19336b 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -202,7 +203,8 @@ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma) static vm_fault_t __kprobes __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr, - unsigned int flags, struct task_struct *tsk) + unsigned int flags, struct task_struct *tsk, + struct pt_regs *regs) { struct vm_area_struct *vma; vm_fault_t fault; @@ -220,6 +222,12 @@ __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr, */ good_area: if (access_error(fsr, vma)) { + /* + * If it is an execute fault, it could be a trampoline + * invocation. + */ + if ((fsr & FSR_LNX_PF) && trampfd_fault(vma, regs)) + return 0; fault = VM_FAULT_BADACCESS; goto out; } @@ -290,7 +298,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) #endif } - fault = __do_page_fault(mm, addr, fsr, flags, tsk); + fault = __do_page_fault(mm, addr, fsr, flags, tsk, regs); /* If we need to retry but a fatal signal is pending, handle the * signal first. We do not need to release the mmap_lock because diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index d5cae5ffede0..88cf4c45069a 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -452,3 +452,4 @@ 437 common openat2 sys_openat2 438 common pidfd_getfd sys_pidfd_getfd 439 common faccessat2 sys_faccessat2 +440 common trampfd_create sys_trampfd_create