From patchwork Tue Dec 12 23:17:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490089 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2100C4332F for ; Tue, 12 Dec 2023 23:17:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4624C6B03D8; Tue, 12 Dec 2023 18:17:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 36F5F6B03E1; Tue, 12 Dec 2023 18:17:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED7B06B03DF; Tue, 12 Dec 2023 18:17:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DA25E6B03DE for ; Tue, 12 Dec 2023 18:17:19 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A54E9C0AE3 for ; Tue, 12 Dec 2023 23:17:19 +0000 (UTC) X-FDA: 81559729398.05.765996A Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf15.hostedemail.com (Postfix) with ESMTP id C591EA0016 for ; Tue, 12 Dec 2023 23:17:17 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=FIH6sem6; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf15.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.214.175 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423037; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=D0KleV6o3T8VwIpW2LkN+EL0QO50pQ5bkLoTJyBRonk=; b=O8d014u2Tbk5HF2u8idfyZZn4PI3HYv9RwOB7zaTEL+s0JSOabDkc09dCYG6u7mF44+29z KXyNbhAzO0SLyES1SXO+r6mMl/U3Dnlwf2NrynLJn3qcKtMypqJNjsaHkBZODTpCTco+ei ycUm/sc8dJr/LFIn1BJgtoKHc/QkRVE= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=FIH6sem6; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf15.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.214.175 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423037; a=rsa-sha256; cv=none; b=V2sA8b3osmWjiWiO5dTe8StACX/SZiX0RIXkOQc79Yi8saB2xcoF4Ohuq2MNFxI5+40e9T jnhak4Ab2pasOqoIpJbcuCP9O+oAHFe9CDh42tjGqxOiLvDhtywSZpVOrALtl3r3XG0Sp9 8bZedg1dr0Daf8HUNOUerLxmrWeZ41s= Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-1d2f1cecf89so22794545ad.1 for ; Tue, 12 Dec 2023 15:17:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423037; x=1703027837; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=D0KleV6o3T8VwIpW2LkN+EL0QO50pQ5bkLoTJyBRonk=; b=FIH6sem6uonT4bMrOnSGgg5ICJ6RRlEBndfT89TfbUW8t34NdYhd/NAjf/osZjePpO Qat+qCeky4UUQtLAiKPbO7SOPmuq5C92QGMe3/0tnUgXKVjU7OUa4k71oGR12DrYTeE2 CVsTnFnqdQkRUTEfyG6SdWiNkxgW7Brhg/eE0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423037; x=1703027837; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=D0KleV6o3T8VwIpW2LkN+EL0QO50pQ5bkLoTJyBRonk=; b=D4+mqwXeegA+pRAtjvwv7wk8M9vjyCTfUaA7+wk+QqisAvQTEIDxUzK0iFCtsZcYek F7w8jYDIJCgX1f6bfLUF7SsBm3Xlll0WKEgbP8ZgHulIpK97p/jDCc55o8ES+vpkD5dw HX8s0khdhnIZlCjVesjpRtYw7YvwGK6ZBgqRdLinaw/p+CN8HRaU65lKoecnrJM2t8z1 mJt3dlAOfsu/yCjY8m4KJ+s5oVY77mbtdwt65Rzji9DXECKyILJFV8AqRLCzbgATBvLp mxxE9nNoYaEhBDvCvzbfkZZ7ahmXP0PlbEv/AC3Cw3e7okGgKymQDJwLGoDTs2h4DAv5 sC0A== X-Gm-Message-State: AOJu0YwnTUt6EIESej9ugHL3xlqc+94TMMvnJRs5cEEqSadPoU/7m2ig 2n1uinZkZLsFrDNF3DUDSIGj8g== X-Google-Smtp-Source: AGHT+IGxR0oJJr0EjsRByUJbUwGXJ3kDSLTggQ7aP2NMz7xXn0QhhIQ1l2eHY3b5MujUVGQNKRdT6g== X-Received: by 2002:a17:902:d4d2:b0:1d3:4aab:194d with SMTP id o18-20020a170902d4d200b001d34aab194dmr281091plg.72.1702423036672; Tue, 12 Dec 2023 15:17:16 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id e1-20020a17090301c100b001d33e65b3cdsm1661489plh.112.2023.12.12.15.17.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:16 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 06/11] mseal: add sealing support for mmap Date: Tue, 12 Dec 2023 23:17:00 +0000 Message-ID: <20231212231706.2680890-7-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: i1wughmwnskkd5ooo5fjx1sw8tbh8pda X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: C591EA0016 X-HE-Tag: 1702423037-944163 X-HE-Meta: U2FsdGVkX1+t5ND5WCll+G8kcakhODyCiqax1ceCMvwEm/P/sgpSn2d4cwn+iJk2fUYyqjrEe31ngmnimW7tsdwizmaHG2GHFp7otPNdpl5ORrqhELxnD3lwdYfuJ9G2JRK9/NMZW1+547zcFMxcvZL6vepDiwL7ZRGwR3bX/8iaZe5worSHJWm+1Jdzdr6jxjg6sbm1urjt1ZZHBXNWRLxn+HrkZue2H6+kibV0OIPL+pk6EjzAowZTOXPGmnLs/v3lZW5lcra8weRym8baK6Z2Bat2HWq0if0PnnZHl6ZrCu2vEYbSUPlUoP5F1rjazM3yel6qJdutFVramYlyR4idLHhZ0DmTDiJ/rZQHtGiFhOEpC8z6OR3k3fxh7rHt5ZE14rx9CE1UIAE6vKuAEQO4HVNUogNFARLwj0PDdO6LSrY5Ate1MivL6FTCP3gZ/DrDWETbnNiLg4vMe57O87My4xOlrHfM7CZ1qxY2iK2h1Q6EVpi0F8FwL1B9ksPK858XOEFCXfsoduMFJ3YDJVaSHDp1NBz7lPs2JMJcFHyCCPA6wPqFnPoz03gT5prX+2LCKOSL8388dPyozef++DHgdm3Nnk9NR17IxVqu0f9yxc/c9P7bAxIu3njEy1AVQLfxw0nLbB8y6tWX2kHZtIZjkPI6/Cj2goCjNtFVnRe2FQ+HqfYHCstDBsf7XqkoltwLEyjDHPbhu/bdOHXI7U/m3SwDG16QmIAUbtZ/UwAS7Si6HCRl+TTn4xcn1rSpZDR35pd6WpeJR6Rj/8sHS1r6Q08/c78hDqhwOCrnRoQBgsvorZmyF7lGIQ4u9AYzVqEXd3N0R3k3EHhxWT9cOiy+Uq0WKdBofm63V0V2Aegtwg1E1vSbfem5rW0oBaytkx1D36G7jGScwSQqvSCVHOzWIXwlaCHK24/jCMgovfasXYB4cQl/sBSaaIymrZ9DVTMvUWjxVGe60ZEhlWl PkM8GNA2 NNr3lxdlvFrq4rL2uqaHzduSb9XxtdOm8jkzoqztc5EkC3saM+I7dUsgUCm+YvqPGc+QHCkmf3wTYbY4xe/afDL8R2QTY37I2Vi6HLHTBnRRjBk1V3VLRkdr0U7brdGz4zQH2I8r8ZuQvlpj58wySEOWPseDmfvr/DKSsshsr17K2F/LpRtL8RRztWVLn4g9+QbMrqB+tE1zhm0KEfa2vVV7HszwNSFHo/QDj5U0sOiJ68Thbw96leDe0eRrYQcchirtan2A00e+DwfLVV5f185tCjj0zyEHB0/krejXFklIM9zlARwKWo+uFUVNZ9pOUa7f7QUuVPY64L89Nb/vloXztJl5UAC4P+ZCKMLtQRSJ0+FL5VFkjVoBvMjxiBlmGy1298mjewTI+AUUmTv/dqHrQzIHubO9NJH7YFa5xZj46OuerWy0mE9h6lM5MP609nuAGyRteOf1IETk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu Allow mmap() to set the sealing type when creating a mapping. This is useful for optimization because it avoids having to make two system calls: one for mmap() and one for mseal(). With this change, mmap() can take an input that specifies the sealing type, so only one system call is needed. This patch uses the "prot" field of mmap() to set the sealing. Three sealing types are added to match with MM_SEAL_xyz in mseal(). PROT_SEAL_SEAL PROT_SEAL_BASE PROT_SEAL_PROT_PKEY We also thought about using MAP_SEAL_xyz, which is a field in the mmap() function called "flags". However, this field is more about the type of mapping, such as MAP_FIXED_NOREPLACE. The "prot" field seems more appropriate for our case. It's worth noting that even though the sealing type is set via the "prot" field in mmap(), we don't require it to be set in the "prot" field in later mprotect() call. This is unlike the PROT_READ, PROT_WRITE, PROT_EXEC bits, e.g. if PROT_WRITE is not set in mprotect(), it means that the region is not writable. In other words, if you set PROT_SEAL_PROT_PKEY in mmap(), you don't need to set it in mprotect(). In fact, with the current approach, mseal() is used to set sealing on existing VMA. Signed-off-by: Jeff Xu Suggested-by: Pedro Falcato --- arch/mips/kernel/vdso.c | 10 +++- include/linux/mm.h | 63 +++++++++++++++++++++++++- include/uapi/asm-generic/mman-common.h | 13 ++++++ mm/mmap.c | 25 ++++++++-- 4 files changed, 105 insertions(+), 6 deletions(-) diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c index f6d40e43f108..6d1103d36af1 100644 --- a/arch/mips/kernel/vdso.c +++ b/arch/mips/kernel/vdso.c @@ -98,11 +98,17 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) return -EINTR; if (IS_ENABLED(CONFIG_MIPS_FP_SUPPORT)) { - /* Map delay slot emulation page */ + /* + * Map delay slot emulation page. + * + * Note: passing vm_seals = 0 + * Don't support sealing for vdso for now. + * This might change when we add sealing support for vdso. + */ base = mmap_region(NULL, STACK_TOP, PAGE_SIZE, VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC, - 0, NULL); + 0, NULL, 0); if (IS_ERR_VALUE(base)) { ret = base; goto out; diff --git a/include/linux/mm.h b/include/linux/mm.h index 2435acc1f44f..5d3ee79f1438 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -266,6 +266,15 @@ extern unsigned int kobjsize(const void *objp); MM_SEAL_BASE | \ MM_SEAL_PROT_PKEY) +/* + * PROT_SEAL_ALL is all supported flags in mmap(). + * See include/uapi/asm-generic/mman-common.h. + */ +#define PROT_SEAL_ALL ( \ + PROT_SEAL_SEAL | \ + PROT_SEAL_BASE | \ + PROT_SEAL_PROT_PKEY) + /* * vm_flags in vm_area_struct, see mm_types.h. * When changing, update also include/trace/events/mmflags.h. @@ -3290,7 +3299,7 @@ extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned lo extern unsigned long mmap_region(struct file *file, unsigned long addr, unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, - struct list_head *uf); + struct list_head *uf, unsigned long vm_seals); extern unsigned long do_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flags, vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate, @@ -3339,12 +3348,47 @@ static inline unsigned long vma_seals(struct vm_area_struct *vma) return (vma->vm_seals & MM_SEAL_ALL); } +static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm_seals) +{ + vma->vm_seals |= vm_seals; +} + extern bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned long checkSeals); extern bool can_modify_vma(struct vm_area_struct *vma, unsigned long checkSeals); +/* + * Convert prot field of mmap to vm_seals type. + */ +static inline unsigned long convert_mmap_seals(unsigned long prot) +{ + unsigned long seals = 0; + + /* + * set SEAL_PROT_PKEY implies SEAL_BASE. + */ + if (prot & PROT_SEAL_PROT_PKEY) + prot |= PROT_SEAL_BASE; + + /* + * The seal bits start from bit 26 of the "prot" field of mmap. + * see comments in include/uapi/asm-generic/mman-common.h. + */ + seals = (prot & PROT_SEAL_ALL) >> PROT_SEAL_BIT_BEGIN; + return seals; +} + +/* + * check input sealing type from the "prot" field of mmap(). + * for CONFIG_MSEAL case, this always return 0 (successful). + */ +static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) +{ + *vm_seals = convert_mmap_seals(prot); + return 0; +} #else static inline bool check_vma_seals_mergeable(unsigned long vm_seals1) { @@ -3367,6 +3411,23 @@ static inline bool can_modify_vma(struct vm_area_struct *vma, { return true; } + +static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm_seals) +{ +} + +/* + * check input sealing type from the "prot" field of mmap(). + * For not CONFIG_MSEAL, if SEAL flag is set, it will return failure. + */ +static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) +{ + if (prot & PROT_SEAL_ALL) + return -EINVAL; + + return 0; +} + #endif /* These take the mm semaphore themselves */ diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index 6ce1f1ceb432..f07ad9e70b3a 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -17,6 +17,19 @@ #define PROT_GROWSDOWN 0x01000000 /* mprotect flag: extend change to start of growsdown vma */ #define PROT_GROWSUP 0x02000000 /* mprotect flag: extend change to end of growsup vma */ +/* + * The PROT_SEAL_XX defines memory sealings flags in the prot argument + * of mmap(). The bits currently take consecutive bits and match + * the same sequence as MM_SEAL_XX type, this allows convert_mmap_seals() + * to convert prot to MM_SEAL_XX type using bit operations. + * The include/uapi/linux/mman.h header file defines the MM_SEAL_XX type, + * which is used by the mseal() system call. + */ +#define PROT_SEAL_BIT_BEGIN 26 +#define PROT_SEAL_SEAL _BITUL(PROT_SEAL_BIT_BEGIN) /* 0x04000000 seal seal */ +#define PROT_SEAL_BASE _BITUL(PROT_SEAL_BIT_BEGIN + 1) /* 0x08000000 base for all sealing types */ +#define PROT_SEAL_PROT_PKEY _BITUL(PROT_SEAL_BIT_BEGIN + 2) /* 0x10000000 seal prot and pkey */ + /* 0x01 - 0x03 are defined in linux/mman.h */ #define MAP_TYPE 0x0f /* Mask for type of mapping */ #define MAP_FIXED 0x10 /* Interpret addr exactly */ diff --git a/mm/mmap.c b/mm/mmap.c index dbc557bd460c..3e1bf5a131b0 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1211,6 +1211,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, { struct mm_struct *mm = current->mm; int pkey = 0; + unsigned long vm_seals = 0; *populate = 0; @@ -1231,6 +1232,9 @@ unsigned long do_mmap(struct file *file, unsigned long addr, if (flags & MAP_FIXED_NOREPLACE) flags |= MAP_FIXED; + if (check_mmap_seals(prot, &vm_seals) < 0) + return -EINVAL; + if (!(flags & MAP_FIXED)) addr = round_hint_to_min(addr); @@ -1381,7 +1385,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, vm_flags |= VM_NORESERVE; } - addr = mmap_region(file, addr, len, vm_flags, pgoff, uf); + addr = mmap_region(file, addr, len, vm_flags, pgoff, uf, vm_seals); if (!IS_ERR_VALUE(addr) && ((vm_flags & VM_LOCKED) || (flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE)) @@ -2679,7 +2683,7 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, unsigned long mmap_region(struct file *file, unsigned long addr, unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, - struct list_head *uf) + struct list_head *uf, unsigned long vm_seals) { struct mm_struct *mm = current->mm; struct vm_area_struct *vma = NULL; @@ -2723,7 +2727,13 @@ unsigned long mmap_region(struct file *file, unsigned long addr, next = vma_next(&vmi); prev = vma_prev(&vmi); - if (vm_flags & VM_SPECIAL) { + /* + * For now, sealed VMA doesn't merge with other VMA, + * Will change this in later commit when we make sealed VMA + * also mergeable. + */ + if ((vm_flags & VM_SPECIAL) || + (vm_seals & MM_SEAL_ALL)) { if (prev) vma_iter_next_range(&vmi); goto cannot_expand; @@ -2781,6 +2791,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vma->vm_page_prot = vm_get_page_prot(vm_flags); vma->vm_pgoff = pgoff; + update_vma_seals(vma, vm_seals); + if (file) { if (vm_flags & VM_SHARED) { error = mapping_map_writable(file->f_mapping); @@ -2992,6 +3004,13 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, if (pgoff + (size >> PAGE_SHIFT) < pgoff) return ret; + /* + * Do not support sealing in remap_file_page. + * sealing is set via mmap() and mseal(). + */ + if (prot & PROT_SEAL_ALL) + return ret; + if (mmap_write_lock_killable(mm)) return -EINTR;