From patchwork Mon Mar 3 05:09:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13998188 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACC04C282C5 for ; Mon, 3 Mar 2025 05:09:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 47048280001; Mon, 3 Mar 2025 00:09:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 41E71280003; Mon, 3 Mar 2025 00:09:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E5F5280001; Mon, 3 Mar 2025 00:09:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 11AA7280003 for ; Mon, 3 Mar 2025 00:09:32 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 8A61DB2804 for ; Mon, 3 Mar 2025 05:09:31 +0000 (UTC) X-FDA: 83179061742.05.04E41DB Received: from mail-ed1-f54.google.com (mail-ed1-f54.google.com [209.85.208.54]) by imf29.hostedemail.com (Postfix) with ESMTP id 927E0120007 for ; Mon, 3 Mar 2025 05:09:29 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=Ckkoqa0L; spf=pass (imf29.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.208.54 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740978569; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=UDEVHx745HfqQKzrcbjnWQijhbqwLuomPt+Qpt4lkGM=; b=He7xmo6xn3fCjna3vnbJSPQCdSmG0zGaN26s4f61/1Ln09TUyUNqSnidoMhIS9RoAyCzOn hz3L1Kn8biS9ubqT8WtyaTlHS/e443nkY3hKRf4BaVgHVqH1FGER1KuK0F7xp/RECFuLUA wfgkgZAWBFVs7IzzbhPcsDOoU8XfOvU= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=Ckkoqa0L; spf=pass (imf29.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.208.54 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740978569; a=rsa-sha256; cv=none; b=C5rPeFzrNosUjnMgrViceow8MGVL8Vb5iN3hISM77hGjkFu22Qa+frypi32+g7mS1hmkWq cDVcg2AK/Zm+Kap0q3KOyBXnbVhF07UjQn0j7I2SlUIY+W1KK9PqZ1444gFPd4150nqcv5 M5Z9vp8Ru4E9s8dqLXsSNh8xN5P9xl8= Received: by mail-ed1-f54.google.com with SMTP id 4fb4d7f45d1cf-5dbf65c0c4fso516280a12.0 for ; Sun, 02 Mar 2025 21:09:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1740978568; x=1741583368; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=UDEVHx745HfqQKzrcbjnWQijhbqwLuomPt+Qpt4lkGM=; b=Ckkoqa0Lf9l1Fz6MXepp71pa6Y3j/ypxM6PMNBa/ORsXSlW3zPy8cRDWOyq+tdXunJ pOYEkzvgxufCfCPHP3FNwtDRGC64nf0ciCd/9Iq/NT7MLayRlFXFUVqTqomFQF0PlONr loXMxyM0yxmLA2SZHPEvFSc2lEajUzlMUy7HQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740978568; x=1741583368; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=UDEVHx745HfqQKzrcbjnWQijhbqwLuomPt+Qpt4lkGM=; b=qik732+c5GDu75NngNe5Yq9JIzXfPKUD0XuSixhD4aqzTFHNnrYEDOc7v3ukDtBj09 1PFKOxNgEoJOcC2A+WKSlRB08c3FXWUYdr38rOgfg6Kwy8+lo0yDikB3fuNM63wDpDWd 5sJA/ewO7EmdoH5EYfpGYJ2KWM0n7jmILTiwG9AEV0TeullR7eZD8RelmPJJK38lF1DF srdr23OPC/EB7Jb4sov9noXzg8eKj5ODnfTrpWN8TO4NXQTP+I7PBh9QiOJJZfJ4ZAl0 P+IWzeO67gB2bisa6A7NOJ/f+Crh2RjM7s/tdkvcE7jw46uGTi5S+bGItJyGX5s6BrKz gXSQ== X-Forwarded-Encrypted: i=1; AJvYcCXKXerYg1aeXYF02zjmhOVIquVZpL6zSIC2ck+BD9u8+roJgLcMj3J81fgWGTkK7d/I+fLSxNl5xQ==@kvack.org X-Gm-Message-State: AOJu0Ywb3oueMl0j6pzBChKER7UJ7j2o7SiHE5V+8mvkKUdkBnlNcVxR XCjT8oJp93HKQTVX7Ww9yZZImM9OHMDwvNtogrmixAbA1e3H+BE8aMcF9kIAlw== X-Gm-Gg: ASbGncsrTGTluwQxuQyQ4k4kVLrFXcRafEMkiYSUP/HnGuPMYHoKPWlbNdC5HoAY9dU 4H+E4nsRXN0k8R/JjKHfpk/JNagqLjknRnis39puoovTyMDu3hAcy8b2Hkc0lv4CUJXT6n6qZFQ JegYI0wOzQoC9nEX76X9jtQa6SpvE/SHmTK9F4LLWZqDJKlWKtMhjgJrQaO3nPMEtjGtQgHMiNE o5Sr4NlZppmmPXBQz9xfSvUOCJVPO4qJ0C8lM1FsJb+nSB1UKobxkfX1UjmTqJFF1J1hlz2yFN+ 84xB941Eo4VAPT1D3RHpqECvdyLvCxQRI1hZObwInRFBnKgCm582QcasGw7UYbr4VX5x/yM7CCu w X-Google-Smtp-Source: AGHT+IFJJ1/Y+xG0LrBQCoUEh3jDQ/6gxJ5XlGCFi/rSh9nmo+Q+kXS8EIpqecs4DgVt86ejopjo1w== X-Received: by 2002:a05:6402:2694:b0:5d0:d183:cc11 with SMTP id 4fb4d7f45d1cf-5e4d6acd351mr5227775a12.2.1740978567672; Sun, 02 Mar 2025 21:09:27 -0800 (PST) Received: from cfish.c.googlers.com.com (40.162.204.35.bc.googleusercontent.com. [35.204.162.40]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5e4c3fb6067sm6248635a12.50.2025.03.02.21.09.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 02 Mar 2025 21:09:27 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, torvalds@linux-foundation.org, vbabka@suse.cz, lorenzo.stoakes@oracle.com, Liam.Howlett@Oracle.com, adhemerval.zanella@linaro.org, oleg@redhat.com, avagin@gmail.com, benjamin@sipsolutions.net Cc: linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, jorgelo@chromium.org, sroettger@google.com, hch@lst.de, ojeda@kernel.org, thomas.weissschuh@linutronix.de, adobriyan@gmail.com, johannes@sipsolutions.net, pedro.falcato@gmail.com, hca@linux.ibm.com, willy@infradead.org, anna-maria@linutronix.de, mark.rutland@arm.com, linus.walleij@linaro.org, Jason@zx2c4.com, deller@gmx.de, rdunlap@infradead.org, davem@davemloft.net, peterx@redhat.com, f.fainelli@gmail.com, gerg@kernel.org, dave.hansen@linux.intel.com, mingo@kernel.org, ardb@kernel.org, mhocko@suse.com, 42.hyeyoo@gmail.com, peterz@infradead.org, ardb@google.com, enh@google.com, rientjes@google.com, groeck@chromium.org, mpe@ellerman.id.au, aleksandr.mikhalitsyn@canonical.com, mike.rapoport@gmail.com, Jeff Xu Subject: [PATCH v8 0/7] mseal system mappings Date: Mon, 3 Mar 2025 05:09:14 +0000 Message-ID: <20250303050921.3033083-1-jeffxu@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 927E0120007 X-Stat-Signature: zn3x4jxpct4rq1aswwknotxsuyxsbwp7 X-HE-Tag: 1740978569-142298 X-HE-Meta: U2FsdGVkX19q991eFYLvopzzZvu6rAKbilRSsBdmJ+JqJmjRr6TX1zu7nI20cuPoFakUYuzBeTw7qAjlKIY/4+EVsihm1qGHjCOGbONwI9AB/1qhUPJ8wp5qfKFV2oAAh5HSgXnIYtd/jU8i6yq3OxzPx31Z8je40QxX4accV0i2L5SB9jtWlaYmYUqGSdZYaHv/ns4s1S+mvD3hUCTPiMwuycXDK4cFSv1N1CDI8Ek4eYjh5FgLcAkTDL63oEmAs8QvphPaPfH1Zw8EibNbHCknCUzBowWyDJDKDmA43XqKG+EqZzE25iLpqCsv0BLrFpyWPSNt/i0kqI1SmYuO1RAXK54e31ZBaGsQouYbZzOw0l0jS3BhXqlY3kq1cckXACHnR6WwKC1+NpOLygHcld7lSIfMpTKAIxAsZEek51uSrkoWtUU4+zlxwhZsoBkSNCoA7FkCwTH8/USvN8EvaUVCEzk5OEBdkmoSIQonjX/9ovj4/xYY4XDPk++IQWgMzB47bL5GY8rM9h2ZThMAUvaJ0lX5OJ9c2EzSZnV1soQDym7aybnbygSxWxwtWkjfALu2PlPN2o0KJTaxfRbpVPGrtdR7Qtu9O02HLsfNYpzrR9vqZCxEumWajCtHyQAZwbCQoHS56ORXR5NUSY6Dtm46rb7LQtpCnjdy3F6vtBhJB6CnohlJHSRVx+dU33PVR5h/FGvRtYrFcMhwPSbcfrzdcpKHcjvb1KTviRHIvpDcweQ7oGVBbrxgAuhB581PelcY1q6QTJ69rZScW/PNH0Bwf/Dmw5q6jCt+xZw6h7Ggw9+Seq5BlpgrUMyzWBmouzSspIc+i5//LkECPk+09eUFBsbK6gRuz3TsjPy3qkDS8dbYJHt5wSEc3sKCHVcGqzUW9JhsRqAsfYIUqOp8AMX0VO4QZkLv0QXw5hP7oTh9frUloHzl9gw9MFhqGz7PdyTFuTXKid8y3R3+mE8 fOo/qXu7 5karesvFnHEX4Eh0qP+QoShtvi+YdSb+ddqDR6XznFzcOkEqNvwVceI8V7IYhlOayqmtaWI6CAZEwvmyLSErClOl/fmiXMw2Q7+Vd2xfYT8wLfvKfaVuvKBm1/TMBGjbhhD+8h8Gt9rMkr/CIpqj0hiH+pQKBsM6Rrz/L+CZHkU642pevcn7dGxCdPW4mPMZbltgBsjjQx8PooH72EvOJes3FR4zm56fFaGv9WIGutW0Q5iFRn37Ro2hdtirDN9NDDKpT5yS3E8zLRYrPpMLedAC4U7kOKn7CC9z3byaQSmplR91KKFS6udu0nQnqIQJQccw0nUT2rYLC5AUWq4Ac79XBGimKmyyRAtz5uMdvZxX8YfSOAN8nadQQFrBsI8zV96FOf/TQUjR31TA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu This is V8 version, addressing comments from V7, without code logic change. ------------------------------------------------------------------- As discussed during mseal() upstream process [1], mseal() protects the VMAs of a given virtual memory range against modifications, such as the read/write (RW) and no-execute (NX) bits. For complete descriptions of memory sealing, please see mseal.rst [2]. The mseal() is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management system. For example, such an attacker primitive can break control-flow integrity guarantees since read-only memory that is supposed to be trusted can become writable or .text pages can get remapped. The system mappings are readonly only, memory sealing can protect them from ever changing to writable or unmmap/remapped as different attributes. System mappings such as vdso, vvar, vvar_vclock, vectors (arm compact-mode), sigpage (arm compact-mode), are created by the kernel during program initialization, and could be sealed after creation. Unlike the aforementioned mappings, the uprobe mapping is not established during program startup. However, its lifetime is the same as the process's lifetime [3]. It could be sealed from creation. The vsyscall on x86-64 uses a special address (0xffffffffff600000), which is outside the mm managed range. This means mprotect, munmap, and mremap won't work on the vsyscall. Since sealing doesn't enhance the vsyscall's security, it is skipped in this patch. If we ever seal the vsyscall, it is probably only for decorative purpose, i.e. showing the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored. It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may alter the system mappings during restore operations. UML(User Mode Linux) and gVisor, rr are also known to change the vdso/vvar mappings. Consequently, this feature cannot be universally enabled across all systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default. To support mseal of system mappings, architectures must define CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS and update their special mappings calls to pass mseal flag. Additionally, architectures must confirm they do not unmap/remap system mappings during the process lifetime. The existence of this flag for an architecture implies that it does not require the remapping of thest system mappings during process lifetime, so sealing these mappings is safe from a kernel perspective. This version covers x86-64 and arm64 archiecture as minimum viable feature. While no specific CPU hardware features are required for enable this feature on an archiecture, memory sealing requires a 64-bit kernel. Other architectures can choose whether or not to adopt this feature. Currently, I'm not aware of any instances in the kernel code that actively munmap/mremap a system mapping without a request from userspace. The PPC does call munmap when _install_special_mapping fails for vdso; however, it's uncertain if this will ever fail for PPC - this needs to be investigated by PPC in the future [4]. The UML kernel can add this support when KUnit tests require it [5]. In this version, we've improved the handling of system mapping sealing from previous versions, instead of modifying the _install_special_mapping function itself, which would affect all architectures, we now call _install_special_mapping with a sealing flag only within the specific architecture that requires it. This targeted approach offers two key advantages: 1) It limits the code change's impact to the necessary architectures, and 2) It aligns with the software architecture by keeping the core memory management within the mm layer, while delegating the decision of sealing system mappings to the individual architecture, which is particularly relevant since 32-bit architectures never require sealing. Prior to this patch series, we explored sealing special mappings from userspace using glibc's dynamic linker. This approach revealed several issues: - The PT_LOAD header may report an incorrect length for vdso, (smaller than its actual size). The dynamic linker, which relies on PT_LOAD information to determine mapping size, would then split and partially seal the vdso mapping. Since each architecture has its own vdso/vvar code, fixing this in the kernel would require going through each archiecture. Our initial goal was to enable sealing readonly mappings, e.g. .text, across all architectures, sealing vdso from kernel since creation appears to be simpler than sealing vdso at glibc. - The [vvar] mapping header only contains address information, not length information. Similar issues might exist for other special mappings. - Mappings like uprobe are not covered by the dynamic linker, and there is no effective solution for them. This feature's security enhancements will benefit ChromeOS, Android, and other high security systems. Testing: This feature was tested on ChromeOS and Android for both x86-64 and ARM64. - Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly, i.e. "sl" shown in the smaps for those mappings, and mremap is blocked. - Passing various automation tests (e.g. pre-checkin) on ChromeOS and Android to ensure the sealing doesn't affect the functionality of Chromebook and Android phone. I also tested the feature on Ubuntu on x86-64: - With config disabled, vdso/vvar is not sealed, - with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK, normal operations such as browsing the web, open/edit doc are OK. Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1] Link: Documentation/userspace-api/mseal.rst [2] Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3] Link: https://lore.kernel.org/all/CABi2SkV6JJwJeviDLsq9N4ONvQ=EFANsiWkgiEOjyT9TQSt+HA@mail.gmail.com/ [4] Link: https://lore.kernel.org/all/202502251035.239B85A93@keescook/ [5] ------------------------------------------- History: V8: - Change ARCH_SUPPORTS_MSEAL_X to ARCH_SUPPORTS_MSEAL_X (Liam R. Howlett) - Update comments in Kconfig and mseal.rst (Lorenzo Stoakes, Liam R. Howlett) - Change patch header perfix to "mseal sysmap" (Lorenzo Stoakes) - Remove "vm_flags =" (Kees Cook, Liam R. Howlett, Oleg Nesterov) - Drop uml architecture (Lorenzo Stoakes, Kees Cook) - Add a selftest to verify system mappings are sealed (Lorenzo Stoakes) V7: https://lore.kernel.org/all/20250224225246.3712295-1-jeffxu@google.com/ - Remove cover letter from the first patch (Liam R. Howlett) - Change macro name to VM_SEALED_SYSMAP (Liam R. Howlett) - logging and fclose() in selftest (Liam R. Howlett) V6: https://lore.kernel.org/all/20250224174513.3600914-1-jeffxu@google.com/ - mseal.rst: fix a typo (Randy Dunlap) - security/Kconfig: add rr into note (Liam R. Howlett) - remove mseal_system_mappings() and use macro instead (Liam R. Howlett) - mseal.rst: add incompatible userland software (Lorenzo Stoakes) - remove RFC from title (Kees Cook) V5 https://lore.kernel.org/all/20250212032155.1276806-1-jeffxu@google.com/ - Remove kernel cmd line (Lorenzo Stoakes) - Add test info (Lorenzo Stoakes) - Add threat model info (Lorenzo Stoakes) - Fix x86 selftest: test_mremap_vdso - Restrict code change to ARM64/x86-64/UM arch only. - Add userprocess.h to include seal_system_mapping(). - Remove sealing vsyscall. - Split the patch. V4: https://lore.kernel.org/all/20241125202021.3684919-1-jeffxu@google.com/ - ARCH_HAS_SEAL_SYSTEM_MAPPINGS (Lorenzo Stoakes) - test info (Lorenzo Stoakes) - Update mseal.rst (Liam R. Howlett) - Update test_mremap_vdso.c (Liam R. Howlett) - Misc. style, comments, doc update (Liam R. Howlett) V3: https://lore.kernel.org/all/20241113191602.3541870-1-jeffxu@google.com/ - Revert uprobe to v1 logic (Oleg Nesterov) - use CONFIG_SEAL_SYSTEM_MAPPINGS instead of _ALWAYS/_NEVER (Kees Cook) - Move kernel cmd line from fs/exec.c to mm/mseal.c and misc. (Liam R. Howlett) V2: https://lore.kernel.org/all/20241014215022.68530-1-jeffxu@google.com/ - Seal uprobe always (Oleg Nesterov) - Update comments and description (Randy Dunlap, Liam R.Howlett, Oleg Nesterov) - Rebase to linux_main V1: - https://lore.kernel.org/all/20241004163155.3493183-1-jeffxu@google.com/ -------------------------------------------------- Jeff Xu (7): mseal sysmap: kernel config and header change selftests: x86: test_mremap_vdso: skip if vdso is msealed mseal sysmap: enable x86-64 mseal sysmap: enable arm64 mseal sysmap: uprobe mapping mseal sysmap: update mseal.rst selftest: test system mappings are sealed. Documentation/userspace-api/mseal.rst | 20 ++++ arch/arm64/Kconfig | 1 + arch/arm64/kernel/vdso.c | 12 +- arch/x86/Kconfig | 1 + arch/x86/entry/vdso/vma.c | 7 +- include/linux/mm.h | 10 ++ init/Kconfig | 22 ++++ kernel/events/uprobes.c | 3 +- security/Kconfig | 21 ++++ .../mseal_system_mappings/.gitignore | 2 + .../selftests/mseal_system_mappings/Makefile | 6 + .../selftests/mseal_system_mappings/config | 1 + .../mseal_system_mappings/sysmap_is_sealed.c | 113 ++++++++++++++++++ .../testing/selftests/x86/test_mremap_vdso.c | 43 +++++++ 14 files changed, 254 insertions(+), 8 deletions(-) create mode 100644 tools/testing/selftests/mseal_system_mappings/.gitignore create mode 100644 tools/testing/selftests/mseal_system_mappings/Makefile create mode 100644 tools/testing/selftests/mseal_system_mappings/config create mode 100644 tools/testing/selftests/mseal_system_mappings/sysmap_is_sealed.c