From patchwork Fri May 31 19:14:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kees Cook X-Patchwork-Id: 13682014 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58D5CC25B75 for ; Fri, 31 May 2024 19:15:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E59286B00AD; Fri, 31 May 2024 15:15:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E08D06B00AF; Fri, 31 May 2024 15:15:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CCF496B00B0; Fri, 31 May 2024 15:15:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id AF0E16B00AD for ; Fri, 31 May 2024 15:15:02 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 55B7F12112C for ; Fri, 31 May 2024 19:15:02 +0000 (UTC) X-FDA: 82179643644.19.5A27F57 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf02.hostedemail.com (Postfix) with ESMTP id 97BE880003 for ; Fri, 31 May 2024 19:15:00 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=OrCqjAbN; spf=pass (imf02.hostedemail.com: domain of kees@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=kees@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717182900; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=cOYXFhWO1pQzWPAYeMqbKH1yBNFMPO0aMgV7iYFUAL0=; b=fd9Ssixnov8s9bmantDSUuUjv05GgUdc+MDhchOa2kuo6UfoUIj5xhM0pUXI51+Swub1QW rONpoddSF39RpcxTvxYj7Xmli9E8LzCXmSvlsZ+5LFp2JOCKkxNNa9ToGr4nADXf5g6TEK FgqZYBmxZRz/ldHlB5Y6yLep/Iu+gZs= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=OrCqjAbN; spf=pass (imf02.hostedemail.com: domain of kees@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=kees@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717182900; a=rsa-sha256; cv=none; b=xxuI8HS348+87FpkMa8LiJIfHzgsDg53kDZKweNIeUvguTEzEhcCfXDvBlb08Yda+MNdez 5JQPk9/FOd9Xj3ari2ZpaPiR1ZtwRIfEY6UKDFTPZa8Kj1vqz6YATvWhG0cwZDLs9j54+7 OT0hgDxvxndnV/NzJTmUxE+aBGsZrsk= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id AE73B62A10; Fri, 31 May 2024 19:14:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5B23DC4AF09; Fri, 31 May 2024 19:14:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717182899; bh=tBx0KaEXOIynYzOOaB1bqfZ9QktW0dbm7km7OlYN+Rs=; h=From:To:Cc:Subject:Date:From; b=OrCqjAbNKm0AWv71bIwhSQBBUUXBwgB56ZUzHevoL1a8V28yF5NTlx5e6IyOh7JRY 6rv2Ek5BX7HXnYmFeOrJjAxTxTijFXeh+JcHunaBXhehvX1x3+fhkPs2lN3n4JpR2A I+J/B5MBYoX0PxxiHEsZ1WFi6BH2ynVV1JRrEEkeVQhPfrR/zF/+0LWF1p7d5FqgaX zbQY2wbPRsw+uYnVhK2M1J5zt9lAO17YuZJMBfIvfWN+E1oZt8bcGk6y9pv2PSmu5W fsvrX3xjMAJqmoIq4r8mUw2r4QiHb7Uk7Oy7TRhR44n/F2rl5Px4jSLKCfPgMxSvsp D3uzLvjLRdOMw== From: Kees Cook To: Vlastimil Babka Cc: Kees Cook , "GONG, Ruiqi" , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , jvoisin , Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Xiu Jianfeng , Suren Baghdasaryan , Kent Overstreet , Jann Horn , Matteo Rizzo , Thomas Graf , Herbert Xu , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH v4 0/6] slab: Introduce dedicated bucket allocator Date: Fri, 31 May 2024 12:14:52 -0700 Message-Id: <20240531191304.it.853-kees@kernel.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=5639; i=kees@kernel.org; h=from:subject:message-id; bh=tBx0KaEXOIynYzOOaB1bqfZ9QktW0dbm7km7OlYN+Rs=; b=owEBbQKS/ZANAwAKAYly9N/cbcAmAcsmYgBmWiGxbiOlG/iNE86Nprep9z5p0TnTMedaC8B0s cmSg2FPrxqJAjMEAAEKAB0WIQSlw/aPIp3WD3I+bhOJcvTf3G3AJgUCZlohsQAKCRCJcvTf3G3A JsZMEACa4xV/oNU/T5sSEh9oE3qw9UvbiJIzdKvwqn1kbdL6wnYCkPggvCGd40ONLDpGvLOv0Cy EO6BEDZldyF3eKBahgBQuDfFYylseB1or4He/ZzrwkV7Qz6Ckm8NVZheUl3W9lwhapXBiYw155E TmzDU+ak3q+6WQiOzQ91Mrh8xmmYLeI43rKVHuVsr1l75Xv4NUHj37If6DjMjj+AjMOwTZLhT+z rABdFOBLww6qOt8Z+Qma5/owfoG6VADg6xBb5TulXNPUrp+L0KF5wXtonSJbwrTZwTRQdJh0AtC jCB+7v+aY5kZVu5nGiFaHfZQIqgliudOAn//iorKWY5TbdtDia4z+vNESAat8nMcvY2G9GlOcQz r0f2mqShBJQ+scm/HLNV2cvT43yGXznkSEjXyN/TGyArApYQ+JgOZrwLGq0qMMsN0jw2Bh45pbY xMA1P30Dz6u2jzSqW5DYRk/DS6C/e8GPirnIlaU7dgJ4DC4MG/CU6TBUVBS1puFKCFYlKE34lxn ldCim2hnPfq0yr7PMaDU8YMH81e7I73kB+VVc/5hLnJEBwDCLnAQcxyEtPvGEKmWAJgyOz5wz6o COZw+BlIcufsq4RZwrx30Wf87clFc/pfLaKcuUaxzSV+Mv8Z5UTRvbF9PPUdTSjDzNFU85Twk+7 q9Oq1 YbxQT7ND Qw== X-Developer-Key: i=kees@kernel.org; a=openpgp; fpr=A5C3F68F229DD60F723E6E138972F4DFDC6DC026 X-Stat-Signature: t7ydqttyoqd5kaxy4nx3e3da3g67pnhp X-Rspamd-Queue-Id: 97BE880003 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717182900-847841 X-HE-Meta: U2FsdGVkX1/Y1Fa2fG1tN/LjZMBnYZv0qt7vfWCPxfDjznc9LiR4NXb7Mpa/tfXg+SxPxTVV3t2zEmzf6Gqy+3oe5PTsYszlph88W2NAafmSY1QTMI1KaKkWLiHBXKJabvZYTDmI1AySOy0M9w7/9JspanO2stHeY4G4FobpGzV9TTXk9b2TZyWouq1bNd14Ytkxluw1QkRZ7UF0SWQuG0zwfihnuOo4e+9uxftYv4jHPZ4AWu0BTTrcem4QiAHti0Q1SoS2fydg/OaM5D8AsFkC7JY26GEnWWcM35WuoQi5YIzoAVkClT+Ht+zeCZ9kYitIR9ANUgrZk4PXiiEAO/Y25ftDHIHZLDZUZ7UNwpol0e/EpG2nSslg4hkHTCMsJbp0naYqPFrBsA2K09yWWjLP+PZoYwSEZqdmCjRwoh0nFSq2N5v7z82qbJ56maExQrX4vOCGHeVTVe02UrfRfnt6vIXZktG9KDLOfHRpZ6WbHGDF8IreRK6EpqFvGYOmYH78pV3fHJgDqBehofGrPxWZ18ZvC6RZ9qMFkPr0BqZEOsms0ij/xncOgzLbVtlKNsU/yOnpMtSoLs63BRmJWl2qaB8jHBUNpbSctKNYCPjPtps2eRsMsVisecr75s9TCyjvBiVgrZD3JX7EG0ojvoSvX997iuRqbr4c3GCgWz/BkF/4j2mZAECFwqMINqsizM77ixjfHf27x/Gubs2CBvcOTpORQwewx84180SfiMLrkE2eWU9g8RwEQowmD6cHONdrYZLdMkVKPYkR0oo7COwLJxcD61EhPZLAqgw/aKKGI8LrGhhdOjTVTt3NXuEY7aVPAodnHIZ62mTLyMJth12ZGwrKxqmjSgvlbK6TzZSSRo4Lw79v8AXdl1VauaIaOBj8LiUqJumppJK80FrpKEtNz+DoHrmzRiak01M4opDonXZN6qHs2IGIe7VZmaEFGo/B9yjn+NPeWJIIafO ZtHYoIYE Zn2iDZmIIIfa6kdndQ7z9cuc4I4FAW/d3k9CFZzWIqUZrzO560lGEnLLgN8QCSWgRLGONTRS72L6JoSVnoWDquadkDnZ/akaJ/gqmKouafr5zMTRy/v608GP31w8X8Ub1QkNgLEyMp+pvMy1paL0xGg29oOGpYrj1K+IksHn8W4bqFoWo/Q9IhRFBXZwB8ljNYO+9TcIqxneKwuLD4Iv7B2CVssKspax5MmYn9jqwrW/GL2TM4OK8T/pFLN68LsDH0hVZeoNn+Ivrw7FdafIMKKT23gurfJijIEYEVcHZs4Xv3hsphsTnDKL1mcHFZNzPiemabrnZMAi4/TjHHpT3hnaIjGWnraHmsrjVU4BI2HY0UFN1aHLY/UfZW+w4uLT1rE4uu0jWGLXK6nF/gIqeaeCBJ+hojMzTGyKleMWRS3RpTrP/9VLJ1QVIvvxS3fTzWmxEnSw7/FrPVT4cDI68uCkkRwsXbPADfEwK+YrK+avtxd9eo4jZaeMysrlPkAk7g1At1zPNlZhmqm9Pb9bSbXRrdWyIxIhqYVLL3L4aQA6Mqvrmpc5kTyVNMCcERnOdhwvAUiyF2aJ9NVLs6qXtndGXPYufdTG38YDiUb95goqnmAFFZASD7xC2zBn2rQvidJ5FcgzywPBlmW6IyupuWkm829AXxJeBH5h57S3NWRgNKK0xlOZCaSpof4fhBsDHpDlvqUCsvBmMXXkVy8EXCsEVY0U3Xfgy68XF1VNWxFRt2u4fNzq7LeSXk4w+1zgHg9jZSwiCNYV7x+F7SPMrq8m2TReMybAfXWo1K4rvPmf6RKZZWU+AnfKBjAOatwc149s7WXadRUgqTLJYbcis+cwp2eyAKSlivoSYJ0JIRaSTTNH2vTUeGaVjDxPAmkdpF1TL5SYcJ17K8cJ5f7FVjZk22PXWS/5XsfH5w+wiRExZJy0QBILvSF05vG5B72BP0izuzOBqqji2LeeZcv7DWRFD4fiT Z7JRikhn 917YuReUcxcE2GT1oK7dm3JHT1wcPFKXDS1emL/uDKV2lWHpnpysahuI12GCb6fZGwFikuni4H81kLCLG/uqn579d4sbLf2j70M9vBy8RmxK+fqjr/QnAelxbwLQLYJVvzwW7xAo3QeuHNsB2WTXFkUcI3xIvBTpXkEkGZd/HvZQZmMHR1Z2brT5NWnzK0wuBHQeI1MU97ecHrUZN67o/AgAgujVlGkHFs/1s+R5tZTw4VulWGBjpvtVtjz/+Wlg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, v4: - Rebase to v6.10-rc1 - Add CONFIG_SLAB_BUCKETS to turn off the feature v3: https://lore.kernel.org/lkml/20240424213019.make.366-kees@kernel.org/ v2: https://lore.kernel.org/lkml/20240305100933.it.923-kees@kernel.org/ v1: https://lore.kernel.org/lkml/20240304184252.work.496-kees@kernel.org/ For the cover letter, I'm repeating commit log for patch 4 here, which has additional clarifications and rationale since v2: Dedicated caches are available for fixed size allocations via kmem_cache_alloc(), but for dynamically sized allocations there is only the global kmalloc API's set of buckets available. This means it isn't possible to separate specific sets of dynamically sized allocations into a separate collection of caches. This leads to a use-after-free exploitation weakness in the Linux kernel since many heap memory spraying/grooming attacks depend on using userspace-controllable dynamically sized allocations to collide with fixed size allocations that end up in same cache. While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense against these kinds of "type confusion" attacks, including for fixed same-size heap objects, we can create a complementary deterministic defense for dynamically sized allocations that are directly user controlled. Addressing these cases is limited in scope, so isolation these kinds of interfaces will not become an unbounded game of whack-a-mole. For example, pass through memdup_user(), making isolation there very effective. In order to isolate user-controllable sized allocations from system allocations, introduce kmem_buckets_create(), which behaves like kmem_cache_create(). Introduce kmem_buckets_alloc(), which behaves like kmem_cache_alloc(). Introduce kmem_buckets_alloc_track_caller() for where caller tracking is needed. Introduce kmem_buckets_valloc() for cases where vmalloc callback is needed. Allows for confining allocations to a dedicated set of sized caches (which have the same layout as the kmalloc caches). This can also be used in the future to extend codetag allocation annotations to implement per-caller allocation cache isolation[1] even for dynamic allocations. Memory allocation pinning[2] is still needed to plug the Use-After-Free cross-allocator weakness, but that is an existing and separate issue which is complementary to this improvement. Development continues for that feature via the SLAB_VIRTUAL[3] series (which could also provide guard pages -- another complementary improvement). Link: https://lore.kernel.org/lkml/202402211449.401382D2AF@keescook [1] Link: https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html [2] Link: https://lore.kernel.org/lkml/20230915105933.495735-1-matteorizzo@google.com/ [3] After the core implementation are 2 patches that cover the most heavily abused "repeat offenders" used in exploits. Repeating those details here: The msg subsystem is a common target for exploiting[1][2][3][4][5][6] use-after-free type confusion flaws in the kernel for both read and write primitives. Avoid having a user-controlled size cache share the global kmalloc allocator by using a separate set of kmalloc buckets. Link: https://blog.hacktivesecurity.com/index.php/2022/06/13/linux-kernel-exploit-development-1day-case-study/ [1] Link: https://hardenedvault.net/blog/2022-11-13-msg_msg-recon-mitigation-ved/ [2] Link: https://www.willsroot.io/2021/08/corctf-2021-fire-of-salvation-writeup.html [3] Link: https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html [4] Link: https://google.github.io/security-research/pocs/linux/cve-2021-22555/writeup.html [5] Link: https://zplin.me/papers/ELOISE.pdf [6] Link: https://syst3mfailure.io/wall-of-perdition/ [7] Both memdup_user() and vmemdup_user() handle allocations that are regularly used for exploiting use-after-free type confusion flaws in the kernel (e.g. prctl() PR_SET_VMA_ANON_NAME[1] and setxattr[2][3][4] respectively). Since both are designed for contents coming from userspace, it allows for userspace-controlled allocation sizes. Use a dedicated set of kmalloc buckets so these allocations do not share caches with the global kmalloc buckets. Link: https://starlabs.sg/blog/2023/07-prctl-anon_vma_name-an-amusing-heap-spray/ [1] Link: https://duasynt.com/blog/linux-kernel-heap-spray [2] Link: https://etenal.me/archives/1336 [3] Link: https://github.com/a13xp0p0v/kernel-hack-drill/blob/master/drill_exploit_uaf.c [4] Thanks! -Kees Kees Cook (6): mm/slab: Introduce kmem_buckets typedef mm/slab: Plumb kmem_buckets into __do_kmalloc_node() mm/slab: Introduce kvmalloc_buckets_node() that can take kmem_buckets argument mm/slab: Introduce kmem_buckets_create() and family ipc, msg: Use dedicated slab buckets for alloc_msg() mm/util: Use dedicated slab buckets for memdup_user() include/linux/slab.h | 70 ++++++++++++++++++++++++++++------- ipc/msgutil.c | 13 ++++++- lib/rhashtable.c | 2 +- mm/Kconfig | 15 ++++++++ mm/slab.h | 6 ++- mm/slab_common.c | 87 ++++++++++++++++++++++++++++++++++++++++++-- mm/slub.c | 34 ++++++++++++----- mm/util.c | 29 +++++++++++---- 8 files changed, 217 insertions(+), 39 deletions(-)