From patchwork Wed Jun 19 19:33:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kees Cook X-Patchwork-Id: 13704460 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7705C27C53 for ; Wed, 19 Jun 2024 19:34:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEE6F6B0451; Wed, 19 Jun 2024 15:34:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D9EFF6B019D; Wed, 19 Jun 2024 15:34:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9CE276B0455; Wed, 19 Jun 2024 15:34:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5F7476B0451 for ; Wed, 19 Jun 2024 15:34:01 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 0AE2FC1501 for ; Wed, 19 Jun 2024 19:34:01 +0000 (UTC) X-FDA: 82248638682.22.C7AA2B2 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf02.hostedemail.com (Postfix) with ESMTP id 46AE08000F for ; Wed, 19 Jun 2024 19:33:59 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Yg3uS92q; spf=pass (imf02.hostedemail.com: domain of kees@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=kees@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718825634; a=rsa-sha256; cv=none; b=wnHrbRCUxCwu/5G316NLAZ2BhhfuQnBRDdIOtEzr64cyNeLSsmnivCMRSBGQdoNb+ULgKt uBvWolOlq6F7xtD6CFLj+RVjeCYXvYHoqF5xQ1TY9ClhW9Mei3wkmy2oNkI+WzlRvop4Lq Un5uiJ+Bl2/v80RFsjbhuWgSIglahJI= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Yg3uS92q; spf=pass (imf02.hostedemail.com: domain of kees@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=kees@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718825634; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pzzzfN2hDtQ9wTLuLMbnLhsOuPf1gSKGK4j215R2dCE=; b=K86++ADEDW9sqRv4ltKQYcgeNJJnibHXU/HrmC/dLIb9DZnKlWSFOWok2EdPnMuASshNI3 GQqGwORlRrfncv85FoFu3Tx9mvX3YEJ6YLq6aiuq+JgwxnVsIJaaLIvXtGpiJaEy8ECy6I YqoNurvPjLb5lcanz6XvPe63Da3IqZE= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 6A9B461E71; Wed, 19 Jun 2024 19:33:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B4E7DC4AF09; Wed, 19 Jun 2024 19:33:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1718825637; bh=/RmCJ4F/4ng/Khq7qR9N0HAZnD/jFStWnmxtCKBK+GE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Yg3uS92qwownEjYk+KrHqmmBxErM9MDG9GWHoFjllUlVSitrJzN/cCkCIxz9m8kG9 kKoYDXC55Gf2FwYsZgznS1yoHOWjYGoi+I0Z6o0Gm5TXB/JtXNVfbtLFKKjgll+gOR CVuFaIrDPLPbOhR1s/D38SWARmQueikUJkNoAqYKhvnRvsXp23H82MaQHLHr5bjLze 9djumIEXDuJJaKmToURyjFbfB9pEA6aj9VayWX/uc1mN3yBbyG5UQB7bA5tqX7afHH qvlaU7v3WU6YTKxQi8wrAE3TOP/f0TTQhbZb/ao4H72plbBlMPtDyeYKOxizgoQpN+ infour+aBOrTA== From: Kees Cook To: Vlastimil Babka Cc: Kees Cook , "GONG, Ruiqi" , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , jvoisin , Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Xiu Jianfeng , Suren Baghdasaryan , Kent Overstreet , Jann Horn , Matteo Rizzo , Thomas Graf , Herbert Xu , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH v5 4/6] mm/slab: Introduce kmem_buckets_create() and family Date: Wed, 19 Jun 2024 12:33:52 -0700 Message-Id: <20240619193357.1333772-4-kees@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240619192131.do.115-kees@kernel.org> References: <20240619192131.do.115-kees@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=6964; i=kees@kernel.org; h=from:subject; bh=/RmCJ4F/4ng/Khq7qR9N0HAZnD/jFStWnmxtCKBK+GE=; b=owEBbAKT/ZANAwAKAYly9N/cbcAmAcsmYgBmczKh21zJZgyIFpVjar+3GzcqV57oOAwiaPyFf 9lg6kUOpTKJAjIEAAEKAB0WIQSlw/aPIp3WD3I+bhOJcvTf3G3AJgUCZnMyoQAKCRCJcvTf3G3A JnTnD/i2CjxP6n3FrmWS5Y2ZF/qTuu2v5aeKrAb1+69yKucXGvlFE1sS0j8g/4CcTgYD2Knvk0Z HJqxN0EMNzWhABAciSgOkzbDhOQyDLUvP+JcxuHiDxOp8K73MMFTWwXPiVjTcCUoqXiMGq3JNzx qesEPnvQM+Tg2kCxyZKlrcdSgRSpNAyOzt/YguC+V5/5tBzhRwfv/4wT8D50QkT/RgzPLjEM/cF LLXlmYAH4ftrzzTwHVPFydtRp3z/LXpUVh5wphBnpJ5tYkkBcEY8WNG5VJCoZbd2HIp58ex6Ypg mlp4bEfdmLCY0m8uqpdycuoTdXpRplIW30Y+VT8NG6JUz/Pmo3TPkDTlyIkFXRmgCwAY7yJ9lcO t5PXQjfhG3j4+JzW0kkDAi8PCZSuB5K+mMSpHGg/KUxut0vaVf/FUqZLaKFbtDtDt3j5Hi2Zxsc McIfFUN7qSQw1oeXD+WFrbTDLVTt7Kvs/PYdqdnswzMcTHKQVyl7k2mwUtibR2uTSs0F1GW3zgb 5kFMEd+FNidxfUztUA449ujJ8F1IbS2inC6b/xfGpEt0PqtKBDgpX2GcDhveoJO/0N4g5S7s5jF 08LkkH5XKfseBz0goioeJ+0E2Lercac875NHpziSFZtCVDzjqgDLDH5gQKsxmNXSXcJ5xqI33VM LRyiGlHiaNTFf X-Developer-Key: i=kees@kernel.org; a=openpgp; fpr=A5C3F68F229DD60F723E6E138972F4DFDC6DC026 X-Stat-Signature: sqbsx8ar4dfcqfq6jn5y1q4diaxa5m73 X-Rspamd-Queue-Id: 46AE08000F X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1718825639-628535 X-HE-Meta: U2FsdGVkX198pGQElu1pu/MIDurTl0VPycgQn5zXgYLYEZ4WCiBjWBD+highr6B4tTL2sJbfNRGXmqPKKR4EC+oGXbEiKSNbUfWuCOFY0IcXcUjA3y4esEhTS9f2khmsNJxE+VorTa2PWaiJW6FOlXvZL13+Ng7Y0xP2WEC4J00OOR1eTh9SeTGDWjkS0svvqetvsQ4CXcp9rwZCqU00bY0ni8b35zwVrOZHNTBgtj8wFxySq0pNuP72jVouPn1B6GLGy6NqaqwROpOw/95j8Nk+2kod7IvVEW6JWXfPiU1caYv1z1EzgvwdT2ct7aYzbcWMMvocHOsLozBfMX0vU6lR90a+Wn8JO8tA+HtHscs+0tTd1MRx9KX4LYVcFkFCuuJKbl6JUQ+9OLuSHHiXbi3kCLaFROxSQpa23zODQ8LXJ8oSmXhvsAKZ+tUNVY6bzO17dYy7dOUOKq1TylUO30Bo+2lNkryUO/6PpdawI85+5Jo0RYjVURgDG2YPHiW16QPyJiYxcDK6X+N7cLhYN2N7YHL1ldOQ+45NwHI0i0neiSfTakE1aj8vxZyRHJXruDQbdfIGNbMr28muw+TpqJBkIwBD8wshO/Yj82n/bDuP/myFOzbGJJwePnkKgwpxs+nsMjPOmPXigKCKLwkdl/sqkmvqtE+H4E6CrkHS2qmSH3iSkCRq5MXoXFJ8LG22XfBUdCDZYZJgKyjX/xZ5WBraz60U3eYWNTd179al/fqF+fPPVWm2HEPTEEFERrQJOUX93qdfX5LaEA+pcwMCH8vQG4vu3T0cbrElxjICPj2WleGJQlV1zjM8ITMwDPJBi0HuLD7a3ChTPVsLalz9357RXOSfmFlbVk3xCh/qwz2ZDpP50aTI7QMPympeyYI9ZZueyL3N6bXmnlBKzFeUbQQoTHSiu3rcvVAiM3dqf/X6mCtmubwE0PS2BTvg0bZX2mExCQ9vlYhdy7b0BMQ ju2s8v+J XuxiFj1VieEOaiNs9iXa43cxPyOY6o+xzd4ypbnZwPe7xUtrCriyVNYY0n02fl8G14wrpjSS96VXZW6PdxdkfniNWYe/tMwOejQwPP+SB8ptB5l4510nQQx8lLP3GR4iInb0c8aMy4zRs8cQ4ustNerd85XdoPPvdgbwX097DRPVbm28xKaWwRNMmYS0dBvlh9sA82MyhDR8DsChNLEs+K2kKTuZ6lmuS6aLNUbnnC2qx7mXGdTaKS5rverWszwf9ALgyZOm0XntY1MY+U99kretB2AnmbtOpqpWo2kZnJB3he0r9Gw4vDPlEEE2KAEEMxi4a0SvGoQDRp3MuRjx42eEjwDDUweWNILOcxq0MGbwD/N3ROm2BBX3rS+eeAEMfVQy7J6Uvc4rOmezuXWlv6TcJnlV17sn3sn0E2vl5Sr8tmy9rUfpv+CbcZONMWzgyKadRciN0ZAj8deWpuk6uzDhifg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Dedicated caches are available for fixed size allocations via kmem_cache_alloc(), but for dynamically sized allocations there is only the global kmalloc API's set of buckets available. This means it isn't possible to separate specific sets of dynamically sized allocations into a separate collection of caches. This leads to a use-after-free exploitation weakness in the Linux kernel since many heap memory spraying/grooming attacks depend on using userspace-controllable dynamically sized allocations to collide with fixed size allocations that end up in same cache. While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense against these kinds of "type confusion" attacks, including for fixed same-size heap objects, we can create a complementary deterministic defense for dynamically sized allocations that are directly user controlled. Addressing these cases is limited in scope, so isolating these kinds of interfaces will not become an unbounded game of whack-a-mole. For example, many pass through memdup_user(), making isolation there very effective. In order to isolate user-controllable dynamically-sized allocations from the common system kmalloc allocations, introduce kmem_buckets_create(), which behaves like kmem_cache_create(). Introduce kmem_buckets_alloc(), which behaves like kmem_cache_alloc(). Introduce kmem_buckets_alloc_track_caller() for where caller tracking is needed. Introduce kmem_buckets_valloc() for cases where vmalloc fallback is needed. This can also be used in the future to extend allocation profiling's use of code tagging to implement per-caller allocation cache isolation[1] even for dynamic allocations. Memory allocation pinning[2] is still needed to plug the Use-After-Free cross-allocator weakness, but that is an existing and separate issue which is complementary to this improvement. Development continues for that feature via the SLAB_VIRTUAL[3] series (which could also provide guard pages -- another complementary improvement). Link: https://lore.kernel.org/lkml/202402211449.401382D2AF@keescook [1] Link: https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html [2] Link: https://lore.kernel.org/lkml/20230915105933.495735-1-matteorizzo@google.com/ [3] Signed-off-by: Kees Cook --- include/linux/slab.h | 13 ++++++++ mm/slab_common.c | 78 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 91 insertions(+) diff --git a/include/linux/slab.h b/include/linux/slab.h index 8d0800c7579a..3698b15b6138 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -549,6 +549,11 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, struct list_lru *lru, void kmem_cache_free(struct kmem_cache *s, void *objp); +kmem_buckets *kmem_buckets_create(const char *name, unsigned int align, + slab_flags_t flags, + unsigned int useroffset, unsigned int usersize, + void (*ctor)(void *)); + /* * Bulk allocation and freeing operations. These are accelerated in an * allocator specific way to avoid taking locks repeatedly or building @@ -681,6 +686,12 @@ static __always_inline __alloc_size(1) void *kmalloc_noprof(size_t size, gfp_t f } #define kmalloc(...) alloc_hooks(kmalloc_noprof(__VA_ARGS__)) +#define kmem_buckets_alloc(_b, _size, _flags) \ + alloc_hooks(__kmalloc_node_noprof(PASS_BUCKET_PARAMS(_size, _b), _flags, NUMA_NO_NODE)) + +#define kmem_buckets_alloc_track_caller(_b, _size, _flags) \ + alloc_hooks(__kmalloc_node_track_caller_noprof(PASS_BUCKET_PARAMS(_size, _b), _flags, NUMA_NO_NODE, _RET_IP_)) + static __always_inline __alloc_size(1) void *kmalloc_node_noprof(size_t size, gfp_t flags, int node) { if (__builtin_constant_p(size) && size) { @@ -808,6 +819,8 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) #define kvzalloc(_size, _flags) kvmalloc(_size, (_flags)|__GFP_ZERO) #define kvzalloc_node(_size, _flags, _node) kvmalloc_node(_size, (_flags)|__GFP_ZERO, _node) +#define kmem_buckets_valloc(_b, _size, _flags) \ + alloc_hooks(__kvmalloc_node_noprof(PASS_BUCKET_PARAMS(_size, _b), _flags, NUMA_NO_NODE)) static inline __alloc_size(1, 2) void * kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node) diff --git a/mm/slab_common.c b/mm/slab_common.c index 9b0f2ef951f1..453bc4ec8b57 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -392,6 +392,80 @@ kmem_cache_create(const char *name, unsigned int size, unsigned int align, } EXPORT_SYMBOL(kmem_cache_create); +static struct kmem_cache *kmem_buckets_cache __ro_after_init; + +kmem_buckets *kmem_buckets_create(const char *name, unsigned int align, + slab_flags_t flags, + unsigned int useroffset, + unsigned int usersize, + void (*ctor)(void *)) +{ + kmem_buckets *b; + int idx; + + /* + * When the separate buckets API is not built in, just return + * a non-NULL value for the kmem_buckets pointer, which will be + * unused when performing allocations. + */ + if (!IS_ENABLED(CONFIG_SLAB_BUCKETS)) + return ZERO_SIZE_PTR; + + if (WARN_ON(!kmem_buckets_cache)) + return NULL; + + b = kmem_cache_alloc(kmem_buckets_cache, GFP_KERNEL|__GFP_ZERO); + if (WARN_ON(!b)) + return NULL; + + flags |= SLAB_NO_MERGE; + + for (idx = 0; idx < ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL]); idx++) { + char *short_size, *cache_name; + unsigned int cache_useroffset, cache_usersize; + unsigned int size; + + if (!kmalloc_caches[KMALLOC_NORMAL][idx]) + continue; + + size = kmalloc_caches[KMALLOC_NORMAL][idx]->object_size; + if (!size) + continue; + + short_size = strchr(kmalloc_caches[KMALLOC_NORMAL][idx]->name, '-'); + if (WARN_ON(!short_size)) + goto fail; + + cache_name = kasprintf(GFP_KERNEL, "%s-%s", name, short_size + 1); + if (WARN_ON(!cache_name)) + goto fail; + + if (useroffset >= size) { + cache_useroffset = 0; + cache_usersize = 0; + } else { + cache_useroffset = useroffset; + cache_usersize = min(size - cache_useroffset, usersize); + } + (*b)[idx] = kmem_cache_create_usercopy(cache_name, size, + align, flags, cache_useroffset, + cache_usersize, ctor); + kfree(cache_name); + if (WARN_ON(!(*b)[idx])) + goto fail; + } + + return b; + +fail: + for (idx = 0; idx < ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL]); idx++) + kmem_cache_destroy((*b)[idx]); + kfree(b); + + return NULL; +} +EXPORT_SYMBOL(kmem_buckets_create); + #ifdef SLAB_SUPPORTS_SYSFS /* * For a given kmem_cache, kmem_cache_destroy() should only be called @@ -931,6 +1005,10 @@ void __init create_kmalloc_caches(void) /* Kmalloc array is now usable */ slab_state = UP; + + kmem_buckets_cache = kmem_cache_create("kmalloc_buckets", + sizeof(kmem_buckets), + 0, SLAB_NO_MERGE, NULL); } /**