From patchwork Wed Apr 24 21:41:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kees Cook X-Patchwork-Id: 13642529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEC1BC10F15 for ; Wed, 24 Apr 2024 21:41:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6EDFA6B0318; Wed, 24 Apr 2024 17:41:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 67C756B031A; Wed, 24 Apr 2024 17:41:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E0688D0034; Wed, 24 Apr 2024 17:41:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1C7396B0318 for ; Wed, 24 Apr 2024 17:41:13 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B41C7C0A67 for ; Wed, 24 Apr 2024 21:41:12 +0000 (UTC) X-FDA: 82045746384.17.2AF0EF0 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf20.hostedemail.com (Postfix) with ESMTP id B9FB91C0008 for ; Wed, 24 Apr 2024 21:41:10 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=C5alb6vS; spf=pass (imf20.hostedemail.com: domain of keescook@chromium.org designates 209.85.214.179 as permitted sender) smtp.mailfrom=keescook@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713994870; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xgSg7tRErA3GTFqfC6/09RNvabn2ysccNK/8UlB0lqg=; b=bGu576w2ucP8BTBwEI20GfVenwsGbW92QautJwGWJQ463BZy+RwpdbnXAe5GLgDHLg/Ejf Ieafx3Wzm5PSZ2rOeYQDrm35VChaiQB8xmMNjvzdk/6BC6nHroypPpUuFpiFjJp0L8uflG CM/6DiDP7XVepcUlu4fsDyHSg3YQctU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713994870; a=rsa-sha256; cv=none; b=VttoCNZ8kbpyIZ0q5i6TLAg4rbbKdE5f93u7kX4YRK1aeA5R87hsaZfeY8WUzsSegp/bVT JNdZ+Dac+9eqE6CobEEZBAzCrG4Dh1qcqIQuaxXoWeM1r83ZeYfAy5VsTK4Ie56szRlegG gbv7pms8LTIzAPKGFJjl8UWTLsEXaCM= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=C5alb6vS; spf=pass (imf20.hostedemail.com: domain of keescook@chromium.org designates 209.85.214.179 as permitted sender) smtp.mailfrom=keescook@chromium.org; dmarc=pass (policy=none) header.from=chromium.org Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-1e4bf0b3e06so3038135ad.1 for ; Wed, 24 Apr 2024 14:41:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1713994870; x=1714599670; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xgSg7tRErA3GTFqfC6/09RNvabn2ysccNK/8UlB0lqg=; b=C5alb6vSTJXw9aUwLu9Zwi3F74d57+N24De90pRd/sunEPy5fYM2zSoXeXmSqmscCD UreqdwRfLFWMhX0RwwF6dXYaMo4i8c8tav8juktjWU5oOZSJJ9Rgr8NwwOGw+Ruf0Df/ O4pJ/8d5HJ4YXiGT/WHUobUZ7lgL5OC1A5rQE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713994870; x=1714599670; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xgSg7tRErA3GTFqfC6/09RNvabn2ysccNK/8UlB0lqg=; b=A/ONEcNEFfZjO+pNlBoqh6yuGpAbAQPhabxkuaCgPuizm+8RIr3djGkzSnyEt058HT quNWY+ya7xK2Zo8EqFfUKiPpV+Ry2G1b7QwUq3Fk4qhvyyPVGrAXofVdqFZj8jHJtqCU KadY461sV/5zU6YjDFuOJNMdXUsxQBqnqqF6O7BhnvWCtSMRGZ+JUaKJeGoS1y4dSo8w s8YnDmEibnVperapgeTPnjIJn4aaLb6DzYhjmjDisriYQpPTsnABiVzceRreLP3Sxmta kkK5yH/rpPhSQxYhZNQRt3wYe2g0qGCOHthDJyAZMTcDVwo1froUqFoBo9NaxgMleEqh J79w== X-Forwarded-Encrypted: i=1; AJvYcCUmcyTDGfSZ9EYwhHDk/Kya5/PWjZ9yI5mJsJPv/E5m8Olg1e2CYgrMNwTnZvsKrAPDEI2jNE4JQqPpH1roKyev0UM= X-Gm-Message-State: AOJu0Yy0OjP+YuWPh0QmAOM2h2BKqnIcMmJIz18huYtrXkrcUGRdDOUz RlVFJKmviVuzhLngpJ6VdPfQh2aoO6sT69m3ny/qlE1GRrwR/tOcINSMqPOhgQ== X-Google-Smtp-Source: AGHT+IHb55e4QA/5/nvYk8hjmJDauEXroyLEAzvYHGIUTin2g6qZf3SjNVhCZtuDOfjBzSSBH/ffBQ== X-Received: by 2002:a17:903:32c5:b0:1e3:e1ff:2e79 with SMTP id i5-20020a17090332c500b001e3e1ff2e79mr4664344plr.45.1713994869664; Wed, 24 Apr 2024 14:41:09 -0700 (PDT) Received: from www.outflux.net ([198.0.35.241]) by smtp.gmail.com with ESMTPSA id a8-20020a170902ecc800b001e944fc9248sm8170647plh.194.2024.04.24.14.41.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Apr 2024 14:41:05 -0700 (PDT) From: Kees Cook To: Vlastimil Babka Cc: Kees Cook , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, "GONG, Ruiqi" , Xiu Jianfeng , Suren Baghdasaryan , Kent Overstreet , Jann Horn , Matteo Rizzo , Thomas Graf , Herbert Xu , julien.voisin@dustri.org, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org Subject: [PATCH v3 4/6] mm/slab: Introduce kmem_buckets_create() and family Date: Wed, 24 Apr 2024 14:41:01 -0700 Message-Id: <20240424214104.3248214-4-keescook@chromium.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240424213019.make.366-kees@kernel.org> References: <20240424213019.make.366-kees@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=7010; i=keescook@chromium.org; h=from:subject; bh=C0Prx/nK0APcgMVkHJYdvYJapdDY3C5e64QBsP8kI7M=; b=owEBbQKS/ZANAwAKAYly9N/cbcAmAcsmYgBmKXxuY6dgso5CobllhqEMKf+YP1OAj0tPWmvHV V/Awo4RmLeJAjMEAAEKAB0WIQSlw/aPIp3WD3I+bhOJcvTf3G3AJgUCZil8bgAKCRCJcvTf3G3A Jqa7D/4xKQ5cm8fo1KortUawuJbdD58RkZxTRydP+JlQbVi/tshUUQFxdEjS36UO7Ine1xNL6eL VQkwIqn4v+jkXbAlyXJynBw6USOvJPLMvkjKyTbnF+xRbqnLcZwtbnDDAXWmCSDAAYQng+WwgxG 8lJQJftOJBVtOIeQmjDgzC2L8KX0WFPKVnlWuV7fSsdyPvqloa2Q1JNk2bdHwFCEv35X8aQSnt/ 4fO8eHYzhOInTIe5set0qN1Ho70Dkalud5cv/0BJTeECVfsmvozJ2hlq//jhZNHnurhzGNt4w7A GH4n4wynqIdieggoJBLVJ6ErY4SIvop/3G3BXB++Q9XgHMtnM+XfYb1cYXZuIQWTJKmLjX5Iwk0 XsCGCXyWWj9ciiwT67Adl/cOnovL6SNORwRVR+MIr/UR5BZTveOWwQEEyXKZewp9ucq0Euvm+PP no6zUeieXsC/WHnGd/ODPO0A3PD0d87TLhjz7B2P1IiwNEmcNrRYp03SIy3LlWBcjkEEBTLRzmH S1Mb+AyR8PLvQS4Sb29XYgL9l9e7oOLmdJjD9rMP4M1xRFcnb7Ki+Q2RO8CmI21fyf6pFNusf9a LdyNNrHcTf3EEJi+USDAgOp/hHIK2VXgig5x64lTso76okPx8USqNBnY8uET6RSCjriSCYhYFM6 0zaJmVbJuJfE0cQ== X-Developer-Key: i=keescook@chromium.org; a=openpgp; fpr=A5C3F68F229DD60F723E6E138972F4DFDC6DC026 X-Stat-Signature: 3jptob14wu43r4qp1qcu998cz8zd8dxg X-Rspamd-Queue-Id: B9FB91C0008 X-Rspamd-Server: rspam06 X-Rspam-User: X-HE-Tag: 1713994870-543423 X-HE-Meta: U2FsdGVkX192EHliRmGqZ8zhEPrRDvYT+HDcxnwumS8SoYhV/1x4UCmN9oBfcyTx1+sUadewjsit9KDwR0dqL8u8DSJCLVY2ZCoajGfUs0c/QZ9MqF1MKlej5xV6kT8JS7z0egfbUIWSDjb1BKfrmLItgy/manB5XQ3/7PPNCzhCiXO4XagzSAZ1RaRN6t6EeEp10Gu8zssX15EjqE4ioKv1iqvWyLQdFhj+Kc+K7uF7fEFHmGXkZSWY4kA3sz5h00EiiIs/vawcqxnUmTLX8dfzkAtGOJJw69Q1fldqQP0ta1a/Hzacv/tREpNV3mTWIlvqNzljfKd3ACT4FDO6Swtbijen5ShFXhr2bww56rL6pAP109Yolz7H1L23WAgoL0y0eJl50sFd8dIjprU9A6LFyQ3u41pH0N7wj+9Zr0iIt5bBAOtSxeaFpYzwNRvYgNDBs1V7XUiv3CbzM27LILIOjBOBNvkR4SV/JqDFHQVrDlVut5qTyRUJhqLx3LhhxgYI8ukNDCjc01YArroETUcOa9YcVc3VQEEl/DOWaYoX9I94vhfdumBI9GB1vwmiJCWYFKVn89SJfCTqM+wTJI0xVMJWp7mPh6E1KPV3RMV3imijryRm7e641DnM366d8kToJS+4oPAmCDk3grKRwTDGKqNGefPXHr7ldEyZCEwblA8A4rEZa9VrAeeD/VyzghhAnVr0AD+pDTOXJ0FFldpJqPFG4yOZr/RFyw0NKuTy04DP0UVcEuWhjHw56uwlcRRWJ3fxBz1N5vIRvy5LdIv7L5nzqq9F4ioW8+Z8PWbGSbnNF53fFZrH6C4PhTgA3ZagJTfJS4FajRL76/sYSRn0g1QlaCmAu/LYAJ5kltcw1hzSFvXOcuJlLmEhGowcmXseuixYvZbVWjMpQhOWiIZ4dRdd3drflpU/PnHKkWkb0qPN+aIZ5w00lJ6JxPNYiP09PKLxuXivSYj3p7U KeSlZOtL 1VhbcBdniEAt3k1PLN9cdPchi4FquXq+m3azuD6NVZZviZ/9N+J5CVsvDNaDhGAdecvY51nKKgzkLqkcRdmK2ZeQ4hQey/7Zu+qmwq08ziNWDYDJUyDrnLKLi5B6dohbxdMgQXpPCGiBkjwxNYhnuO6bJv/sesnGxtfiI+7o5WnCJRdaLIOjOrm44AJPKf3Kx2lNgWXAYmvkMPgG/VM80wxkwDLibvK1rxlebuNL7s2SJZV2rYlvNMdtXEJMQA67Ck2pxp2J1ZRZwi9TCRVk32wDJCJSOfNVSbvGvjcIw2ZJKNHUFZAPqTbMoLd7vxrVKOZRa4Waatnmoa/MJnPkgsoRFQyQS/OmozMa8WbhePzynnmglrcJfzW+Gi4IKEOWCeJCHeyOa36FIpLjaFmkenfWDrEaE4fmD6hBx4EoZ9cgD73SrnHozRt0UmGh4bpEraIaTI2oJUTzdxJsmN7WckuLG6r6hFPuw2NvHANKUQHegnYFuL3e9BAGGKkRLVTIuZyK9erA2lsljCEAFs1lgNy64vOwS4CzyWke3QHV01d24JpJFKBW8ZNpG54w6MWqzHmCXFfNxcC62wcDMXW1OjhSaoHRSmai9quGXaFdmTbPq+zHBjD31h2KdLxg2ebFUH4S5oOVasU/1lD2+6QV3asg5FSeCyL5QHQl/nkefv22UZfTxZnr1UTcsz9ixYSb4j403cYB6+Q/KVXVfEx8DFS2McBCVCAeWvoRKJfxT36LqhBGDUaB1mGF9GTOH7Wh7wYhhL4zxDTJm1dLO2luI0jRK6ioeVqPmDaTSiLW0LvVDNUrv74jL6xMm8tlnSHp3UmqyhL5oVh8xpJNX61S1FHMLHKJM94zECIRBohWUvhch4GMe0OWr4Qxka1cra1yKHNuW9WtVzCu+C0DZxq3KsIkXFBYUFDI0lBaPixvrUiWRndeoZZXTtiL82nm5XtWJgJMV X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Dedicated caches are available for fixed size allocations via kmem_cache_alloc(), but for dynamically sized allocations there is only the global kmalloc API's set of buckets available. This means it isn't possible to separate specific sets of dynamically sized allocations into a separate collection of caches. This leads to a use-after-free exploitation weakness in the Linux kernel since many heap memory spraying/grooming attacks depend on using userspace-controllable dynamically sized allocations to collide with fixed size allocations that end up in same cache. While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense against these kinds of "type confusion" attacks, including for fixed same-size heap objects, we can create a complementary deterministic defense for dynamically sized allocations that are directly user controlled. Addressing these cases is limited in scope, so isolation these kinds of interfaces will not become an unbounded game of whack-a-mole. For example, pass through memdup_user(), making isolation there very effective. In order to isolate user-controllable sized allocations from system allocations, introduce kmem_buckets_create(), which behaves like kmem_cache_create(). Introduce kmem_buckets_alloc(), which behaves like kmem_cache_alloc(). Introduce kmem_buckets_alloc_track_caller() for where caller tracking is needed. Introduce kmem_buckets_valloc() for cases where vmalloc callback is needed. Allows for confining allocations to a dedicated set of sized caches (which have the same layout as the kmalloc caches). This can also be used in the future to extend codetag allocation annotations to implement per-caller allocation cache isolation[1] even for dynamic allocations. Memory allocation pinning[2] is still needed to plug the Use-After-Free cross-allocator weakness, but that is an existing and separate issue which is complementary to this improvement. Development continues for that feature via the SLAB_VIRTUAL[3] series (which could also provide guard pages -- another complementary improvement). Link: https://lore.kernel.org/lkml/202402211449.401382D2AF@keescook [1] Link: https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html [2] Link: https://lore.kernel.org/lkml/20230915105933.495735-1-matteorizzo@google.com/ [3] Signed-off-by: Kees Cook --- Cc: Vlastimil Babka Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Andrew Morton Cc: Roman Gushchin Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: linux-mm@kvack.org --- include/linux/slab.h | 13 ++++++++ mm/slab_common.c | 72 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 85 insertions(+) diff --git a/include/linux/slab.h b/include/linux/slab.h index 23b13be0ac95..1f14a65741a6 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -552,6 +552,11 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, struct list_lru *lru, void kmem_cache_free(struct kmem_cache *s, void *objp); +kmem_buckets *kmem_buckets_create(const char *name, unsigned int align, + slab_flags_t flags, + unsigned int useroffset, unsigned int usersize, + void (*ctor)(void *)); + /* * Bulk allocation and freeing operations. These are accelerated in an * allocator specific way to avoid taking locks repeatedly or building @@ -666,6 +671,12 @@ static __always_inline __alloc_size(1) void *kmalloc_noprof(size_t size, gfp_t f } #define kmalloc(...) alloc_hooks(kmalloc_noprof(__VA_ARGS__)) +#define kmem_buckets_alloc(_b, _size, _flags) \ + alloc_hooks(__kmalloc_node_noprof(_b, _size, _flags, NUMA_NO_NODE)) + +#define kmem_buckets_alloc_track_caller(_b, _size, _flags) \ + alloc_hooks(kmalloc_node_track_caller_noprof(_b, _size, _flags, NUMA_NO_NODE, _RET_IP_)) + static __always_inline __alloc_size(1) void *kmalloc_node_noprof(size_t size, gfp_t flags, int node) { if (__builtin_constant_p(size) && size) { @@ -792,6 +803,8 @@ extern void *kvmalloc_node_noprof(kmem_buckets *b, size_t size, gfp_t flags, int #define kvzalloc_node(_size, _flags, _node) kvmalloc_node(_size, _flags|__GFP_ZERO, _node) +#define kmem_buckets_valloc(_b, _size, _flags) __kvmalloc_node(_b, _size, _flags, NUMA_NO_NODE) + static inline __alloc_size(1, 2) void *kvmalloc_array_noprof(size_t n, size_t size, gfp_t flags) { size_t bytes; diff --git a/mm/slab_common.c b/mm/slab_common.c index 7cb4e8fd1275..e36360e63ebd 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -392,6 +392,74 @@ kmem_cache_create(const char *name, unsigned int size, unsigned int align, } EXPORT_SYMBOL(kmem_cache_create); +static struct kmem_cache *kmem_buckets_cache __ro_after_init; + +kmem_buckets *kmem_buckets_create(const char *name, unsigned int align, + slab_flags_t flags, + unsigned int useroffset, + unsigned int usersize, + void (*ctor)(void *)) +{ + kmem_buckets *b; + int idx; + + if (WARN_ON(!kmem_buckets_cache)) + return NULL; + + b = kmem_cache_alloc(kmem_buckets_cache, GFP_KERNEL|__GFP_ZERO); + if (WARN_ON(!b)) + return NULL; + + flags |= SLAB_NO_MERGE; + + for (idx = 0; idx < ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL]); idx++) { + char *short_size, *cache_name; + unsigned int cache_useroffset, cache_usersize; + unsigned int size; + + if (!kmalloc_caches[KMALLOC_NORMAL][idx]) + continue; + + size = kmalloc_caches[KMALLOC_NORMAL][idx]->object_size; + if (!size) + continue; + + short_size = strchr(kmalloc_caches[KMALLOC_NORMAL][idx]->name, '-'); + if (WARN_ON(!short_size)) + goto fail; + + cache_name = kasprintf(GFP_KERNEL, "%s-%s", name, short_size + 1); + if (WARN_ON(!cache_name)) + goto fail; + + if (useroffset >= size) { + cache_useroffset = 0; + cache_usersize = 0; + } else { + cache_useroffset = useroffset; + cache_usersize = min(size - cache_useroffset, usersize); + } + (*b)[idx] = kmem_cache_create_usercopy(cache_name, size, + align, flags, cache_useroffset, + cache_usersize, ctor); + kfree(cache_name); + if (WARN_ON(!(*b)[idx])) + goto fail; + } + + return b; + +fail: + for (idx = 0; idx < ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL]); idx++) { + if ((*b)[idx]) + kmem_cache_destroy((*b)[idx]); + } + kfree(b); + + return NULL; +} +EXPORT_SYMBOL(kmem_buckets_create); + #ifdef SLAB_SUPPORTS_SYSFS /* * For a given kmem_cache, kmem_cache_destroy() should only be called @@ -938,6 +1006,10 @@ void __init create_kmalloc_caches(void) /* Kmalloc array is now usable */ slab_state = UP; + + kmem_buckets_cache = kmem_cache_create("kmalloc_buckets", + sizeof(kmem_buckets), + 0, 0, NULL); } /**