From patchwork Mon Jan 29 10:07:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marco Elver X-Patchwork-Id: 13535351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E354C47DA9 for ; Mon, 29 Jan 2024 10:07:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A23046B0098; Mon, 29 Jan 2024 05:07:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9D2846B0099; Mon, 29 Jan 2024 05:07:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 871316B009A; Mon, 29 Jan 2024 05:07:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 715126B0098 for ; Mon, 29 Jan 2024 05:07:15 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 073BB8095E for ; Mon, 29 Jan 2024 10:07:15 +0000 (UTC) X-FDA: 81731920830.19.E0CCE19 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf28.hostedemail.com (Postfix) with ESMTP id 52D3DC0021 for ; Mon, 29 Jan 2024 10:07:13 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="ok9to/6L"; spf=pass (imf28.hostedemail.com: domain of 30Hi3ZQUKCNM3AK3G5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--elver.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=30Hi3ZQUKCNM3AK3G5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--elver.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706522833; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=LogR/Zah+zrx3k9gi8s3U+HMAC7cqgfXE5NI/I6QwMw=; b=bbWXe0yP3/tojh4vE/ZqNntq0QstjEXfB27j8dD2exH7VWJZun7juHROWBdEfv2vVXEbdD iJw3vhDKAp25KA+GateuDUR3dFtDsLRAYZX5s7/UOeOKeaV6zfmVe6o02lbNA/PmNr6rN1 1Pn3U6CwwNesohNVztrKhatYOzM9TwQ= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="ok9to/6L"; spf=pass (imf28.hostedemail.com: domain of 30Hi3ZQUKCNM3AK3G5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--elver.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=30Hi3ZQUKCNM3AK3G5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--elver.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706522833; a=rsa-sha256; cv=none; b=LvOdbxMckWUlhHdjmdSJWSV83UpiFeUMJtXwdenKCl/v8QMSTaeE3rIDs85hmGq1FrJAtw apezPs8NVg3SrmTxuckPLrFbVdIryOqUGo1jFQ9a5/uf/99w7oywtAnGPotLE0p5j2mNCu RX7elvNpq44AjscJd4AFy7Uv82ZT5os= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-602aa6d987cso48852217b3.2 for ; Mon, 29 Jan 2024 02:07:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1706522832; x=1707127632; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=LogR/Zah+zrx3k9gi8s3U+HMAC7cqgfXE5NI/I6QwMw=; b=ok9to/6L6varl0T7y3KE7AmGe6bFJPO1N1ODDQMFAR9OpchBX2sVBn6710PscU3dUv Yipj9q131HUUgu/vsKsFf0dt1uDlB4Z3nsVMhvJRlMwyqCVV8el/xJjkxajdTkivbz2Q 7JEEljKwNbx0F8P2P74RMVx8OppSWQed4PzE0E5WgnosdixS55OYO36NIiXLW7WZ2KJ4 ZX+rCoWQBfTvNznAXE7vBZ1TVN2lJ+cm0HgVQpR4CXAiLTNL6Bq3ZiRIujGqZqGRg6Sk nVob/jC6LBLV4Zr/tH6O5BCkfV47i+p7XLvLlnC4K9dbnmFPDIpQcYT11qIeCMQBCZFi 5phQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706522832; x=1707127632; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=LogR/Zah+zrx3k9gi8s3U+HMAC7cqgfXE5NI/I6QwMw=; b=u8JxhKx39++8T4Y9i5EyWWfk4mklBBZ7qSpwTbnvLjkXu3LmC1YvMmLyhJPvDqk2fa H57mfAEui6vciDJzsqcawETPrkxD4ari0drv/nDbMpzmuyI2K0jmEfYDiOW45amOkIgc HNV1hNizv/izsWZafeu4qVBkIzRsSfz35DUIhnjAq66OGkZlGJ06KuPN0dBhKm21a4RW EQ2C6YbqTnOGjQi1I28p4kc7yPnr4BSR7fqIVd2ZYwjvrUMAj+BR7Ib9alfCJ/V4d75t ot3Zkcx4q4nTSdqQN8ZYQgovcziK7j5aIHkc8ZHFeluXCbV/LWkADu6nkU7oxPhFgyYa vDZQ== X-Gm-Message-State: AOJu0YzjH34mhJXBXRcptn4IHvJitySHX4Bp41FgrfYP7jdtx9UaCedw IXsf1/UavasjL7Tv6sWCVlcIhcdtkPg9Hb4Xg5Iw/HeAiD5A2k8ewAfCbW997NFvRKnaaz+VoQ= = X-Google-Smtp-Source: AGHT+IHqTMXqqL2nyM3NecLgr0z5DQRBy+4JtepztXMJYecQ3LButAzydh2nIT281m8MW8DiWUDSDf5MCA== X-Received: from elver.muc.corp.google.com ([2a00:79e0:9c:201:cb16:eb72:6e81:bff1]) (user=elver job=sendgmr) by 2002:a05:690c:dd6:b0:5fc:43cb:cb1e with SMTP id db22-20020a05690c0dd600b005fc43cbcb1emr1772952ywb.10.1706522832529; Mon, 29 Jan 2024 02:07:12 -0800 (PST) Date: Mon, 29 Jan 2024 11:07:01 +0100 Mime-Version: 1.0 X-Mailer: git-send-email 2.43.0.429.g432eaa2c6b-goog Message-ID: <20240129100708.39460-1-elver@google.com> Subject: [PATCH v2 1/2] stackdepot: use variable size records for non-evictable entries From: Marco Elver To: elver@google.com, Andrew Morton Cc: Andrey Konovalov , Alexander Potapenko , Dmitry Vyukov , Vlastimil Babka , Andrey Ryabinin , Vincenzo Frascino , linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, linux-mm@kvack.org X-Rspamd-Queue-Id: 52D3DC0021 X-Rspam-User: X-Stat-Signature: ga7dfsfsrqbhwyagyhbaafspgqmi5kkq X-Rspamd-Server: rspam01 X-HE-Tag: 1706522833-779215 X-HE-Meta: U2FsdGVkX19Z+r1Sf8tdqYjSrRLM/+Yq201Hy8Df1hCIGwJB7Q+Kf3+LbSihUIdbgkcPCcwRsyGxQo7cjDoOAKy4Tv9bH/aI7+MbihTL+R50nLU5WC1iEs50L+/5X/XGq6LiXdUs6o34BUkA49kv905WpNnhbQM2AX8SRW5vwtvgl2P8t3oz0hVDWoJmPOsy0MbvZ5Dcuc9bDzCrN7lmtQFUPLEpIp9Umj+CP2AQ4GmPOo5Yn3E7u6jqeFdW8A923kBtMoKl3Rr1hYyHnGoq4yrfM81tviD/VMN/xBPSoxTRitSqW2B6trAQoAPl6DS6HRBHgNo0korB6oT9MMfLc40lquXvZOQlZYSuk/DYmfa2cAc+h/e38LBXuEA/vqum5fFdm/f0NDE1y3EKxWAV1zKYqB1XKBNHePxgO3e2rn8bp5MPQ0taQNlsfD7PxiTvhs+bBPt98uHLJReQG1AWkMfUWg1Gn2AVq4xKENHipZ7eD65ioG8pJYhe0jZIrcOZEh30wRunsfhAZv5i777BIJz1zXgjowj0HRQx2VEVIiJJPXrAALuFzuYVNbrt9UfcQX5bwCwBpkj0r+vY/4XYE3kyTbsDFRO248LWAM7bTVboMcsv20X5cRGYv4gjdMPbYmv8MKi1XYdaTRXdtkyhdOStHYEeDUaGn1mC7pqkneJi5X+lWiXBNKYHl8JtEnax2w9ahdpLr82gFqOH7Cf6JHPmoD/JFrnOu3WDR4IGkUWYmZTTWO4F23y/iVA0ItfBACdm1+GeozNjTExdcSKDWE/1+fY1YDcQvgbRdE7G+vNBJXX17gFVbSJ4Cqvs8CbggUiLP9B/y8wxFaIweBACS1DmNyn8rtitV3VVb50cFnWwGgXzczh4jJ/EPa4HoM5L2ZEwIT+FepUCPyab6Rhaau/uur5JV6Ls1doCsSMI/fWOaJqcq2Q2TJ38+nFH4XHZ+XT75c/XHz4FMqxOcT+ kud3DBD3 ucWEnSA+oaXapGRJnkq8YypYn+aVe7sE1PSNWGJR+ZvcLm7loAiRLsp2pQZ8Nl+iaJ+5VR+zFRr1bjW6sAmsA/7Sn9nj0Rwkq5oE0ZKfGkcoes6cPqMVxQc16nlz+TDOlF4kDqvWFxOw7t7ASE3ZSu8iHguqvUxkd+XMO6io4LXXgcXg/I7tF4fZIDdTTajigGPlGWTHrn+HOE7Gxc6iZbCO1AKJ0ThIIH636pmJeUqrZcGvW0G+RBV3cOewTucVsA69MSmlqEVrGJ569M2mbzUkkS+1JPn1KJnzOrXdKftEK4ap5Yh7xwN3H7Sq7PZtEkAULNPiRExrLYhG/FiUwuXczsc4Xj69OGs2hl2UXExXehExppWHYNfF6nFoWLfzl2QmHxcTkG0yEHK4L4JxSc84P3/tGgaradnkCdWGMRBFTw16Da5+dyVo678IIy5+zpxp3SUybY4EY4e0Tq0QaRHWlNH6tsVQrhJ7lHj72GXGTEjCfUYnLk0pWhBcfxox1VDdyL9QrngqQvm0n5/oacrRcgE0BeyiW/QP4dCcqj0TBPy8yRpake3/J6tI/JsIUNrm5OtFr3AFHks0xbfHvY1XHi5/Idc26HTrGT4N7qKiBPrpKEgWgmbuu1nRueeMTuMsWiDxjy8pJ1Q9NAcOFu7kHSZ/rOVaIVHaxrNM2p1nKsauzIi8n5pJu/O0/18fqvrG2 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: With the introduction of stack depot evictions, each stack record is now fixed size, so that future reuse after an eviction can safely store differently sized stack traces. In all cases that do not make use of evictions, this wastes lots of space. Fix it by re-introducing variable size stack records (up to the max allowed size) for entries that will never be evicted. We know if an entry will never be evicted if the flag STACK_DEPOT_FLAG_GET is not provided, since a later stack_depot_put() attempt is undefined behavior. With my current kernel config that enables KASAN and also SLUB owner tracking, I observe (after a kernel boot) a whopping reduction of 296 stack depot pools, which translates into 4736 KiB saved. The savings here are from SLUB owner tracking only, because KASAN generic mode still uses refcounting. Before: pools: 893 allocations: 29841 frees: 6524 in_use: 23317 freelist_size: 3454 After: pools: 597 refcounted_allocations: 17547 refcounted_frees: 6477 refcounted_in_use: 11070 freelist_size: 3497 persistent_count: 12163 persistent_bytes: 1717008 Fixes: 108be8def46e ("lib/stackdepot: allow users to evict stack traces") Signed-off-by: Marco Elver Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Dmitry Vyukov --- v2: * Also remove KMSAN-specific DEPOT_POOLS_CAP (revert bd9d9624b7136). * Let counters distinguish refcounted and non-refcounted entries. * Comments. v1 (since RFC): * Get rid of new_pool_required to simplify the code. * Warn on attempts to switch a non-refcounted entry to refcounting. * Typos. --- include/linux/poison.h | 3 + lib/stackdepot.c | 250 +++++++++++++++++++++-------------------- 2 files changed, 130 insertions(+), 123 deletions(-) diff --git a/include/linux/poison.h b/include/linux/poison.h index 27a7dad17eef..1f0ee2459f2a 100644 --- a/include/linux/poison.h +++ b/include/linux/poison.h @@ -92,4 +92,7 @@ /********** VFS **********/ #define VFS_PTR_POISON ((void *)(0xF5 + POISON_POINTER_DELTA)) +/********** lib/stackdepot.c **********/ +#define STACK_DEPOT_POISON ((void *)(0xD390 + POISON_POINTER_DELTA)) + #endif diff --git a/lib/stackdepot.c b/lib/stackdepot.c index 5caa1f566553..8f3b2c84ec2d 100644 --- a/lib/stackdepot.c +++ b/lib/stackdepot.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -43,17 +44,7 @@ #define DEPOT_OFFSET_BITS (DEPOT_POOL_ORDER + PAGE_SHIFT - DEPOT_STACK_ALIGN) #define DEPOT_POOL_INDEX_BITS (DEPOT_HANDLE_BITS - DEPOT_OFFSET_BITS - \ STACK_DEPOT_EXTRA_BITS) -#if IS_ENABLED(CONFIG_KMSAN) && CONFIG_STACKDEPOT_MAX_FRAMES >= 32 -/* - * KMSAN is frequently used in fuzzing scenarios and thus saves a lot of stack - * traces. As KMSAN does not support evicting stack traces from the stack - * depot, the stack depot capacity might be reached quickly with large stack - * records. Adjust the maximum number of stack depot pools for this case. - */ -#define DEPOT_POOLS_CAP (8192 * (CONFIG_STACKDEPOT_MAX_FRAMES / 16)) -#else #define DEPOT_POOLS_CAP 8192 -#endif #define DEPOT_MAX_POOLS \ (((1LL << (DEPOT_POOL_INDEX_BITS)) < DEPOT_POOLS_CAP) ? \ (1LL << (DEPOT_POOL_INDEX_BITS)) : DEPOT_POOLS_CAP) @@ -93,9 +84,6 @@ struct stack_record { }; }; -#define DEPOT_STACK_RECORD_SIZE \ - ALIGN(sizeof(struct stack_record), 1 << DEPOT_STACK_ALIGN) - static bool stack_depot_disabled; static bool __stack_depot_early_init_requested __initdata = IS_ENABLED(CONFIG_STACKDEPOT_ALWAYS_INIT); static bool __stack_depot_early_init_passed __initdata; @@ -121,32 +109,31 @@ static void *stack_pools[DEPOT_MAX_POOLS]; static void *new_pool; /* Number of pools in stack_pools. */ static int pools_num; +/* Offset to the unused space in the currently used pool. */ +static size_t pool_offset = DEPOT_POOL_SIZE; /* Freelist of stack records within stack_pools. */ static LIST_HEAD(free_stacks); -/* - * Stack depot tries to keep an extra pool allocated even before it runs out - * of space in the currently used pool. This flag marks whether this extra pool - * needs to be allocated. It has the value 0 when either an extra pool is not - * yet allocated or if the limit on the number of pools is reached. - */ -static bool new_pool_required = true; /* The lock must be held when performing pool or freelist modifications. */ static DEFINE_RAW_SPINLOCK(pool_lock); /* Statistics counters for debugfs. */ enum depot_counter_id { - DEPOT_COUNTER_ALLOCS, - DEPOT_COUNTER_FREES, - DEPOT_COUNTER_INUSE, + DEPOT_COUNTER_REFD_ALLOCS, + DEPOT_COUNTER_REFD_FREES, + DEPOT_COUNTER_REFD_INUSE, DEPOT_COUNTER_FREELIST_SIZE, + DEPOT_COUNTER_PERSIST_COUNT, + DEPOT_COUNTER_PERSIST_BYTES, DEPOT_COUNTER_COUNT, }; static long counters[DEPOT_COUNTER_COUNT]; static const char *const counter_names[] = { - [DEPOT_COUNTER_ALLOCS] = "allocations", - [DEPOT_COUNTER_FREES] = "frees", - [DEPOT_COUNTER_INUSE] = "in_use", + [DEPOT_COUNTER_REFD_ALLOCS] = "refcounted_allocations", + [DEPOT_COUNTER_REFD_FREES] = "refcounted_frees", + [DEPOT_COUNTER_REFD_INUSE] = "refcounted_in_use", [DEPOT_COUNTER_FREELIST_SIZE] = "freelist_size", + [DEPOT_COUNTER_PERSIST_COUNT] = "persistent_count", + [DEPOT_COUNTER_PERSIST_BYTES] = "persistent_bytes", }; static_assert(ARRAY_SIZE(counter_names) == DEPOT_COUNTER_COUNT); @@ -294,48 +281,52 @@ int stack_depot_init(void) EXPORT_SYMBOL_GPL(stack_depot_init); /* - * Initializes new stack depot @pool, release all its entries to the freelist, - * and update the list of pools. + * Initializes new stack pool, and updates the list of pools. */ -static void depot_init_pool(void *pool) +static bool depot_init_pool(void **prealloc) { - int offset; - lockdep_assert_held(&pool_lock); - /* Initialize handles and link stack records into the freelist. */ - for (offset = 0; offset <= DEPOT_POOL_SIZE - DEPOT_STACK_RECORD_SIZE; - offset += DEPOT_STACK_RECORD_SIZE) { - struct stack_record *stack = pool + offset; - - stack->handle.pool_index = pools_num; - stack->handle.offset = offset >> DEPOT_STACK_ALIGN; - stack->handle.extra = 0; - - /* - * Stack traces of size 0 are never saved, and we can simply use - * the size field as an indicator if this is a new unused stack - * record in the freelist. - */ - stack->size = 0; + if (unlikely(pools_num >= DEPOT_MAX_POOLS)) { + /* Bail out if we reached the pool limit. */ + WARN_ON_ONCE(pools_num > DEPOT_MAX_POOLS); /* should never happen */ + WARN_ON_ONCE(!new_pool); /* to avoid unnecessary pre-allocation */ + WARN_ONCE(1, "Stack depot reached limit capacity"); + return false; + } - INIT_LIST_HEAD(&stack->hash_list); - /* - * Add to the freelist front to prioritize never-used entries: - * required in case there are entries in the freelist, but their - * RCU cookie still belongs to the current RCU grace period - * (there can still be concurrent readers). - */ - list_add(&stack->free_list, &free_stacks); - counters[DEPOT_COUNTER_FREELIST_SIZE]++; + if (!new_pool && *prealloc) { + /* We have preallocated memory, use it. */ + WRITE_ONCE(new_pool, *prealloc); + *prealloc = NULL; } + if (!new_pool) + return false; /* new_pool and *prealloc are NULL */ + /* Save reference to the pool to be used by depot_fetch_stack(). */ - stack_pools[pools_num] = pool; + stack_pools[pools_num] = new_pool; + + /* + * Stack depot tries to keep an extra pool allocated even before it runs + * out of space in the currently used pool. + * + * To indicate that a new preallocation is needed new_pool is reset to + * NULL; do not reset to NULL if we have reached the maximum number of + * pools. + */ + if (pools_num < DEPOT_MAX_POOLS) + WRITE_ONCE(new_pool, NULL); + else + WRITE_ONCE(new_pool, STACK_DEPOT_POISON); /* Pairs with concurrent READ_ONCE() in depot_fetch_stack(). */ WRITE_ONCE(pools_num, pools_num + 1); ASSERT_EXCLUSIVE_WRITER(pools_num); + + pool_offset = 0; + + return true; } /* Keeps the preallocated memory to be used for a new stack depot pool. */ @@ -347,63 +338,51 @@ static void depot_keep_new_pool(void **prealloc) * If a new pool is already saved or the maximum number of * pools is reached, do not use the preallocated memory. */ - if (!new_pool_required) + if (new_pool) return; - /* - * Use the preallocated memory for the new pool - * as long as we do not exceed the maximum number of pools. - */ - if (pools_num < DEPOT_MAX_POOLS) { - new_pool = *prealloc; - *prealloc = NULL; - } - - /* - * At this point, either a new pool is kept or the maximum - * number of pools is reached. In either case, take note that - * keeping another pool is not required. - */ - WRITE_ONCE(new_pool_required, false); + WRITE_ONCE(new_pool, *prealloc); + *prealloc = NULL; } /* - * Try to initialize a new stack depot pool from either a previous or the - * current pre-allocation, and release all its entries to the freelist. + * Try to initialize a new stack record from the current pool, a cached pool, or + * the current pre-allocation. */ -static bool depot_try_init_pool(void **prealloc) +static struct stack_record *depot_pop_free_pool(void **prealloc, size_t size) { + struct stack_record *stack; + void *current_pool; + u32 pool_index; + lockdep_assert_held(&pool_lock); - /* Check if we have a new pool saved and use it. */ - if (new_pool) { - depot_init_pool(new_pool); - new_pool = NULL; + if (pool_offset + size > DEPOT_POOL_SIZE) { + if (!depot_init_pool(prealloc)) + return NULL; + } - /* Take note that we might need a new new_pool. */ - if (pools_num < DEPOT_MAX_POOLS) - WRITE_ONCE(new_pool_required, true); + if (WARN_ON_ONCE(pools_num < 1)) + return NULL; + pool_index = pools_num - 1; + current_pool = stack_pools[pool_index]; + if (WARN_ON_ONCE(!current_pool)) + return NULL; - return true; - } + stack = current_pool + pool_offset; - /* Bail out if we reached the pool limit. */ - if (unlikely(pools_num >= DEPOT_MAX_POOLS)) { - WARN_ONCE(1, "Stack depot reached limit capacity"); - return false; - } + /* Pre-initialize handle once. */ + stack->handle.pool_index = pool_index; + stack->handle.offset = pool_offset >> DEPOT_STACK_ALIGN; + stack->handle.extra = 0; + INIT_LIST_HEAD(&stack->hash_list); - /* Check if we have preallocated memory and use it. */ - if (*prealloc) { - depot_init_pool(*prealloc); - *prealloc = NULL; - return true; - } + pool_offset += size; - return false; + return stack; } -/* Try to find next free usable entry. */ +/* Try to find next free usable entry from the freelist. */ static struct stack_record *depot_pop_free(void) { struct stack_record *stack; @@ -420,7 +399,7 @@ static struct stack_record *depot_pop_free(void) * check the first entry. */ stack = list_first_entry(&free_stacks, struct stack_record, free_list); - if (stack->size && !poll_state_synchronize_rcu(stack->rcu_state)) + if (!poll_state_synchronize_rcu(stack->rcu_state)) return NULL; list_del(&stack->free_list); @@ -429,48 +408,73 @@ static struct stack_record *depot_pop_free(void) return stack; } +static inline size_t depot_stack_record_size(struct stack_record *s, unsigned int nr_entries) +{ + const size_t used = flex_array_size(s, entries, nr_entries); + const size_t unused = sizeof(s->entries) - used; + + WARN_ON_ONCE(sizeof(s->entries) < used); + + return ALIGN(sizeof(struct stack_record) - unused, 1 << DEPOT_STACK_ALIGN); +} + /* Allocates a new stack in a stack depot pool. */ static struct stack_record * -depot_alloc_stack(unsigned long *entries, int size, u32 hash, void **prealloc) +depot_alloc_stack(unsigned long *entries, int nr_entries, u32 hash, depot_flags_t flags, void **prealloc) { - struct stack_record *stack; + struct stack_record *stack = NULL; + size_t record_size; lockdep_assert_held(&pool_lock); /* This should already be checked by public API entry points. */ - if (WARN_ON_ONCE(!size)) + if (WARN_ON_ONCE(!nr_entries)) return NULL; - /* Check if we have a stack record to save the stack trace. */ - stack = depot_pop_free(); - if (!stack) { - /* No usable entries on the freelist - try to refill the freelist. */ - if (!depot_try_init_pool(prealloc)) - return NULL; + /* Limit number of saved frames to CONFIG_STACKDEPOT_MAX_FRAMES. */ + if (nr_entries > CONFIG_STACKDEPOT_MAX_FRAMES) + nr_entries = CONFIG_STACKDEPOT_MAX_FRAMES; + + if (flags & STACK_DEPOT_FLAG_GET) { + /* + * Evictable entries have to allocate the max. size so they may + * safely be re-used by differently sized allocations. + */ + record_size = depot_stack_record_size(stack, CONFIG_STACKDEPOT_MAX_FRAMES); stack = depot_pop_free(); - if (WARN_ON(!stack)) - return NULL; + } else { + record_size = depot_stack_record_size(stack, nr_entries); } - /* Limit number of saved frames to CONFIG_STACKDEPOT_MAX_FRAMES. */ - if (size > CONFIG_STACKDEPOT_MAX_FRAMES) - size = CONFIG_STACKDEPOT_MAX_FRAMES; + if (!stack) { + stack = depot_pop_free_pool(prealloc, record_size); + if (!stack) + return NULL; + } /* Save the stack trace. */ stack->hash = hash; - stack->size = size; - /* stack->handle is already filled in by depot_init_pool(). */ - refcount_set(&stack->count, 1); - memcpy(stack->entries, entries, flex_array_size(stack, entries, size)); + stack->size = nr_entries; + /* stack->handle is already filled in by depot_pop_free_pool(). */ + memcpy(stack->entries, entries, flex_array_size(stack, entries, nr_entries)); + + if (flags & STACK_DEPOT_FLAG_GET) { + refcount_set(&stack->count, 1); + counters[DEPOT_COUNTER_REFD_ALLOCS]++; + counters[DEPOT_COUNTER_REFD_INUSE]++; + } else { + /* Warn on attempts to switch to refcounting this entry. */ + refcount_set(&stack->count, REFCOUNT_SATURATED); + counters[DEPOT_COUNTER_PERSIST_COUNT]++; + counters[DEPOT_COUNTER_PERSIST_BYTES] += record_size; + } /* * Let KMSAN know the stored stack record is initialized. This shall * prevent false positive reports if instrumented code accesses it. */ - kmsan_unpoison_memory(stack, DEPOT_STACK_RECORD_SIZE); + kmsan_unpoison_memory(stack, record_size); - counters[DEPOT_COUNTER_ALLOCS]++; - counters[DEPOT_COUNTER_INUSE]++; return stack; } @@ -538,8 +542,8 @@ static void depot_free_stack(struct stack_record *stack) list_add_tail(&stack->free_list, &free_stacks); counters[DEPOT_COUNTER_FREELIST_SIZE]++; - counters[DEPOT_COUNTER_FREES]++; - counters[DEPOT_COUNTER_INUSE]--; + counters[DEPOT_COUNTER_REFD_FREES]++; + counters[DEPOT_COUNTER_REFD_INUSE]--; printk_deferred_exit(); raw_spin_unlock_irqrestore(&pool_lock, flags); @@ -660,7 +664,7 @@ depot_stack_handle_t stack_depot_save_flags(unsigned long *entries, * Allocate memory for a new pool if required now: * we won't be able to do that under the lock. */ - if (unlikely(can_alloc && READ_ONCE(new_pool_required))) { + if (unlikely(can_alloc && !READ_ONCE(new_pool))) { /* * Zero out zone modifiers, as we don't have specific zone * requirements. Keep the flags related to allocation in atomic @@ -681,7 +685,7 @@ depot_stack_handle_t stack_depot_save_flags(unsigned long *entries, found = find_stack(bucket, entries, nr_entries, hash, depot_flags); if (!found) { struct stack_record *new = - depot_alloc_stack(entries, nr_entries, hash, &prealloc); + depot_alloc_stack(entries, nr_entries, hash, depot_flags, &prealloc); if (new) { /*