From patchwork Mon Apr 7 18:40:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sidhartha Kumar X-Patchwork-Id: 14041525 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93AB3C36010 for ; Mon, 7 Apr 2025 18:41:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3CF326B0012; Mon, 7 Apr 2025 14:41:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 358D16B0022; Mon, 7 Apr 2025 14:41:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 112B96B0024; Mon, 7 Apr 2025 14:41:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C5C896B0012 for ; Mon, 7 Apr 2025 14:41:20 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 4C395B2545 for ; Mon, 7 Apr 2025 18:41:21 +0000 (UTC) X-FDA: 83308115562.03.39B2F34 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf24.hostedemail.com (Postfix) with ESMTP id 50C71180009 for ; Mon, 7 Apr 2025 18:41:19 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=FmgqflRL; dmarc=pass (policy=reject) header.from=oracle.com; spf=pass (imf24.hostedemail.com: domain of sidhartha.kumar@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=sidhartha.kumar@oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744051279; a=rsa-sha256; cv=none; b=IVhB/u44SKvnx+ZqROO5lnLXbefQG/+NbYoyOdu4jc7L5M0O0Z9pNIVTcILBlxOaPnKQLK ZilOTHtO9kj5FpB8DS5kF/UQf0q/s3ehK2ma5DwOchSm6/BnOO4OChhbypHyOsVyoxEb/c dsLNa8lTLtxio4KNxy7LgWZW9FMLQwM= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=FmgqflRL; dmarc=pass (policy=reject) header.from=oracle.com; spf=pass (imf24.hostedemail.com: domain of sidhartha.kumar@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=sidhartha.kumar@oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744051279; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zPgIHOrW9aCWrw2poxjwCKw+rbRXEoMh5Fvuqu6qsfk=; b=7c2Ug+m5hEzpZLRscUt+afKYs0+A5Jx+JZPNexv8CKScgChfL4nyhQsq4VItIADXIQYi1R J/vR7ETNqttibvvvmeQ0A6Ee9qpOmkH7AMe0jYJmb6V5nKTYbs3O5iED6rYBkm1+dvCwBa G6CSJtmlpTNWxUFuO0r9FJMTub1uNbI= Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 537H0hOE016137; Mon, 7 Apr 2025 18:41:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2023-11-20; bh=zPgIH OrW9aCWrw2poxjwCKw+rbRXEoMh5Fvuqu6qsfk=; b=FmgqflRLGZ01AF1WQxeCw 6nOrQn69vTM6/PlfYpTGMY0GR1PqwV81HHaxF967xp4x8m3RHFqjaD2qJh7839cb rReMg0OND20tn5mHZ9/cueRZZ1Aef9gGhck++AVU4bau6tzMzEEKkAe8W4xQAiqX x9QQJaxcYMp5pQ0SLewr2+9HEbR3OdQlfQ+ac1Za8qtkHsTaflG2k5jb4kNqZpgS KSO8m8gHXOcVDVNk7wj3NdKIVvKrERTolNqLWmBYOzOH6v6uVbxYyCGd9aLb7qcY OIfs8tqOA3wEU3MLOBnQW/3B4fTgLPWsOZ/Q6vWg2mjvRKR6cfDhBmYUYG7AOwiv A== Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.appoci.oracle.com [130.35.100.223]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 45tvjcua7d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 07 Apr 2025 18:41:08 +0000 (GMT) Received: from pps.filterd (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 537I5Qfn023786; Mon, 7 Apr 2025 18:41:08 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 45ttyefwsm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 07 Apr 2025 18:41:07 +0000 Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 537IY5DV038909; Mon, 7 Apr 2025 18:41:07 GMT Received: from sidhakum-ubuntu.osdevelopmeniad.oraclevcn.com (sidhakum-ubuntu.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.250.108]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 45ttyefwqr-4; Mon, 07 Apr 2025 18:41:07 +0000 From: Sidhartha Kumar To: linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org Cc: linux-mm@kvack.org, akpm@linux-foundation.org, liam.howlett@oracle.com, willy@infradead.org, Sidhartha Kumar Subject: [PATCH v4 3/6] maple_tree: use vacant nodes to reduce worst case allocations Date: Mon, 7 Apr 2025 18:40:59 +0000 Message-ID: <20250407184102.2155415-4-sidhartha.kumar@oracle.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250407184102.2155415-1-sidhartha.kumar@oracle.com> References: <20250407184102.2155415-1-sidhartha.kumar@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1095,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-04-07_05,2025-04-07_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 bulkscore=0 mlxscore=0 mlxlogscore=999 suspectscore=0 adultscore=0 spamscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2502280000 definitions=main-2504070130 X-Proofpoint-ORIG-GUID: adnJPjtgQpvg9TEHIQY65ZKPPW9--LxP X-Proofpoint-GUID: adnJPjtgQpvg9TEHIQY65ZKPPW9--LxP X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 50C71180009 X-Stat-Signature: ajzg9h94qwawq31q44feosmh99su5waj X-Rspam-User: X-HE-Tag: 1744051279-644702 X-HE-Meta: U2FsdGVkX18yaF5fPHVXFBW31n2nTlcDo0uLlv7Em/TrYGMyzqtloLACQKIwJVALkApaPwUTd+DGRvHYxWzzzHO4KbNZKbEdcymiWiK0oCTvPmaRSDOJcTyWlj7393x1K1ZNEMsB3IByPe1TI/SKo8HDihLsR1ZB/94lRMb1inERZdQUx2XZPplcGnQZAbjgLMuOXPfOwrNudxU2IaP/hF5HHuRL6xU9hnBth1guDxeVhkPajnGGVChwgsbUJXh2Nhn4N7+alQpGweEQJlU64SgQ4c8lislShwKjarmQZbdefNHm0/oMsqApLhfXxCQl1ieRZT12xivIbL3QSuNn/bbnIc+Sdu1kmdQNHcZvPzDH/iRHfw0Sx1CSvt9tf7Vv6XCRd4rnfC2RWsziLrcZMVZF6L5pO5d2AGh9q8J8V36BE/HzL13Hef9CNq5DP/wKc4Xi+CwLVfx3PXpzMwXA4LSUOLf9DsV0W2VZyJ2ouHq/3ooNzKgsg6R6rvsSOV4XIQ7ShLGbpKgmYzt46pm/9kln9MNkhY27ck6vAoznJAAMr2MDaUSMGE8eWTWSpOp5ZE8Trvi5yDU0FdiQQxJgdhpRqeC1YomhLAQhnCOQoQ0JfRz6xGr+rfBKHquX/xm2NqwZTS+pqJnh22qMQF+tmFEtv9ZEYC3UlxbBrG1jWdpZTHqchfC21OYlPqhWuHR6uL3Yh6/pN8sOfv7FRoZykPOrRA7A29lgnq7Um4E2gQv+OD4RmNU9FVALQp7QpaB2h39rzK0G9OwC61ZVkr4BBEtS2mUQejGr/Mk3wvYBQpWc1sX9Qk4uwYRRLjSnER/epL2jnYGdDJioDLeqvEiFCcOZLl0PNDky2NXtSkeMhwzzOhoufjCzfacfZLFdUHJULLg+yfuqkVlAQ33co4UJvz1WukFZFk/4xzhBVomiUgA0i9HeBx9wPgBRfHfXQGF77+0cgkfy1PRxI4yW7ym WM78W20A 27SiRUsNv1sC/l9wb+/FV4X7Tzy3V/3ryIyvb0ImXdiBk8jJY969SRiJ6Eod+yYR6jWQ8JpAESY7NsJ0qvv91n5Zr1eRvGwi+ErNI5GH9bxKa2CC9ATMKFyRfmuUxSfMDLS3G3vb7XxS4A7+6tQKfjHEhmBI1W09uJrU2HRYsDRpoK9YQFsi8rFaLxI1KgzwcV/HLL2JuemiR4iMQtSpQ+BfibbE/Ni5FnRqy7AoNkswE9VzPgwzI3QkZvwTRabDHz7WAvhSx7e/51spN5dvNUADZ432i8BsEiXsA X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In order to determine the store type for a maple tree operation, a walk of the tree is done through mas_wr_walk(). This function descends the tree until a spanning write is detected or we reach a leaf node. While descending, keep track of the height at which we encounter a node with available space. This is done by checking if mas->end is less than the number of slots a given node type can fit. Now that the height of the vacant node is tracked, we can use the difference between the height of the tree and the height of the vacant node to know how many levels we will have to propagate creating new nodes. Update mas_prealloc_calc() to consider the vacant height and reduce the number of worst-case allocations. Rebalancing and spanning stores are not supported and fall back to using the full height of the tree for allocations. Update preallocation testing assertions to take into account vacant height. Signed-off-by: Sidhartha Kumar Reviewed-by: Liam R. Howlett --- include/linux/maple_tree.h | 2 + lib/maple_tree.c | 13 ++++-- tools/testing/radix-tree/maple.c | 79 ++++++++++++++++++++++++++++---- 3 files changed, 82 insertions(+), 12 deletions(-) diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h index cbbcd18d4186..7d777aa2d9ed 100644 --- a/include/linux/maple_tree.h +++ b/include/linux/maple_tree.h @@ -463,6 +463,7 @@ struct ma_wr_state { void __rcu **slots; /* mas->node->slots pointer */ void *entry; /* The entry to write */ void *content; /* The existing entry that is being overwritten */ + unsigned char vacant_height; /* Depth of lowest node with free space */ }; #define mas_lock(mas) spin_lock(&((mas)->tree->ma_lock)) @@ -498,6 +499,7 @@ struct ma_wr_state { .mas = ma_state, \ .content = NULL, \ .entry = wr_entry, \ + .vacant_height = 0 \ } #define MA_TOPIARY(name, tree) \ diff --git a/lib/maple_tree.c b/lib/maple_tree.c index 236f0579ca53..203a1a529884 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -3539,6 +3539,9 @@ static bool mas_wr_walk(struct ma_wr_state *wr_mas) if (ma_is_leaf(wr_mas->type)) return true; + if (mas->end < mt_slots[wr_mas->type] - 1) + wr_mas->vacant_height = mas->depth + 1; + mas_wr_walk_traverse(wr_mas); } @@ -4154,7 +4157,9 @@ static inline void mas_wr_prealloc_setup(struct ma_wr_state *wr_mas) static inline int mas_prealloc_calc(struct ma_wr_state *wr_mas, void *entry) { struct ma_state *mas = wr_mas->mas; - int ret = mas_mt_height(mas) * 3 + 1; + unsigned char height = mas_mt_height(mas); + int ret = height * 3 + 1; + unsigned char delta = height - wr_mas->vacant_height; switch (mas->store_type) { case wr_invalid: @@ -4172,13 +4177,13 @@ static inline int mas_prealloc_calc(struct ma_wr_state *wr_mas, void *entry) ret = 0; break; case wr_spanning_store: - ret = mas_mt_height(mas) * 3 + 1; + WARN_ON_ONCE(ret != height * 3 + 1); break; case wr_split_store: - ret = mas_mt_height(mas) * 2 + 1; + ret = delta * 2 + 1; break; case wr_rebalance: - ret = mas_mt_height(mas) * 2 - 1; + ret = height * 2 + 1; break; case wr_node_store: ret = mt_in_rcu(mas->tree) ? 1 : 0; diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/maple.c index e0f8fabe8821..e37a3ab2e921 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -35475,15 +35475,65 @@ static void check_dfs_preorder(struct maple_tree *mt) } /* End of depth first search tests */ +/* get height of the lowest non-leaf node with free space */ +static unsigned char get_vacant_height(struct ma_wr_state *wr_mas, void *entry) +{ + struct ma_state *mas = wr_mas->mas; + char vacant_height = 0; + enum maple_type type; + unsigned long *pivots; + unsigned long min = 0; + unsigned long max = ULONG_MAX; + unsigned char offset; + + /* start traversal */ + mas_reset(mas); + mas_start(mas); + if (!xa_is_node(mas_root(mas))) + return 0; + + type = mte_node_type(mas->node); + wr_mas->type = type; + while (!ma_is_leaf(type)) { + mas_node_walk(mas, mte_to_node(mas->node), type, &min, &max); + offset = mas->offset; + mas->end = mas_data_end(mas); + pivots = ma_pivots(mte_to_node(mas->node), type); + + if (pivots) { + if (offset) + min = pivots[mas->offset - 1]; + if (offset < mas->end) + max = pivots[mas->offset]; + } + wr_mas->r_max = offset < mas->end ? pivots[offset] : mas->max; + + /* detect spanning write */ + if (mas_is_span_wr(wr_mas)) + break; + + if (mas->end < mt_slot_count(mas->node) - 1) + vacant_height = mas->depth + 1; + + mas_descend(mas); + type = mte_node_type(mas->node); + mas->depth++; + } + + return vacant_height; +} + /* Preallocation testing */ static noinline void __init check_prealloc(struct maple_tree *mt) { unsigned long i, max = 100; unsigned long allocated; unsigned char height; + unsigned char vacant_height; struct maple_node *mn; void *ptr = check_prealloc; MA_STATE(mas, mt, 10, 20); + MA_WR_STATE(wr_mas, &mas, ptr); mt_set_non_kernel(1000); for (i = 0; i <= max; i++) @@ -35494,8 +35544,9 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); + vacant_height = get_vacant_height(&wr_mas, ptr); MT_BUG_ON(mt, allocated == 0); - MT_BUG_ON(mt, allocated != 1 + height * 3); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mas_destroy(&mas); allocated = mas_allocated(&mas); MT_BUG_ON(mt, allocated != 0); @@ -35503,8 +35554,9 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); + vacant_height = get_vacant_height(&wr_mas, ptr); MT_BUG_ON(mt, allocated == 0); - MT_BUG_ON(mt, allocated != 1 + height * 3); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); mas_destroy(&mas); allocated = mas_allocated(&mas); @@ -35514,7 +35566,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&wr_mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mn = mas_pop_node(&mas); MT_BUG_ON(mt, mas_allocated(&mas) != allocated - 1); mn->parent = ma_parent_ptr(mn); @@ -35527,7 +35580,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&wr_mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mn = mas_pop_node(&mas); MT_BUG_ON(mt, mas_allocated(&mas) != allocated - 1); MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); @@ -35540,7 +35594,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&wr_mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mn = mas_pop_node(&mas); MT_BUG_ON(mt, mas_allocated(&mas) != allocated - 1); mas_push_node(&mas, mn); @@ -35553,7 +35608,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&wr_mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mas_store_prealloc(&mas, ptr); MT_BUG_ON(mt, mas_allocated(&mas) != 0); @@ -35578,7 +35634,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 2); + vacant_height = get_vacant_height(&wr_mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 2); mas_store_prealloc(&mas, ptr); MT_BUG_ON(mt, mas_allocated(&mas) != 0); mt_set_non_kernel(1); @@ -35595,8 +35652,14 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); + vacant_height = get_vacant_height(&wr_mas, ptr); MT_BUG_ON(mt, allocated == 0); - MT_BUG_ON(mt, allocated != 1 + height * 3); + /* + * vacant height cannot be used to compute the number of nodes needed + * as the root contains two entries which means it is on the verge of + * insufficiency. The worst case full height of the tree is needed. + */ + MT_BUG_ON(mt, allocated != height * 3 + 1); mas_store_prealloc(&mas, ptr); MT_BUG_ON(mt, mas_allocated(&mas) != 0); mas_set_range(&mas, 0, 200);