From patchwork Thu Nov 14 17:05:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sidhartha Kumar X-Patchwork-Id: 13875506 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DAC4DD637D4 for ; Thu, 14 Nov 2024 17:06:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4A5F06B00A5; Thu, 14 Nov 2024 12:06:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 42EC26B00AB; Thu, 14 Nov 2024 12:06:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 25C546B00AD; Thu, 14 Nov 2024 12:06:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A01A56B00A5 for ; Thu, 14 Nov 2024 12:06:23 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D5BB4ACBCC for ; Thu, 14 Nov 2024 17:06:22 +0000 (UTC) X-FDA: 82785328080.18.8384B4B Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf11.hostedemail.com (Postfix) with ESMTP id AC1B4400CF for ; Thu, 14 Nov 2024 17:04:38 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=G6OHosqv; dmarc=pass (policy=reject) header.from=oracle.com; spf=pass (imf11.hostedemail.com: domain of sidhartha.kumar@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=sidhartha.kumar@oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731603871; a=rsa-sha256; cv=none; b=rtsbJdAv2TcXp5egrTRjLjqBJ1+RPFKjjMi1Kbc+lRlcT3aWxX759KGpQT1Fk++q4k3rav Xgrg6EYBCZBQPs07Ub+FrJn03b1VkaHMHp3BBoFJ0fuEFB0R/MgTum2pFyNgLAk7kN2Dx3 4Bbf97kcin6/XbxembN6bU57S4LeXDQ= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=G6OHosqv; dmarc=pass (policy=reject) header.from=oracle.com; spf=pass (imf11.hostedemail.com: domain of sidhartha.kumar@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=sidhartha.kumar@oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731603871; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0hRvHf4WYhJpauYsodnpVxyrZ6H8VFblaBb1i/nTs4o=; b=YxgfojQ7HUF+oe/voaObz7D6WDvfNyopsNe3Lwg1godAlPPUSj2jjvyydXTz5x6MPblzIU v6qxzvN13lyhlPt8/n6e8L2Lq194kxv283MA9kx471knl1PmNUilb6lj6QQyExv1SNBkpd 4ajRCDnLphOjDNJW2I6+sWwpIALRmCE= Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 4AED1WDu002323; Thu, 14 Nov 2024 17:05:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2023-11-20; bh=0hRvH f4WYhJpauYsodnpVxyrZ6H8VFblaBb1i/nTs4o=; b=G6OHosqvZkJnCWf6jHUrb NwhFfSoai/Hig7KBp0w3wzMRndPi0lQ9cqUgRQFN40EW+h4G5u4cDzr5a4IJ2rQe 0I70AC5P6LYqQ9P/Fc3H/m5rqmVgEO1kc/erOsOodeiJIM44C4UVcJ+8puFc7SZM OKPTlZ9+NeMxCQ63CLdTuIfdI21tCqhkdrUG3UKG5/1anActW74rQGYeDG9fMJ02 l+PK4orL2cCOfi8Ek2KoAJXUVZGzFfKdpsgpHf/Xp5z63J1z7b1O09o131FHmAvS GqPyrl5+phBu6MpY+pbLZfTwMrSCE6aSuefVG+j59RrXl6II2Ijkrbhv1GLd9QZU Q== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 42t0k5hkcj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Nov 2024 17:05:31 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 4AEGVnJE023937; Thu, 14 Nov 2024 17:05:30 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 42vuw1jy8n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Nov 2024 17:05:30 +0000 Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 4AEH5QmN032739; Thu, 14 Nov 2024 17:05:29 GMT Received: from sidkumar-mac.us.oracle.com (dhcp-10-39-201-66.vpn.oracle.com [10.39.201.66]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 42vuw1jy5w-4; Thu, 14 Nov 2024 17:05:29 +0000 From: Sidhartha Kumar To: linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org Cc: linux-mm@kvack.org, akpm@linux-foundation.org, liam.howlett@oracle.com, Sidhartha Kumar Subject: [PATCH 3/5] maple_tree: use vacant nodes to reduce worst case allocations Date: Thu, 14 Nov 2024 12:05:22 -0500 Message-ID: <20241114170524.64391-4-sidhartha.kumar@oracle.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241114170524.64391-1-sidhartha.kumar@oracle.com> References: <20241114170524.64391-1-sidhartha.kumar@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.62.30 definitions=2024-11-14_05,2024-11-13_01,2024-09-30_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 mlxscore=0 spamscore=0 adultscore=0 phishscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2409260000 definitions=main-2411140134 X-Proofpoint-ORIG-GUID: NhYnhKZPdGkfi9L220eXnIAw8JqYObD6 X-Proofpoint-GUID: NhYnhKZPdGkfi9L220eXnIAw8JqYObD6 X-Rspam-User: X-Rspamd-Queue-Id: AC1B4400CF X-Rspamd-Server: rspam11 X-Stat-Signature: sjx38fpo4oyjzhq76ime7xb96p94bxf9 X-HE-Tag: 1731603878-763163 X-HE-Meta: U2FsdGVkX18Lhg9WYK733p/JLKVbUemop6a8/imCSGx6aeNDP1gd8Ea/Wb1w/BfrH98Tc4HClV6Kt0jI0LU56nPIuBhZqbI6X2OwUHaiyx7SG47wr+CPsefKcgppF3lYuTJZHdC5nK72TWyvPo++sLnJygdYyTvmm5sCB/aeoVWYiKsQ6YdJ1pFBfQte8rXxWrdXAiu1oXSLFXKWVBYiA66IcfLdM1z6ySZ6ChadaUtzOsW1HiE7vZOLJXBvB/pMYGXKnC/c/otsaTQPlGtE36YPB8NxK4JjXa4T+cOy8aWKoUjm0fwgIA8HnTcvcMlgsGyzrYMRboIIixleBVHGGWDV2gVjdSHp7N6GFOs7vjRdg0qzSppjSfJeyhmDvYDsMpetpvDQqn/9Mx88zXQ3IdqFhKYPPJemmAYaayAs22WN3RNXHex4tKq+luY6VzcKHYIkS/UuPL8+f/lpxLHDQrl6RXea5vE/oOx96LF7dbsro37VhJ9VMEc4Y1ryBJgEjrjipQLFx+RxAp1WQC702x+rIZjQO01hHpIEsRmqga43xbaOJdV9VoFGMzqMy3hLBkP4RAYA0fLEF2A+GSAxGaNpOkn6BldybOgUrkdk1RPY8UxN7hyykgGYlPpxX3fTIkewIAlHjhP/lvmU1cAHVylycoziD6m5STtS0SKisXcCUfA6iAfW4842ppm9OrKQ2Hine7h7CulmZrYMj/37gCpv8iyA3pJcUSAN13NI2H5bTfRAB6ReJtxLOyb2dl0anVmmvhCbZA4eB68yCQz9e2NCeZ6cDwB5/dsv1ardCs7vAlis1lPp805dTljPFLMuLRGHGVcU3IzC1hGZfbwNLduUOqmXPmURaRBlY3q8kFSlrKgLf2AVYyR3nmVeu1RMFJg3Kb+BEqDkGRhlXZnDmMGdo8dy2sV/zYoD1PReFjjIQthQMB596IltYZix8w1V/psRCtUqpMOCH+Ri8Hb CHuwPPPY FByZbXX2s+DI3fK/oOWv1KJFJBJoWE9E9aTDfhHK/1wWWGQ/OgNPhhocjgH7/bsIcL70tVde2jSI0JDAoiod7jKgr2qclBf4woB+WGnUrt7zvYKwgAlWgPFwoJx3vSijWjOSycLIWVJlNq+QRWurkmleQUnE8ggQMTNvIO0STFZQr/NN/nSzjnjn+bCUo6l2o5QbKcBdzvZrh7xq/+pL5QaWH5NQ7mYif+3p53jnXS5VQZbqbs2DlzBiISU34ybW/MdKF+cvnGmH7PLgrKQB2y9TheA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In order to determine the store type for a maple tree operation, a walk of the tree is done through mas_wr_walk(). This function descends the tree until a spanning write is detected or we reach a leaf node. While descending, keep track of the height at which we encounter a node with available space. This is done by checking if mas->end is less than the number of slots a given node type can fit. Now that the height of the vacant node is tracked, we can use the difference between the height of the tree and the height of the vacant node to know how many levels we will have to propagate creating new nodes. Update mas_prealloc_calc() to consider the vacant height and reduce the number of worst allocations. Rebalancing stores are not supported and fall back to using the full height of the tree for allocations. Update preallocation testing assertions to take into account vacant height. Signed-off-by: Sidhartha --- include/linux/maple_tree.h | 2 + lib/maple_tree.c | 13 +++-- tools/testing/radix-tree/maple.c | 97 +++++++++++++++++++++++++++++--- 3 files changed, 100 insertions(+), 12 deletions(-) diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h index cbbcd18d4186..7d777aa2d9ed 100644 --- a/include/linux/maple_tree.h +++ b/include/linux/maple_tree.h @@ -463,6 +463,7 @@ struct ma_wr_state { void __rcu **slots; /* mas->node->slots pointer */ void *entry; /* The entry to write */ void *content; /* The existing entry that is being overwritten */ + unsigned char vacant_height; /* Depth of lowest node with free space */ }; #define mas_lock(mas) spin_lock(&((mas)->tree->ma_lock)) @@ -498,6 +499,7 @@ struct ma_wr_state { .mas = ma_state, \ .content = NULL, \ .entry = wr_entry, \ + .vacant_height = 0 \ } #define MA_TOPIARY(name, tree) \ diff --git a/lib/maple_tree.c b/lib/maple_tree.c index 21289e350382..f14d70c171c2 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -3545,6 +3545,9 @@ static bool mas_wr_walk(struct ma_wr_state *wr_mas) if (ma_is_leaf(wr_mas->type)) return true; + if (mas->end < mt_slots[wr_mas->type] - 1) + wr_mas->vacant_height = mas->depth + 1; + mas_wr_walk_traverse(wr_mas); } @@ -4159,7 +4162,9 @@ static inline void mas_wr_prealloc_setup(struct ma_wr_state *wr_mas) static inline int mas_prealloc_calc(struct ma_wr_state *wr_mas, void *entry) { struct ma_state *mas = wr_mas->mas; - int ret = mas_mt_height(mas) * 3 + 1; + unsigned char height = mas_mt_height(mas); + int ret = height * 3 + 1; + unsigned char delta = height - wr_mas->vacant_height; switch (mas->store_type) { case wr_invalid: @@ -4177,13 +4182,13 @@ static inline int mas_prealloc_calc(struct ma_wr_state *wr_mas, void *entry) ret = 0; break; case wr_spanning_store: - ret = mas_mt_height(mas) * 3 + 1; + ret = delta * 3 + 1; break; case wr_split_store: - ret = mas_mt_height(mas) * 2 + 1; + ret = delta * 2 + 1; break; case wr_rebalance: - ret = mas_mt_height(mas) * 2 - 1; + ret = height * 2 + 1; break; case wr_node_store: ret = mt_in_rcu(mas->tree) ? 1 : 0; diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/maple.c index bc30050227fd..bc8b107e0177 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -35475,12 +35475,85 @@ static void check_dfs_preorder(struct maple_tree *mt) } /* End of depth first search tests */ +/* same implementation as mas_is_span_wr() in lib/maple_tree.c */ +static bool is_span_wr(struct ma_state *mas, unsigned long r_max, + enum maple_type type, void *entry) +{ + unsigned long max = r_max; + unsigned long last = mas->last; + + /* Contained in this pivot, fast path */ + if (last < max) + return false; + + if (ma_is_leaf(type)) { + max = mas->max; + if (last < max) + return false; + } + + if (last == max) { + /* + * The last entry of leaf node cannot be NULL unless it is the + * rightmost node (writing ULONG_MAX), otherwise it spans slots. + */ + if (entry || last == ULONG_MAX) + return false; + } + + return true; +} + +/* get height of the lowest non-leaf node with free space */ +static unsigned char get_vacant_height(struct ma_state *mas, void *entry) +{ + char vacant_height = 0; + enum maple_type type; + unsigned long *pivots; + unsigned long min = 0; + unsigned long max = ULONG_MAX; + + /* start traversal */ + mas_reset(mas); + mas_start(mas); + if (!xa_is_node(mas_root(mas))) + return 0; + + type = mte_node_type(mas->node); + while (!ma_is_leaf(type)) { + mas_node_walk(mas, mte_to_node(mas->node), type, &min, &max); + mas->end = mas_data_end(mas); + pivots = ma_pivots(mte_to_node(mas->node), type); + + if (pivots) { + if (mas->offset) + min = pivots[mas->offset - 1]; + if (mas->offset < mas->end) + max = pivots[mas->offset]; + } + + /* detect spanning write */ + if (is_span_wr(mas, max, type, entry)) + break; + + if (mas->end < mt_slot_count(mas->node) - 1) + vacant_height = mas->depth + 1; + + mas_descend(mas); + type = mte_node_type(mas->node); + mas->depth++; + } + + return vacant_height; +} + /* Preallocation testing */ static noinline void __init check_prealloc(struct maple_tree *mt) { unsigned long i, max = 100; unsigned long allocated; unsigned char height; + unsigned char vacant_height; struct maple_node *mn; void *ptr = check_prealloc; MA_STATE(mas, mt, 10, 20); @@ -35494,8 +35567,9 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); + vacant_height = get_vacant_height(&mas, ptr); MT_BUG_ON(mt, allocated == 0); - MT_BUG_ON(mt, allocated != 1 + height * 3); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mas_destroy(&mas); allocated = mas_allocated(&mas); MT_BUG_ON(mt, allocated != 0); @@ -35503,8 +35577,9 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); + vacant_height = get_vacant_height(&mas, ptr); MT_BUG_ON(mt, allocated == 0); - MT_BUG_ON(mt, allocated != 1 + height * 3); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); mas_destroy(&mas); allocated = mas_allocated(&mas); @@ -35514,7 +35589,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mn = mas_pop_node(&mas); MT_BUG_ON(mt, mas_allocated(&mas) != allocated - 1); mn->parent = ma_parent_ptr(mn); @@ -35527,7 +35603,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mn = mas_pop_node(&mas); MT_BUG_ON(mt, mas_allocated(&mas) != allocated - 1); MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); @@ -35540,7 +35617,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mn = mas_pop_node(&mas); MT_BUG_ON(mt, mas_allocated(&mas) != allocated - 1); mas_push_node(&mas, mn); @@ -35553,7 +35631,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 3); + vacant_height = get_vacant_height(&mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mas_store_prealloc(&mas, ptr); MT_BUG_ON(mt, mas_allocated(&mas) != 0); @@ -35578,7 +35657,8 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); - MT_BUG_ON(mt, allocated != 1 + height * 2); + vacant_height = get_vacant_height(&mas, ptr); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 2); mas_store_prealloc(&mas, ptr); MT_BUG_ON(mt, mas_allocated(&mas) != 0); mt_set_non_kernel(1); @@ -35595,8 +35675,9 @@ static noinline void __init check_prealloc(struct maple_tree *mt) MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0); allocated = mas_allocated(&mas); height = mas_mt_height(&mas); + vacant_height = get_vacant_height(&mas, ptr); MT_BUG_ON(mt, allocated == 0); - MT_BUG_ON(mt, allocated != 1 + height * 3); + MT_BUG_ON(mt, allocated != 1 + (height - vacant_height) * 3); mas_store_prealloc(&mas, ptr); MT_BUG_ON(mt, mas_allocated(&mas) != 0); mas_set_range(&mas, 0, 200);