From patchwork Mon Apr 7 18:41:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sidhartha Kumar X-Patchwork-Id: 14041521 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EDD2CC36018 for ; Mon, 7 Apr 2025 18:41:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0D8A26B0008; Mon, 7 Apr 2025 14:41:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 08B466B000A; Mon, 7 Apr 2025 14:41:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E6C0A6B000D; Mon, 7 Apr 2025 14:41:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C32506B000A for ; Mon, 7 Apr 2025 14:41:15 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 3D2C71CD511 for ; Mon, 7 Apr 2025 18:41:16 +0000 (UTC) X-FDA: 83308115352.19.302BB41 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf18.hostedemail.com (Postfix) with ESMTP id 3A1B01C0006 for ; Mon, 7 Apr 2025 18:41:14 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=Le3wEsMy; dmarc=pass (policy=reject) header.from=oracle.com; spf=pass (imf18.hostedemail.com: domain of sidhartha.kumar@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=sidhartha.kumar@oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744051274; a=rsa-sha256; cv=none; b=Ot2vYS86GhewTItrShGagXF1j9pMkAA++v2PSKwH73MjoU99smEM0f3iFoZvTd2NQjU9NX B1tY/nh0A9nj4xTkjyQsUrRa187FfwkEmAdPj2TXcgZkqy+niwDEUglosSfm690dqT9Yys 9SHLpZllwZvB90fZVLf/1+azDG3lB28= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=Le3wEsMy; dmarc=pass (policy=reject) header.from=oracle.com; spf=pass (imf18.hostedemail.com: domain of sidhartha.kumar@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=sidhartha.kumar@oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744051274; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BEyayAD0n5OcCAz/yUgOyP7rLNtsr3FbNg29L/u+Tug=; b=ZLU1CqOQf1ZAYyvWTbWCU7aecpT20l9Q/K4nC1et/yxXmSTVo/UsTWy443XJ82iGjw+Euv Q3gG9bp7zgmndJ8p43oHoOMfTiXcQwutO7ZIHom4ELlFweEGxXuzrLqo6Y4ZAVXI46XgYL E8MvJIucYT/jgHZnYP9yBbH6g6NwwjE= Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 537H0iZX003422; Mon, 7 Apr 2025 18:41:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2023-11-20; bh=BEyay AD0n5OcCAz/yUgOyP7rLNtsr3FbNg29L/u+Tug=; b=Le3wEsMy5235U8G+otW6r CMSpM0WxFMzUXv+eQ9tDaGvKVSJmJYfWh9ew+qDmUU1DP3gRK1xkWcx8wOgo9zf2 yPHJhNFIh9IVIxm2NLajqXSsSWLVePC9ecTC95t6+1q/wFLqoW4P1i9h/3LMonWr uYGdcjY+BXGzSie8vNADxUb53gDYfQIKZg/LpLGei7GffzOUsyYezVtmpUl7n2Bq 06N0uepefw16NizkdirvQYtSVd14TA4kZjp4FuPcfRIMXtBnimM+dsmcMi+6PY5K aI8CXYLoWGi8meWxQP8rwktTUp4j3GkrJJFl/LokC8JyF/UC4YVGiP3qLxWlWuvH w== Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.appoci.oracle.com [130.35.100.223]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 45tv4su9ux-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 07 Apr 2025 18:41:09 +0000 (GMT) Received: from pps.filterd (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 537I0VB1023792; Mon, 7 Apr 2025 18:41:08 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 45ttyefwt8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 07 Apr 2025 18:41:08 +0000 Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 537IY5DZ038909; Mon, 7 Apr 2025 18:41:08 GMT Received: from sidhakum-ubuntu.osdevelopmeniad.oraclevcn.com (sidhakum-ubuntu.allregionaliads.osdevelopmeniad.oraclevcn.com [100.100.250.108]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 45ttyefwqr-6; Mon, 07 Apr 2025 18:41:08 +0000 From: Sidhartha Kumar To: linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org Cc: linux-mm@kvack.org, akpm@linux-foundation.org, liam.howlett@oracle.com, willy@infradead.org, Sidhartha Kumar Subject: [PATCH v4 5/6] maple_tree: add sufficient height Date: Mon, 7 Apr 2025 18:41:01 +0000 Message-ID: <20250407184102.2155415-6-sidhartha.kumar@oracle.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250407184102.2155415-1-sidhartha.kumar@oracle.com> References: <20250407184102.2155415-1-sidhartha.kumar@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1095,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-04-07_05,2025-04-07_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 bulkscore=0 mlxscore=0 mlxlogscore=999 suspectscore=0 adultscore=0 spamscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2502280000 definitions=main-2504070130 X-Proofpoint-ORIG-GUID: auY2qp1JnqR0C1Y8Ld-Vefhdvcmn2wNM X-Proofpoint-GUID: auY2qp1JnqR0C1Y8Ld-Vefhdvcmn2wNM X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 3A1B01C0006 X-Stat-Signature: 9hn4qrrc8xsaybtnrwpmsc5ttchehodp X-Rspam-User: X-HE-Tag: 1744051274-275150 X-HE-Meta: U2FsdGVkX1/AftVn79G8xlt5SZgw8agx80e9fv69e26S5d8LPdJI5hXdF0fQ6rE28IsjhR4DbfYcEYwBwK//KxFDxzVYGb/p0XnlzULme2gAqYU+BpwJw5aqkoWkK2VW8+rvLQ0pHVOBsU3OFy38hwL9UdZHb2yiDUNfOkvcf7p3m4mMOrunFTvH3jLHEm4D4scxUKDSrUkYlOWi8l9uQyrx8P7b5/FGnlgaCDIihZ9nF5FssJx1e1BExWUYA8qbrUXR2uUS+psroJOm4f71ypYOMCSSGOunXoDaUWmc6HZj0VxFP32hUhGJpOAbfG4C8RsMc3CloxYolbd4axxHXq99EMfcuq07t7qGnUbYstDj3eXdQpTHEImqvkawfmwpBP66xxi6fj9XchBwld/l3/5QG7V9mVXyzbcXaZ1vvbox8NhbBdwPDWsi17ACSvr8A975EXMan0CFdcWtJjjYUa/msFYqpoALw8tIb54dX2FOAqvNzwrjDB/W03d9lY+BT8DjWVM/GeEl9rgILAnOqzc2YiBgxWCm63iyueGcpKcSEskeOJEtpSyxG604t996aEaK9VdIaGn+U40JAZJvK8k4yitcEQFvqojD+R61N315Tt+ftRxBpLro9G3Sfp9EOlbPKjjy44svFpJLZzGIG6Uj/9WxuXYkyHoOfnoxC6D43yY6fKUesAaODuEenHshR3VTlBB5ck2XyXAymWiWG0uLGxsRBzz/9gsMD0PLeKaPkdviIskumDcKdrYPFEX4Cg1IKNA+J9bSJX75qzYPpVjwsD5JFpZ07ADu40OW/TFVv10QHvA9za+VI0MQznCZL946oUmhrTFWfLwUxwehN5buq57VzqiEAd3/oFIflsasm/6xxXih3f7pbs37d+OA4Ds0hNje6qiOtDwOJsDv0RykLO9PTcTNZiCxZjYEp91xlDBSWs0tQP1QRKNt7uGLLCCv8iS7MgEzdUora8q EYhGVngN 2gtsWE0hJOChGh5eKTGwUS0Fe4CnFOYws0UPZcVvjk2g+GKGSte1dmFZ+BPIxJ6dzMNmpnvqcjBAL/RKnjL6AbLuYaRHCbKrTwDjxcQKS0eZQxz/h0u2gOV2BfZAVgIgATmDJ1QDtjHyZ0iUyJSCOUYr9DdvKVZidCyEaYDoA7nmajeTnbkaijm/sTZ2VwGj8twmlQZvhubBk8F6qlWoMeivxup6kYS4tvplel2c05fAMu/LkgV+wtVgm4CDjRvqg82nj0G2tEdf/ZFLcx47m8j/VcU9csocig0vT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In order to support rebalancing and spanning stores using less than the worst case number of nodes, we need to track more than just the vacant height. Using only vacant height to reduce the worst case maple node allocation count can lead to a shortcoming of nodes in the following scenarios. For rebalancing writes, when a leaf node becomes insufficient, it may be combined with a sibling into a single node. This means that the parent node which has entries for this children will lose one entry. If this parent node was just meeting the minimum entries, losing one entry will now cause this parent node to be insufficient. This leads to a cascading operation of rebalancing at different levels and can lead to more node allocations than simply using vacant height can return. For spanning writes, a similar situation occurs. At the location at which a spanning write is detected, the number of ancestor nodes may similarly need to rebalanced into a smaller number of nodes and the same cascading situation could occur. To use less than the full height of the tree for the number of allocations, we also need to track the height at which a non-leaf node cannot become insufficient. This means even if a rebalance occurs to a child of this node, it currently has enough entries that it can lose one without any further action. This field is stored in the maple write state as sufficient height. In mas_prealloc_calc() when figuring out how many nodes to allocate, we check if the vacant node is lower in the tree than a sufficient node (has a larger value). If it is, we cannot use the vacant height and must use the difference in the height and sufficient height as the basis for the number of nodes needed. An off by one bug was also discovered in mast_overflow() where it is using >= rather than >. This caused extra iterations of the mas_spanning_rebalance() loop and lead to unneeded allocations. A test is also added to check the number of allocations is correct. Signed-off-by: Sidhartha Kumar --- include/linux/maple_tree.h | 4 +++- lib/maple_tree.c | 19 ++++++++++++++++--- tools/testing/radix-tree/maple.c | 28 ++++++++++++++++++++++++++++ 3 files changed, 47 insertions(+), 4 deletions(-) diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h index 7d777aa2d9ed..37dc9525dff6 100644 --- a/include/linux/maple_tree.h +++ b/include/linux/maple_tree.h @@ -464,6 +464,7 @@ struct ma_wr_state { void *entry; /* The entry to write */ void *content; /* The existing entry that is being overwritten */ unsigned char vacant_height; /* Depth of lowest node with free space */ + unsigned char sufficient_height;/* Depth of lowest node with min sufficiency + 1 nodes */ }; #define mas_lock(mas) spin_lock(&((mas)->tree->ma_lock)) @@ -499,7 +500,8 @@ struct ma_wr_state { .mas = ma_state, \ .content = NULL, \ .entry = wr_entry, \ - .vacant_height = 0 \ + .vacant_height = 0, \ + .sufficient_height = 0 \ } #define MA_TOPIARY(name, tree) \ diff --git a/lib/maple_tree.c b/lib/maple_tree.c index acecd4e8a6a0..98b06709d864 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -2741,7 +2741,7 @@ static inline bool mast_sufficient(struct maple_subtree_state *mast) */ static inline bool mast_overflow(struct maple_subtree_state *mast) { - if (mast->bn->b_end >= mt_slot_count(mast->orig_l->node)) + if (mast->bn->b_end > mt_slot_count(mast->orig_l->node)) return true; return false; @@ -3552,6 +3552,13 @@ static bool mas_wr_walk(struct ma_wr_state *wr_mas) if (mas->end < mt_slots[wr_mas->type] - 1) wr_mas->vacant_height = mas->depth + 1; + if (ma_is_root(mas_mn(mas))) { + /* root needs more than 2 entries to be sufficient + 1 */ + if (mas->end > 2) + wr_mas->sufficient_height = 1; + } else if (mas->end > mt_min_slots[wr_mas->type] + 1) + wr_mas->sufficient_height = mas->depth + 1; + mas_wr_walk_traverse(wr_mas); } @@ -4187,13 +4194,19 @@ static inline int mas_prealloc_calc(struct ma_wr_state *wr_mas, void *entry) ret = 0; break; case wr_spanning_store: - WARN_ON_ONCE(ret != height * 3 + 1); + if (wr_mas->sufficient_height < wr_mas->vacant_height) + ret = (height - wr_mas->sufficient_height) * 3 + 1; + else + ret = delta * 3 + 1; break; case wr_split_store: ret = delta * 2 + 1; break; case wr_rebalance: - ret = height * 2 + 1; + if (wr_mas->sufficient_height < wr_mas->vacant_height) + ret = (height - wr_mas->sufficient_height) * 2 + 1; + else + ret = delta * 2 + 1; break; case wr_node_store: ret = mt_in_rcu(mas->tree) ? 1 : 0; diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/maple.c index e37a3ab2e921..2c0b38301253 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -36326,6 +36326,30 @@ static inline void check_spanning_store_height(struct maple_tree *mt) mas_unlock(&mas); } +/* + * Test to check the path of a spanning rebalance which results in + * a collapse where the rebalancing of the child node leads to + * insufficieny in the parent node. + */ +static void check_collapsing_rebalance(struct maple_tree *mt) +{ + int i = 0; + MA_STATE(mas, mt, ULONG_MAX, ULONG_MAX); + + /* create a height 6 tree */ + while (mt_height(mt) < 6) { + mtree_store_range(mt, i, i + 10, xa_mk_value(i), GFP_KERNEL); + i += 9; + } + + /* delete all entries one at a time, starting from the right */ + do { + mas_erase(&mas); + } while (mas_prev(&mas, 0) != NULL); + + mtree_unlock(mt); +} + /* callback function used for check_nomem_writer_race() */ static void writer2(void *maple_tree) { @@ -36496,6 +36520,10 @@ void farmer_tests(void) check_spanning_store_height(&tree); mtree_destroy(&tree); + mt_init_flags(&tree, MT_FLAGS_ALLOC_RANGE); + check_collapsing_rebalance(&tree); + mtree_destroy(&tree); + mt_init_flags(&tree, MT_FLAGS_ALLOC_RANGE); check_null_expand(&tree); mtree_destroy(&tree);