From patchwork Thu Jun 15 08:43:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peng Zhang X-Patchwork-Id: 13280890 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20B3DEB64DC for ; Thu, 15 Jun 2023 08:43:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AC8176B0078; Thu, 15 Jun 2023 04:43:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A4FA88E0003; Thu, 15 Jun 2023 04:43:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8CC2E8E0002; Thu, 15 Jun 2023 04:43:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 77FCD6B0078 for ; Thu, 15 Jun 2023 04:43:21 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 248C9160B19 for ; Thu, 15 Jun 2023 08:43:21 +0000 (UTC) X-FDA: 80904343002.18.A2A91B4 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) by imf06.hostedemail.com (Postfix) with ESMTP id 56F05180017 for ; Thu, 15 Jun 2023 08:43:19 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=l3TDRxu1; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf06.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.210.176 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686818599; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XtXCNMCToSbvY4lMnhwDqjdw8KEp/J31bkp75I6UIro=; b=W/CXzWv2glVp8+DYkT+2MwkDt5+ZwcZLHcPD5MvFGkxM1ikxRmvoigBmnNGn8icDPcJj9B jvifiy3x5aLlthRpt2eB5y5IwnVesbf6qfWA+lm4Gahc00OfXMZ5JWjms0fkDWclp2AyA4 o5ovTK2DlZPmbBPugfOK91Ji2o00GoM= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=l3TDRxu1; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf06.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.210.176 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686818599; a=rsa-sha256; cv=none; b=6iRgHAQwByhrxPDbB+vpXZpspC8boPZg59tVIluETUoDOM90z4CM3NYPuzcMS9MyZBt/RV g1dZAndEj8TM0NO0qqFznYnks0N9SrfeKWnM5cGthS5UBDpWFnYGphDqX/pCDJNAmUhymd oE1JOhzMGBQJBlv+0ZIMXZPBWDPEiZM= Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-6668c030ec9so538609b3a.1 for ; Thu, 15 Jun 2023 01:43:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1686818598; x=1689410598; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XtXCNMCToSbvY4lMnhwDqjdw8KEp/J31bkp75I6UIro=; b=l3TDRxu1IZeyZNvWGdJOFqtHd+LYhlWJeVIRmuLAkF0iPL/Q4KJgpGWeh7yt2KT390 KzCL+z04k59mTjKNc1qkFngmiEIN7AEVfPul0AZh9evDVDjlwvB62qsNc0qk1cODg5op c5ZcfjgPq4iTWVmfJGSM2rtBKFqryJoMm4+M8PPPckeA9qg2gVriJbfcAwUAQBPIeVXu YMm0+kwc4VoVWFYaesVj4e8F5hPwX8K5ZwSWddqC5b8kzgp76myGhJZxg9jm+Ptz3mtX cai+VRJ32w9/dUu3CoeZE60WbitOBrdrPjWnPzWbuqC7KE5tPbmfNbKrOtB+3Y5qgmcQ QROg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686818598; x=1689410598; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XtXCNMCToSbvY4lMnhwDqjdw8KEp/J31bkp75I6UIro=; b=GDh9c5LLzUMuXKTbxbeU19/1u4P8ty8fckKO4Vg3oFXPO/opFZQqrJrALLPkcBc8Am F2Q8OetcQ6GyTajYgxBu7Ak5zSuyu/5u5wH9S6g6nmpyXQWWrGQwdO43+LHYh4OUAJ7f uMWYliTg06Rvz9qtvtM967jSsj+47eDdTf2rP8SzXLXRLpACzNzIps+nBcWa0aJue8QR s44eOX9zkd2RmT4Z+HklblHwrIQvR8WgQeD+NiF9kWF9LTtJTUnFX5S+oMqwCVmxXRZE YGBdDQ6tmv5NDesVmz7XBCgyaTAG23ZK44SN846WEmulVztNhFBbCASFRqJ6PQK2nsB7 ybeg== X-Gm-Message-State: AC+VfDxlZp+bYUBZPGI9nYYoL9oOa9vJUrVbXGCoclzLTPVz56Kv34jI UFv/4deouYfypIT8Ay5q8uGMdw== X-Google-Smtp-Source: ACHHUZ5q9AP+JpvTGeNeOKfaSVbGfzkXtHUw8eqE+etHTDoY4NTa38tMPtfWroY21eD2WL8zxFFROQ== X-Received: by 2002:a05:6a00:391e:b0:643:aa8d:8cd7 with SMTP id fh30-20020a056a00391e00b00643aa8d8cd7mr4292787pfb.32.1686818598107; Thu, 15 Jun 2023 01:43:18 -0700 (PDT) Received: from GL4FX4PXWL.bytedance.net ([139.177.225.249]) by smtp.gmail.com with ESMTPSA id i21-20020aa78b55000000b0064fe06fe712sm11139783pfd.129.2023.06.15.01.43.15 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 15 Jun 2023 01:43:17 -0700 (PDT) From: Peng Zhang To: Liam.Howlett@oracle.com Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, Peng Zhang Subject: [PATCH v3 3/4] maple_tree: optimize mas_wr_append(), also improve duplicating VMAs Date: Thu, 15 Jun 2023 16:43:00 +0800 Message-Id: <20230615084301.97701-4-zhangpeng.00@bytedance.com> X-Mailer: git-send-email 2.37.0 (Apple Git-136) In-Reply-To: <20230615084301.97701-1-zhangpeng.00@bytedance.com> References: <20230615084301.97701-1-zhangpeng.00@bytedance.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 56F05180017 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 4c8mauqs5pmra1mt7js6ag4rd968fkd4 X-HE-Tag: 1686818599-555675 X-HE-Meta: U2FsdGVkX1/nNNsaQVK87qQNfpuoqv3m/bcnmSvtfBxgR59ZmNdKGZfA7dtc9MRjE9kduUVuj/Y9qn0Tboqf2+Am01chY11gOkG6pgq4MGsabkbB1iGoU6vWHOCdt2ZD95QcAfqefY3GrTBUSDErgYO69qRRvBLVEcbRWt/9vGDuxHr/7yKSopVZr959x3eQpxGnUxqfLvpV1pQ/SO0P+8gZNR1XHfDkRfFrCIIEIlTKrW5Wn10arjqm8bFvAeH9bE/4lF3nEWmnBJ8cL/zHJ3GZeGI6ifIyCWVEUsLZSr3TC/CngpxbW3hkqpKC7qYww6WJ5HUlxEFm9SzuMxwFxKG3oOHyzbMU4ASs0CCqtbt+LkJoCHHTMfTaQjxMWbroFOWalWiig2q8CaAQjsI6jComMxegZTACH4qbcH936CzJSANn8WZ85d9kF12Nhav0Pm8246WlttA+jnCiHUrKpp7Q9WmqPncCdJiQncwvLzWPHbYPgLbakUe74A02gwG42uoa5gepJv+BWb2WRNyOdeSbA+aipnEUi60VlnMlFBbsKR4fzLBQTKq2rZFhiybqLaPz2PhmrsTuUbLRWqmVxPICdvLfid9P1ESJsP7x6LVJhmXvA7K0BY/aGmuWkUMMO5jpP3uOE00nzX0ITyi5qS1Rke2Fq+VwpOB/lTnBzCh5Zflc3FG3AfOUQ8YPt6PZzFrkFs3tllCYruD3hHfncyQxXfaRvCeMj4eq8hK5EPWO0rONOxOj+peyRw5wgMBWsFbOebcQMnw4hm7vUXgkij07R5XMtaRII0XsAFC5od0zocQMMweAlLlqHxu13DDAyJrB/EbBcCnwSo9JaavVBHwVrhOqs/RZVdmkjmNg95uiBe5RLPAUJmwyLvZGVVbx+Y35sWhlZHdpBPac6Q1Pd1FzE7myt9UrbtW63gqf0empHMAhcn46p4zUPP7ZrYgQ/EoqR5fZsw+cPe/uvgF 8IXT+IiI yx28pYUyCF697vZTJKOEyTd0BmTLj5GVfAs81mb74FJxCzIpEw/JzdIbdNneQgYS72tBD5bnlAo2dCrpdM4mc/CmqrEu7ZpktkdFmUkjCj0KEa5V8bhAMv8eazPaMbMuFHP5KC8OrGCTdM8K9kueB++e4IiSebmDc76aMXM7QaQjc8cbr18sjgLZKFL0HkSI0d6s5zq3/a27Zbu+0/EZmT+6DkTQ/VWlWSWB0JJ027CZQsmqj4K3a8MJJle+wSO8q6ytnKB6Z5vXSqipt/EO2U03ZbsVxWSP43LG7ZEjTkc2tXNlO9GvAWS8ZS3AJYr6UTFD01xhmBVrvm+KLC/TRxrU5gHc3oI0FcMY9jhkOqZIdCka73CDbodC09gIdldNn3ZteLzbiXByYVB5KrGEDMp1zJw9be1lERx29NCCJukxsubfsJ66qzudgDKYtQ0D7RIb6/+QhY2WvgqnD+yKRA/+SLhd8PZaUpvAN X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When the new range can be completely covered by the original last range without touching the boundaries on both sides, two new entries can be appended to the end as a fast path. We update the original last pivot at the end, and the newly appended two entries will not be accessed before this, so it is also safe in RCU mode. This is useful for sequential insertion, which is what we do in dup_mmap(). Enabling BENCH_FORK in test_maple_tree and just running bench_forking() gives the following time-consuming numbers: before: after: 17,874.83 msec 15,738.38 msec It shows about a 12% performance improvement for duplicating VMAs. Signed-off-by: Peng Zhang Reviewed-by: Liam R. Howlett --- lib/maple_tree.c | 33 ++++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 11 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index d2799c69a669..da4af6743b30 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -4202,10 +4202,10 @@ static inline unsigned char mas_wr_new_end(struct ma_wr_state *wr_mas) * * Return: True if appended, false otherwise */ -static inline bool mas_wr_append(struct ma_wr_state *wr_mas) +static inline bool mas_wr_append(struct ma_wr_state *wr_mas, + unsigned char new_end) { unsigned char end = wr_mas->node_end; - unsigned char new_end = end + 1; struct ma_state *mas = wr_mas->mas; unsigned char node_pivots = mt_pivots[wr_mas->type]; @@ -4217,16 +4217,27 @@ static inline bool mas_wr_append(struct ma_wr_state *wr_mas) ma_set_meta(wr_mas->node, maple_leaf_64, 0, new_end); } - if (mas->last == wr_mas->r_max) { - /* Append to end of range */ - rcu_assign_pointer(wr_mas->slots[new_end], wr_mas->entry); - wr_mas->pivots[end] = mas->index - 1; - mas->offset = new_end; + if (new_end == wr_mas->node_end + 1) { + if (mas->last == wr_mas->r_max) { + /* Append to end of range */ + rcu_assign_pointer(wr_mas->slots[new_end], + wr_mas->entry); + wr_mas->pivots[end] = mas->index - 1; + mas->offset = new_end; + } else { + /* Append to start of range */ + rcu_assign_pointer(wr_mas->slots[new_end], + wr_mas->content); + wr_mas->pivots[end] = mas->last; + rcu_assign_pointer(wr_mas->slots[end], wr_mas->entry); + } } else { - /* Append to start of range */ + /* Append to the range without touching any boundaries. */ rcu_assign_pointer(wr_mas->slots[new_end], wr_mas->content); - wr_mas->pivots[end] = mas->last; - rcu_assign_pointer(wr_mas->slots[end], wr_mas->entry); + wr_mas->pivots[end + 1] = mas->last; + rcu_assign_pointer(wr_mas->slots[end + 1], wr_mas->entry); + wr_mas->pivots[end] = mas->index - 1; + mas->offset = end + 1; } if (!wr_mas->content || !wr_mas->entry) @@ -4273,7 +4284,7 @@ static inline void mas_wr_modify(struct ma_wr_state *wr_mas) goto slow_path; /* Attempt to append */ - if (new_end == wr_mas->node_end + 1 && mas_wr_append(wr_mas)) + if (mas_wr_append(wr_mas, new_end)) return; if (new_end == wr_mas->node_end && mas_wr_slot_store(wr_mas))