From patchwork Fri Jun 9 12:03:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peng Zhang X-Patchwork-Id: 13273777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41F27C7EE29 for ; Fri, 9 Jun 2023 12:04:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D5E548E0006; Fri, 9 Jun 2023 08:04:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C984A8E0001; Fri, 9 Jun 2023 08:04:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B605D8E0006; Fri, 9 Jun 2023 08:04:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A8AD48E0001 for ; Fri, 9 Jun 2023 08:04:40 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 753561C7640 for ; Fri, 9 Jun 2023 12:04:40 +0000 (UTC) X-FDA: 80883077520.20.0E34D61 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf01.hostedemail.com (Postfix) with ESMTP id A10F04001C for ; Fri, 9 Jun 2023 12:04:38 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Gsx26ftx; spf=pass (imf01.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686312278; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jL1zyrCsP3IyIQZEOeNmddZJSl3qkO5SMTODoFeq2wM=; b=a0yU4AsCD2fM/l2ljt0vpzh0ahBNtfcA7kCPrjxD7o/4SF++w/4OG8EgNBdP7pHd0HwtRG 2F8nbOn/HOlH72rvukXTsvQuigTgc5iwHZTcSVsRs2Gw8qMw35AxqMQwCYaEk+j9vJ3/0u SX8iYde0okNmQMarCEdr/qG9C5Lmt4w= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686312278; a=rsa-sha256; cv=none; b=1B/vOpXX9EmDw11nFCU7atbhA4ELfjfDl9yXVnR5rN2JPGNKX6hO1XY9VclIkkpbgiu+QY VpwsvA9t285CUqtMh8V+HXBfUZCnIvlYKLS2n4bm874aTV2DDJ1kvV91VHsPISAfy69Rvm 7oe+9Bbb+e++xlJFkXEAH7qOypI8lGE= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Gsx26ftx; spf=pass (imf01.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-653f9c7b3e4so1324731b3a.2 for ; Fri, 09 Jun 2023 05:04:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1686312277; x=1688904277; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jL1zyrCsP3IyIQZEOeNmddZJSl3qkO5SMTODoFeq2wM=; b=Gsx26ftx1CyOLFjHK1Iey0HHbvMHRqZFye9C373qPvnSfcCcMpNYFN2Z57E2PqB9am NbEYjnVKZUxWntdqaCW5M8xWUXHWcwX4lvW3/HiJV4NIyKczYExvZC58nr8Ty/u1q/UF kAyymqAnVU/kjJi3Qh1Yu4z6k9bJdZBbZRwTHkrzA41DfdmuJxRoVFdoz9bawNnffH2V sjePu9crSP9w9NvNlT6zn+KdBRotkeVlUEwamMNtmUlafxWjSPCt2iqDitqqcRKrZ7H6 aXhB5+ydO++F08JoNLBn96lp/nHrJ19Sf1XX7PaxX3VNnjfplmFR4GjOuK03MtXxUjjt n/lA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686312277; x=1688904277; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jL1zyrCsP3IyIQZEOeNmddZJSl3qkO5SMTODoFeq2wM=; b=ctm7Ty7AUU9VMirg5JEbUoP9zx5Ze5+1VAgFIFyFmtjZp6jHXE0sYuiTA+kGalBzzR F6LCiOlINkK6wxL+EE3r/IDC3MxFC/g66lO/0RYyJPfM3V2sIkdVolF8iUIfd0SJDRcc 5Bvax1i6zYLrrrOSjMZo221YRHozM3oOnFo54iKlrWGGCg4II+9JWAWCcTBToaeNCm0h ypaQV8UOrcf6J+1X+OEsXGhF5+R0gyxNEUpULmjwh4YFzn0VuC2PXDsEhdxUJePfShWW uNHj5+FB/nBdhwg79Fx8NEtQwcVDvGJvU7E35wdi04YIogPkALTYzFJckLNLaKVTdoXQ 0gCg== X-Gm-Message-State: AC+VfDxzu6rHA8OUab/Uu2o7Uylhtmqymig8Ne9LwXnKyoDqGpzMkqZx xgsewtlqtcbmCq8gMnEHoNNTDA== X-Google-Smtp-Source: ACHHUZ6F7zQNdvlQlu66gpRu8wSi/BuDCO9wkbiibbLKyy50iBdhniNxE0EEiDy15hFiNJ/jReow8w== X-Received: by 2002:a05:6a00:21d0:b0:654:100f:bffc with SMTP id t16-20020a056a0021d000b00654100fbffcmr938338pfj.4.1686312277267; Fri, 09 Jun 2023 05:04:37 -0700 (PDT) Received: from localhost.localdomain ([139.177.225.249]) by smtp.gmail.com with ESMTPSA id v12-20020a65568c000000b00514256c05c2sm2619168pgs.7.2023.06.09.05.04.34 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 09 Jun 2023 05:04:37 -0700 (PDT) From: Peng Zhang To: Liam.Howlett@oracle.com Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, Peng Zhang Subject: [PATCH v2 2/3] maple_tree: optimize mas_wr_append(), also improve duplicating VMAs Date: Fri, 9 Jun 2023 20:03:46 +0800 Message-Id: <20230609120347.63936-3-zhangpeng.00@bytedance.com> X-Mailer: git-send-email 2.37.0 (Apple Git-136) In-Reply-To: <20230609120347.63936-1-zhangpeng.00@bytedance.com> References: <20230609120347.63936-1-zhangpeng.00@bytedance.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: A10F04001C X-Rspam-User: X-Stat-Signature: 9z1hz4xzbjs5jo9warmpfzzmi9o73t6z X-Rspamd-Server: rspam03 X-HE-Tag: 1686312278-212262 X-HE-Meta: U2FsdGVkX1/0eD6lBkWxyZj0wjTClmpl0H/j9ZmOVBXLb04S4rT/+7SejN7Ndn9GOSwETxkYvJfMBa27KguWGW5ohzOfP8R+cCWyuBLZyjyhbp0eBMsqIsbsxXQorKTpaLmXGCt9mYJXSFQVXBb0prDWRQnMKe2devZd4S0gwBCsD2XyzSSrpHwY4Nalzzyb6uihZUyPwxQuwRRpliAiLPWNK4Gegu33vdREiYTMn3co0gTur+ke87mFEg5+TpOMbWCk6XoiiBkrGwIG4xsyGAJwqTDsP0HFT5AmYDGaPVMUxG3Dv3qxS+QnDQClGYTmiDrEPiAd8+cFrBbXkx66+WemXdA4rwA36Ag+sbf1/zIZCf2IwIeeJw+GdipZ2dB8lQhAQvavMM/OASYT1eyP2MxIMUMA+BofVBEXLH++Rw+KBh3RYtGn0jgDn+HNQ/Wcbu6LMKYVjNJM74cXMKO0yKCxtjtExB0+c34SaLsj5hFVd7ep9fHmuCUdmXLkCe6AOI0Ud2pB0BSiIfxKCHvyk2y+Hewq/yp2CKID93bSfzYKaUVwKc/kDJ2Ds0zXGXbSJqvb+Y0IfMOCLWYNw5N64yvPAQDeGTf6DZ2SXRiualr2TtSdeMZd10n9zEwJ1SjQVRp8itBsFYd82HuG+S9WG+A/eIdkDrJ+nYV84/R78HYDnAZ8bmZAzmnRES5n6XvN+UHqq9rV/PEmAmhpYeuwYETEnQrNUMepG+UlQCtfQKyyT9I6x0aJTVsdIV/1/kF5rz79/qL928kmbAxjSn/5xfDiizhWLKYUa8pFoYzTamP8hoTSa28+fo4W38rLERbbDb5KFariECDZYNOPSBUm9Zl+5oFYVcZmY89aoOAB9keSzZXL4Az0XPH/iQ/E3VEDgCtemAlBWwj1El4sfeAlFcV/+o9yF7/yXrA/QZHMeym/Wkk6sTT96KEaVLmVqWlaKfHYIVfTI24qiBlGZ1V QtteoNjP AsSo2qOc+S3P6yBfFeS1yQpexYdCytlXOiMeq1L1Mtq7OjzxguanoJ3k52L9zdXu9Ozu4cpm6uUnig/G8fiqgK+JKTJcrfq7/ava+vYg7V1SyxiOdyAcLTRR9bC9p21ds9WpO4cmbp1ulvBxCqcoezNqrhOlzHPGEQOHv2GLeMjz3Ui9UMwESjXg8VR+4K9AI07NJRy3p5QgIv16b97Xm6/UN48//LPFSm5i2x+m0yhHMFEfNZqNip81zTgpdEe3vaoGKpM9UBM0cD0v2wNPU4uvylYPWVXxS+09BxlXx7KB+1Hn5Zfqc710wd20rG5DxTm2zJvpY5/+bShR9SeSD6R4aLU+dLuSaEYQTY1v9QA54AoKI8biA+v+hfJopDTMy6V3g283epBDGl9pxMOVIQHVfFVWrPyQGRzdS5EsP3jUZzFHIhXmnff0MIoGiYpumEriQTVhhq1sI4riJ9SQiz+htsAP8VrYhvwdT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When the new range can be completely covered by the original last range without touching the boundaries on both sides, two new entries can be appended to the end as a fast path. We update the original last pivot at the end, and the newly appended two entries will not be accessed before this, so it is also safe in RCU mode. This is useful for sequential insertion, which is what we do in dup_mmap(). Enabling BENCH_FORK in test_maple_tree and just running bench_forking() gives the following time-consuming numbers: before: after: 17,874.83 msec 15,738.38 msec It shows about a 12% performance improvement for duplicating VMAs. Signed-off-by: Peng Zhang Reviewed-by: Liam R. Howlett --- lib/maple_tree.c | 33 ++++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 11 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index 5ea211c3f186..a96eb646e839 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -4269,10 +4269,10 @@ static inline unsigned char mas_wr_new_end(struct ma_wr_state *wr_mas) * * Return: True if appended, false otherwise */ -static inline bool mas_wr_append(struct ma_wr_state *wr_mas) +static inline bool mas_wr_append(struct ma_wr_state *wr_mas, + unsigned char new_end) { unsigned char end = wr_mas->node_end; - unsigned char new_end = end + 1; struct ma_state *mas = wr_mas->mas; unsigned char node_pivots = mt_pivots[wr_mas->type]; @@ -4284,16 +4284,27 @@ static inline bool mas_wr_append(struct ma_wr_state *wr_mas) ma_set_meta(wr_mas->node, maple_leaf_64, 0, new_end); } - if (mas->last == wr_mas->r_max) { - /* Append to end of range */ - rcu_assign_pointer(wr_mas->slots[new_end], wr_mas->entry); - wr_mas->pivots[end] = mas->index - 1; - mas->offset = new_end; + if (new_end == wr_mas->node_end + 1) { + if (mas->last == wr_mas->r_max) { + /* Append to end of range */ + rcu_assign_pointer(wr_mas->slots[new_end], + wr_mas->entry); + wr_mas->pivots[end] = mas->index - 1; + mas->offset = new_end; + } else { + /* Append to start of range */ + rcu_assign_pointer(wr_mas->slots[new_end], + wr_mas->content); + wr_mas->pivots[end] = mas->last; + rcu_assign_pointer(wr_mas->slots[end], wr_mas->entry); + } } else { - /* Append to start of range */ + /* Append to the range without touching any boundaries. */ rcu_assign_pointer(wr_mas->slots[new_end], wr_mas->content); - wr_mas->pivots[end] = mas->last; - rcu_assign_pointer(wr_mas->slots[end], wr_mas->entry); + wr_mas->pivots[end + 1] = mas->last; + rcu_assign_pointer(wr_mas->slots[end + 1], wr_mas->entry); + wr_mas->pivots[end] = mas->index - 1; + mas->offset = end + 1; } if (!wr_mas->content || !wr_mas->entry) @@ -4340,7 +4351,7 @@ static inline void mas_wr_modify(struct ma_wr_state *wr_mas) goto slow_path; /* Attempt to append */ - if (new_end == wr_mas->node_end + 1 && mas_wr_append(wr_mas)) + if (mas_wr_append(wr_mas, new_end)) return; if (new_end == wr_mas->node_end && mas_wr_slot_store(wr_mas))