From patchwork Tue Apr 16 07:17:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13631447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AC3CC4345F for ; Tue, 16 Apr 2024 07:17:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 182736B0088; Tue, 16 Apr 2024 03:17:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 131D06B008A; Tue, 16 Apr 2024 03:17:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F13BE6B008C; Tue, 16 Apr 2024 03:17:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C0E7D6B0088 for ; Tue, 16 Apr 2024 03:17:49 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7307CA0985 for ; Tue, 16 Apr 2024 07:17:49 +0000 (UTC) X-FDA: 82014540258.01.8364889 Received: from mail-oa1-f44.google.com (mail-oa1-f44.google.com [209.85.160.44]) by imf29.hostedemail.com (Postfix) with ESMTP id A895912000D for ; Tue, 16 Apr 2024 07:17:47 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=et5dunIF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.160.44 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713251867; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=3LBhWV67y6b0WgcIZ1Dkb9cu6NZAVWwfF3kgN+jwSeQ=; b=P46mENEyXSna2ow8cNAlvxfGOIQqcTsqNUSd9d2wNQB3gHLMb7JZGGA7oMg9S5zJdZoDJv Uph6ZR/Kc1zVcyhSXZOJg64IC1vGRV1VGKSlsEwQZ2b8H9mf3KvcHViJJeVzQ2aRqy9yhj zIBp8QBQA9W5wFHJu+DSICNeUA7E3ic= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=et5dunIF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.160.44 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713251867; a=rsa-sha256; cv=none; b=P8ClvgLs8qfI5D5zyXaUkS5w8PlZ3M7nBggEwKJR47c4NbTfoBL4W8gQfnZ1NrR/7YOXQZ 7ctkBY58zD4E6J3y+0r9OQHoNYeSGRILoWuX4ALupjbf1ncV6I7VQxbtAgM2JPxgC100l9 ZaTqtPMgefsZCf9OW/O73uVNY3oEYX4= Received: by mail-oa1-f44.google.com with SMTP id 586e51a60fabf-2352cc0b076so90894fac.0 for ; Tue, 16 Apr 2024 00:17:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713251866; x=1713856666; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=3LBhWV67y6b0WgcIZ1Dkb9cu6NZAVWwfF3kgN+jwSeQ=; b=et5dunIFeVBF12ZoaFxv7HV7BCihTGXBGgNUP6fz/Bg8aIeJj2sdWc/9Vg4egOJZ6I bTNr1rhQ4aaygo9+VwohacEJWoBtw8+pxy9o4uI3CGShb0eC71tbDXAdVe+2Ov68Iaqd K2K0sM5ls/LmqhQfMCUDaIaIo8fYIxv51bu+JyFXQzGc1J9G/Jq3RNhnsBmIiAvsjpkR OGun4kVBWUenKpRhf913WNRJvaSwIe+4azLFWnyYMPSCzWcqaZw4x9mUx39FYaplHlYh sNujVF+6EYKPk+isTvOKCF+wCRCUURT5nIyMbjdA8/KNQlMtq7y8equHhvY7OiiPktmj wfuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713251866; x=1713856666; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=3LBhWV67y6b0WgcIZ1Dkb9cu6NZAVWwfF3kgN+jwSeQ=; b=j7I0LBldDJoNE8cXALzrsbBf/G72gUycnz6hg41w2JD276bAbMP65sECySjI/fixoo J5yu54SRvFTg16wQCbKsl7gi28H9ltn83cHvI431ZyWhcweeprRae7w0oOODzQhfBHHE 0995eVsH1bl1hjs/PBP70Ho64HBnVsOWN8oHGx6PMA01KCwe4kzVoy3utOOhc1LWy2Mp ns1K+fMAv993/STRvDbS/EvWZW/EKkPNJXzO2bfwRZwTpZdaIhxVj3tEZxKNJ3xfrJlu SPzFXZPveZsSlhxOVbrlLzhD1FrH5DNe87g3XDFcJyHV28Q0rmhFmxFffxvFxXRKq8jF q8DQ== X-Gm-Message-State: AOJu0YxSMPjjHINt3tiAbdYlFrKeoZng32IbvLvzV/MX+vqtvw/FnU3q 0WNA5QnOY60TNvjl+Oov0Z5bNr5rSX8cz9mvkx/Zkk5WBPPnPixzUTVtvWgj X-Google-Smtp-Source: AGHT+IHgv+eiaC3hjMju7rm9Qo7jLE7d9pZ+EJk4lWRkxaTVzSmh0SwpD32NoJ8kG1H43a01F3F4eQ== X-Received: by 2002:a05:6870:7012:b0:22e:ca59:8faa with SMTP id u18-20020a056870701200b0022eca598faamr13780993oae.34.1713251866075; Tue, 16 Apr 2024 00:17:46 -0700 (PDT) Received: from KASONG-MB2.tencent.com ([43.132.141.21]) by smtp.gmail.com with ESMTPSA id lr48-20020a056a0073b000b006e724ccdc3esm8340029pfb.55.2024.04.16.00.17.44 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 16 Apr 2024 00:17:45 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Matthew Wilcox , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v4 0/4] mm/filemap: optimize folio adding and splitting Date: Tue, 16 Apr 2024 15:17:18 +0800 Message-ID: <20240416071722.45997-1-ryncsn@gmail.com> X-Mailer: git-send-email 2.44.0 Reply-To: Kairui Song MIME-Version: 1.0 X-Rspamd-Queue-Id: A895912000D X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: tajxjj7pirnngpbhjhioocqkydnuxnt3 X-HE-Tag: 1713251867-977665 X-HE-Meta: U2FsdGVkX19e6D88ZBs62vd7Zro9KYjvBV90MVQxY8lfT/QLVjCn0FM7f1NOGr42xk/BOyf/q0heO7vjeKLIbEzxChZJoS7Qxst3ZzLmd9BaSdEuI1CIaQwEdsaM0mJxm1TFeKzPwLrqWx4l+Wd8HOAFEeUmy2Wa325CVtzjX7F6ybhcvfzZfTI/wK3maVRKEeISFWgQ3vLxmNQWxlv37ySxp13DAyu+KCLOcOXR2LfvszdjGDsv+yVGziR7wMqDTvrXkYU5jqTEUHWPW/5QHzolXw2epjZFE06RvqDYwCsatWHwzQfIb5kreOGMiGb+2rkLRqqZZe8O2ChoayfABvHAx/rzbLhRzz5ufFSvpnr8ChD5Thd7uEM3rdHNtSUE+S7dcJ0CuxrEsb7MPIogZ82FcPSyVV1MGbWr9aPIK28e82paGpmQnDzf5wyBGcQaXyHYXvqwgE9BexPPjcjhzSdEAxWL79GVae4fWmBcD+QBvRLBluW9zIXlNaAJAvYIA44EcHkBf5rx3nXeoopzy66LIWWh0b/3ASYU5rkEZA4apHzXnRGyZ4FnB9LNMJcjGyjZVtIxJ4D9ZD6dqUYOmy7GrrB5zaescD4K7UPi5imsEM5JckaYZ+4Vp6y2JqVJZ7HInOOgmYizXAdMMktnCaxeVHvvENLHvG52Xe9ZHgiHxBWy1EQE3NW31qK9/9KZjxx3uKpYMQ95+HR+A5o5CMKdeGz1OVAEzAOn0ahKjMiHc5AVrbWtmXAhI5Zok6mbPKDtYnl5HVL2vZd5tOAJCnr+Czcw6pThXy2d0CLpSXqP699Am/wZIMk5ML9UbkLm4ctAukpuxE4jVfbw1cUAtDx7iYCTYNjQScq/Iaf3fssXSuFnQcIOd876K14ilOANR8+EDtrGGEdxhorViId27Fy2mmGxw6I4SmRD+qXXSk5SxzLKDNn1t83dcs0QF6+2HkUyDxwNNMlo5e8XkRZ Gj5yvbn5 2rfSmUId6wOtLay6hc3b59fN/ZClKsy9bQ43hImjtyLoNwxzciADKy/FjdtUKXs7VqkA1ALn+/Uw0vclfkeCZUvZoF0Vmzz1+ggfmBkG9zyi8Cg0oOV63Vo8zioX7ZEFpTCzWyn0MiXLtv3wBxEluwPJj0BvyMv7AFCQE7bN3//f91ysx9fZi1zw4fX/KgMPe+mr/eZQB4Cfdsim3r+A2XzS7xO3O28DdNQSu0Ydc7S0INzvcUCX30wrtQ33A3/qD2SwuZTyiXAZyQPkefp9ArrxAjCFBi1QFp+nps72aMw5hnL/tsDzzbu8t3PYrfXd2TjkWXlE9c0Xh5JL2e7AYXqB9LYvsDOBSkIUjkLV0sl828cZAvEdjG/mcdA8gUQTK8g5QI8v6ilp23CXQR2fBqia8bRtkTBy65oOKO1t8z/H7t/7vJWKzRx2/h9PLTxtqvF+Hs71e+EX5EnDQ3+nZvfxJcP1OSCrkm/egHqKF3YpB7M8servq9O4AuwYuQYin6BnWqoYDt/eT2k4s+9Ad4wEmRxTwHomQT+gFJZewoB6LlucYY7gKkztugGuS4CjaH+D6sClubb9vnbdvdwzADDT5mWcikLvhZH/pHRwOFlmwn5zCenrNcD7x0skkFNA/cwHDGDlF/Y2g4GjiNZSEPB9brGIi+/cY/2xOxZwOhbyolj+QLrEmbRXAlL0Ki2luGSbOnGkFzbReiOaJ0345R8KOtfRMpiMbE/IkiiRkUC94+OxLhZCcvusSZlu0kJzkwdfMWcdKLaEQXI4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song Currently, at least 3 tree walks are needed for filemap folio adding if the folio is previously evicted. One for getting the order of current slot, one for ranged conflict check, and one for another order retrieving. If a split is needed, more walks are needed. This series is trying to merge these walks, and speed up filemap_add_folio, I see a 7.5% - 12.5% performance gain for fio stress test. So instead of doing multiple tree walks, do one optimism range check with lock hold, and exit if raced with another insertion. If a shadow exists, check it with a new xas_get_order helper before releasing the lock to avoid redundant tree walks for getting its order. Drop the lock and do the allocation only if a split is needed. In the best case, it only need to walk the tree once. If it needs to alloc and split, 3 walks are issued (One for first ranged conflict check and order retrieving, one for the second check after allocation, one for the insert after split). Testing with 4K pages, in an 8G cgroup, with 16G brd as block device: echo 3 > /proc/sys/vm/drop_caches fio -name=cached --numjobs=16 --filename=/mnt/test.img \ --buffered=1 --ioengine=mmap --rw=randread --time_based \ --ramp_time=30s --runtime=5m --group_reporting Before: bw ( MiB/s): min= 1027, max= 3520, per=100.00%, avg=2445.02, stdev=18.90, samples=8691 iops : min=263001, max=901288, avg=625924.36, stdev=4837.28, samples=8691 After (+7.3%): bw ( MiB/s): min= 493, max= 3947, per=100.00%, avg=2625.56, stdev=25.74, samples=8651 iops : min=126454, max=1010681, avg=672142.61, stdev=6590.48, samples=8651 Test result with THP (do a THP randread then switch to 4K page in hope it issues a lot of splitting): echo 3 > /proc/sys/vm/drop_caches fio -name=cached --numjobs=16 --filename=/mnt/test.img \ --buffered=1 --ioengine=mmap -thp=1 --readonly \ --rw=randread --time_based --ramp_time=30s --runtime=10m \ --group_reporting fio -name=cached --numjobs=16 --filename=/mnt/test.img \ --buffered=1 --ioengine=mmap \ --rw=randread --time_based --runtime=5s --group_reporting Before: bw ( KiB/s): min= 4141, max=14202, per=100.00%, avg=7935.51, stdev=96.85, samples=18976 iops : min= 1029, max= 3548, avg=1979.52, stdev=24.23, samples=18976ยท READ: bw=4545B/s (4545B/s), 4545B/s-4545B/s (4545B/s-4545B/s), io=64.0KiB (65.5kB), run=14419-14419msec After (+10.4%): bw ( KiB/s): min= 4611, max=15370, per=100.00%, avg=8928.74, stdev=105.17, samples=19146 iops : min= 1151, max= 3842, avg=2231.27, stdev=26.29, samples=19146 READ: bw=4635B/s (4635B/s), 4635B/s-4635B/s (4635B/s-4635B/s), io=64.0KiB (65.5kB), run=14137-14137msec The performance is better for both 4K (+7.5%) and THP (+12.5%) cached read. V3: https://lore.kernel.org/linux-mm/20240415171857.19244-1-ryncsn@gmail.com/ Updates from V3: - Simplify comment, and fold in a sparse warning fix, as suggested by Matthew Wilcox. V2: https://lore.kernel.org/lkml/20240325171405.99971-1-ryncsn@gmail.com/ Updates from V2: - Fix the misusage of locks in test module: https://lore.kernel.org/oe-lkp/202404151046.448e2d6e-lkp@intel.com V1: https://lore.kernel.org/lkml/20240319092733.4501-1-ryncsn@gmail.com/ Updates from V1: - Collect Acks. - Add tests for new xas_get_order and combined usage of xas_get_order with xas_for_each_conflict. - Fix a memleak for patch 4/4 and modify the function in place instead of adding a new helper. - Update benchmark, I forgot to drop cache and disable THP for pervious test, so the result was for mixed usaged of split and add. The result is even better now. Kairui Song (4): mm/filemap: return early if failed to allocate memory for split mm/filemap: clean up hugetlb exclusion code lib/xarray: introduce a new helper xas_get_order mm/filemap: optimize filemap folio adding include/linux/xarray.h | 6 +++ lib/test_xarray.c | 93 ++++++++++++++++++++++++++++++++++++++++++ lib/xarray.c | 49 ++++++++++++++-------- mm/filemap.c | 74 +++++++++++++++++++++------------ 4 files changed, 179 insertions(+), 43 deletions(-)