From patchwork Tue Mar 26 18:50:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13604894 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F037DC6FD1F for ; Tue, 26 Mar 2024 19:04:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 71F3F6B0087; Tue, 26 Mar 2024 15:04:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A6BF6B008A; Tue, 26 Mar 2024 15:04:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 51FD26B008C; Tue, 26 Mar 2024 15:04:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 424926B0087 for ; Tue, 26 Mar 2024 15:04:25 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E1812140D3D for ; Tue, 26 Mar 2024 19:04:24 +0000 (UTC) X-FDA: 81940116048.25.965713A Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) by imf05.hostedemail.com (Postfix) with ESMTP id 108A9100020 for ; Tue, 26 Mar 2024 19:04:22 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kYWmMjF3; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.210.182 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711479863; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MPCeiBQ6Vuv/VMKCl+70rHoy18EK+MdrkqBGmpRbWL4=; b=Dn2RqhFGz/8rZ7khkjPmR/cECpJuE3TruzfyTbF+5vbTFqGmoCLnMoBlnjrg2j2cevgBkj imq3XNskNJpNJGMpVC2twyp5i2fYm1unaOdfq501BBTiCX7r/PcN/AVA5bakN0xrmpDc43 CuZbNwOCukLJgrAVkJHRkO9SMiXSwSc= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kYWmMjF3; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.210.182 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711479863; a=rsa-sha256; cv=none; b=aGPXsaTiQwAP5sxE2gvLFYDIOncEabBfOskGxLbpn0vWwnYxBMZguyT+o5Fwv3lIycWdx4 qwmHlQxvE308J4aYYusHp3xUbzKKN92s7Ul3hhJPK6JMduCi7ZR7ypGomOUTMUAqjsAd5x yD4Pkod6ImUewJK4WbHzo5geu89EHaU= Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-6e6b5432439so4576690b3a.1 for ; Tue, 26 Mar 2024 12:04:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711479861; x=1712084661; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=MPCeiBQ6Vuv/VMKCl+70rHoy18EK+MdrkqBGmpRbWL4=; b=kYWmMjF3yN0Zx/JzGQ+KDvNK/MXO/ggsOtEg9cMVx2yn7SCvjEYP0cDnJd76NrjzT5 nOK2Gt5/4Tlp1C6kCHM4OjqEvqcx2qzlscm+giwYTq6/aSw9EKa67s+pA8njRPtZAeNp o9J4JJBvE3lC1jGWl//L14MzCAqp/md9YWqymv7fshLnSqwrzgbVJ8sBbh+xQ7PGSoh9 RrmkSr6QMHiNBeuF2uG+rNUIxZNZL2i0Qxk0JcDfN/GiXocb7uVcNgi2Fw0UJRsyZQMx mAzivC86xhlx7Xl1fQMb6zpVBg1/eIBqIq+SKg3uoZVmZbmVt1TVruVCRXCNBvytoypE ySlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711479861; x=1712084661; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=MPCeiBQ6Vuv/VMKCl+70rHoy18EK+MdrkqBGmpRbWL4=; b=W9FwKu7wg/Cvze5gox3HLhWE5d68xig2dvDLQQbpojuZ7VODWNsH5PjhrmTJmzEKzV OS0FVR4ER0T+anEgqcQkMg7ChejO8Dz4NUPY/ImUBY7fyTqh3D+H4j81ItIdyPTkvVJN vPeaqlFUaipa1QJytaw3+2JsMthNX/ZWkI//d9Cjg34utlWoA6XCp5Ve/27PT3QFuSQu xcs+ztWgFqmIsisxOOdTtA7HSvM/6h2nh6JmnRRQwrzONtXqrok3hitbjaWrWVmvMBjD cZuXLXhAXou440s62zAqB//UB3gjLzQ7q8MRRvgli6eOtQlA1VOg3EiAfcikDZKkbiJ4 aR7Q== X-Gm-Message-State: AOJu0YzOwTbfOMEGeoDAFHyD+2GTdUR3WxmWydyM3reoS9umA+wqPG6U 3VXfV4NHtN3NwVbl/Z4zo3+o+/KRRtod80B8qrW1yg1Pe3Tsa8v3HQs8VMWWPlQk3ztv X-Google-Smtp-Source: AGHT+IEvuj2NywPp4edKIveEUgd69iy3TfxxMJh0u9KQ2EoAOriC83NY/O9bepwSOY7NB56Qt1OqEA== X-Received: by 2002:a05:6a00:814:b0:6e6:830:cd13 with SMTP id m20-20020a056a00081400b006e60830cd13mr2478320pfk.23.1711479860988; Tue, 26 Mar 2024 12:04:20 -0700 (PDT) Received: from KASONG-MB2.tencent.com ([115.171.40.106]) by smtp.gmail.com with ESMTPSA id j14-20020aa783ce000000b006ea790c2232sm6298350pfn.79.2024.03.26.12.04.16 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 26 Mar 2024 12:04:20 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: "Huang, Ying" , Chris Li , Minchan Kim , Barry Song , Ryan Roberts , Yu Zhao , SeongJae Park , David Hildenbrand , Yosry Ahmed , Johannes Weiner , Matthew Wilcox , Nhat Pham , Chengming Zhou , Andrew Morton , linux-kernel@vger.kernel.org, Kairui Song Subject: [RFC PATCH 01/10] mm/filemap: split filemap storing logic into a standalone helper Date: Wed, 27 Mar 2024 02:50:23 +0800 Message-ID: <20240326185032.72159-2-ryncsn@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240326185032.72159-1-ryncsn@gmail.com> References: <20240326185032.72159-1-ryncsn@gmail.com> Reply-To: Kairui Song MIME-Version: 1.0 X-Rspamd-Queue-Id: 108A9100020 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 8pt3d4bbpxnzma7eoaarq3ren6pcydks X-HE-Tag: 1711479862-678238 X-HE-Meta: U2FsdGVkX1/7x4VdLP90rNydVhgDGY111KTtZ/MHn4rr8rPz5My2a9RypZlpG95aI7LCIGKdimxJB4vvsjutCxnSGOJ7Vwqep/kPcz2UB4R7yg07IC4H2jM+4HEXTbRTZe1ONyNRZqQu1jjabMct/4Gpb7WWyYcdJvwHcLX5c+r6mu13qSDJVkXWJltjy5WBCnPazFiCzXrqcGmL6MZnLibuAVhuwrZommMZIYqk4W4+NMk8WIvaHZ2F90JNjqPxxneAyPa4MppKEcJKLHSDfKXxm2l3c4YJt5A3Ax1i9WqCwjitldVqFruv9aNlpulePCmARZRf33kI01I9CgpAOZO4LmDckgJHJlUpoXqmIBI6kMK4TSfH1wgGGSnkbDf1XV284ZcS4DntkeUUkJLuD2wDFIZslZa67kfS2S4nRhKco9EttDzRZXUSMymiCWH4MK1tDF3o0F1XOyXe07qkMwzY7ffP8u10oJ2xLKEUfdEbS7lJUQhpktj+yTYW7es9Fmkk+4DQ3T0UYXLwV0YSsHZvZrtkqD6/3sxdE+zQL6R/XlT3IMYBFE6xYDzpY2sPL+IbT9tFAH/p6kmLql10vADmRuM3Kcrv+Pdx87xOv7A12do2qQhU1saaUGv7bDuqQwiRxd7j2O3mkWg26bJ8Z1ewhi7XgEj6giMIdhjQaqQ0i6yo8EbmrPZ735KG4jjmT5Xw8XVG+QmcC/tOy+NBDOa0GDWZZkppZaiFmQxHJRHSZfG+CacoqsT4ROojsgx6ygu4eSlyWzhRq6tNWNRfS8C9A9yeA+5g5tzaScdPMPbpN072bjCUPP+aexutLc7BWmNA75qHA2gPhVzySuM372tjc7mJDrTKWvcFgeKbF1xGDuH93u/KIoEDFuD18z1f4+YkaMlA4gCgPsnETSw28g32wDofkbpDM1QsBCQEolOZjnsOdp3+/4GwcIpzoct5t6ycyLXcbBwzapCFcO/ UwrvUFU0 6TahXXvzUIsszj6T7cTEjb2CjjBU8QbmNo9uFVWVMidpCh8sD1wHYMA/BI3M25xm8JOH4cVgTbD3SL1iGm43vcHU9jF9B/mbV8BcooC9w5yZG93gwOmAE/AbR8cUianlCbr3zxMKzRocmpIm7TjGFtPngE1JpTEokDAPIuJbZbFgeal39QgnkqBaVSvQNg4m5rG3x0ku1PG4yiOQXbqSj0xO53MKy+AcdP3BtyinIajcVLhET93sGXgPOVHC0d9RuLq86skd//VU+xk9/10+94o6Vv29/TCHeIeKTcdiw+vsdntrpV3ODxcB0jX7YDN79uSvXGeeeLJ+E5ScEkEkX37+A4t3hGFZ2dkqrUQTKjBaksshkeIBBJJy/+K5YkZm8ZwurNfXV8NNEO4IOhxiJa7cJkhdAAWpFw0fm/k9LiXChwBcvGPlluJKmXajKQ2GMmWAiXqyeprq8PQhNYoFDG2Z0+E+M2Rr7YO58OuzWWDU9mdewZWZXmmfDMCXWeKHGCuIXoq2ojFmYl1RKaapQVfu+eDTj+X8+xVfNlNJbQs/KvX9fNmdjWi7VYQHwdMGO9mM8X2f1urX78UvWWM8o7wS7PgW5YtmFvZK7XfgmAdhBdgYAp1t3twneTrPPtTVeW4i+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song Swapcache can reuse this part for multi index support, no change of performance from page cache side except noise: Test in 8G memory cgroup and 16G brd ramdisk. echo 3 > /proc/sys/vm/drop_caches fio -name=cached --numjobs=16 --filename=/mnt/test.img \ --buffered=1 --ioengine=mmap --rw=randread --time_based \ --ramp_time=30s --runtime=5m --group_reporting Before: bw ( MiB/s): min= 493, max= 3947, per=100.00%, avg=2625.56, stdev=25.74, samples=8651 iops : min=126454, max=1010681, avg=672142.61, stdev=6590.48, samples=8651 After: bw ( MiB/s): min= 298, max= 3840, per=100.00%, avg=2614.34, stdev=23.77, samples=8689 iops : min=76464, max=983045, avg=669270.35, stdev=6084.31, samples=8689 Test result with THP (do a THP randread then switch to 4K page in hope it issues a lot of splitting): echo 3 > /proc/sys/vm/drop_caches fio -name=cached --numjobs=16 --filename=/mnt/test.img \ --buffered=1 --ioengine=mmap -thp=1 --readonly \ --rw=randread --time_based --ramp_time=30s --runtime=10m \ --group_reporting fio -name=cached --numjobs=16 --filename=/mnt/test.img \ --buffered=1 --ioengine=mmap \ --rw=randread --time_based --runtime=5s --group_reporting Before: bw ( KiB/s): min= 4611, max=15370, per=100.00%, avg=8928.74, stdev=105.17, samples=19146 iops : min= 1151, max= 3842, avg=2231.27, stdev=26.29, samples=19146 READ: bw=4635B/s (4635B/s), 4635B/s-4635B/s (4635B/s-4635B/s), io=64.0KiB (65.5kB), run=14137-14137msec After: bw ( KiB/s): min= 4691, max=15666, per=100.00%, avg=8890.30, stdev=104.53, samples=19056 iops : min= 1167, max= 3913, avg=2218.68, stdev=26.15, samples=19056 READ: bw=4590B/s (4590B/s), 4590B/s-4590B/s (4590B/s-4590B/s), io=64.0KiB (65.5kB), run=14275-14275msec Signed-off-by: Kairui Song --- mm/filemap.c | 124 +++++++++++++++++++++++++++------------------------ 1 file changed, 65 insertions(+), 59 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 90b86f22a9df..0ccdc9e92764 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -848,38 +848,23 @@ void replace_page_cache_folio(struct folio *old, struct folio *new) } EXPORT_SYMBOL_GPL(replace_page_cache_folio); -noinline int __filemap_add_folio(struct address_space *mapping, - struct folio *folio, pgoff_t index, gfp_t gfp, void **shadowp) +static int __filemap_lock_store(struct xa_state *xas, struct folio *folio, + pgoff_t index, gfp_t gfp, void **shadowp) { - XA_STATE(xas, &mapping->i_pages, index); - void *alloced_shadow = NULL; - int alloced_order = 0; - bool huge; - long nr; - - VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); - VM_BUG_ON_FOLIO(folio_test_swapbacked(folio), folio); - mapping_set_update(&xas, mapping); - - VM_BUG_ON_FOLIO(index & (folio_nr_pages(folio) - 1), folio); - xas_set_order(&xas, index, folio_order(folio)); - huge = folio_test_hugetlb(folio); - nr = folio_nr_pages(folio); - + void *entry, *old, *alloced_shadow = NULL; + int order, split_order, alloced_order = 0; gfp &= GFP_RECLAIM_MASK; - folio_ref_add(folio, nr); - folio->mapping = mapping; - folio->index = xas.xa_index; for (;;) { - int order = -1, split_order = 0; - void *entry, *old = NULL; + order = -1; + split_order = 0; + old = NULL; - xas_lock_irq(&xas); - xas_for_each_conflict(&xas, entry) { + xas_lock_irq(xas); + xas_for_each_conflict(xas, entry) { old = entry; if (!xa_is_value(entry)) { - xas_set_err(&xas, -EEXIST); + xas_set_err(xas, -EEXIST); goto unlock; } /* @@ -887,72 +872,93 @@ noinline int __filemap_add_folio(struct address_space *mapping, * it will be the first and only entry iterated. */ if (order == -1) - order = xas_get_order(&xas); + order = xas_get_order(xas); } /* entry may have changed before we re-acquire the lock */ if (alloced_order && (old != alloced_shadow || order != alloced_order)) { - xas_destroy(&xas); + xas_destroy(xas); alloced_order = 0; } if (old) { if (order > 0 && order > folio_order(folio)) { - /* How to handle large swap entries? */ - BUG_ON(shmem_mapping(mapping)); if (!alloced_order) { split_order = order; goto unlock; } - xas_split(&xas, old, order); - xas_reset(&xas); + xas_split(xas, old, order); + xas_reset(xas); } if (shadowp) *shadowp = old; } - xas_store(&xas, folio); - if (xas_error(&xas)) - goto unlock; - - mapping->nrpages += nr; - - /* hugetlb pages do not participate in page cache accounting */ - if (!huge) { - __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, nr); - if (folio_test_pmd_mappable(folio)) - __lruvec_stat_mod_folio(folio, - NR_FILE_THPS, nr); - } - + xas_store(xas, folio); + if (!xas_error(xas)) + return 0; unlock: - xas_unlock_irq(&xas); + xas_unlock_irq(xas); /* split needed, alloc here and retry. */ if (split_order) { - xas_split_alloc(&xas, old, split_order, gfp); - if (xas_error(&xas)) + xas_split_alloc(xas, old, split_order, gfp); + if (xas_error(xas)) goto error; alloced_shadow = old; alloced_order = split_order; - xas_reset(&xas); + xas_reset(xas); continue; } - if (!xas_nomem(&xas, gfp)) + if (!xas_nomem(xas, gfp)) break; } - if (xas_error(&xas)) - goto error; - - trace_mm_filemap_add_to_page_cache(folio); - return 0; error: - folio->mapping = NULL; - /* Leave page->index set: truncation relies upon it */ - folio_put_refs(folio, nr); - return xas_error(&xas); + return xas_error(xas); +} + +noinline int __filemap_add_folio(struct address_space *mapping, + struct folio *folio, pgoff_t index, gfp_t gfp, void **shadowp) +{ + XA_STATE(xas, &mapping->i_pages, index); + bool huge; + long nr; + int ret; + + VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); + VM_BUG_ON_FOLIO(folio_test_swapbacked(folio), folio); + mapping_set_update(&xas, mapping); + + VM_BUG_ON_FOLIO(index & (folio_nr_pages(folio) - 1), folio); + xas_set_order(&xas, index, folio_order(folio)); + huge = folio_test_hugetlb(folio); + nr = folio_nr_pages(folio); + + folio_ref_add(folio, nr); + folio->mapping = mapping; + folio->index = xas.xa_index; + + ret = __filemap_lock_store(&xas, folio, index, gfp, shadowp); + if (!ret) { + mapping->nrpages += nr; + /* hugetlb pages do not participate in page cache accounting */ + if (!huge) { + __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, nr); + if (folio_test_pmd_mappable(folio)) + __lruvec_stat_mod_folio(folio, + NR_FILE_THPS, nr); + } + xas_unlock_irq(&xas); + trace_mm_filemap_add_to_page_cache(folio); + } else { + folio->mapping = NULL; + /* Leave page->index set: truncation relies upon it */ + folio_put_refs(folio, nr); + } + + return ret; } ALLOW_ERROR_INJECTION(__filemap_add_folio, ERRNO);