From patchwork Sun Mar 2 06:14:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: wang wei X-Patchwork-Id: 13997751 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2BBEC19F32 for ; Sun, 2 Mar 2025 06:15:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD3D96B0082; Sun, 2 Mar 2025 01:15:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A5C0C6B0083; Sun, 2 Mar 2025 01:15:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94ABD6B0085; Sun, 2 Mar 2025 01:15:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 77FB06B0082 for ; Sun, 2 Mar 2025 01:15:00 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D5DBF122D7A for ; Sun, 2 Mar 2025 06:14:59 +0000 (UTC) X-FDA: 83175597918.21.FB18F07 Received: from m16.mail.163.com (m16.mail.163.com [220.197.31.5]) by imf26.hostedemail.com (Postfix) with ESMTP id 13BD2140005 for ; Sun, 2 Mar 2025 06:14:56 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=163.com header.s=s110527 header.b=Xd2ilemj; spf=pass (imf26.hostedemail.com: domain of a929244872@163.com designates 220.197.31.5 as permitted sender) smtp.mailfrom=a929244872@163.com; dmarc=pass (policy=none) header.from=163.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740896098; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=LZiJ5cOq3yvPM8KTarC8E2rmBeMmRh8DrwrD4hyq05s=; b=uFOE59XVptPS6NU+ueO+e314O0RBB49ZKwlETMETNsNOMlCVYCFMlF2L7KydfJ/vGUEAZB CQc8jo9RPU9tatAgVWTejYOjBPqkVNIbg4FbAFoBpqjIm0flpMP/sSY/IOZwIYdRc5pGhj zxSiyOORxziOMkRvXCXfgqfZb/36WwQ= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=163.com header.s=s110527 header.b=Xd2ilemj; spf=pass (imf26.hostedemail.com: domain of a929244872@163.com designates 220.197.31.5 as permitted sender) smtp.mailfrom=a929244872@163.com; dmarc=pass (policy=none) header.from=163.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740896098; a=rsa-sha256; cv=none; b=yftnOhvxIthixy3Lk5jkvq/qbNp3n01UOvGOWWbuj/W0Cf5Y2PUg+NmEVx273HUeuvD2ue NDR+asqXVr1AGaGhXYq/CKFgVAQpa11NOa8RP4OqgeRuEsSRz1g074d8my7j/Nb6jIcZjP XpMwjUJE5cBrLsA9jl7aj5weLE87tbk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=From:Subject:Date:Message-Id:MIME-Version; bh=LZiJ5 cOq3yvPM8KTarC8E2rmBeMmRh8DrwrD4hyq05s=; b=Xd2ilemja9OjRQgptlaWv GpkAnsu9sPjtSbkpuK0NOjZOKpifeCwNYvYKiskjflxznnBMLEY4rUPA5TyNSN1j HkSjqloZo6kP+JlV2MuPLYnXFnPNcT4N6Yn7/Oq0+chyU9aZIdB0RdXmoyO9I/jQ mPfOS/OMzGF4DNuthwRngI= Received: from localhost.localdomain (unknown []) by gzsmtp4 (Coremail) with SMTP id PygvCgD36hpE98Nnl8nzBQ--.32133S2; Sun, 02 Mar 2025 14:14:29 +0800 (CST) From: wang wei To: linux-mm@kvack.org, labbott@redhat.com, patchwork-bot@kernel.org Cc: axboe@kernel.dk, wang wei Subject: [RFC v1] mm/page cache: A method to drop one-read page cache Date: Sun, 2 Mar 2025 14:14:27 +0800 Message-Id: <20250302061427.33455-1-a929244872@163.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-CM-TRANSID: PygvCgD36hpE98Nnl8nzBQ--.32133S2 X-Coremail-Antispam: 1Uf129KBjvJXoWxKFW3AFW5XrWxGw1xAw4xJFb_yoWxAw1DpF W5Kwn0krZrXr1agr43WanrZF1Y9ryxJayUAFyxWwsI93Z09F12gr47Ka4UAFy3Jrn5CFyI qF4UAFyDZFyjqaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x0pi0zudUUUUU= X-Originating-IP: [111.0.233.153] X-CM-SenderInfo: jdzsmjiuuylji6rwjhhfrp/1tbiUhIEpmfD6sPNZwADsG X-Rspam-User: X-Rspamd-Queue-Id: 13BD2140005 X-Rspamd-Server: rspam09 X-Stat-Signature: egdidj68yc47gcwpoq1jpu1pzp17g18n X-HE-Tag: 1740896096-384954 X-HE-Meta: U2FsdGVkX194BoBBcTMiIG3ZKDDwZvdijrKW7NWPu2501SJowfORXemxr3O2UrWdeFAGyhFlFTU6WZFeGHZZcKrCXbftzhWDriz284eMQu//pMpiMZvxuP8q5VYqZ4fgvlQtoFqSRCpNT79iLyua9as+3vCxfsNJCz4R3H6SvUSpxE26K9yecZ6r8MxnekbJdaI7YUFegte/Bz5o23/0h2pVFLNh1yaihI9I9f5vq6FOm4+JDwHtjLhimvHvOY3u2y3HyV2Tdrbj2SyOI4Sq0edgu+LTbs+toGQNreVRkbyz+cHFH37SkVw/PsisRuAF8z+qbE33J+TQExXu5O1njXuYQg0inhKbeBibiJkYkUHydvoTDQGV1nWSIqhqPrLot+3rh6H90PD7gLHXHmMaHVchP47xtoU8Isi+oczH1ijJO5eFbAupc9oal1M247Sww9Ju1dDrZ8l5ZqxzblNgzeBFcZsHfL04OQgbCQ1UqAVPHiX2Sacx3rIl68baz3tU/byXsso8VmOI57pKfe8JyZco5SkJVmtnXeXCLpVAGXu/VzsRyYQQzEw/+CCeisfUIkD93moqzQVYHe+I5QzMg0r4BE4L+Ax3q2hlcO+yLRxpmvbF2i9+eY6b8EIc45G3zQUkdYeJFNek06PzSfcHOU117vqAUFQWZuvk2BZbAwkLzAwIE7Pt4gLUa2IigvHBeu6ImZ4XTJXv6P5XuzPCxpUOs7L9CTYlif6OQMA9R7AhH739B9Sv/aJiB1kCm33iVMXh4MdSvozXyFZu/aXuZoM7Ue7/o/3ns/hFySxeNo5mQkvKX+0j/1UG04TmQzmT7FlYq7tXlm52FrP4+c7wKwBz4c69ElDWfVX4hf9pUg1bQk/sVFgAS8gkM7wj/KxDI90Rx0zMdrcXh93qhKpTqObp3TM9Xu/F9EDa9Ha3FOgsv3OhIXmER3WtvkE78C8zojSYsyiwQVAFfqzDGGh PFEuOX33 IgVvhswwpzYzc78qfHdaIBVSRSVMq5mHrqe4YzRJwGosNOzMQIWziWNZUh4wzIwyIgWsganFk6OQgAq7PgU9gFQw0Wo/S+v5xxdKWiszkKSHjL1IYKjL02WwrKZU6afM4Tolarzb4X9rFDx0ff+vUXw4ek4zMj2fwUMotG9jYUWqsJ73cF9+e3E2l9tnKXGV8Gl4pBXTzeqCq2GPi6grvsJ31dcUh2JOPyJt6RZ9URsKskQHisXlkTRQXT5AbAGeN5RWIeofdFkXygT395LeQMWVjvllo0vZBKeJznolBfcwAlSuay3GraTidVgMDQYUVRfWt0oNvgpjIPmKX3c9r1dsqhDLsrTraiKsaJMomnCXNIQ0nsiKTZn09XIq7UqaFCuOtZsrF39Rr+oQ3BijE6FUeFw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000033, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Jens Axboe added a new flag, RWF_DONTCACHE, which is provided to the preadv2() and pwritev2() system calls [1]. This feature is particularly suitable for pages that are read only once, as it allows these pages to both benefit from the fast data access provided by the page cache and be quickly cleared from memory without needing to go through kswapd.However, in most cases, programmers cannot distinguish between pages that are read only once and those that are not. This made me think of the refault distance feature. If it is accessed again shortly after being evicted, the refault distance feature moves it to the active LRU. Conversely, if a page is being added to the page cache for the first time or has not been accessed again for an extended period after being evicted, it is placed in the inactive LRU list. IMO, pages that remain unaccessed for a long time after being evicted can be considered as "one time read" pages. By setting the DONTCACHE flag for such pages, the system can quickly remove them from memory. After the patch, the refault distance feature categorizes file pages into three types. 1.Pages accessed again shortly after being evicted: These pages are moved to the active LRU list to maintain their efficient access in memory. 2.Pages added to the page cache for the first time: These pages are directly placed in the inactive LRU list, waiting for subsequent accesses to determine their activity. 3.Pages that remain unaccessed for a long time after being evicted: These pages are marked with the DONTCACHE flag, allowing the system to quickly remove them from memory without involving kswapd for complex eviction operations. But after this patch, a new issue may arise. If the system runs for a long time, all pages will have been accessed at least once, leaving no pages for the first access. When a page is added to the page cache, it will either be moved to the active LRU list or marked with the DONTCACHE flag, which may degrade system performance. This requires readjusting the thresholds for identifying the three types of pages. Therefore, I would like to seek help from the community to understand what new problems this modification might introduce and how to address them. [1]: https://lore.kernel.org/all/20241220154831.1086649-1-axboe@kernel.dk/ Signed-off-by: wang wei --- mm/filemap.c | 18 ++++++++++++------ mm/internal.h | 2 +- mm/truncate.c | 6 +++--- mm/workingset.c | 5 ++++- 4 files changed, 20 insertions(+), 11 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 804d73656..ee7afff9f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1606,7 +1606,7 @@ static void folio_end_dropbehind_write(struct folio *folio) */ if (in_task() && folio_trylock(folio)) { if (folio->mapping) - folio_unmap_invalidate(folio->mapping, folio, 0); + folio_unmap_invalidate(folio->mapping, folio, 0, NULL); folio_unlock(folio); } } @@ -2625,15 +2625,21 @@ static inline bool pos_same_folio(loff_t pos1, loff_t pos2, struct folio *folio) } static void filemap_end_dropbehind_read(struct address_space *mapping, - struct folio *folio) + struct folio *folio, int ki_flags) { + void *shadow = NULL; + if (!folio_test_dropbehind(folio)) return; if (folio_test_writeback(folio) || folio_test_dirty(folio)) return; if (folio_trylock(folio)) { - if (folio_test_clear_dropbehind(folio)) - folio_unmap_invalidate(mapping, folio, 0); + if (folio_test_clear_dropbehind(folio)) { + /* If this foio is dropped by preadv2(), do not record eviction*/ + if (!(ki_flags & IOCB_DONTCACHE)) + shadow = workingset_eviction(folio, folio_memcg(folio)); + folio_unmap_invalidate(mapping, folio, 0, shadow); + } folio_unlock(folio); } } @@ -2754,7 +2760,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, for (i = 0; i < folio_batch_count(&fbatch); i++) { struct folio *folio = fbatch.folios[i]; - filemap_end_dropbehind_read(mapping, folio); + filemap_end_dropbehind_read(mapping, folio, iocb->ki_flags); folio_put(folio); } folio_batch_init(&fbatch); @@ -3455,7 +3461,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) mapping_locked = true; } folio = __filemap_get_folio(mapping, index, - FGP_CREAT|FGP_FOR_MMAP, + FGP_CREAT|FGP_FOR_MMAP|FGP_DONTCACHE, vmf->gfp_mask); if (IS_ERR(folio)) { if (fpin) diff --git a/mm/internal.h b/mm/internal.h index 109ef30fe..5f9a5b6c4 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -393,7 +393,7 @@ void unmap_page_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end, struct zap_details *details); int folio_unmap_invalidate(struct address_space *mapping, struct folio *folio, - gfp_t gfp); + gfp_t gfp, void *shadow); void page_cache_ra_order(struct readahead_control *, struct file_ra_state *, unsigned int order); diff --git a/mm/truncate.c b/mm/truncate.c index e2e115adf..204006a9d 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -542,7 +542,7 @@ static int folio_launder(struct address_space *mapping, struct folio *folio) * sitting in the folio_add_lru() caches. */ int folio_unmap_invalidate(struct address_space *mapping, struct folio *folio, - gfp_t gfp) + gfp_t gfp, void *shadow) { int ret; @@ -568,7 +568,7 @@ int folio_unmap_invalidate(struct address_space *mapping, struct folio *folio, goto failed; BUG_ON(folio_has_private(folio)); - __filemap_remove_folio(folio, NULL); + __filemap_remove_folio(folio, shadow); xa_unlock_irq(&mapping->i_pages); if (mapping_shrinkable(mapping)) inode_add_lru(mapping->host); @@ -643,7 +643,7 @@ int invalidate_inode_pages2_range(struct address_space *mapping, } VM_BUG_ON_FOLIO(!folio_contains(folio, indices[i]), folio); folio_wait_writeback(folio); - ret2 = folio_unmap_invalidate(mapping, folio, GFP_KERNEL); + ret2 = folio_unmap_invalidate(mapping, folio, GFP_KERNEL, NULL); if (ret2 < 0) ret = ret2; folio_unlock(folio); diff --git a/mm/workingset.c b/mm/workingset.c index 4841ae8af..e606ce0c5 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -563,7 +563,10 @@ void workingset_refault(struct folio *folio, void *shadow) mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr); - if (!workingset_test_recent(shadow, file, &workingset, true)) + if (!workingset_test_recent(shadow, file, &workingset, true)) { + if (file && folio->mapping) + __folio_set_dropbehind(folio); + } return; folio_set_active(folio);