From patchwork Fri Apr 11 08:28:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinjiang Tu X-Patchwork-Id: 14047822 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28AA3C36010 for ; Fri, 11 Apr 2025 08:39:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3A3AC28018E; Fri, 11 Apr 2025 04:39:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 32A3A280188; Fri, 11 Apr 2025 04:39:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17D6C28018E; Fri, 11 Apr 2025 04:39:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id EE9E6280188 for ; Fri, 11 Apr 2025 04:39:00 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 29E47821F6 for ; Fri, 11 Apr 2025 08:39:01 +0000 (UTC) X-FDA: 83321112882.26.063D903 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf15.hostedemail.com (Postfix) with ESMTP id 3F8D2A0006 for ; Fri, 11 Apr 2025 08:38:57 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf15.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744360739; a=rsa-sha256; cv=none; b=liNEA1RFZHxXEAU2G6dofLVbnLeu7Q54h9sRagZSULqk4wvgcP18yyDeM1pVWX8VfmdWeB EGNg/ux3jj5bUnFY5RdPlATcqBMWENqZRNyAOpkvJCbc/nTJv/2fvnTHB8scKjBs0UA/h9 ZihhkGmCafhOdTLJF6Fd+hk+9GbSKqg= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf15.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744360739; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=sOkJmoaAtmaMVYZpsHTJ22J0CwteFsemwv1KFiN7osE=; b=jAICnANvei01cwMHZrqgIsgurzEg8Syd+Uop9OiaQVBnPIrhaN/qUkxoli+bXEvpP0Xzj6 aHgJG5v6OCXbg2cEhAMU7SZcKsDOWPoXIdEimpIgY0THpiKXDmw7arMfvUaKgjGUjejUdL wuvk5+XeFr21guVcUDANUyqBOWlUMBY= Received: from mail.maildlp.com (unknown [172.19.88.194]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4ZYqnv0xf8z1d177; Fri, 11 Apr 2025 16:38:11 +0800 (CST) Received: from kwepemo200002.china.huawei.com (unknown [7.202.195.209]) by mail.maildlp.com (Postfix) with ESMTPS id 5AC671401E0; Fri, 11 Apr 2025 16:38:54 +0800 (CST) Received: from huawei.com (10.175.124.71) by kwepemo200002.china.huawei.com (7.202.195.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Fri, 11 Apr 2025 16:38:53 +0800 From: Jinjiang Tu To: , , CC: , , Subject: [PATCH] mm/swap: set active flag after adding the folio to activate fbatch Date: Fri, 11 Apr 2025 16:28:57 +0800 Message-ID: <20250411082857.2426539-1-tujinjiang@huawei.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Originating-IP: [10.175.124.71] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To kwepemo200002.china.huawei.com (7.202.195.209) X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 3F8D2A0006 X-Stat-Signature: 5k1n9pjmjb3bu59b7s58qory9tufqphf X-Rspam-User: X-HE-Tag: 1744360737-515390 X-HE-Meta: U2FsdGVkX19hmQyM/NfGVoPfj8glziYiVwnkCGTEhdu4T3cIgFU9Q5+JXQ3LuqQhetaaFtrXUmgRJgG4aAPJj/5AceNWwXIjmjkIvOyw4se3uvCQn+K+LMzRVMFR+z9Ivw3PxXITuNJfcwSONJ7cZcUXWtOKgk2f2w9A2YZVnP+kjOcqdXS/7XNv6e16udDIWKWEk/m6knIzzfedyIzjMGfelIueSwoOBbV87zzaa40rb4APPmLGskHVe7LaEY+8who3WdavzXMIY+FyMHyD+jF5py5F8PDhpAMb8z0x0LUX00bizi8b4hhs8RMrkjCO2dodgFjrhLXh3YNFzIDeIgfJLo4EzTbtFFmcskmeqYSqm0DwuF3ZZOalXQGX1jeL/gK1oeW+1Z2p+Du0D7sZWki51vTKEewpNH4+/bdXFmPoXzahWjJdmugWPL800xNfDN+cE7XSSrK13zV62TJXe4TmDh/D77rcDSaRplhOYiy4gUn1m3E0PCI8pXlIR7c1+Z/A9fWILQcU8QUtMVmAxKpyHFd3XbyVSwSS4bYrmU7FQRkEJ6F2M9NPmMu7RzcZPpsG/Al2pBWqXynZMubbT1vhaE8OLltohPCMi7Cl+rHQSlXmLnhc1j+sjWkTWuEPEwtCABoIW1W6I8Mp8g0FjMLvBrzqGIzuNw8Ao9qzvLEbfPRJVgQtWLLzYekqTMFT/ZuFhNZbxMcggtujempJ6XVDEiH4vNAdnU+8Ya7zZZ1i5WPHjgm+S0/lFjtesH4jfsIdypl5094YXcJLv+BkC828s9wBTfccT35QyEAG09GjcVwBj+33HiGuPfLA1L5O5YEL9NpbAerexzsRg+luSg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We notiched a 12.3% performance regression for LibMicro pwrite testcase due to commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding to LRU batch"). The testcase is executed as follows, and the file is tmpfs file. pwrite -E -C 200 -L -S -W -N "pwrite_t1k" -s 1k -I 500 -f $TFILE this testcase writes 1KB (only one page) to the tmpfs and repeats this step for many times. The Flame graph shows the performance regression comes from folio_mark_accessed() and workingset_activation(). folio_mark_accessed() is called for the same page for many times. Before the commit, each call will add the page to activate fbatch. When the fbatch is full, the fbatch is drained and the page is promoted to active list. And then, folio_mark_accessed() does nothing in later calls. But after the commit, the folio clear lru flags after it is added to activate fbatch. After then, folio_mark_accessed will never call folio_activate() again due to the page is without lru flag, and the fbatch will not be full and the folio will not be marked active, later folio_mark_accessed() calls will always call workingset_activation(), leading to performance regression. Besides, repeated workingset_age_nonresident() call before the folio is drained from activate fbatch leads to unreasonable lruvec->nonresident_age. To fix it, set active flag after the folio is cleared lru flag when adding the folio to activate fbatch. Fixes: 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding to LRU batch") Suggested-by: David Hildenbrand Signed-off-by: Jinjiang Tu --- mm/swap.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/mm/swap.c b/mm/swap.c index 77b2d5997873..f0de837988b4 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -175,6 +175,8 @@ static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t move_fn) folios_put(fbatch); } +static void lru_activate(struct lruvec *lruvec, struct folio *folio); + static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch, struct folio *folio, move_fn_t move_fn, bool on_lru, bool disable_irq) @@ -184,6 +186,14 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch, if (on_lru && !folio_test_clear_lru(folio)) return; + if (move_fn == lru_activate) { + if (folio_test_unevictable(folio)) { + folio_set_lru(folio); + return; + } + folio_set_active(folio); + } + folio_get(folio); if (disable_irq) @@ -299,12 +309,15 @@ static void lru_activate(struct lruvec *lruvec, struct folio *folio) { long nr_pages = folio_nr_pages(folio); - if (folio_test_active(folio) || folio_test_unevictable(folio)) - return; - + /* + * We check unevictable flag isn't set and set active flag + * after we clear lru flag. Unevictable and active flag + * couldn't be modified before we set lru flag again. + */ + VM_WARN_ON_ONCE(!folio_test_active(folio)); + VM_WARN_ON_ONCE(folio_test_unevictable(folio)); lruvec_del_folio(lruvec, folio); - folio_set_active(folio); lruvec_add_folio(lruvec, folio); trace_mm_lru_activate(folio); @@ -341,6 +354,11 @@ void folio_activate(struct folio *folio) if (!folio_test_clear_lru(folio)) return; + if (folio_test_unevictable(folio) || folio_test_active(folio)) { + folio_set_lru(folio); + return; + } + folio_set_active(folio); lruvec = folio_lruvec_lock_irq(folio); lru_activate(lruvec, folio); unlock_page_lruvec_irq(lruvec);