From patchwork Mon Mar 24 22:03:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kinsey Ho X-Patchwork-Id: 14027925 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C013C36005 for ; Mon, 24 Mar 2025 22:03:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C3DCA280006; Mon, 24 Mar 2025 18:03:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BC7DA280001; Mon, 24 Mar 2025 18:03:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A66CF280006; Mon, 24 Mar 2025 18:03:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 825F8280001 for ; Mon, 24 Mar 2025 18:03:15 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2C653140A5B for ; Mon, 24 Mar 2025 22:03:16 +0000 (UTC) X-FDA: 83257821192.02.D5AE7C2 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf04.hostedemail.com (Postfix) with ESMTP id 6EF744001C for ; Mon, 24 Mar 2025 22:03:14 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NpTAfTma; spf=pass (imf04.hostedemail.com: domain of 3odbhZwgKCBwCAFK6Q9G8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--kinseyho.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3odbhZwgKCBwCAFK6Q9G8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--kinseyho.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742853794; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HrFPQxQVCxFCJ5eIn/lb4rFAO8+JMUT5L57UjQpjJz0=; b=YcLlvpbKxzmYd0PAGv1L8eZAV4qW+eWv3oHaUQkKhiMpD0T5Jjo7B2FNX83UOSeE7RNm2g Bkvi7wu066huW3wM8lM9EGWq71fCzThBjdrt95I0nfzzmU5Nfr3QSRqLQ+CTJdXy5OX/IP +sH4mafRTkVIndD/WJ7MB9RQ2i1b+t8= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NpTAfTma; spf=pass (imf04.hostedemail.com: domain of 3odbhZwgKCBwCAFK6Q9G8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--kinseyho.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3odbhZwgKCBwCAFK6Q9G8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--kinseyho.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742853794; a=rsa-sha256; cv=none; b=8nJZP0bzmQzaLN2hoxN6VHkq+UTQiG1IB0UCvMDFzK7QA5FM8X+0UTWNZrV8AJ8qADDoxu rpuAPf3YSSoNEup3JKfaLyY7wDZD5ZwXasWKAIoTYo6b4tDOV4wfWFvXI0ucoFgaGK+6Tf df4imDKvCXjZAgZgGTT5gLe6CJtAgDk= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-30364fc706fso929394a91.3 for ; Mon, 24 Mar 2025 15:03:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742853793; x=1743458593; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HrFPQxQVCxFCJ5eIn/lb4rFAO8+JMUT5L57UjQpjJz0=; b=NpTAfTmaOtgjW5YkRWhLiw5u7RymCOzpM9sMwkwXJhPt0K8BXE07TRY+FonlCLBV0o Jl6zA2iDlqCV5vWLWdAIi7eekzJ7EeIJpVYvakMjzZF0OmBbvBQxNSVMOZEixrKPKySp M3+po2akLJuVwkTPGwYPWvzR/qVEt40b2LHq+1yAGp9M4eTZFPKQFmbRuDqiNb+azIur 9t9JcAkhp7GT/wLK2hmh0Zf7icyZ1qGQmUuMwDHEHqiqokSPaGvPRiVi7RQMO6emkrKe TEXBT2FwhoB+Hq4uqN6Wg2gtX+oMINuOdK0ibtCMAB//HRJNeyCe0XbJ2PCU0EkIVN7P B7MQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742853793; x=1743458593; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HrFPQxQVCxFCJ5eIn/lb4rFAO8+JMUT5L57UjQpjJz0=; b=xL919JWVLdFdtQ3B2fxYDxPmU1Kdihl930p68ctCKTNvAdvAY4CmZSIvdyS8bYpL8I HHkDEGDvkXbMOPHC4L2QSW1Fe1IEODsIGT5WOC3VB37u8nLhILyneRLfGDvQgF+sJL4G pRw5PiiVBnLFHF3hxvDjAkib1UrgW1qCQ1psD7GMJy+p9nPejuGtmA0klYcmLm/x/JY9 BoBCLTZ/ZfgSThPBvcvsvDtv1LB45ghp23y2RLEuZ9zu4n7T4qG0hi3p1niKJoDbk5mO CSdl6ko9Em7bJmBXGMCurmP+YPceMWrrPebZ5IXDQA1RIOTEwfByqfakhmcPwbk82N4b 5J9A== X-Gm-Message-State: AOJu0YwL0+exnXZGhC0CGp7ZY4DwmJjWW6HAee9CpSp0A/IYjn690hPV RIZ46m9egs9qjVyF+bZOuL0THvuedxFX0nOs9AZ1IxfscN5Cp/cM4Ig3YupX1xWrvVXhweZX29/ EuejuDMM129f0PUzdEFRyVmdNnc+gMrMxO+aa7Ai4mBDfXGx/rfMrPGKWtTW4ZuHEhHjivW8rxd 1S0H0f/6evWlD5X53jX9gGNmX4/8w29QWVifNt5Q== X-Google-Smtp-Source: AGHT+IFZ5gFWS0tV3e6VWZdQngRJrAWzxvv1iFdLvWy/bvzelevqq81SzyHG2/91MA2Kj/yFZ5fNgitHPoeVag== X-Received: from pglu36.prod.google.com ([2002:a63:1424:0:b0:af2:22fe:cfb3]) (user=kinseyho job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:2585:b0:1f5:7eb5:72c7 with SMTP id adf61e73a8af0-1fe4300f7d1mr23507348637.29.1742853793076; Mon, 24 Mar 2025 15:03:13 -0700 (PDT) Date: Mon, 24 Mar 2025 15:03:01 -0700 In-Reply-To: <20250324220301.1273038-1-kinseyho@google.com> Mime-Version: 1.0 References: <20250324220301.1273038-1-kinseyho@google.com> X-Mailer: git-send-email 2.49.0.395.g12beb8f557-goog Message-ID: <20250324220301.1273038-3-kinseyho@google.com> Subject: [RFC PATCH v1 2/2] mm: klruscand: use mglru scanning for page promotion From: Kinsey Ho To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuanchu@google.com, AneeshKumar.KizhakeVeetil@arm.com, Hasan.Maruf@amd.com, Jonathan.Cameron@huawei.com, Michael.Day@amd.com, akpm@linux-foundation.org, dave.hansen@intel.com, david@redhat.com, feng.tang@intel.com, gourry@gourry.net, hannes@cmpxchg.org, honggyu.kim@sk.com, hughd@google.com, jhubbard@nvidia.com, k.shutemov@gmail.com, kbusch@meta.com, kmanaouil.dev@gmail.com, leesuyeon0506@gmail.com, leillc@google.com, liam.howlett@oracle.com, mgorman@techsingularity.net, mingo@redhat.com, nadav.amit@gmail.com, nphamcs@gmail.com, peterz@infradead.org, raghavendra.kt@amd.com, riel@surriel.com, rientjes@google.com, rppt@kernel.org, shivankg@amd.com, shy828301@gmail.com, sj@kernel.org, vbabka@suse.cz, weixugc@google.com, willy@infradead.org, ying.huang@linux.alibaba.com, ziy@nvidia.com, dave@stgolabs.net, hyeonggon.yoo@sk.com, bharata@amd.com, Kinsey Ho X-Rspamd-Queue-Id: 6EF744001C X-Stat-Signature: xq43g5fmcop1nd461r3dyxs9xgodsnkj X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1742853794-792961 X-HE-Meta: U2FsdGVkX1+K+sN7+jlVvbvk7JVD9aDiihY6Mu9/xEN3klAYWxbdHNEDEj9pcbwRDQF3SSmSh7cnNPpawoWRMY3L95gZJWfS4GAs9ulM/q9saB2N/To7/9FuTjrbkjbXPuXpkcaUVHpVJVQWBz4xPsumzxDa5FIwbbr3BY16XVEjDhsSeooJlGdYclWXl3AcvEo8Ho/0R5N9HILQUxW7xS++WxddmBNHH3idoJpsCo97Ee0ScCQMqDTN2bXMgkgyfClalEDjASCOy+6/F5KSA2uU8epVUloY+RZ8fzvdM0BymJyhVCeDYMgzOw79tDPBdthxb6bOPURBU809jkFvV+0BxSnoQ3l/z/kEthp7FBItutbKKbu6S9AP/hlIp/+Pzqj5vOhUxp1MmN6ZfVuBfGOGBlY1Jb3J9+xg2Cl4/iYh8D7QicEmTqGLNzlMStBCGcGbsyp/gO/MTmXzqo0FwrJ4OVdqt2WLgvVJCN6Py3YwSNWPrW3h1D4pKxr7DIjGTTomio5+EDri/Wd2iKNtL4exQjeR+NEaUYJpjN0glOofdJ/hqFpKfyCBln7WCbocO31ikI0Czo9wSkRr/73IUwbQUbaNkW5Ip3kbVBqz4mYYXujm7OdGc/33e8vgRCL5TIBlzGvJ71fmEnvp+iIGEKOT6XOfyIduGKDrHOOze9LNMrQdRKxgQ8QLC/f8ugU8WEDhSYi5kvgeyPQj4MTxrljCnkb4y0yjgP2IAz4l3fPq8Qi0fALdN3GzY/7VRi8ah6hqHS0VwYcL3KVmM8tuXQeyi5dF3CoGmfNPR4MldYS8XLQ8rEKfRVpyhSZR9P+uDeXIGVrytgZI3sJlnR8f81YaiNZgqLxBx6Bvxg1Rk1bbKYugUcPqoUTNjX+f4jIpUqd0PWT7e5dc28drc75PxJlsRuahtioA91IbKPdhhxiibRX0BLCn9IHQWjQa6sJzbjK/FcnHiYxF4leRyt2 gaJX7zsw Nq9DHMNyGyEgSIV92hhL8n4S4J3AhloRiTrnHFoe+geZvNTODHn3i5zUUW/TdNr4ZiJ5mfI9Y6C/C4b2uXmZ/+stGYNeornE6sF2fU+PHsyTy7QfFOKqLebHSuV83bNJ2cZdX9pChIywfX0vnkqQaVjc6XeNOj8I56Uc1+uM3zhmq02OP/5b8zx5QwcxSMLviqDzD5Ba1Ni9lEqsYfg2uDc+ODPyUGi4ucpoDyO71v650whcPFxHANZT7LLvgkphYcTPRSJnPsw3kSQ+uAkM8J0lYQeWKDqLAA+Wvbn23pl28yxhE3Tvf+cedKT6Lpb/wzcdAw66/TBGB9w4GJo/w+tYE853yodLB1OfUWWWsCczxbToVQmdcSEqRv93Amm8WZ4p2MP7Mw+Yow7O8/Jlu2ZS3SE0k10kxAMD/JyzQ8lEV5iZbkg9tV8zAyCY0tIanHcHrLj8G5OwmbP2I/gR+AK8WODp9pIoFh4ByN9VyMlFPYF8vASuF9JFoeVEZ+19Xhm5SC3ywxyNf5HpxpHjGXQNh0g6AId8wdkqoua4tWKjRMbfbrxAo/riq9Y9FNGBAftOYnnP3HF9xaKYJQZZf+HhD1yGGtcOi05u7TpYKWCGEDxdin2GMDSt5KeAFcP4SD3lk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduce a new kernel daemon, klruscand, that periodically invokes the MGLRU page table walk. It leverages the new callbacks to gather access information and forwards it to the kpromoted daemon for promotion decisions. This benefits from reusing the existing MGLRU page table walk infrastructure, which is optimized with features such as hierarchical scanning and bloom filters to reduce CPU overhead. As an additional optimization to be added in the future, we can tune the scan intervals for each memcg. Signed-off-by: Kinsey Ho Signed-off-by: Yuanchu Xie --- mm/Kconfig | 8 ++++ mm/Makefile | 1 + mm/klruscand.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 127 insertions(+) create mode 100644 mm/klruscand.c diff --git a/mm/Kconfig b/mm/Kconfig index ceaa462a0ce6..ed0fa8f2551e 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1366,6 +1366,14 @@ config KPROMOTED Promote hot pages from lower tier to top tier by using the memory access information provided by various sources. +config KLRUSCAND + bool "Kernel lower tier access scan daemon" + default y + depends on KPROMOTED && LRU_GEN_WALKS_MMU + help + Scan for accesses from lower tiers by invoking MGLRU to perform + page table walks. + source "mm/damon/Kconfig" endmenu diff --git a/mm/Makefile b/mm/Makefile index bf4f5f18f1f9..eb7b76db3b33 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -148,3 +148,4 @@ obj-$(CONFIG_EXECMEM) += execmem.o obj-$(CONFIG_TMPFS_QUOTA) += shmem_quota.o obj-$(CONFIG_PT_RECLAIM) += pt_reclaim.o obj-$(CONFIG_KPROMOTED) += kpromoted.o +obj-$(CONFIG_KLRUSCAND) += klruscand.o diff --git a/mm/klruscand.c b/mm/klruscand.c new file mode 100644 index 000000000000..a53d43c60155 --- /dev/null +++ b/mm/klruscand.c @@ -0,0 +1,118 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "internal.h" + +#define KLRUSCAND_INTERVAL_MS 4000 +#define BATCH_SIZE (2 << 16) + +static struct task_struct *scan_thread; +static pfn_t pfn_batch[BATCH_SIZE]; +static int batch_index; + +static void flush_cb(void) +{ + int i = 0; + + for (; i < batch_index; i++) { + u64 pfn = pfn_batch[i].val; + + kpromoted_record_access((unsigned long)pfn, NUMA_NO_NODE, + KPROMOTED_PGTABLE_SCAN, jiffies); + + if (i % 16 == 0) + cond_resched(); + } + batch_index = 0; +} + +static int accessed_cb(pfn_t pfn) +{ + if (batch_index >= BATCH_SIZE) + return -EAGAIN; + + pfn_batch[batch_index++] = pfn; + return 0; +} + +static int klruscand_run(void *unused) +{ + struct lru_gen_mm_walk *walk; + + walk = kzalloc(sizeof(*walk), + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN); + if (!walk) + return -ENOMEM; + + while (!kthread_should_stop()) { + unsigned long next_wake_time; + long sleep_time; + struct mem_cgroup *memcg; + int flags; + int nid; + + next_wake_time = jiffies + msecs_to_jiffies(KLRUSCAND_INTERVAL_MS); + + for_each_node_state(nid, N_MEMORY) { + pg_data_t *pgdat = NODE_DATA(nid); + struct reclaim_state rs = { 0 }; + + if (node_is_toptier(nid)) + continue; + + rs.mm_walk = walk; + set_task_reclaim_state(current, &rs); + flags = memalloc_noreclaim_save(); + + memcg = mem_cgroup_iter(NULL, NULL, NULL); + do { + struct lruvec *lruvec = + mem_cgroup_lruvec(memcg, pgdat); + unsigned long max_seq = + READ_ONCE((lruvec)->lrugen.max_seq); + + lru_gen_scan_lruvec(lruvec, max_seq, + accessed_cb, flush_cb); + cond_resched(); + } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL))); + + memalloc_noreclaim_restore(flags); + set_task_reclaim_state(current, NULL); + memset(walk, 0, sizeof(*walk)); + } + + sleep_time = next_wake_time - jiffies; + if (sleep_time > 0 && sleep_time != MAX_SCHEDULE_TIMEOUT) + schedule_timeout_idle(sleep_time); + } + kfree(walk); + return 0; +} + +static int __init klruscand_init(void) +{ + struct task_struct *task; + + task = kthread_run(klruscand_run, NULL, "klruscand"); + + if (IS_ERR(task)) { + pr_err("Failed to create klruscand kthread\n"); + return PTR_ERR(task); + } + + scan_thread = task; + return 0; +} +module_init(klruscand_init);