From patchwork Sun Apr 2 10:42:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Aneesh Kumar K.V" X-Patchwork-Id: 13197367 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50200C7619A for ; Sun, 2 Apr 2023 10:43:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB5F86B007D; Sun, 2 Apr 2023 06:43:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D66406B007E; Sun, 2 Apr 2023 06:43:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C07016B0080; Sun, 2 Apr 2023 06:43:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id AE17F6B007D for ; Sun, 2 Apr 2023 06:43:28 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 340C31407CF for ; Sun, 2 Apr 2023 10:43:28 +0000 (UTC) X-FDA: 80636114496.21.431E097 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf18.hostedemail.com (Postfix) with ESMTP id D1B661C0010 for ; Sun, 2 Apr 2023 10:43:25 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=QBs1rv3O; spf=pass (imf18.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680432206; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UCxEDFtiGee16VthUhi5zJD9ESLgyFyMYv60OUZLUIE=; b=QWq78v4DamOVeceGEQn/IaY9UOIIhJUFkpcKriLprC4Sd8bPgEx/oHV9T14nQRuYBLC5u7 wKCK7O6BDYFx5MnbWM/kB0AxDZiCocev33+wDCwH2+EOdXvJaV0xVe3B4U7QAL8NXfNA6O b9qBzIxVyKTBcMmUdLRgbD0zBPvtdVM= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=QBs1rv3O; spf=pass (imf18.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680432206; a=rsa-sha256; cv=none; b=BYjDaa6KlPLKaNPjzP73qohupqikxySo4FzO1KxdcxGWOsMqAMYdKVr9a4WfQKVQsjFzSe YiccajugyaONUXGt2yhhOz0Tglb5qW6fPmKRYXPd5CvGMVXiHyLHXy2a51JD11hZR9nfEX WEUfz2Z47CKPC4sf/49EkrTI5vyump0= Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3327Q4hP019970; Sun, 2 Apr 2023 10:43:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=UCxEDFtiGee16VthUhi5zJD9ESLgyFyMYv60OUZLUIE=; b=QBs1rv3OuAwbLnjnVV8ehGbdmeYjluHEJlRgJfNhIR/wJApMNb9q0VuwHQKAWWHbtBF6 E5OpOcB1H+obi8qn4fDkCG65zosgYGxDcGiGLSTxDy3TQ5FzF8SIqgeQZlJ49HjseM6j AUxn8PD5ilqTSJs7FkN0gAF+wSfJcqE8tovhqPrL/mua38IWJGXN2/Q3PINc31bFvTNJ hM4iPC9fsjaMiWdZ9bDo8NV/BOdy9gR+z9y3rWWpJ/3GUeDUePISahhGorHmkWIV/LII 3/NFlt9NKMKuadGw3jA7MzlBRzAtStwOfjDYOcOaziTu6JUvP7yFI6gp82vAJeaoYCra 3g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ppxdtfb4c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 02 Apr 2023 10:43:20 +0000 Received: from m0098409.ppops.net (m0098409.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 332AUIPE020700; Sun, 2 Apr 2023 10:43:20 GMT Received: from ppma03wdc.us.ibm.com (ba.79.3fa9.ip4.static.sl-reverse.com [169.63.121.186]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ppxdtfb40-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 02 Apr 2023 10:43:20 +0000 Received: from pps.filterd (ppma03wdc.us.ibm.com [127.0.0.1]) by ppma03wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3326iVEs016808; Sun, 2 Apr 2023 10:43:19 GMT Received: from smtprelay06.dal12v.mail.ibm.com ([9.208.130.100]) by ppma03wdc.us.ibm.com (PPS) with ESMTPS id 3ppc87e1q5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 02 Apr 2023 10:43:19 +0000 Received: from smtpav03.dal12v.mail.ibm.com (smtpav03.dal12v.mail.ibm.com [10.241.53.102]) by smtprelay06.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 332AhI6b11993660 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 2 Apr 2023 10:43:18 GMT Received: from smtpav03.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 34B7E5805A; Sun, 2 Apr 2023 10:43:18 +0000 (GMT) Received: from smtpav03.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 89EED58056; Sun, 2 Apr 2023 10:43:14 +0000 (GMT) Received: from skywalker.ibmuc.com (unknown [9.43.8.200]) by smtpav03.dal12v.mail.ibm.com (Postfix) with ESMTP; Sun, 2 Apr 2023 10:43:14 +0000 (GMT) From: "Aneesh Kumar K.V" To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: Dave Hansen , Johannes Weiner , Matthew Wilcox , Mel Gorman , Yu Zhao , Wei Xu , Guru Anbalagane , "Aneesh Kumar K.V" Subject: [RFC PATCH v1 5/7] powerpc/mm: Add page access count support Date: Sun, 2 Apr 2023 16:12:38 +0530 Message-Id: <20230402104240.1734931-6-aneesh.kumar@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230402104240.1734931-1-aneesh.kumar@linux.ibm.com> References: <20230402104240.1734931-1-aneesh.kumar@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 1QZYs54kXimLLfCSiyV0y3IvMfi0kZ-P X-Proofpoint-GUID: 4tCiQdgjd1JmyhRQNBnlkyDseqB7aNBk X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-31_07,2023-03-31_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 clxscore=1015 bulkscore=0 spamscore=0 suspectscore=0 adultscore=0 mlxlogscore=999 impostorscore=0 malwarescore=0 mlxscore=0 phishscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304020092 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D1B661C0010 X-Rspam-User: X-Stat-Signature: x4o3u41gsmg1xs9pbg7zemcp114iccp1 X-HE-Tag: 1680432205-992629 X-HE-Meta: U2FsdGVkX18BrpZ+HlKmwZKUr0ZMrTVGNgociFGcyJjbz6E3nb+AOZWQuAJHhUI1MeCwWqFn1aMSlktVIsc20FsID4WjfOuCRDUrD9QYXp1lKdXk8GDY0P6fAiLSsNTbr7TOoYs4pfL8gzd6am/NNzQVl88Z01CtXlmHts5FBrwQwUXgQD+R+n0D8A9Rj+sUutDViXcQHyFrKYilP4uF2OKGD1vqe1BS9/tRCsJWE7QAdv+T2AemYd2XXDT+cRiliFhpC3DDLxygoMlVZc1hLh4/cw8TqDdbwhhoqmhPbdtYCq59jUL0cW2P8i8t4mij5470W/9lGJkkoza40RbD7RPzvyAHkeGduc84y+6rC4Cq8QnNiayWLiA0CbdNr01WVoLkkeEq5Pgnr5VzBr62r2+hM1wbvSPz4KuwSoQIui70ujNOcgDZ/caXwX5S+h+vnqVwH5W+wpis3Nm4wpvPfg3U72vLsWAy1ZfjV6Bu+gYmWMN6i+G3ezwvjLsxDooTINzCmoM5y4vPWyi61fjVmBT0easJbST8eYMlb2jyTStjyoC3/XPvB5YOnn2ZCyCQ65jGIF7aU3+x3ci+8FYv6qFMjb+fTtaR3qkbwoLUEcO94gNBWreB1uWvl+Hog9gKJVlCLZT4D0A/CVBCHWozLqxlECmxIQ7hHsGrzpLldehRXaSjQBdllPL8wyZJ0SFM3lYLp4G51wDyVywl5UHlP5ytbKC5LlB/rg5qXW4Em/86B0v4JJp1gz4jnJHHqgBmYIgWlR4fiS8miVLbbd9h0Zigtzo91JzRO4eNRYLxyzj51NAR6Hpf8gPCnb+1CbvRkEF2waGgxeI2n8/Ym5hjJj90gz3szSzaQyBPOFJVy6J5Vt1j0EpUmvwoT3pwsjvyVomYACxLYnNpCxh4usoAHPepF+174DnuDYLfVisUVa4EEs1787MC3bSF5xUfxy9wjjtnxGDwDkRzrqrjtii hMxEN1be Obydy/0rssvmoJPIzCaIpiil4i1ZTTigQ5WhUzNllY0f0SX/vfFMi1CXQfsEbYXgy9iIxrelKbt2/eJR5OwLqowE//8aVkjjXCzl5POdQjn73ND67zj7tjZwwooq8rtvSBtJOSre6mt2T/tafka1fJ2O8NqgAa6wCBJ5i X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hot Cold Affinity engine is a facility provided by POWER10 where each access to a page is counted and the access count value decreased/decayed if not accessed in a time window. There is a 32-bit counter for each page. This patch uses the HCA engine to provide page access count on POWER10 and uses the same with multi-gen LRU to classify the page to correct LRU generation. This uses a simple page classification mechanism where pages are sampled from the youngest and oldest generation to find the max and min page hotness in the lruvec. This value is later used to sort every page to the right generation. The max and min hotness range is established during aging when new generations are created. Not-yet-Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/Kconfig | 10 + arch/powerpc/include/asm/hca.h | 49 +++++ arch/powerpc/include/asm/page_aging.h | 35 ++++ arch/powerpc/mm/Makefile | 1 + arch/powerpc/mm/hca.c | 275 ++++++++++++++++++++++++++ include/linux/mmzone.h | 5 + include/linux/page_aging.h | 5 + mm/Kconfig | 4 + mm/vmscan.c | 5 +- 9 files changed, 387 insertions(+), 2 deletions(-) create mode 100644 arch/powerpc/include/asm/hca.h create mode 100644 arch/powerpc/include/asm/page_aging.h create mode 100644 arch/powerpc/mm/hca.c diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 7a5f8dbfbdd0..71e8f23d9a96 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -1045,6 +1045,16 @@ config PPC_SECVAR_SYSFS read/write operations on these variables. Say Y if you have secure boot enabled and want to expose variables to userspace. +config PPC_HCA_HOTNESS + prompt "PowerPC HCA engine based page hotness" + def_bool y + select ARCH_HAS_PAGE_AGING + depends on PPC_BOOK3S_64 + help + Use HCA engine to find page hotness + + If unsure, say N. + endmenu config ISA_DMA_API diff --git a/arch/powerpc/include/asm/hca.h b/arch/powerpc/include/asm/hca.h new file mode 100644 index 000000000000..c0ed380594ca --- /dev/null +++ b/arch/powerpc/include/asm/hca.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +/* + * Configuration helpers for the Hot-Cold Affinity helper + */ + +#ifndef _ASM_POWERPC_HCA_H +#define _ASM_POWERPC_HCA_H + +#include + +struct hca_entry { + unsigned long count; + unsigned long prev_count; + uint8_t age; +}; + +static inline unsigned long hotness_score(struct hca_entry *entry) +{ + unsigned long hotness; + +#if 0 + /* + * Give more weightage to the prev_count because it got + * historical values. Take smaller part of count as we + * age more because prev_count would be a better approximation. + * We still need to consider count to accomidate spike in access. + * + 1 with age to handle age == 0. + */ + hotness = entry->prev_count + (entry->count / (entry->age + 1)); +#else + /* Considering we are not finding in real workloads pages with + * very high hotness a decay essentially move count value to prev count. + * At that point we could look at decay as period zeroing of the counter. + * I am finding better results with the below hotness score with real workloads. + */ + hotness = entry->prev_count + entry->count; +#endif + + return hotness; +} + +extern void (*hca_backend_node_debugfs_init)(int numa_node, struct dentry *node_dentry); +extern void (*hca_backend_debugfs_init)(struct dentry *root_dentry); +extern int (*hca_pfn_entry)(unsigned long pfn, struct hca_entry *entry); +extern bool (*hca_node_enabled)(int numa_node); +extern int (*hca_clear_entry)(unsigned long pfn); + +#endif /* _ASM_POWERPC_HCA_H */ diff --git a/arch/powerpc/include/asm/page_aging.h b/arch/powerpc/include/asm/page_aging.h new file mode 100644 index 000000000000..0d98cd877308 --- /dev/null +++ b/arch/powerpc/include/asm/page_aging.h @@ -0,0 +1,35 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#ifndef _ASM_POWERPC_PAGE_AGING_H_ +#define _ASM_POWERPC_PAGE_AGING_H_ + +#ifdef CONFIG_LRU_GEN +extern bool hca_lru_age; +unsigned long hca_map_lru_seq(struct lruvec *lruvec, struct folio *folio); +bool hca_try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, + int scan_priority, bool can_swap, bool force_scan); + +#define arch_supports_page_access_count arch_supports_page_access_count +static inline bool arch_supports_page_access_count(void) +{ + return hca_lru_age; +} + +#define arch_try_to_inc_max_seq arch_try_to_inc_max_seq +static inline bool arch_try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, + int scan_priority, bool can_swap, + bool force_scan) +{ + return hca_try_to_inc_max_seq(lruvec, max_seq, scan_priority, + can_swap, force_scan); + +} + +#define arch_get_lru_gen_seq arch_get_lru_gen_seq +static inline unsigned long arch_get_lru_gen_seq(struct lruvec *lruvec, struct folio *folio) +{ + return hca_map_lru_seq(lruvec, folio); +} + +#endif /* CONFIG_LRU_GEN */ +#endif diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile index 503a6e249940..30bd4ad4aff0 100644 --- a/arch/powerpc/mm/Makefile +++ b/arch/powerpc/mm/Makefile @@ -19,3 +19,4 @@ obj-$(CONFIG_NOT_COHERENT_CACHE) += dma-noncoherent.o obj-$(CONFIG_PPC_COPRO_BASE) += copro_fault.o obj-$(CONFIG_PTDUMP_CORE) += ptdump/ obj-$(CONFIG_KASAN) += kasan/ +obj-$(CONFIG_PPC_HCA_HOTNESS) += hca.o diff --git a/arch/powerpc/mm/hca.c b/arch/powerpc/mm/hca.c new file mode 100644 index 000000000000..af6de4492ead --- /dev/null +++ b/arch/powerpc/mm/hca.c @@ -0,0 +1,275 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include +#include +#include +#include +#include + +#include + +bool hca_lru_age; +static struct dentry *hca_debugfs_root; +/* + * percentage of pfns to scan from each lurvec list to determine max/min hotness + */ +static ulong scan_pfn_ratio __read_mostly = 20; +/* + * Millisec to wait/skip before starting another random scan + */ +static ulong scan_skip_msec __read_mostly = 60; + +/* backend callbacks */ +void (*hca_backend_node_debugfs_init)(int numa_node, struct dentry *node_dentry); +void (*hca_backend_debugfs_init)(struct dentry *root_dentry); +int (*hca_pfn_entry)(unsigned long pfn, struct hca_entry *entry); +bool (*hca_node_enabled)(int numa_node); +int (*hca_clear_entry)(unsigned long pfn); + +static int parse_hca_age(char *arg) +{ + return strtobool(arg, &hca_lru_age); +} +early_param("hca_age", parse_hca_age); + +static inline int folio_hca_entry(struct folio *folio, struct hca_entry *entry) +{ + return hca_pfn_entry(folio_pfn(folio), entry); +} + +#ifdef CONFIG_LRU_GEN +static inline int get_nr_gens(struct lruvec *lruvec, int type) +{ + return lruvec->lrugen.max_seq - lruvec->lrugen.min_seq[type] + 1; +} + +/* FIXME!! */ +static inline bool folio_evictable(struct folio *folio) +{ + bool ret; + + /* Prevent address_space of inode and swap cache from being freed */ + rcu_read_lock(); + ret = !mapping_unevictable(folio_mapping(folio)) && + !folio_test_mlocked(folio); + rcu_read_unlock(); + return ret; +} + +static void restablish_hotness_range(struct lruvec *lruvec) +{ + bool youngest = true; + int gen, nr_pages; + unsigned long seq; + int new_scan_pfn_count; + struct lru_gen_struct *lrugen = &lruvec->lrugen; + unsigned long current_hotness, max_hotness = 0, min_hotness = 0; + + if (time_is_after_jiffies64(lrugen->next_span_scan)) + return; + + spin_lock_irq(&lruvec->lru_lock); + +retry: + for (int type = 0; type < ANON_AND_FILE; type++) { + for (int zone = 0; zone < MAX_NR_ZONES; zone++) { + int index = 0; + struct list_head *head; + struct folio *folio; + struct hca_entry entry; + + if (youngest) + seq = lrugen->max_seq; + else + seq = lrugen->min_seq[type]; + gen = lru_gen_from_seq(seq); + nr_pages = lrugen->nr_pages[gen][type][zone]; + + new_scan_pfn_count = nr_pages * scan_pfn_ratio/100; + if (!new_scan_pfn_count) + new_scan_pfn_count = nr_pages; + + head = &lrugen->lists[gen][type][zone]; + list_for_each_entry(folio, head, lru) { + + if (unlikely(!folio_evictable(folio))) + continue; + + if (folio_hca_entry(folio, &entry)) + continue; + + if (index++ > new_scan_pfn_count) + break; + + current_hotness = hotness_score(&entry); + /* If the page didn't see any access, skip it */ + if (!current_hotness) + continue; + /* + * Let's make sure we at least wait 1 decay + * updates before looking at this pfn for + * max/min computation. + */ + if (entry.age < 1) + continue; + + if (current_hotness > max_hotness) + max_hotness = (current_hotness + max_hotness) / 2; + else if ((current_hotness < min_hotness) || !min_hotness) + min_hotness = (current_hotness + min_hotness) / 2; + else if ((current_hotness - min_hotness) < (max_hotness - min_hotness) / 2) + min_hotness = (current_hotness + min_hotness) / 2; + else + max_hotness = (current_hotness + max_hotness) / 2; + + } + + } + } + if (youngest) { + /* compute with oldest generation */ + youngest = false; + goto retry; + } + lrugen->next_span_scan = get_jiffies_64() + msecs_to_jiffies(scan_skip_msec); + if (min_hotness) { + lrugen->max_hotness = max_hotness; + lrugen->min_hotness = min_hotness; + } + + spin_unlock_irq(&lruvec->lru_lock); +} + +/* Return Multigen LRU generation based on folio hotness */ +unsigned long hca_map_lru_seq(struct lruvec *lruvec, struct folio *folio) +{ + unsigned long seq; + int type, nr_gens; + struct lru_gen_struct *lrugen = &lruvec->lrugen; + struct hca_entry folio_entry; + unsigned long hotness, seq_range; + + type = folio_is_file_lru(folio); + if (!hca_lru_age || folio_hca_entry(folio, &folio_entry)) + /* return youngest generation ? */ + return lrugen->min_seq[type]; + + hotness = hotness_score(&folio_entry); + /* The page didn't see any access, return oldest generation */ + if (!hotness) + return lrugen->min_seq[type]; + + /* Also adjust based on current value. */ + if (hotness > lrugen->max_hotness) { + lrugen->max_hotness = (hotness + lrugen->max_hotness) / 2; + return lrugen->max_seq; + } else if (hotness < lrugen->min_hotness) { + lrugen->min_hotness = (hotness + lrugen->min_hotness) / 2; + return lrugen->min_seq[type]; + } + + /* + * Convert the max and min hotness into 4 ranges for sequence. + * Then place our current hotness into one of these range. + * We use the range number as an increment factor for generation. + */ + /* inclusive range min and max */ + seq_range = lrugen->max_hotness - lrugen->min_hotness + 1; + nr_gens = get_nr_gens(lruvec, type); + seq_range = (seq_range + nr_gens - 1)/nr_gens; + + /* higher the hotness younger the generation */ + seq = lrugen->min_seq[type] + ((hotness - lrugen->min_hotness)/seq_range); + + return seq; +} + +bool hca_try_to_inc_max_seq(struct lruvec *lruvec, + unsigned long max_seq, int scan_priority, + bool can_swap, bool force_scan) + +{ + bool success = false; + struct lru_gen_struct *lrugen = &lruvec->lrugen; + + VM_WARN_ON_ONCE(max_seq > READ_ONCE(lrugen->max_seq)); + + /* see the comment in iterate_mm_list() */ + if (lruvec->seq_update_progress) + success = false; + else { + spin_lock_irq(&lruvec->lru_lock); + + if (max_seq != lrugen->max_seq) + goto done; + + if (lruvec->seq_update_progress) + goto done; + + success = true; + lruvec->seq_update_progress = true; +done: + spin_unlock_irq(&lruvec->lru_lock); + } + if (!success) { + if (scan_priority <= DEF_PRIORITY - 2) + wait_event_killable(lruvec->seq_update_wait, + max_seq < READ_ONCE(lrugen->max_seq)); + + return max_seq < READ_ONCE(lrugen->max_seq); + } + + /* + * With hardware aging use the counters to update + * lruvec max and min hotness. + */ + restablish_hotness_range(lruvec); + + VM_WARN_ON_ONCE(max_seq != READ_ONCE(lrugen->max_seq)); + inc_max_seq(lruvec, can_swap, force_scan); + /* either this sees any waiters or they will see updated max_seq */ + if (wq_has_sleeper(&lruvec->seq_update_wait)) + wake_up_all(&lruvec->seq_update_wait); + + return success; +} +#endif /* CONFIG_LRU_GEN */ + +static void hca_debugfs_init(void) +{ + int node; + char name[32]; + struct dentry *node_dentry; + + hca_debugfs_root = debugfs_create_dir("hca", arch_debugfs_dir); + + for_each_online_node(node) { + snprintf(name, sizeof(name), "node%u", node); + node_dentry = debugfs_create_dir(name, hca_debugfs_root); + + hca_backend_node_debugfs_init(node, node_dentry); + } + + debugfs_create_ulong("scan-pfn-ratio", 0600, hca_debugfs_root, + &scan_pfn_ratio); + debugfs_create_ulong("scan-skip-msec", 0600, hca_debugfs_root, + &scan_skip_msec); + debugfs_create_bool("hca_lru_age", 0600, hca_debugfs_root, + &hca_lru_age); + + /* Now create backend debugs */ + hca_backend_debugfs_init(hca_debugfs_root); +} + +static int __init hca_init(void) +{ + if (!hca_backend_debugfs_init) { + pr_info("No HCA device registered. Disabling hca lru gen\n"); + hca_lru_age = false; + } + + hca_debugfs_init(); + return 0; +} + +late_initcall(hca_init); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 0bcc5d88239a..934ad587a558 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -425,6 +425,11 @@ struct lru_gen_struct { atomic_long_t evicted[NR_HIST_GENS][ANON_AND_FILE][MAX_NR_TIERS]; atomic_long_t refaulted[NR_HIST_GENS][ANON_AND_FILE][MAX_NR_TIERS]; /* whether the multi-gen LRU is enabled */ +#ifndef CONFIG_LRU_TASK_PAGE_AGING + unsigned long max_hotness; + unsigned long min_hotness; + u64 next_span_scan; +#endif bool enabled; }; diff --git a/include/linux/page_aging.h b/include/linux/page_aging.h index d7c63ce0d824..074c876f17e1 100644 --- a/include/linux/page_aging.h +++ b/include/linux/page_aging.h @@ -3,6 +3,10 @@ #ifndef _LINUX_PAGE_AGING_H #define _LINUX_PAGE_AGING_H +#ifdef CONFIG_ARCH_HAS_PAGE_AGING +#include +#endif + #ifndef arch_supports_page_access_count static inline bool arch_supports_page_access_count(void) { @@ -14,6 +18,7 @@ static inline bool arch_supports_page_access_count(void) bool __try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, int scan_priority, bool can_swap, bool force_scan); +void inc_max_seq(struct lruvec *lruvec, bool can_swap, bool force_scan); #ifndef arch_get_lru_gen_seq static inline unsigned long arch_get_lru_gen_seq(struct lruvec *lruvec, struct folio *folio) diff --git a/mm/Kconfig b/mm/Kconfig index ff7b209dec05..493709ac758e 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1181,6 +1181,10 @@ config LRU_GEN_STATS from evicted generations for debugging purpose. This option has a per-memcg and per-node memory overhead. + +config ARCH_HAS_PAGE_AGING + bool + # } source "mm/damon/Kconfig" diff --git a/mm/vmscan.c b/mm/vmscan.c index c8b98201f0b0..a5f6238b3926 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4362,7 +4362,7 @@ static bool try_to_inc_min_seq(struct lruvec *lruvec, bool can_swap) return success; } -static void inc_max_seq(struct lruvec *lruvec, bool can_swap, bool force_scan) +void inc_max_seq(struct lruvec *lruvec, bool can_swap, bool force_scan) { int prev, next; int type, zone; @@ -4420,6 +4420,7 @@ static void inc_max_seq(struct lruvec *lruvec, bool can_swap, bool force_scan) #endif spin_unlock_irq(&lruvec->lru_lock); } + #ifdef CONFIG_LRU_TASK_PAGE_AGING static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, int scan_priority, bool can_swap, bool force_scan) @@ -5861,7 +5862,7 @@ static int lru_gen_seq_show(struct seq_file *m, void *v) seq_printf(m, "memcg %5hu %s\n", mem_cgroup_id(memcg), path); } - seq_printf(m, " node %5d\n", nid); + seq_printf(m, " node %5d max_hotness %ld min_hotness %ld\n", nid, lrugen->max_hotness, lrugen->min_hotness); if (!full) seq = min_seq[LRU_GEN_ANON];