From patchwork Wed Mar 19 19:30:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Raghavendra K T X-Patchwork-Id: 14023077 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CEF3C35FFB for ; Wed, 19 Mar 2025 19:32:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3A49F280016; Wed, 19 Mar 2025 15:32:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 32B68280010; Wed, 19 Mar 2025 15:32:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17D32280016; Wed, 19 Mar 2025 15:32:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E7D5B280010 for ; Wed, 19 Mar 2025 15:32:41 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id F25FAC0719 for ; Wed, 19 Mar 2025 19:32:42 +0000 (UTC) X-FDA: 83239297806.21.798C5F3 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2046.outbound.protection.outlook.com [40.107.220.46]) by imf13.hostedemail.com (Postfix) with ESMTP id 8A1BF20013 for ; Wed, 19 Mar 2025 19:32:39 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=L2lXQwqg; dmarc=pass (policy=quarantine) header.from=amd.com; spf=pass (imf13.hostedemail.com: domain of Raghavendra.KodsaraThimmappa@amd.com designates 40.107.220.46 as permitted sender) smtp.mailfrom=Raghavendra.KodsaraThimmappa@amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742412759; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3JobkGtP3Dd44LrrvsSjv2D3CsygQXYXyfCaumfB9Hk=; b=zubrqOQnhSSJdejO0VfGgLLxza7xhlUNVhHbwfhNim4WuOTocFTjIWnUfqUvzXIEQ9RUNn c905ppVVUHWVy3RvxHtY7AFS/vNvfLYBG3Ef7xF/jv/iLVuO2MSFbiCOoXrydI0RGSA+jY keWRsqwZ0ndtvWUcaArSmOkfxQ4ho/A= ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1742412759; a=rsa-sha256; cv=pass; b=QZ+fpR/QkpV2BM0LrRtBCqd2tcuuh50bIKR9QKuWPzB+YpblUwZi65lBZ3JJTghbrgjyhc /tWg3SSyWli9pRRpXKPv3vVVrj5aKkKU89OJANdPmSZcSSI8v3pT3dGyNj1as3WkGibawi nWVsGMmkGd37N9urv4cX8WHYrdj4OyA= ARC-Authentication-Results: i=2; imf13.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=L2lXQwqg; dmarc=pass (policy=quarantine) header.from=amd.com; spf=pass (imf13.hostedemail.com: domain of Raghavendra.KodsaraThimmappa@amd.com designates 40.107.220.46 as permitted sender) smtp.mailfrom=Raghavendra.KodsaraThimmappa@amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=MqGRPycymWQayi5YBWyQAECzfD0Igb5HuJ1v8IUah1ESpXcTRjPE/1a6KfmWVGaXbpfEcr6DaRor8rRB9eNoaO3X/cYF1lQXZ/4GYLPqmLlf2tTQys5S1YysfYvxO769C9iA7fpKjqGEyZiGbwGtXVbZ3Mzi/12uOYBjYZGyQ3+u6rCUZm/y2K+rEuH4PBIDpioYawKic8xPVTTi5uZvsiU6yN0a1MMqnqv++a2cI51N5feHP7Fyca9W/dbbGDYhVeY9CYixxJxW38dlbq11FLxU7zRoYUaW2Ord3Zl0M/apxxasCgAwn1d798JHCNBRO8HkGe61490+lTzVbvspzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3JobkGtP3Dd44LrrvsSjv2D3CsygQXYXyfCaumfB9Hk=; b=lfisZdkpKVMdNpvdLTaxf4Z70bEXrcXO7ygHjYpIOiYPlHGB5syIa5rB5IPdn9ItR8Ca2hORVmKDtoF6M0nTUVd4Kh8uHf3E135cB+Li5HLYmCgGXDjZAJ0Bcs6treIpRCvsc9AzvjhS2TfsCU5gbsTRtSu476/ka0MN5f48nM3Xg7PPdySze+KO7rR4/IXP7zeFTKuPHp6yABBZ8BERaqzPbK9el4vslcAKKAN8uUA1EfMxqmSACtQ7554FgA4GA0WZrhM4VbyhorobjIlylBVQg8E7+N1gMP4TG19rfH99OtSP/tM0l+JmKecUIDUMutMzieVBYA+7IZ4T+1csxg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=arm.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3JobkGtP3Dd44LrrvsSjv2D3CsygQXYXyfCaumfB9Hk=; b=L2lXQwqgg3ziT7XgoNHqKtaFbmXyDj68hmkdNeNou/eETuOQJf0V74A7tEFeVFgrnLsnbnJ/6nWv8nrt4Ia9B2LI3I5wcER8omV+r3ybIe5zDheECHdN1Zs/Mq9NU3Yz2W4fs4LSKveNLY6G1YyYwJ4zrK5g6WLm0MVY0cvj8wM= Received: from SA9PR13CA0096.namprd13.prod.outlook.com (2603:10b6:806:24::11) by MN2PR12MB4110.namprd12.prod.outlook.com (2603:10b6:208:1dd::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.34; Wed, 19 Mar 2025 19:32:32 +0000 Received: from SN1PEPF0002636D.namprd02.prod.outlook.com (2603:10b6:806:24:cafe::be) by SA9PR13CA0096.outlook.office365.com (2603:10b6:806:24::11) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.34 via Frontend Transport; Wed, 19 Mar 2025 19:32:31 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by SN1PEPF0002636D.mail.protection.outlook.com (10.167.241.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8534.20 via Frontend Transport; Wed, 19 Mar 2025 19:32:31 +0000 Received: from tunga.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 19 Mar 2025 14:32:21 -0500 From: Raghavendra K T To: CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [RFC PATCH V1 09/13] mm: Add heuristic to calculate target node Date: Wed, 19 Mar 2025 19:30:24 +0000 Message-ID: <20250319193028.29514-10-raghavendra.kt@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250319193028.29514-1-raghavendra.kt@amd.com> References: <20250319193028.29514-1-raghavendra.kt@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN1PEPF0002636D:EE_|MN2PR12MB4110:EE_ X-MS-Office365-Filtering-Correlation-Id: 89c7117f-eac1-4ed3-7784-08dd671ccae1 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|1800799024|376014|7416014|82310400026; X-Microsoft-Antispam-Message-Info: epnti3YhX8xkc7q4GWx1wq3kfO81RLoY3glgj+4tJV44spY3HwAKppiQO7/KYM4+dk+jZFH/PCtXjS+byH884CxZqpGzwFpck+TfRfycWqA7MOZypkWVVbT9mCC7dyldfY76kBFsuMH8B767I1+6PFjieR8O7zwa4PmrBnLkjPJUcakzWFtclBorCyknsiPqN0KwsCIpattrMcLmLXbcWtwAby6gNWetKHT3OugXD4kICgZG47COWjBnSezE7evblWVscKkpC29e4TpOH7W9pu8RFMDbwytF0ygm/O//zc0mVK7iNtttWKXZWUKz/0+7BvzRlRo7/0IcShIpnw0hNCm5I7G241CYjyrGMWl2fQ5Y2eTUYEZKdQMWaAFzhe1+Wgo9SVY7vTGGFfpXCZB8kpEMVreLv6WjRx5K/HD/JuIOy6/1bK+IH4Y5covYUmwkVBfeiZVoRypSxRyfn1kr/soQ+UhpgoglrstWV7KRfMjBVTVnJXgLHw49ObNqeWN3AE6J4IJUzVJFpemgVvfidyMZ1wA0FF5UPO2SM/0xoTlJp6KTh/fTNqqQ5foGx+SPFi+Cs93S6kCs3nfU47fZIcu44K7LOU1y7o+6p76G5MVZJdOBmEMLgSUmWnYYBUZ4/Qz2Wa6AfMG3nbMgNJjHAKOEKiDh6tJoqx1S0EqFJ6gEe+rd8JcnI9fxHbvU0lDGE/qSmtze370PLpNIqjzrJXudETSVEeB+jHchxI2JmgIzMjGC4gmTqF9Oq2TyCd8wX3UFmcZE/jbrS90Z3JBT3TyN8hu5D+zc4ihSOLX56L8IkVfcgTdDNOWbo9K/qisH3E3g61ZO29bcSTJlCTMr1iEm1JjTuX3TGD6AUfbDDw3aqPCem1GgtJjyY7U5nNqaacMb4VSIwS9WMfbtQYI1DT8UaxBOwgAbMT1Qs2P3qE4Rs8QUg7+/VC0y/uaJdlH4EOg6YOqHowsW4h6hq+Ia5ojcZewUewjwbk9m6URCEkhgdBYeZoKAHkwLsZIsVuEIIj6NI1wR8iB3iCLLDC5vPVIhgPsywHTPsFxFrRd9znhPAqwRqdOvlu/rrJ7I2FmCq0LV7nYemNWFBbZRdw3P1mnfwg7fRdrCDiHwAkXnAXTe/R/Q34M8GXkiWyvEgOnlAiCZkOEX4BYs24sDHpNtKgIgm0VeelvcxnCZ/Tq3HgVsO7koNIENn9xs7/Is1p8Vke6ec7YtSw2nKWKkFgVS6fdozwDGSayM7DYvy7JZp8ikgM7xicWvE5dW/sLeJR1wX8YBv5/cDdSxzWxf5Xh6HMfvQdY6BgGp3SfsrAwrKCUX4eNiQnDKsnImyu7hiZQb9p5NH2GVCi9d8E1wXuADrnOXneHTa7/Y/JCh4dR7CMkJmpdqFm7mMmUHgZtYwPGs5/Lg5NuSlEb3h1eFvG20FI5cOw5frGj/cRbVqwq5MzOIo1SIlnK+nSV7UgMMhXzwQfMJeUFinMbo8G6TD9/JX2A0vijtBt9XdIekMLBNeok= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(36860700013)(1800799024)(376014)(7416014)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Mar 2025 19:32:31.5747 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 89c7117f-eac1-4ed3-7784-08dd671ccae1 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: SN1PEPF0002636D.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4110 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 8A1BF20013 X-Stat-Signature: 3193sbham3thj8bws34agbsshjgbfeyg X-HE-Tag: 1742412759-454891 X-HE-Meta: U2FsdGVkX1/N38xU4tkaYushsi0P3rsgQatXVwmwpUiWOeqDodhYMdH2cNpDl/FvZw2OwZ+T67McJNm3pd4dScf8BqllLma1PgkgrPl+kRp05K7gggOX1eKA1kaT/xshhIszVF0TD4JuN1qte12si8gZ18lXMjFlELv5CWiG5EWjrNoJLQx9VMiEbq5Ycl3UouCvj91KlvI9BJgpKMOYmveNisVbfqvaUqrk+gwZJiQsRuBaA/W/fZDu3DuBNB6AnoP2kUF9HpxRSmud52wtTe1EcKxhund+monlVHWHCk2Nvzoux0rqOeNAjH/Il3luHIxwtfFJesSnNrGx7nWQYaJV9oBhq2Fy7a5sMi4ZUiD5XO13jxGFaOHVfWlJFCft7heEgxJ1saxoltmAnz3M6mCiSYlb+YXaVBIyJDpKtVLHy0ZE2yN3kF64gGM+Ld3lrrsz2x3nBlqytrm3XsEwViCl1SzUxDP6r2bKlkKQ+6m6ux1uO7GsT/ZvSefR0Gwvj/6viNN7UzLODOjRPzog5r1qYJZr1gz/4Mkush/cplzlFz9YSrRwlY4+o/r3m/lrXuBSwu62EAMm3YZWfpgozpdKrMHLmEwv0kaEWrjl/69O0R4vr+Iue7tu0WK3NxPs+wJsjARrx0xXoNcK+177WCAAgeeHV6vhrNJZWVR04tAGJhgR8fmjtYG+lQXmqnhMQOzT+hju/LwIOOaqH07PFnaM9bqWOOwD130l72XSmfQDXnuQLMAGF50eGlQ5z6/kiHkxkPtFjFlQoR/pZnRrV15mQYW7aHMn8fbK/axJpHHJZgmZMP0P18x8fuuSdx9KcRVZLe7EEsRPAYJdvqYX6AaJkkfPPTKN01c4nInjgDFsLwrgoM0RGTIajeY0j2GiSMBD6RzvCkgI7hhscCLep35HWdKt7CyLy/8RYEudJsDgBgAq0khXkL/8kIXvMy5y9P4ZBe47WvmFa0K+ji+ dS+B6jtN iDYohohvflQlX41OMrTHxetBkPOiQMU5Kj1f4xkykLGFsq/4P6XTnG+u7IwJVuWtDJ5MCEUNzjQiUyYbqRA7RdQLzMKVLmE6Ze5RGhazR4XeVAkHcYyU0Vm3gAlQvSNlm2RQ9fawNxVqyPYSKq478eRRDw/MDrN3nmzckgPnGPhHqH1VRoElb+hoyHCITPNETay7ZmKJtGN0hBap664VPPLQ1BjsLBq4XPmYgJWScN2nHXt6i1RENbcluVMwj70KB+nGrMfv553fUC4COjz2xDPzPi8cLIxhOwH9yNRIlbI7wRyik2CrPN6tpTk3hXomw4e9F38TcNJE5cv5lkS20fZdRiwDOtZF3W+7tyPhhLIaDFIMgqMdkYn6Wa2Alp8N4tkPCvtLZaWBogOGCmQIvOSOE07Sw8HZwkOuZL1ASZyaAqPnPCX58xIKn/240/UPKob3oEnZy7MbRDj1UlRSSZiR4GGN+YnoxJc9hMn89RFRxAARdLKaUiybkgBIFnCS7hDEXqh2UmCvWMw7WBYNGp7OwlB3J7DS3hhmlurhCjUBihI60yU8g7I7+ngQNlt3J+0hUvpxanFugLOdXJfwrgxVP/zrY3uBopcNpPcAYzxRs6nGg8kh1BH+HfMuos3zlkuptyW79mYt6M4DEsYDtpdWC13QwbwsMxmygzSk4Pv+h5xjV/nEjDyAn5NADgvkZZKiShDIuLPdFwapIZm1FpZQTsFszo3p3QbQOwR5ysRYjtkG7dGVHeDnMdULJGAs2DmrX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: One of the key challenges in PTE A bit based scanning is to find right target node to promote to. Here is a simple heuristic based approach: While scanning pages of any mm we also scan toptier pages that belong to that mm. We get an insight on the distribution of pages that potentially belonging to particular toptier node and also its recent access. Current logic walks all the toptier node, and picks the one with highest accesses. Signed-off-by: Raghavendra K T --- PS: There are many potential idea possible here. 1. we can do a quick sort on toptier nodes scan and access info and maintain the list of preferred nodes/fallback nodes in case of current target_node is getting filled up 2. We can also keep the history of access/scan information from last scan used its decayed value to get a stable view etc etc. include/linux/mm_types.h | 4 + mm/kmmscand.c | 174 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 174 insertions(+), 4 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 0234f14f2aa6..eeaedc7473b1 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1015,6 +1015,10 @@ struct mm_struct { /* numa_scan_seq prevents two threads remapping PTEs. */ int numa_scan_seq; #endif +#ifdef CONFIG_KMMSCAND + /* Tracks promotion node. XXX: use nodemask */ + int target_node; + #endif /* * An operation with batched TLB flushing is going on. Anything * that can move process memory needs to flush the TLB when diff --git a/mm/kmmscand.c b/mm/kmmscand.c index 84140b9e8ce2..c2924b2e8a6d 100644 --- a/mm/kmmscand.c +++ b/mm/kmmscand.c @@ -88,6 +88,14 @@ static DEFINE_READ_MOSTLY_HASHTABLE(kmmscand_slots_hash, KMMSCAND_SLOT_HASH_BITS static struct kmem_cache *kmmscand_slot_cache __read_mostly; +/* Per memory node information used to caclulate target_node for migration */ +struct kmmscand_nodeinfo { + unsigned long nr_scanned; + unsigned long nr_accessed; + int node; + bool is_toptier; +}; + /* Per mm information collected to control VMA scanning */ struct kmmscand_mm_slot { struct mm_slot slot; @@ -100,6 +108,7 @@ struct kmmscand_mm_slot { unsigned long scan_size; long address; bool is_scanned; + int target_node; }; /* Data structure to keep track of current mm under scan */ @@ -118,7 +127,9 @@ struct kmmscand_scan kmmscand_scan = { */ struct kmmscand_scanctrl { struct list_head scan_list; + struct kmmscand_nodeinfo *nodeinfo[MAX_NUMNODES]; unsigned long address; + unsigned long nr_to_scan; }; struct kmmscand_scanctrl kmmscand_scanctrl; @@ -208,6 +219,98 @@ static void kmmmigrated_wait_work(void) migrate_sleep_jiffies); } +static unsigned long get_slowtier_accesed(struct kmmscand_scanctrl *scanctrl) +{ + int node; + unsigned long accessed = 0; + + for_each_node_state(node, N_MEMORY) { + if (!node_is_toptier(node) && scanctrl->nodeinfo[node]) + accessed += scanctrl->nodeinfo[node]->nr_accessed; + } + return accessed; +} + +static inline void set_nodeinfo_nr_accessed(struct kmmscand_nodeinfo *ni, unsigned long val) +{ + ni->nr_accessed = val; +} +static inline unsigned long get_nodeinfo_nr_scanned(struct kmmscand_nodeinfo *ni) +{ + return ni->nr_scanned; +} + +static inline void set_nodeinfo_nr_scanned(struct kmmscand_nodeinfo *ni, unsigned long val) +{ + ni->nr_scanned = val; +} + +static inline void reset_nodeinfo_nr_scanned(struct kmmscand_nodeinfo *ni) +{ + set_nodeinfo_nr_scanned(ni, 0); +} + +static inline void reset_nodeinfo(struct kmmscand_nodeinfo *ni) +{ + set_nodeinfo_nr_scanned(ni, 0); + set_nodeinfo_nr_accessed(ni, 0); +} + +static void init_one_nodeinfo(struct kmmscand_nodeinfo *ni, int node) +{ + ni->nr_scanned = 0; + ni->nr_accessed = 0; + ni->node = node; + ni->is_toptier = node_is_toptier(node) ? true : false; +} + +static struct kmmscand_nodeinfo *alloc_one_nodeinfo(int node) +{ + struct kmmscand_nodeinfo *ni; + + ni = kzalloc(sizeof(*ni), GFP_KERNEL); + + if (!ni) + return NULL; + + init_one_nodeinfo(ni, node); + + return ni; +} + +/* TBD: Handle errors */ +static void init_scanctrl(struct kmmscand_scanctrl *scanctrl) +{ + struct kmmscand_nodeinfo *ni; + int node; + + for_each_node(node) { + ni = alloc_one_nodeinfo(node); + if (!ni) + WARN_ON_ONCE(ni); + scanctrl->nodeinfo[node] = ni; + } +} + +static void reset_scanctrl(struct kmmscand_scanctrl *scanctrl) +{ + int node; + + for_each_node_state(node, N_MEMORY) + reset_nodeinfo(scanctrl->nodeinfo[node]); + + /* XXX: Not rellay required? */ + scanctrl->nr_to_scan = kmmscand_scan_size; +} + +static void free_scanctrl(struct kmmscand_scanctrl *scanctrl) +{ + int node; + + for_each_node(node) + kfree(scanctrl->nodeinfo[node]); +} + /* * Do not know what info to pass in the future to make * decision on taget node. Keep it void * now. @@ -217,6 +320,24 @@ static int kmmscand_get_target_node(void *data) return kmmscand_target_node; } +static int get_target_node(struct kmmscand_scanctrl *scanctrl) +{ + int node, target_node = NUMA_NO_NODE; + unsigned long prev = 0; + + for_each_node(node) { + if (node_is_toptier(node) && scanctrl->nodeinfo[node] && + get_nodeinfo_nr_scanned(scanctrl->nodeinfo[node]) > prev) { + prev = get_nodeinfo_nr_scanned(scanctrl->nodeinfo[node]); + target_node = node; + } + } + if (target_node == NUMA_NO_NODE) + target_node = kmmscand_get_target_node(NULL); + + return target_node; +} + extern bool migrate_balanced_pgdat(struct pglist_data *pgdat, unsigned long nr_migrate_pages); @@ -469,6 +590,14 @@ static int hot_vma_idle_pte_entry(pte_t *pte, } srcnid = folio_nid(folio); + scanctrl->nodeinfo[srcnid]->nr_scanned++; + if (scanctrl->nr_to_scan) + scanctrl->nr_to_scan--; + + if (!scanctrl->nr_to_scan) { + folio_put(folio); + return 1; + } if (!folio_test_lru(folio)) { folio_put(folio); @@ -479,11 +608,14 @@ static int hot_vma_idle_pte_entry(pte_t *pte, mmu_notifier_test_young(mm, addr) || folio_test_referenced(folio) || pte_young(pteval)) { + scanctrl->nodeinfo[srcnid]->nr_accessed++; + /* Do not try to promote pages from regular nodes */ if (!kmmscand_eligible_srcnid(srcnid)) { folio_put(folio); return 0; } + info = kzalloc(sizeof(struct kmmscand_migrate_info), GFP_NOWAIT); if (info && scanctrl) { @@ -571,6 +703,7 @@ static void kmmscand_collect_mm_slot(struct kmmscand_mm_slot *mm_slot) static void kmmscand_migrate_folio(void) { int ret = 0, dest = -1; + struct mm_struct *oldmm = NULL; struct kmmscand_migrate_info *info, *tmp; spin_lock(&kmmscand_migrate_lock); @@ -596,7 +729,16 @@ static void kmmscand_migrate_folio(void) spin_unlock(&kmmscand_migrate_lock); if (info->mm) { - dest = kmmscand_get_target_node(NULL); + if (oldmm != info->mm) { + if (!mmap_read_trylock(info->mm)) { + dest = kmmscand_get_target_node(NULL); + } else { + dest = READ_ONCE(info->mm->target_node); + mmap_read_unlock(info->mm); + } + oldmm = info->mm; + } + ret = kmmscand_promote_folio(info, dest); } @@ -658,7 +800,7 @@ static void kmmscand_migrate_folio(void) * Increase scan_size by (1 << SCAN_SIZE_CHANGE_SHIFT). */ static inline void kmmscand_update_mmslot_info(struct kmmscand_mm_slot *mm_slot, - unsigned long total) + unsigned long total, int target_node) { unsigned int scan_period; unsigned long now; @@ -706,6 +848,7 @@ static inline void kmmscand_update_mmslot_info(struct kmmscand_mm_slot *mm_slot, mm_slot->scan_period = scan_period; mm_slot->scan_size = scan_size; mm_slot->scan_delta = total; + mm_slot->target_node = target_node; } static unsigned long kmmscand_scan_mm_slot(void) @@ -714,6 +857,7 @@ static unsigned long kmmscand_scan_mm_slot(void) bool update_mmslot_info = false; unsigned int mm_slot_scan_period; + int target_node, mm_slot_target_node, mm_target_node; unsigned long now; unsigned long mm_slot_next_scan; unsigned long mm_slot_scan_size; @@ -746,6 +890,7 @@ static unsigned long kmmscand_scan_mm_slot(void) mm_slot_next_scan = mm_slot->next_scan; mm_slot_scan_period = mm_slot->scan_period; mm_slot_scan_size = mm_slot->scan_size; + mm_slot_target_node = mm_slot->target_node; spin_unlock(&kmmscand_mm_lock); if (unlikely(!mmap_read_trylock(mm))) @@ -756,6 +901,9 @@ static unsigned long kmmscand_scan_mm_slot(void) goto outerloop; } + mm_target_node = READ_ONCE(mm->target_node); + if (mm_target_node != mm_slot_target_node) + WRITE_ONCE(mm->target_node, mm_slot_target_node); now = jiffies; if (mm_slot_next_scan && time_before(now, mm_slot_next_scan)) @@ -763,11 +911,17 @@ static unsigned long kmmscand_scan_mm_slot(void) VMA_ITERATOR(vmi, mm, address); + /* Either Scan 25% of scan_size or cover vma size of scan_size */ + kmmscand_scanctrl.nr_to_scan = mm_slot_scan_size >> PAGE_SHIFT; + /* Reduce actual amount of pages scanned */ + kmmscand_scanctrl.nr_to_scan = mm_slot_scan_size >> 1; + for_each_vma(vmi, vma) { kmmscand_walk_page_vma(vma, &kmmscand_scanctrl); vma_scanned_size += vma->vm_end - vma->vm_start; - if (vma_scanned_size >= kmmscand_scan_size) { + if (vma_scanned_size >= mm_slot_scan_size || + !kmmscand_scanctrl.nr_to_scan) { next_mm = true; /* Add scanned folios to migration list */ spin_lock(&kmmscand_migrate_lock); @@ -789,9 +943,19 @@ static unsigned long kmmscand_scan_mm_slot(void) update_mmslot_info = true; + total = get_slowtier_accesed(&kmmscand_scanctrl); + target_node = get_target_node(&kmmscand_scanctrl); + + mm_target_node = READ_ONCE(mm->target_node); + + /* XXX: Do we need write lock? */ + if (mm_target_node != target_node) + WRITE_ONCE(mm->target_node, target_node); + reset_scanctrl(&kmmscand_scanctrl); + if (update_mmslot_info) { mm_slot->address = address; - kmmscand_update_mmslot_info(mm_slot, total); + kmmscand_update_mmslot_info(mm_slot, total, target_node); } outerloop: @@ -988,6 +1152,7 @@ static int stop_kmmscand(void) kthread_stop(kmmscand_thread); kmmscand_thread = NULL; } + free_scanctrl(&kmmscand_scanctrl); return err; } @@ -1044,6 +1209,7 @@ static void init_list(void) spin_lock_init(&kmmscand_migrate_lock); init_waitqueue_head(&kmmscand_wait); init_waitqueue_head(&kmmmigrated_wait); + init_scanctrl(&kmmscand_scanctrl); } static int __init kmmscand_init(void)