From patchwork Tue Feb 25 14:00:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chen Yu X-Patchwork-Id: 13990061 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10E00C021B2 for ; Tue, 25 Feb 2025 14:05:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C1BF6B0089; Tue, 25 Feb 2025 09:05:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 949FD6B008A; Tue, 25 Feb 2025 09:05:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 79D6D6B008C; Tue, 25 Feb 2025 09:05:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 55ACE6B0089 for ; Tue, 25 Feb 2025 09:05:05 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id EBE9581605 for ; Tue, 25 Feb 2025 14:05:04 +0000 (UTC) X-FDA: 83158638528.04.D8D2BCC Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) by imf22.hostedemail.com (Postfix) with ESMTP id BE80BC0025 for ; Tue, 25 Feb 2025 14:05:02 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=PmWNJWSX; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf22.hostedemail.com: domain of yu.c.chen@intel.com designates 198.175.65.16 as permitted sender) smtp.mailfrom=yu.c.chen@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740492303; a=rsa-sha256; cv=none; b=xXU5krveKxDA38sP+If+MZHB/HVnis46LTXcICzPdBdFFpTODhvdWGMlKr5xGmRLEi7Y/H fQsiZW4McCUpShBi26tajwIpJhitYH15c+Xx+//a7PrISaly4GEGBRdo4B3hSDMwSRidZE INkLy+2G4Bzyqjo3wsNcg1HdtCcr4ho= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=PmWNJWSX; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf22.hostedemail.com: domain of yu.c.chen@intel.com designates 198.175.65.16 as permitted sender) smtp.mailfrom=yu.c.chen@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740492303; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YSXlqu+wLTawFl3+al+ZWt/0DvxG2giBkV7DA47stJ8=; b=JmS8UZlHqPNc4EZvwNM4qC50HQYJdzPgWRIVSXpxXxLth+Uz+Cp3LXqCHr3xFv4Q5yU/fD 6wc4odobjdAL33DjdNixeeJDjmOl7prkBdJTgEabQYsBepbTdGJvInhD+NU0HctSYbZPal ari/YWXQ0g6rmmukoxw8p2qSKcfB3Cg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740492303; x=1772028303; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hdQzrrNE52j/1pPIwjImRAwPDZ4EG0HB8b/6MuOnZsA=; b=PmWNJWSXMDYQktx6ceMp0MX4mHsAYyR7zxoIwi2G8Pb3Il7gYZ9B2oUT XCaXbPKKZKGKJYgbj67JQcfZcCbeeefHK5VLrxexBysWppVHTRajp2O4B eb3AGgmkVsddeW+OssFUXUqNcSvcBu+OjsiA3eYTSAZSCoGBZSt3CwBi9 5wXfSH3SSkEfKkmwM6Qc6mi/iexs7C001j4LRJ+mMApjwVsHL8E5EbvMa 3E4UNk5APKCtY571kCSW4KkSK7cVN2S8BjtlIu+185LdNLlGWs/qPHH4O QeotFYqdxapoj8Es+936O714McGHK6aJxEBrl7NhPSUmeJBb7k6IsICv9 g==; X-CSE-ConnectionGUID: H+kh5yZjQgWGQ/dUA5eYAQ== X-CSE-MsgGUID: VBzwjdJ1QM6nKJudzhvBZQ== X-IronPort-AV: E=McAfee;i="6700,10204,11356"; a="41424601" X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="41424601" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2025 06:05:01 -0800 X-CSE-ConnectionGUID: jOvlhMYcRna49H6HM95Otg== X-CSE-MsgGUID: aHIhg91ZQQexTDVfL18beg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="116590762" Received: from chenyu-dev.sh.intel.com ([10.239.62.107]) by fmviesa008.fm.intel.com with ESMTP; 25 Feb 2025 06:04:56 -0800 From: Chen Yu To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Andrew Morton Cc: Rik van Riel , Mel Gorman , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , "Liam R. Howlett" , Lorenzo Stoakes , "Huang, Ying" , Tim Chen , Aubrey Li , Michael Wang , Kaiyang Zhao , David Rientjes , Raghavendra K T , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Chen Yu Subject: [RFC PATCH 1/3] sched/numa: Introduce numa balance task migration and swap in schedstats Date: Tue, 25 Feb 2025 22:00:01 +0800 Message-Id: <1847c5ef828ad4835a35e3a54b88d2e13bce0eea.1740483690.git.yu.c.chen@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BE80BC0025 X-Stat-Signature: s94hcx9uxmrkbqp643wufqhuco1xcx1g X-Rspam-User: X-HE-Tag: 1740492302-937787 X-HE-Meta: U2FsdGVkX1+ckirCu6Xa3k5c6K+y+20sqH8HVQxbq7pj/jYO+m5CQoRtHt27J72x84ZQLw7PY6vZkPa+rKTatQfcoIjIbKiNim4YOH+YWOVnjLQmWzFIkqAImdXtQWWAp+RZOQA1wCf7d8yI0OrTToqM/yJehFvIRg5nwiuHQ7YfVn4ztFHAlffQYzQ/q8dCSdJQEF+f+4q8ZgPjzD/V056nbKNuDeL2UALZnp/wAtBJG0xCL5BnqhnGX5+c8zKAcsk4q4KlAM/dGOYk/sVJDlri8+iRD+bHd1VloBCl8aoEn69vYQzpbClkcxluVhzw3RPYnTq9nk7Wo4RLxQt+g015tlI8BVnPWQkmnXXSFW20+VoW9Qz6rdGiBlbp5NF9/5wgg4WwVUlzm0w2YQrKuaKcbIW0a+r7nI+fvNbRNhV4vgrZovzipZP/gq+IrhRuuWDyj8Pn3xnWe3VyKZW17vab5AcUWvb2ymAtm+dIm0WZZ/vbw1rFqcCLltWKYYKsXKnbb3A1fr5mndVoZ/ug3xRDNg//rAD68emIP8/DEFcWuikwCJFpSxUWjwqbohSpY9uREJ7iZNLX0fk73TgCccX64K44g3PnsfN3SsvurYQ1Gkg9X+ar6D8k6tR94qJNLo7l2di/krWivBs8WDIxPypioD1iN0a+UO/zjp4t3tF8OmKU+Z6LxoHU49vZxKCzJ9bKUrb7SDHPBzvPQTOxtJphQvpYXNiG3qFQ+jzYVEilRauOEhEqaLCMrlfbSqx9AAfXF8emV1l+C5YEcqaHfa5Uaf3CjsaRoJs2FEFS76TVt8eS/xB/mWm4eFZznCyKZbvSPxX5d4eG/Fwg8gXZvgAWSNl2+s5aH1BA0WPfF9ytk0cAMLIOnhK8fDm5kG7R28dhjXAjurf0zvi9k6w0/fFQRzbbBhtPnDD9H/DN1erut/nX9CxDNKfRyXzthPzA114Ju/jrI+il0EeoCf/ +NP0uKkX ymAiwwXepVwL2F++LQ5cUYp3vmi8rQ4mU0a5LOK3rlziGmdY2rn5BYHcUDTmwFmZY2EpE8mTr1sGSyfIbc2ke0m9ouxr4H0WKPGQ/kBmgYyiGFUpUOa7nt+kk7eAzNPnSYweBQlqVjs8fk+h+4x7Iwk7U8P65g0zbWlGf/vsMuMHYdHq+o1WQAeMRRw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: There is a requirement to track task activities during NUMA balancing. NUMA balancing has two mechanisms for task migration: one is to migrate the task to an idle CPU in its preferred node, and the other is to swap tasks on different nodes if they are on each other's preferred node. The kernel already has NUMA page migration statistics. Add the task migration and swap count described above in the per-task/cgroup scope. The data will be displayed at /sys/fs/cgroup/mytest/memory.stat and /proc/{PID}/sched. Signed-off-by: Chen Yu --- include/linux/sched.h | 4 ++++ include/linux/vm_event_item.h | 2 ++ kernel/sched/core.c | 10 ++++++++-- kernel/sched/debug.c | 4 ++++ mm/memcontrol.c | 2 ++ mm/vmstat.c | 2 ++ 6 files changed, 22 insertions(+), 2 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 9632e3318e0d..01faa608ed7c 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -527,6 +527,10 @@ struct sched_statistics { u64 nr_failed_migrations_running; u64 nr_failed_migrations_hot; u64 nr_forced_migrations; +#ifdef CONFIG_NUMA_BALANCING + u64 nr_numa_migrations; + u64 nr_numa_swap; +#endif u64 nr_wakeups; u64 nr_wakeups_sync; diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index f70d0958095c..aef817474781 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -64,6 +64,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, NUMA_HINT_FAULTS, NUMA_HINT_FAULTS_LOCAL, NUMA_PAGE_MIGRATE, + NUMA_TASK_MIGRATE, + NUMA_TASK_SWAP, #endif #ifdef CONFIG_MIGRATION PGMIGRATE_SUCCESS, PGMIGRATE_FAIL, diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 165c90ba64ea..44efc725054a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3348,6 +3348,11 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu) #ifdef CONFIG_NUMA_BALANCING static void __migrate_swap_task(struct task_struct *p, int cpu) { + __schedstat_inc(p->stats.nr_numa_swap); + + if (p->mm) + count_memcg_events_mm(p->mm, NUMA_TASK_SWAP, 1); + if (task_on_rq_queued(p)) { struct rq *src_rq, *dst_rq; struct rq_flags srf, drf; @@ -7901,8 +7906,9 @@ int migrate_task_to(struct task_struct *p, int target_cpu) if (!cpumask_test_cpu(target_cpu, p->cpus_ptr)) return -EINVAL; - /* TODO: This is not properly updating schedstats */ - + __schedstat_inc(p->stats.nr_numa_migrations); + if (p->mm) + count_memcg_events_mm(p->mm, NUMA_TASK_MIGRATE, 1); trace_sched_move_numa(p, curr_cpu, target_cpu); return stop_one_cpu(curr_cpu, migration_cpu_stop, &arg); } diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index ef047add7f9e..ed801cc00bf1 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -1204,6 +1204,10 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns, P_SCHEDSTAT(nr_failed_migrations_running); P_SCHEDSTAT(nr_failed_migrations_hot); P_SCHEDSTAT(nr_forced_migrations); +#ifdef CONFIG_NUMA_BALANCING + P_SCHEDSTAT(nr_numa_migrations); + P_SCHEDSTAT(nr_numa_swap); +#endif P_SCHEDSTAT(nr_wakeups); P_SCHEDSTAT(nr_wakeups_sync); P_SCHEDSTAT(nr_wakeups_migrate); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 46f8b372d212..496b5edc3db6 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -460,6 +460,8 @@ static const unsigned int memcg_vm_event_stat[] = { NUMA_PAGE_MIGRATE, NUMA_PTE_UPDATES, NUMA_HINT_FAULTS, + NUMA_TASK_MIGRATE, + NUMA_TASK_SWAP, #endif }; diff --git a/mm/vmstat.c b/mm/vmstat.c index 16bfe1c694dd..d6651778e4bf 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1339,6 +1339,8 @@ const char * const vmstat_text[] = { "numa_hint_faults", "numa_hint_faults_local", "numa_pages_migrated", + "numa_task_migrated", + "numa_task_swaped", #endif #ifdef CONFIG_MIGRATION "pgmigrate_success",