From patchwork Fri Jun 10 13:52:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Aneesh Kumar K.V" X-Patchwork-Id: 12877627 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75A3FC433EF for ; Fri, 10 Jun 2022 13:54:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 107898D00AA; Fri, 10 Jun 2022 09:54:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0B5E08D009C; Fri, 10 Jun 2022 09:54:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E71EC8D00AA; Fri, 10 Jun 2022 09:54:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D7F7A8D009C for ; Fri, 10 Jun 2022 09:54:18 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id AAA92811E4 for ; Fri, 10 Jun 2022 13:54:18 +0000 (UTC) X-FDA: 79562470596.23.8572D0A Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf16.hostedemail.com (Postfix) with ESMTP id 114FC180057 for ; Fri, 10 Jun 2022 13:54:17 +0000 (UTC) Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 25AD7kOW030120; Fri, 10 Jun 2022 13:53:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=mCkQY9eq8ljfIN+4+Uneh6vfxHdvwScC98u2nmNKZr0=; b=UwlQHqVnyqOTTD5Oj94N7FHv21odR8jCf02WhNc+cXWr8fvxQn2WbYtXYCqHuyc/2l1J /fSiVSDq0D+pkH9UDVZPEaoAQ0R/JK4j7coa6uqcN39JQLODQErTIG4abhEA7f5tPz7l /2z70vMPf1htUbs8PNDR57NfgkZZBLHPGZuj+FxX9Aw9gZHanyFd0ZCeYmjzfOSWnYFs fPlgGy6lpbarokUNEfhQ7RNVYGNGqLo3azc6fi8LN/k3uyRfI8Dn/RgHcC+EpcaQ9HyO IFpXwUxkSaOFwYmqxn18GYEtFP/BSukFVfg68FB5wrrL6uffOqdnJr6KGVR/dz1IIgVk gg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3gm4vaaxxv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Jun 2022 13:53:56 +0000 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 25ADPKNW032120; Fri, 10 Jun 2022 13:53:55 GMT Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3gm4vaaxxj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Jun 2022 13:53:55 +0000 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 25ADZS9o014172; Fri, 10 Jun 2022 13:53:54 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma02dal.us.ibm.com with ESMTP id 3gfy1bb4kq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Jun 2022 13:53:54 +0000 Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 25ADrrun32702788 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jun 2022 13:53:53 GMT Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1643B6A057; Fri, 10 Jun 2022 13:53:53 +0000 (GMT) Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 09A7C6A04D; Fri, 10 Jun 2022 13:53:45 +0000 (GMT) Received: from skywalker.ibmuc.com (unknown [9.43.90.151]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP; Fri, 10 Jun 2022 13:53:44 +0000 (GMT) From: "Aneesh Kumar K.V" To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: Wei Xu , Huang Ying , Greg Thelen , Yang Shi , Davidlohr Bueso , Tim C Chen , Brice Goglin , Michal Hocko , Linux Kernel Mailing List , Hesham Almatary , Dave Hansen , Jonathan Cameron , Alistair Popple , Dan Williams , Feng Tang , Jagdish Gediya , Baolin Wang , David Rientjes , "Aneesh Kumar K.V" Subject: [PATCH v6 07/13] mm/demotion: Add per node memory tier attribute to sysfs Date: Fri, 10 Jun 2022 19:22:23 +0530 Message-Id: <20220610135229.182859-8-aneesh.kumar@linux.ibm.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220610135229.182859-1-aneesh.kumar@linux.ibm.com> References: <20220610135229.182859-1-aneesh.kumar@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: lpIccryA9ZlRNWpkgFKmSLojMXjMgkGq X-Proofpoint-GUID: 4ZtrkFZg1m4-wH4m24WbekH28Sqg_6ZZ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.517,FMLib:17.11.64.514 definitions=2022-06-10_06,2022-06-09_02,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 phishscore=0 priorityscore=1501 bulkscore=0 spamscore=0 malwarescore=0 adultscore=0 impostorscore=0 suspectscore=0 mlxlogscore=999 clxscore=1015 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2204290000 definitions=main-2206100056 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1654869258; a=rsa-sha256; cv=none; b=G6Fzh3a6ngSwffD9JNUxC/Qt3V23F/crOUJf8YCxrtp86ByAT2zDtFfX8hCY96kzFz3pfH g86RTakhu1HrGJ9pvb9M1sH91ZynuZGqMYUNLPYAkYQTEugCSRiK9YgR6xq7EtHpiuOC1S 9wdqgYVZOsaudxGf9Hii09QIdxoPB8U= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=UwlQHqVn; spf=pass (imf16.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1654869258; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mCkQY9eq8ljfIN+4+Uneh6vfxHdvwScC98u2nmNKZr0=; b=Do1yDjSqmi7LJHNyCMrfqMaNLQsSFe2EDGEvtrfSDOFx1YWlDAi2kncC2NaAFwEYkX7bZP CbDe4WiBYYoKDJt44kEo8ZALOPEg2CJadvrGlyDaFYurwUq6GldX1MIfdx/UUJJPXbWfCP F3OgROjnoklJN2T1jn62LC09hAzihd8= X-Rspamd-Queue-Id: 114FC180057 Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=UwlQHqVn; spf=pass (imf16.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com X-Rspam-User: X-Rspamd-Server: rspam06 X-Stat-Signature: 7yh3xrs47pjwim8ubgchifwa9oy5rgyn X-HE-Tag: 1654869257-23497 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add support to modify the memory tier for a NUMA node. /sys/devices/system/node/nodeN/memtier where N = node id When read, It list the memory tier that the node belongs to. When written, the kernel moves the node into the specified memory tier, the tier assignment of all other nodes are not affected. If the memory tier does not exist an error is returned. Suggested-by: Wei Xu Signed-off-by: Jagdish Gediya Signed-off-by: Aneesh Kumar K.V --- drivers/base/node.c | 39 ++++++++++++++++++++++++++++++++ include/linux/memory-tiers.h | 3 +++ mm/memory-tiers.c | 44 ++++++++++++++++++++++++++++++++++++ 3 files changed, 86 insertions(+) diff --git a/drivers/base/node.c b/drivers/base/node.c index 0ac6376ef7a1..599ed64d910f 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -20,6 +20,7 @@ #include #include #include +#include static struct bus_type node_subsys = { .name = "node", @@ -560,11 +561,49 @@ static ssize_t node_read_distance(struct device *dev, } static DEVICE_ATTR(distance, 0444, node_read_distance, NULL); +#ifdef CONFIG_TIERED_MEMORY +static ssize_t memtier_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + int node = dev->id; + int tier_index = node_get_memory_tier_id(node); + + if (tier_index != -1) + return sysfs_emit(buf, "%d\n", tier_index); + return 0; +} + +static ssize_t memtier_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + unsigned long tier; + int node = dev->id; + int ret; + + ret = kstrtoul(buf, 10, &tier); + if (ret) + return ret; + + ret = node_reset_memory_tier(node, tier); + if (ret) + return ret; + + return count; +} + +static DEVICE_ATTR_RW(memtier); +#endif + static struct attribute *node_dev_attrs[] = { &dev_attr_meminfo.attr, &dev_attr_numastat.attr, &dev_attr_distance.attr, &dev_attr_vmstat.attr, +#ifdef CONFIG_TIERED_MEMORY + &dev_attr_memtier.attr, +#endif NULL }; diff --git a/include/linux/memory-tiers.h b/include/linux/memory-tiers.h index 18dd1ab7b96e..e70f0040d845 100644 --- a/include/linux/memory-tiers.h +++ b/include/linux/memory-tiers.h @@ -20,6 +20,9 @@ extern bool numa_demotion_enabled; int node_create_and_set_memory_tier(int node, int tier); int next_demotion_node(int node); +int node_set_memory_tier(int node, int tier); +int node_get_memory_tier_id(int node); +int node_reset_memory_tier(int node, int tier); #else #define numa_demotion_enabled false static inline int next_demotion_node(int node) diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c index 51210f5efc1f..7bfdfac4d43e 100644 --- a/mm/memory-tiers.c +++ b/mm/memory-tiers.c @@ -313,6 +313,50 @@ int node_set_memory_tier(int node, int tier) return ret; } +int node_get_memory_tier_id(int node) +{ + int tier = -1; + struct memory_tier *memtier; + /* + * Make sure memory tier is not unregistered + * while it is being read. + */ + mutex_lock(&memory_tier_lock); + memtier = __node_get_memory_tier(node); + if (memtier) + tier = memtier->dev.id; + mutex_unlock(&memory_tier_lock); + + return tier; +} + +int node_reset_memory_tier(int node, int tier) +{ + struct memory_tier *current_tier; + int ret = 0; + + mutex_lock(&memory_tier_lock); + + current_tier = __node_get_memory_tier(node); + if (!current_tier || current_tier->dev.id == tier) + goto out; + + node_clear(node, current_tier->nodelist); + + ret = __node_set_memory_tier(node, tier); + if (ret) { + /* reset it back to older tier */ + node_set(node, current_tier->nodelist); + goto out; + } + + establish_migration_targets(); +out: + mutex_unlock(&memory_tier_lock); + + return ret; +} + /** * next_demotion_node() - Get the next node in the demotion path * @node: The starting node to lookup the next node