From patchwork Wed Sep 14 07:26:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12975713 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC0C8ECAAD8 for ; Wed, 14 Sep 2022 07:31:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 732E180007; Wed, 14 Sep 2022 03:31:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E17F6B0073; Wed, 14 Sep 2022 03:31:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55B1F80007; Wed, 14 Sep 2022 03:31:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4360A6B0071 for ; Wed, 14 Sep 2022 03:31:45 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 1CF17A0A27 for ; Wed, 14 Sep 2022 07:31:45 +0000 (UTC) X-FDA: 79909871370.15.2062DE6 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf27.hostedemail.com (Postfix) with ESMTP id C0AA540102 for ; Wed, 14 Sep 2022 07:31:44 +0000 (UTC) Received: by mail-pf1-f169.google.com with SMTP id c198so14050322pfc.13 for ; Wed, 14 Sep 2022 00:31:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=P7t7WUngpFeNHqFfmkHfb7/fuWkdcDiCj/xqnkzjDAI=; b=hKcF0vb8uLwfQEnr4pLwZ8SVBJyzJOgqLyUVkg9ZkY2F9mB5/Q/JpUjQonjsLRG5Cp p/0BFprl020CP+RBrtllDYaIyDt8h1YhAzEr0ZGXeCFGogISB95MGgLCq3mKt+EzZtoB tnk9Evfeh+kbUkTOTiV36gZVWpHUjFIKxaNZjEtW9Spo3QLrIPimLDmubjX5i+GI9IGj RZP/Fo7bfJot8yybFZq85ohokwmScpdckkB7DBA5QNRyJ2VXBRQ0t3akmvIgFZRkk7Fn zvj+gtLyA+eKPi/Sq4SdfRJFZdetYm6ZC24ykcnDAorfUF/CuNpaO7Uao9SGX7g7znxR vlGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=P7t7WUngpFeNHqFfmkHfb7/fuWkdcDiCj/xqnkzjDAI=; b=dRDYr3s1I9/F2PCtIqAqQolb8UDLC63Rsr8TSyBnS47T8Q0zSIGs/M7TyJU1BPsWi6 VHkBhTWwSWq4ZkrRIrNrhrccKX+e7GaCYf4Oz1/67kJVl+axEBVs1/CM0MryZfu+L3k3 XScxEE/7WS7md3eXi4EiKqbMOJ7k+UYr/XIloyxD5LoWNyA/jne0uGM5pZip9IBHitDK 9w+xCvqk9m1/sna04PAauApqGZ0gzOSZlB+nlldadeobTbeIN8M9WCasSklNt0CcpjS4 NZueR/XICISuyjNND6/VZIdvHQMHs4o1kcztDA4dnx2bMuqj8e+E21xQ4NoOet+vAILC 99Yg== X-Gm-Message-State: ACgBeo3n10nK3bAZ9Kgwla45AhJE3xZghRV8XcaZr+GFuVK/AExg2sl7 rpcvLx+nlCphdYflKn6l9oukDQ== X-Google-Smtp-Source: AA6agR6ZtNsQFXpUK8CHMXL0c2ERVtOnVYoirXaRYW14HcsOA+Y5zGHUBHGgMZ8REIr7LPNcMdaawg== X-Received: by 2002:a05:6a00:b8b:b0:536:71f7:4ce3 with SMTP id g11-20020a056a000b8b00b0053671f74ce3mr35866309pfj.74.1663140703613; Wed, 14 Sep 2022 00:31:43 -0700 (PDT) Received: from PXLDJ45XCM.bytedance.net ([61.120.150.76]) by smtp.gmail.com with ESMTPSA id e2-20020a170902784200b001754cfb5e21sm9831042pln.96.2022.09.14.00.31.36 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 14 Sep 2022 00:31:43 -0700 (PDT) From: Muchun Song To: gregkh@linuxfoundation.org, rafael@kernel.org, mike.kravetz@oracle.com, songmuchun@bytedance.com, akpm@linux-foundation.org, osalvador@suse.de, david@redhat.com, ying.huang@intel.com, aneesh.kumar@linux.ibm.com, rientjes@google.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, muchun.song@linux.dev, Andi Kleen Subject: [PATCH v4 1/2] mm: hugetlb: simplify per-node sysfs creation and removal Date: Wed, 14 Sep 2022 15:26:02 +0800 Message-Id: <20220914072603.60293-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.37.0 (Apple Git-136) In-Reply-To: <20220914072603.60293-1-songmuchun@bytedance.com> References: <20220914072603.60293-1-songmuchun@bytedance.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663140704; a=rsa-sha256; cv=none; b=HubBw/KsebLVzzntMlIetF+L/SHddEXzBoJh0iLmcfn8ey83dAHaPVQXgl/ONYAqzuIc0x 4DbwO5xPyjCf+WTAGdUYQFLZU0reAOe4t9ew4VrpcdI0qyrbOXhwvIl6VhGNz0CDj/OUpR 7BTrqyJjIQNL2L0v7Vb77KeWNk9MY+s= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=hKcF0vb8; dmarc=pass (policy=none) header.from=bytedance.com; spf=pass (imf27.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663140704; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=P7t7WUngpFeNHqFfmkHfb7/fuWkdcDiCj/xqnkzjDAI=; b=uqWv0Uw+JM78C0cclWyAc0fk9JPdMyCWuz6SMWu6Z9eC9xdbpVo4WQr4eIL08ErSe4P8yz Z9I01I5Q81yPrecfPZt8cSQWjhe/9/ST261KsANUfKS69VP+K+TK0Krm03KFwd/xkO8YIL T07NTp428jB55kVTNod+0mPOyn1m9b4= X-Stat-Signature: texxj3we9j1snwu9kxhkzcofxgay7j1k X-Rspamd-Queue-Id: C0AA540102 X-Rspamd-Server: rspam03 X-Rspam-User: Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=hKcF0vb8; dmarc=pass (policy=none) header.from=bytedance.com; spf=pass (imf27.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com X-HE-Tag: 1663140704-446569 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The following commit offload per-node sysfs creation and removal to a kworker and did not say why it is needed. And it also said "I don't know that this is absolutely required". It seems like the author was not sure as well. Since it only complicates the code, this patch will revert the changes to simplify the code. 39da08cb074c ("hugetlb: offload per node attribute registrations") We could use memory hotplug notifier to do per-node sysfs creation and removal instead of inserting those operations to node registration and unregistration. Then, it can reduce the code coupling between node.c and hugetlb.c. Also, it can simplify the code. Signed-off-by: Muchun Song Acked-by: Mike Kravetz Cc: Andi Kleen Cc: David Hildenbrand Cc: Greg Kroah-Hartman Cc: Muchun Song Cc: Oscar Salvador Cc: Rafael J. Wysocki Signed-off-by: Andrew Morton Acked-by: David Hildenbrand --- drivers/base/node.c | 139 ++------------------------------------------------- include/linux/node.h | 24 ++------- mm/hugetlb.c | 35 ++++++++----- 3 files changed, 30 insertions(+), 168 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index eb0f43784c2b..ed391cb09999 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -587,64 +587,9 @@ static const struct attribute_group *node_dev_groups[] = { NULL }; -#ifdef CONFIG_HUGETLBFS -/* - * hugetlbfs per node attributes registration interface: - * When/if hugetlb[fs] subsystem initializes [sometime after this module], - * it will register its per node attributes for all online nodes with - * memory. It will also call register_hugetlbfs_with_node(), below, to - * register its attribute registration functions with this node driver. - * Once these hooks have been initialized, the node driver will call into - * the hugetlb module to [un]register attributes for hot-plugged nodes. - */ -static node_registration_func_t __hugetlb_register_node; -static node_registration_func_t __hugetlb_unregister_node; - -static inline bool hugetlb_register_node(struct node *node) -{ - if (__hugetlb_register_node && - node_state(node->dev.id, N_MEMORY)) { - __hugetlb_register_node(node); - return true; - } - return false; -} - -static inline void hugetlb_unregister_node(struct node *node) -{ - if (__hugetlb_unregister_node) - __hugetlb_unregister_node(node); -} - -void register_hugetlbfs_with_node(node_registration_func_t doregister, - node_registration_func_t unregister) -{ - __hugetlb_register_node = doregister; - __hugetlb_unregister_node = unregister; -} -#else -static inline void hugetlb_register_node(struct node *node) {} - -static inline void hugetlb_unregister_node(struct node *node) {} -#endif - static void node_device_release(struct device *dev) { - struct node *node = to_node(dev); - -#if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_HUGETLBFS) - /* - * We schedule the work only when a memory section is - * onlined/offlined on this node. When we come here, - * all the memory on this node has been offlined, - * so we won't enqueue new work to this work. - * - * The work is using node->node_work, so we should - * flush work before freeing the memory. - */ - flush_work(&node->node_work); -#endif - kfree(node); + kfree(to_node(dev)); } /* @@ -665,11 +610,9 @@ static int register_node(struct node *node, int num) if (error) put_device(&node->dev); - else { - hugetlb_register_node(node); - + else compaction_register_node(node); - } + return error; } @@ -683,7 +626,6 @@ static int register_node(struct node *node, int num) void unregister_node(struct node *node) { compaction_unregister_node(node); - hugetlb_unregister_node(node); /* no-op, if memoryless node */ node_remove_accesses(node); node_remove_caches(node); device_unregister(&node->dev); @@ -905,74 +847,8 @@ void register_memory_blocks_under_node(int nid, unsigned long start_pfn, (void *)&nid, func); return; } - -#ifdef CONFIG_HUGETLBFS -/* - * Handle per node hstate attribute [un]registration on transistions - * to/from memoryless state. - */ -static void node_hugetlb_work(struct work_struct *work) -{ - struct node *node = container_of(work, struct node, node_work); - - /* - * We only get here when a node transitions to/from memoryless state. - * We can detect which transition occurred by examining whether the - * node has memory now. hugetlb_register_node() already check this - * so we try to register the attributes. If that fails, then the - * node has transitioned to memoryless, try to unregister the - * attributes. - */ - if (!hugetlb_register_node(node)) - hugetlb_unregister_node(node); -} - -static void init_node_hugetlb_work(int nid) -{ - INIT_WORK(&node_devices[nid]->node_work, node_hugetlb_work); -} - -static int node_memory_callback(struct notifier_block *self, - unsigned long action, void *arg) -{ - struct memory_notify *mnb = arg; - int nid = mnb->status_change_nid; - - switch (action) { - case MEM_ONLINE: - case MEM_OFFLINE: - /* - * offload per node hstate [un]registration to a work thread - * when transitioning to/from memoryless state. - */ - if (nid != NUMA_NO_NODE) - schedule_work(&node_devices[nid]->node_work); - break; - - case MEM_GOING_ONLINE: - case MEM_GOING_OFFLINE: - case MEM_CANCEL_ONLINE: - case MEM_CANCEL_OFFLINE: - default: - break; - } - - return NOTIFY_OK; -} -#endif /* CONFIG_HUGETLBFS */ #endif /* CONFIG_MEMORY_HOTPLUG */ -#if !defined(CONFIG_MEMORY_HOTPLUG) || !defined(CONFIG_HUGETLBFS) -static inline int node_memory_callback(struct notifier_block *self, - unsigned long action, void *arg) -{ - return NOTIFY_OK; -} - -static void init_node_hugetlb_work(int nid) { } - -#endif - int __register_one_node(int nid) { int error; @@ -991,8 +867,6 @@ int __register_one_node(int nid) } INIT_LIST_HEAD(&node_devices[nid]->access_list); - /* initialize work queue for memory hot plug */ - init_node_hugetlb_work(nid); node_init_caches(nid); return error; @@ -1063,13 +937,8 @@ static const struct attribute_group *cpu_root_attr_groups[] = { NULL, }; -#define NODE_CALLBACK_PRI 2 /* lower than SLAB */ void __init node_dev_init(void) { - static struct notifier_block node_memory_callback_nb = { - .notifier_call = node_memory_callback, - .priority = NODE_CALLBACK_PRI, - }; int ret, i; BUILD_BUG_ON(ARRAY_SIZE(node_state_attr) != NR_NODE_STATES); @@ -1079,8 +948,6 @@ void __init node_dev_init(void) if (ret) panic("%s() failed to register subsystem: %d\n", __func__, ret); - register_hotmemory_notifier(&node_memory_callback_nb); - /* * Create all node devices, which will properly link the node * to applicable memory block devices and already created cpu devices. diff --git a/include/linux/node.h b/include/linux/node.h index 9ec680dd607f..427a5975cf40 100644 --- a/include/linux/node.h +++ b/include/linux/node.h @@ -2,15 +2,15 @@ /* * include/linux/node.h - generic node definition * - * This is mainly for topological representation. We define the - * basic 'struct node' here, which can be embedded in per-arch + * This is mainly for topological representation. We define the + * basic 'struct node' here, which can be embedded in per-arch * definitions of processors. * * Basic handling of the devices is done in drivers/base/node.c - * and system devices are handled in drivers/base/sys.c. + * and system devices are handled in drivers/base/sys.c. * * Nodes are exported via driverfs in the class/node/devices/ - * directory. + * directory. */ #ifndef _LINUX_NODE_H_ #define _LINUX_NODE_H_ @@ -18,7 +18,6 @@ #include #include #include -#include /** * struct node_hmem_attrs - heterogeneous memory performance attributes @@ -84,10 +83,6 @@ static inline void node_set_perf_attrs(unsigned int nid, struct node { struct device dev; struct list_head access_list; - -#if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_HUGETLBFS) - struct work_struct node_work; -#endif #ifdef CONFIG_HMEM_REPORTING struct list_head cache_attrs; struct device *cache_dev; @@ -96,7 +91,6 @@ struct node { struct memory_block; extern struct node *node_devices[]; -typedef void (*node_registration_func_t)(struct node *); #if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_NUMA) void register_memory_blocks_under_node(int nid, unsigned long start_pfn, @@ -144,11 +138,6 @@ extern void unregister_memory_block_under_nodes(struct memory_block *mem_blk); extern int register_memory_node_under_compute_node(unsigned int mem_nid, unsigned int cpu_nid, unsigned access); - -#ifdef CONFIG_HUGETLBFS -extern void register_hugetlbfs_with_node(node_registration_func_t doregister, - node_registration_func_t unregister); -#endif #else static inline void node_dev_init(void) { @@ -176,11 +165,6 @@ static inline int unregister_cpu_under_node(unsigned int cpu, unsigned int nid) static inline void unregister_memory_block_under_nodes(struct memory_block *mem_blk) { } - -static inline void register_hugetlbfs_with_node(node_registration_func_t reg, - node_registration_func_t unreg) -{ -} #endif #define to_node(device) container_of(device, struct node, dev) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c6b53bcf823d..0a37e80730b7 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include @@ -3998,6 +3999,23 @@ static void hugetlb_register_node(struct node *node) } } +static int __meminit hugetlb_memory_callback(struct notifier_block *self, + unsigned long action, void *arg) +{ + struct memory_notify *mnb = arg; + int nid = mnb->status_change_nid; + + if (nid == NUMA_NO_NODE) + return NOTIFY_DONE; + + if (action == MEM_GOING_ONLINE) + hugetlb_register_node(node_devices[nid]); + else if (action == MEM_CANCEL_ONLINE || action == MEM_OFFLINE) + hugetlb_unregister_node(node_devices[nid]); + + return NOTIFY_OK; +} + /* * hugetlb init time: register hstate attributes for all registered node * devices of nodes that have memory. All on-line nodes should have @@ -4007,18 +4025,11 @@ static void __init hugetlb_register_all_nodes(void) { int nid; - for_each_node_state(nid, N_MEMORY) { - struct node *node = node_devices[nid]; - if (node->dev.id == nid) - hugetlb_register_node(node); - } - - /* - * Let the node device driver know we're here so it can - * [un]register hstate attributes on node hotplug. - */ - register_hugetlbfs_with_node(hugetlb_register_node, - hugetlb_unregister_node); + get_online_mems(); + hotplug_memory_notifier(hugetlb_memory_callback, 0); + for_each_node_state(nid, N_MEMORY) + hugetlb_register_node(node_devices[nid]); + put_online_mems(); } #else /* !CONFIG_NUMA */ From patchwork Wed Sep 14 07:26:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12975714 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D57F9ECAAD8 for ; Wed, 14 Sep 2022 07:31:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 620CB6B0071; Wed, 14 Sep 2022 03:31:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5CFF66B0073; Wed, 14 Sep 2022 03:31:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 497B980008; Wed, 14 Sep 2022 03:31:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 360756B0071 for ; Wed, 14 Sep 2022 03:31:51 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1338A120972 for ; Wed, 14 Sep 2022 07:31:51 +0000 (UTC) X-FDA: 79909871622.16.0228985 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by imf23.hostedemail.com (Postfix) with ESMTP id C3DAD1400BB for ; Wed, 14 Sep 2022 07:31:50 +0000 (UTC) Received: by mail-pj1-f45.google.com with SMTP id n23-20020a17090a091700b00202a51cc78bso11788021pjn.2 for ; Wed, 14 Sep 2022 00:31:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=jc7xFT6mhtnxrIEcMacUso/5x0NDvllgocyZuMTs7Rc=; b=6VcGImYWSjqml2cSlokYPvwgthXECIOs37Mjs04Q/DenMtHS9gP1vL/El5NgBY56Tz Si9trp78ZA5dD11/pNy+izfOnvG/EKjO/2qMNHe5zNgm8SZsfVVnQgaqJiX6oRo9O8L6 FLwqZQ8ualYx2JMrIHjHmFvx3+z9bPLieJOh9ptv+OldMugQm0CJdv+PRLXuZ9TriCVI +JN9zvSqpnwgo9NO2wZdOMkShnJwkrcGwnnKH6xjWEcgRpKzUj14Dx3lZh++3MzLL7DM ARFZSsY2TlO35kdd4KZxRqI5zyaNEFlLyC3mfcPv/ld+gdTKCND38XP2zEwOOah+f2Nr znLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=jc7xFT6mhtnxrIEcMacUso/5x0NDvllgocyZuMTs7Rc=; b=UV+qLvtk2pTKppOVJZ2xrEeCoUotPNn6isdoQ8BB4ydsEtglvACcrKDBSyeoMxZjW2 z2NJpphec0oDMLbVmkmyKKTjb94zJrQ9R7mIv3pLXxteRXBvA24SllguwuDfLtNE/wrP jOGKl8nRgNV3JFFZV3eOGujqKD7L56lKHfa/2fqZtA4XrZpK8sS8lPxcvEmtXyCmnuwU uKxQdQmDrertXNJnGn4Zd9BOX4zf8nv9EUO+lMMOzasXXcDziZjDM8eJsTlrvHDIUclz 2/KhSpDSnwiG2zRnJDKAMQnpfwnlGEWHq1WdY30iO8oyvUkTSnigG0ogAxc3oR2bLQeY bDuA== X-Gm-Message-State: ACrzQf1MuM6aYg80ZzEdx6nI/kPgyVOC4PGeZ0h32w0zlTSRFauuxs3P ASzhuDYfbmKPb0vKbHX2isFHqg== X-Google-Smtp-Source: AMsMyM7NM3FoOoIfAL9hv+Njip2AbwzjblRZxDATyRfFMcTjFPnd+J48jJUZgpPNA0FjZTNUelAKtw== X-Received: by 2002:a17:90b:3142:b0:202:eaca:8aed with SMTP id ip2-20020a17090b314200b00202eaca8aedmr3246114pjb.175.1663140709815; Wed, 14 Sep 2022 00:31:49 -0700 (PDT) Received: from PXLDJ45XCM.bytedance.net ([61.120.150.76]) by smtp.gmail.com with ESMTPSA id e2-20020a170902784200b001754cfb5e21sm9831042pln.96.2022.09.14.00.31.44 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 14 Sep 2022 00:31:49 -0700 (PDT) From: Muchun Song To: gregkh@linuxfoundation.org, rafael@kernel.org, mike.kravetz@oracle.com, songmuchun@bytedance.com, akpm@linux-foundation.org, osalvador@suse.de, david@redhat.com, ying.huang@intel.com, aneesh.kumar@linux.ibm.com, rientjes@google.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, muchun.song@linux.dev Subject: [PATCH v4 2/2] mm: hugetlb: eliminate memory-less nodes handling Date: Wed, 14 Sep 2022 15:26:03 +0800 Message-Id: <20220914072603.60293-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.37.0 (Apple Git-136) In-Reply-To: <20220914072603.60293-1-songmuchun@bytedance.com> References: <20220914072603.60293-1-songmuchun@bytedance.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663140710; a=rsa-sha256; cv=none; b=F52sgfNcDKWgGVHfP6dUU0ajtBfdcPTN8D7nPfYIz5mY1OTzGvRRd17SozAzfpFZNl3THu bN5LGIFFxAH8o5y1MG+S/NF+RWi1nYXi633uSj7gjoRTS3URZWC7y2Tfd7igbWlTNQxTzY jG5R96GhRe0WmZYwaJzDvd85JD9BAK8= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=6VcGImYW; dmarc=pass (policy=none) header.from=bytedance.com; spf=pass (imf23.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663140710; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jc7xFT6mhtnxrIEcMacUso/5x0NDvllgocyZuMTs7Rc=; b=2k5llBOzLg1GBSJqCALEDvYXy4PrwhuQLKA1/fDljM0g+9n02qrlC4gInrF0LRr60Xel6U OJp+PsNm4tGrQClCRoh3VA1HSbJ9xCtWbzVR7fPv280EYnI3PsTvrcCDuxBZyiflEy6y2w 0Rwpg7/DzrnjiZ9PmcEzcEaO3tNsTUE= X-Rspam-User: Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=6VcGImYW; dmarc=pass (policy=none) header.from=bytedance.com; spf=pass (imf23.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com X-Rspamd-Server: rspam11 X-Stat-Signature: 6h595m4th9mar6wj5yf9ihhxrp6purmy X-Rspamd-Queue-Id: C3DAD1400BB X-HE-Tag: 1663140710-488107 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The memory-notify-based approach aims to handle meory-less nodes, however, it just adds the complexity of code as pointed by David in thread [1]. The handling of memory-less nodes is introduced by commit 4faf8d950ec4 ("hugetlb: handle memory hot-plug events"). From its commit message, we cannot find any necessity of handling this case. So, we can simply register/unregister sysfs entries in register_node/unregister_node to simlify the code. BTW, hotplug callback added because in hugetlb_register_all_nodes() we register sysfs nodes only for N_MEMORY nodes, seeing commit 9b5e5d0fdc91, which said it was a preparation for handling memory-less nodes via memory hotplug. Since we want to remove memory hotplug, so make sure we only register per-node sysfs for online (N_ONLINE) nodes in hugetlb_register_all_nodes(). https://lore.kernel.org/linux-mm/60933ffc-b850-976c-78a0-0ee6e0ea9ef0@redhat.com/ [1] Suggested-by: David Hildenbrand Signed-off-by: Muchun Song Acked-by: David Hildenbrand --- v4: - Remove hugetlb_mark_sysfs_initialized() helper per David. v3: - Fix 'struct node' is not declared reported by LTP. v2: - Move declaration of function related to hugetlb to hugetlb.h (David). - Introduce hugetlb_sysfs_initialized() and call it from hugetlb_sysfs_init() (David). - Move hugetlb_register_all_nodes() into hugetlb_sysfs_init() (David). - Fix implicit-function-declaration reported by LKP. - Register per-node sysfs for online (N_ONLINE) nodes instead of N_MEMORY (Aneesh). drivers/base/node.c | 8 ++++-- include/linux/hugetlb.h | 14 ++++++++++ mm/hugetlb.c | 70 ++++++++++++++++++++----------------------------- 3 files changed, 49 insertions(+), 43 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index ed391cb09999..80b1e91b9608 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -20,6 +20,7 @@ #include #include #include +#include static struct bus_type node_subsys = { .name = "node", @@ -608,10 +609,12 @@ static int register_node(struct node *node, int num) node->dev.groups = node_dev_groups; error = device_register(&node->dev); - if (error) + if (error) { put_device(&node->dev); - else + } else { + hugetlb_register_node(node); compaction_register_node(node); + } return error; } @@ -625,6 +628,7 @@ static int register_node(struct node *node, int num) */ void unregister_node(struct node *node) { + hugetlb_unregister_node(node); compaction_unregister_node(node); node_remove_accesses(node); node_remove_caches(node); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 890f7b6a2eff..a6fc49db0ce0 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -16,6 +16,7 @@ struct ctl_table; struct user_struct; struct mmu_gather; +struct node; #ifndef is_hugepd typedef struct { unsigned long pd; } hugepd_t; @@ -935,6 +936,11 @@ static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, } #endif +#ifdef CONFIG_NUMA +void hugetlb_register_node(struct node *node); +void hugetlb_unregister_node(struct node *node); +#endif + #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; @@ -1109,6 +1115,14 @@ static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { } + +static inline void hugetlb_register_node(struct node *node) +{ +} + +static inline void hugetlb_unregister_node(struct node *node) +{ +} #endif /* CONFIG_HUGETLB_PAGE */ static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0a37e80730b7..d776d55a97b2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3869,24 +3869,8 @@ static int hugetlb_sysfs_add_hstate(struct hstate *h, struct kobject *parent, return 0; } -static void __init hugetlb_sysfs_init(void) -{ - struct hstate *h; - int err; - - hugepages_kobj = kobject_create_and_add("hugepages", mm_kobj); - if (!hugepages_kobj) - return; - - for_each_hstate(h) { - err = hugetlb_sysfs_add_hstate(h, hugepages_kobj, - hstate_kobjs, &hstate_attr_group); - if (err) - pr_err("HugeTLB: Unable to add hstate %s", h->name); - } -} - #ifdef CONFIG_NUMA +static bool hugetlb_sysfs_initialized __ro_after_init; /* * node_hstate/s - associate per node hstate attributes, via their kobjects, @@ -3942,7 +3926,7 @@ static struct hstate *kobj_to_node_hstate(struct kobject *kobj, int *nidp) * Unregister hstate attributes from a single node device. * No-op if no hstate attributes attached. */ -static void hugetlb_unregister_node(struct node *node) +void hugetlb_unregister_node(struct node *node) { struct hstate *h; struct node_hstate *nhs = &node_hstates[node->dev.id]; @@ -3972,12 +3956,15 @@ static void hugetlb_unregister_node(struct node *node) * Register hstate attributes for a single node device. * No-op if attributes already registered. */ -static void hugetlb_register_node(struct node *node) +void hugetlb_register_node(struct node *node) { struct hstate *h; struct node_hstate *nhs = &node_hstates[node->dev.id]; int err; + if (!hugetlb_sysfs_initialized) + return; + if (nhs->hugepages_kobj) return; /* already allocated */ @@ -3999,23 +3986,6 @@ static void hugetlb_register_node(struct node *node) } } -static int __meminit hugetlb_memory_callback(struct notifier_block *self, - unsigned long action, void *arg) -{ - struct memory_notify *mnb = arg; - int nid = mnb->status_change_nid; - - if (nid == NUMA_NO_NODE) - return NOTIFY_DONE; - - if (action == MEM_GOING_ONLINE) - hugetlb_register_node(node_devices[nid]); - else if (action == MEM_CANCEL_ONLINE || action == MEM_OFFLINE) - hugetlb_unregister_node(node_devices[nid]); - - return NOTIFY_OK; -} - /* * hugetlb init time: register hstate attributes for all registered node * devices of nodes that have memory. All on-line nodes should have @@ -4025,11 +3995,8 @@ static void __init hugetlb_register_all_nodes(void) { int nid; - get_online_mems(); - hotplug_memory_notifier(hugetlb_memory_callback, 0); - for_each_node_state(nid, N_MEMORY) + for_each_online_node(nid) hugetlb_register_node(node_devices[nid]); - put_online_mems(); } #else /* !CONFIG_NUMA */ @@ -4053,6 +4020,28 @@ static inline __init void hugetlb_cma_check(void) } #endif +static void __init hugetlb_sysfs_init(void) +{ + struct hstate *h; + int err; + + hugepages_kobj = kobject_create_and_add("hugepages", mm_kobj); + if (!hugepages_kobj) + return; + + for_each_hstate(h) { + err = hugetlb_sysfs_add_hstate(h, hugepages_kobj, + hstate_kobjs, &hstate_attr_group); + if (err) + pr_err("HugeTLB: Unable to add hstate %s", h->name); + } + +#ifdef CONFIG_NUMA + hugetlb_sysfs_initialized = true; +#endif + hugetlb_register_all_nodes(); +} + static int __init hugetlb_init(void) { int i; @@ -4107,7 +4096,6 @@ static int __init hugetlb_init(void) report_hugepages(); hugetlb_sysfs_init(); - hugetlb_register_all_nodes(); hugetlb_cgroup_file_init(); #ifdef CONFIG_SMP