From patchwork Fri Jul 26 08:44:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Li Zhijian X-Patchwork-Id: 13742529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A294C3DA49 for ; Fri, 26 Jul 2024 09:09:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D04186B0096; Fri, 26 Jul 2024 05:09:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CB3F66B0099; Fri, 26 Jul 2024 05:09:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B54C86B009B; Fri, 26 Jul 2024 05:09:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9134D6B0096 for ; Fri, 26 Jul 2024 05:09:20 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id EA15DA6AD1 for ; Fri, 26 Jul 2024 09:09:19 +0000 (UTC) X-FDA: 82381330038.13.4481042 Received: from esa4.hc1455-7.c3s2.iphmx.com (esa4.hc1455-7.c3s2.iphmx.com [68.232.139.117]) by imf20.hostedemail.com (Postfix) with ESMTP id A34691C001F for ; Fri, 26 Jul 2024 09:09:17 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=fujitsu.com header.s=fj2 header.b="AWY2J/Be"; spf=pass (imf20.hostedemail.com: domain of lizhijian@fujitsu.com designates 68.232.139.117 as permitted sender) smtp.mailfrom=lizhijian@fujitsu.com; dmarc=pass (policy=reject) header.from=fujitsu.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721984909; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=nnnUwFDthVcfxFtmsyOlbYeqXEVPbom0tsI9PAP0OKo=; b=mmQaSRAWMcINURodWVDsMCxqTTZ+lw8lP0SyQw3RBle3Fep9kemOScgnv610eUyjBmYNig dy9S0PcZtKlNB8EYZ4qrFR2o5azWTJU+Nsby81b3T+X+b390ekFgME8WOgYvbh+fMx5lOX A/gobsH6NLjOVIcrDkfTxH4k1dJezQE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721984909; a=rsa-sha256; cv=none; b=T0RsAbPcvz1HQCL2zvuW1FjAtofIcQwOVwSLGFp90yPu4y3aldTrRQ0RVmuwqTQLTW9oqt qNojG2Hkb6ynjGrQYQxLJJs8eDbPN3EkGJyupMo2GWaj+H8oh5JT9fIZlYfnvsJ9Ql7cMy 87WdGK23huOzlzh84K/857Rf9cRGYyo= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=fujitsu.com header.s=fj2 header.b="AWY2J/Be"; spf=pass (imf20.hostedemail.com: domain of lizhijian@fujitsu.com designates 68.232.139.117 as permitted sender) smtp.mailfrom=lizhijian@fujitsu.com; dmarc=pass (policy=reject) header.from=fujitsu.com DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=fujitsu.com; i=@fujitsu.com; q=dns/txt; s=fj2; t=1721984957; x=1753520957; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=t6QuhYE/SSRR8xQo7N5Fh/penc48x+ovfqOLB5mzMIE=; b=AWY2J/Bey0bI4l+7Rvj6cblayyzkZuAOHXIqyQ2sA0R4/FL45uRWP3dk 3xubype9dUM9xhHccTnbZVKXaybehgezbozX0OJ5ahekl0Td4eCoanZ7i Vni6T6jOJVEypCFZPbYTv3aWy4megw0XVP4wsgA6fBqHaR3t2c6MRTqVS I2PFFXXiTSAHutb9Hofi5jDPfXH5VCIMg137L287onNLGA33CiTDCmK6v CBnUiG/bgFnfbkW+O2BVY/5LXPAJFYh3B74ZyJUrOyMcclP/D/58TzkAg jKpB4Zm6JeDT1pCzCS8LEov6yjSydjBHj1V4MLuAxenrF2+AWm87/FsUg A==; X-IronPort-AV: E=McAfee;i="6700,10204,11144"; a="168766316" X-IronPort-AV: E=Sophos;i="6.09,238,1716217200"; d="scan'208";a="168766316" Received: from unknown (HELO oym-r3.gw.nic.fujitsu.com) ([210.162.30.91]) by esa4.hc1455-7.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jul 2024 18:09:15 +0900 Received: from oym-m1.gw.nic.fujitsu.com (oym-nat-oym-m1.gw.nic.fujitsu.com [192.168.87.58]) by oym-r3.gw.nic.fujitsu.com (Postfix) with ESMTP id D9B7FBBCA0 for ; Fri, 26 Jul 2024 17:45:29 +0900 (JST) Received: from kws-ab3.gw.nic.fujitsu.com (kws-ab3.gw.nic.fujitsu.com [192.51.206.21]) by oym-m1.gw.nic.fujitsu.com (Postfix) with ESMTP id 2F5B5D8BBE for ; Fri, 26 Jul 2024 17:45:29 +0900 (JST) Received: from edo.cn.fujitsu.com (edo.cn.fujitsu.com [10.167.33.5]) by kws-ab3.gw.nic.fujitsu.com (Postfix) with ESMTP id 9F2DD2008B3A9 for ; Fri, 26 Jul 2024 17:45:23 +0900 (JST) Received: from localhost.localdomain (unknown [10.167.226.45]) by edo.cn.fujitsu.com (Postfix) with ESMTP id D27D31A000B; Fri, 26 Jul 2024 16:45:22 +0800 (CST) From: Li Zhijian To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Oscar Salvador , linux-kernel@vger.kernel.org, Yasunori Gotou , Li Zhijian Subject: [PATCH RFC] mm: Avoid triggering oom-killer during memory hot-remove operations Date: Fri, 26 Jul 2024 16:44:56 +0800 Message-Id: <20240726084456.1309928-1-lizhijian@fujitsu.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-TM-AS-Product-Ver: IMSS-9.1.0.1417-9.0.0.1002-28552.006 X-TM-AS-User-Approved-Sender: Yes X-TMASE-Version: IMSS-9.1.0.1417-9.0.1002-28552.006 X-TMASE-Result: 10--3.086200-10.000000 X-TMASE-MatchedRID: 2R1RapUTXmo8g1fEYZ382xIRh9wkXSlFvhf/zJ92tsN9GT4ye9aN7g9l TAN4WxnxKuCh/pfd43i12HagvbwDji/7QU2czuUNA9lly13c/gGOVGny5q72hhFBDiQWqOMkoyu fjYs62v6OpJMltluhQ/PW5Zy7hnhRWcA4Y6culFZ/OBWacv+iVRLBqTl41fL7K6iPfafXtVijxY yRBa/qJShNCXvA0fw+A/nNPR3f3dzGVuWouVipcpfPZdpiFyjvE7jm0IggUvjijL8OSScfR6Ttp fvQitq6IgceRPDLi1UGFqMUnVlsWFS668seOAnWhAr0BLHZJbgLt6WET4w7+RFltGxCTkwFQHVA +r1vGdZmQDEDCMiuswfP8fSSIvISoYC0cwOOST0= X-TMASE-SNAP-Result: 1.821001.0001-0-1-22:0,33:0,34:0-0 X-Rspamd-Queue-Id: A34691C001F X-Stat-Signature: r8sanu5poa6kxywpt94ufht9ojq5ktda X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1721984957-361351 X-HE-Meta: U2FsdGVkX19H/e9UKJqmbA1PrzKOs8Z9khbJC308Z9SubSLZsMOCyPTDcWvjperW15SzXgHTRbP6UiwTb8sZ5uv92UiCnwvOIi+n3+wDCU09c5Xy3tWIJwEWb/5BKGP9ef2qtQLqzV9y3eNIjgzttOlyiaTLp6AeOSmmR2XaJn4XOuArONjHrJYDmgZKuk4x6jJ3BnpSXh38+zSAcATzAAL9sEGeIPYTiDRGWj3K0Z5BoVwZM4SSyUk3E9uH+M9SpyN6FwAgL9Maw64nT/h/sYFr7JGJDxwRgDAUq95DNE8XAknpxFgzDcKXzVb86E1uM7cV6cSZ5WxK1Ya/wLy/JFJIKLnlkwxmZkBSewYSoDjpBhfg0YBdlYO9EnOpUSvLJLi9RiMWpMgDy1Q5s3tORkqS4dXXOUfTlmjF5HXzkMMTrZHKdmi4QtL+FlXZSF47zjEGJdW/yPTgmGsqV7GytkUG+GWi8TGyH8x+57ILseiU7W7bZ0aqy7tsPluQbAwDlvJqn47XLXQSwFV8Pw43jA1K426q3jrAYJ3fFsYOKtuMI+2ae4CYXCcX6fXjKjllEdMTQXlRBjZ6DyZTS54GTfg5IKdhKzur7Gs0h7J5aA3cLfAr5f/UaFr2J7GoYeLu8H9+f2F09HMNkd5Gl7gvyG5EgyhIRHs4ozdbG8BHMaS35MrWhkOIrnqz0bS0D9tmGHADm/ulDGgALcJ5WVQ0uKXZPCyxWg1J1Yi+rZgLcy8hE9G5meypYhv5CdHERcBc6OJy2vssGBowvToO6NklPV42c4qfsJO9skcH0M/io7Fcpdm79QLN42VKjJtSXfpERp72PcT+iXizqtZNDT3b2Tw3eSyp52/O74is9F0y/C2hQV6RUYDDWYDFtN/qnTftqREUCPtQG1wUZX7jpx7n1mBPgFB3IuQavsiEhKBxEQeLPeEWzkgJe/KKUuv7BLJZhae58az+cV9zyXRgM+D lla/RTsR mcIAQdOTPNtibFsLnrhR+56vY7fhWKSnCWC7XXYhHWooQZeGRAW0eflkKAq9+f1w8f1K7gp4B6rUU53/5l8p11ZgK04HahUgvrMjw6jGQ22myUHbH7nbUG3iro4oocZpNgxmFTBdp7so9urLCcONqLvfMuTKocGYpJ2hHZ8pi1qD3YVNrUiphKnLRRybk3WbL+e3X0mNF8YGJ4bKmHeVzVeE0p6YT4xWyMF3P729h5OQgIHg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When a process is bound to a node that is being hot-removed, any memory allocation attempts from that node should fail gracefully without triggering the OOM-killer. However, the current behavior can cause the oom-killer to be invoked, leading to the termination of processes on other nodes, even when there is sufficient memory available in the system. Prevent the oom-killer from being triggered by processes bound to a node undergoing hot-remove operations. Instead, the allocation attempts from the offlining node will simply fail, allowing the process to handle the failure appropriately without causing disruption to the system. Signed-off-by: Li Zhijian --- include/linux/memory_hotplug.h | 6 ++++++ mm/memory_hotplug.c | 21 +++++++++++++++++++++ mm/page_alloc.c | 6 ++++++ 3 files changed, 33 insertions(+) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 7a9ff464608d..0ca804215e11 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -332,6 +332,7 @@ extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages, extern int remove_memory(u64 start, u64 size); extern void __remove_memory(u64 start, u64 size); extern int offline_and_remove_memory(u64 start, u64 size); +bool is_offlining_node(nodemask_t nodes); #else static inline void try_offline_node(int nid) {} @@ -348,6 +349,11 @@ static inline int remove_memory(u64 start, u64 size) } static inline void __remove_memory(u64 start, u64 size) {} + +static inline bool is_offlining_node(nodemask_t nodes) +{ + return false; +} #endif /* CONFIG_MEMORY_HOTREMOVE */ #ifdef CONFIG_MEMORY_HOTPLUG diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 431b1f6753c0..da3982751ba9 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1938,6 +1938,22 @@ static int count_system_ram_pages_cb(unsigned long start_pfn, return 0; } +static nodemask_t offlining_node = NODE_MASK_NONE; + +bool is_offlining_node(nodemask_t nodes) +{ + return nodes_equal(offlining_node, nodes); +} + +static void offline_pages_start(int node) +{ + node_set(node, offlining_node); +} + +static void offline_pages_end(void) +{ + offlining_node = NODE_MASK_NONE; +} /* * Must be called with mem_hotplug_lock in write mode. */ @@ -1991,6 +2007,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, goto failed_removal; } + offline_pages_start(node); /* * Disable pcplists so that page isolation cannot race with freeing * in a way that pages from isolated pageblock are left on pcplists. @@ -2107,6 +2124,8 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, memory_notify(MEM_OFFLINE, &arg); remove_pfn_range_from_zone(zone, start_pfn, nr_pages); + offline_pages_end(); + return 0; failed_removal_isolated: @@ -2121,6 +2140,8 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, (unsigned long long) start_pfn << PAGE_SHIFT, ((unsigned long long) end_pfn << PAGE_SHIFT) - 1, reason); + + offline_pages_end(); return ret; } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1780df31d5f5..acdab6b114a5 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3563,6 +3563,12 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, if (page) goto out; + /* hot-remove is on-going, it generally fails to allocate memory from + * the being removed memory node. Leave it alone. + */ + if (is_offlining_node(*ac->nodemask)) + goto out; + /* Coredumps can quickly deplete all memory reserves */ if (current->flags & PF_DUMPCORE) goto out;