From patchwork Thu Jul 27 08:02:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Aneesh Kumar K.V" X-Patchwork-Id: 13329141 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 263DDC0015E for ; Thu, 27 Jul 2023 08:03:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9D9F26B0078; Thu, 27 Jul 2023 04:03:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 98AA96B007B; Thu, 27 Jul 2023 04:03:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82B396B007D; Thu, 27 Jul 2023 04:03:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6F35C6B0078 for ; Thu, 27 Jul 2023 04:03:15 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4FB73160FE6 for ; Thu, 27 Jul 2023 08:03:15 +0000 (UTC) X-FDA: 81056651550.15.00B189B Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf25.hostedemail.com (Postfix) with ESMTP id 016D5A001D for ; Thu, 27 Jul 2023 08:03:12 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=rf2S1HOd; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf25.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690444993; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XeNGIPUf7wZ3h97+WSmcctwpHIYSBEv685Z8C0KQiMQ=; b=KYS+p3ns5n2iwEaFT1dMSBt25adiyZGDv5uHFY0F9an9NsI6OA8vsAsLFElecyzHlAw299 vc6nJW4WwblD80Pc6jVbb2POjNi+qL3YAzHnrAvsJsPu3ZqM3DzVumvSVCVCTDEExZPtAU pFuYtt25BA7DCSodxDTeyfj6qvim3Pw= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=rf2S1HOd; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf25.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690444993; a=rsa-sha256; cv=none; b=fOAXBiiPO5TRhmHkFvALoqeEM/YkGKNk0iER0VQe9F6Z8Kalg9xqny3EMk/d+fBLnfAa5S 5TcbuDdRmSLIeSxL9ZDTkzZrb7anIxl1L7Q8EvnhvulPWqE2ldwOjywJukNL5MMolhZHfm ao+ijJA561lU+U/5L0T+kJFOar2dj+w= Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36R7cZvY013832; Thu, 27 Jul 2023 08:03:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=XeNGIPUf7wZ3h97+WSmcctwpHIYSBEv685Z8C0KQiMQ=; b=rf2S1HOd/gq/zj3tbDPe7Wmq2XMGLP3Dk2JJGm27JEFzBOQoXvXwEwYPg1IrLZqgDUhU ZodJfIrN/PUTpQqD+tJEkECllcdguWY7jLPO1DGVZ7BIGXdHZxY8aYbepj6zHzzEX52z +HGyqZKIRIJXKdkfoJObPgSEnNBJjWeIx8+ObVs0BoA9q05TColY3MFQ0tFIbX1yXqvt R+7zDRlY1tQgFzPMzuMXSZAJWaubUlO6YB6TeECaNnNNfPpPZxLzvFEUf8HrYWBMKCHa XStVXqKw/Of24Xj4sxM0IuuQ3AM5KJR6A+ZyPYZkPfxnYCP36oDObUO64ZvpZwBF+rW+ rg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3s3k83je83-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Jul 2023 08:03:05 +0000 Received: from m0360072.ppops.net (m0360072.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 36R7cm5e015801; Thu, 27 Jul 2023 08:03:05 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3s3k83je62-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Jul 2023 08:03:04 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 36R7xj6R014373; Thu, 27 Jul 2023 08:03:02 GMT Received: from smtprelay06.dal12v.mail.ibm.com ([172.16.1.8]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3s0styc20s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Jul 2023 08:03:02 +0000 Received: from smtpav01.wdc07v.mail.ibm.com (smtpav01.wdc07v.mail.ibm.com [10.39.53.228]) by smtprelay06.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 36R831VE56033536 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Jul 2023 08:03:02 GMT Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B146F5805B; Thu, 27 Jul 2023 08:03:01 +0000 (GMT) Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 99D395804B; Thu, 27 Jul 2023 08:02:57 +0000 (GMT) Received: from skywalker.in.ibm.com (unknown [9.109.212.144]) by smtpav01.wdc07v.mail.ibm.com (Postfix) with ESMTP; Thu, 27 Jul 2023 08:02:57 +0000 (GMT) From: "Aneesh Kumar K.V" To: linux-mm@kvack.org, akpm@linux-foundation.org, mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com, christophe.leroy@csgroup.eu Cc: Oscar Salvador , David Hildenbrand , Michal Hocko , Vishal Verma , "Aneesh Kumar K.V" Subject: [PATCH v6 4/7] mm/memory_hotplug: Support memmap_on_memory when memmap is not aligned to pageblocks Date: Thu, 27 Jul 2023 13:32:29 +0530 Message-ID: <20230727080232.667439-5-aneesh.kumar@linux.ibm.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230727080232.667439-1-aneesh.kumar@linux.ibm.com> References: <20230727080232.667439-1-aneesh.kumar@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: pAxZu9ZG_ClK_Ghs-RfQSrd973aFl_FZ X-Proofpoint-GUID: SGC6UnTeHfmk9y6EpBJGu1pSeXJexq59 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-26_08,2023-07-26_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 spamscore=0 priorityscore=1501 mlxscore=0 bulkscore=0 phishscore=0 mlxlogscore=999 adultscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2307270066 X-Rspamd-Queue-Id: 016D5A001D X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: geudroiye79rh6ypd49rx9cd5kypcfsk X-HE-Tag: 1690444992-336909 X-HE-Meta: U2FsdGVkX1/vI+JupumnA96JOL7KMKT4UzdyzPHtUbjttira4e/tZcRw8UkDcMdDGo2s8arFbKFvaUO3DJHP4nFMOQt5SEpZJQ0UnIdPQwkdZ9pYKRGr7WYE06qwuLXqp8cknB9NdyKfaKgCj9/VPAdGcf07UGb7iU8cdcNl3hCjAX5jE7n3TNQnCb3RNdAB3FOThImgeZfvuYGDWxgPfXW+uZub58JmslsiqdSC8uxs8G+OXd1H+IgNJVabgv61KqR2MUmqyEQsDlXq4sirOVxlBTBJMaZdqaLef3N85N0xIK2dQc2tzPFdVrfSesFregxEX8SUFUd/KEVVsb3ZqMwyq5BlO9wxLL+NSFO0rEHqIk71GDkap88TLti38UylOQvwfMZnEncIkXyHVYSMqGnSS8l8p/0RS/Jzr7LKUF2L2ePidznJ/kcMF2iLd9izQM8cVx4KxxpBSmMkKjOkHceFo+MlsQTlX8dEQsR6HlSDRuhGZ94toF/MUtZqdrUFvJ+bBLR3ZMnHqJeYakPfUDJhCeCdqYopsdiwZSqWtYwvL9oEvulsdYJ3BjtOBLmuP0zicvl7/8YsU4G7RqFn+RsrcWyPjIUjBBA7B7HF5pkOtFVyTnSDWgJDv1bHzL0+2RAFUDy7lzmEnBfmhOEGfEfHjp1E0J2VzTxh5Dcrf+PbT4H+aQ+PeFjoknvpnFplr0XAMaMIzoCyE/MKoE7WT1wmsXVI5CsF7XlWoAL9fdlXbDthuNY8TvX5ecgB9A5vyOqQuVeN9rXsZJeKlrcxTEsq9vKerKd59a6gtQ2jNxkIm3x1gTIZmbx/WaVtJuzeru1Sf8K6aWr8k7TJiYHk7vJsPKhUjPdiJbPrmap63nxGWeAf9aaFiq8cupBMxpnS7GfrgsC5ofCpkwhpaDunsOFrDcSI3K+fniYETRO2IYx82XCEn1PouNO4ZdWoOwRR5i23zKtgmydYilaf8O1 9sAaYGmp GmlHmQvWbJw8YrEVSZk+psFswwk5JarUqXJN2+ffz6U4NxcwmzAMgK5wtz0Q8sqeKTUNLN2n27UrP3IvwIy+RKYNnWut5K8OayLLdLl87kqdxIlP2UqPuab+lIgvpzpg7XF1zv+RrOfqyrLIlUbtpBtU4KrT6HpcS9peag68YtQayjSdG2pcoHcc0kyOW6g6td/Huse9U6TkCKUemZVG4v8LYf4sL5zly5A5KW28XRP+cIj4+nRr5LT4VKCihfvHCu7jh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, memmap_on_memory feature is only supported with memory block sizes that result in vmemmap pages covering full page blocks. This is because memory onlining/offlining code requires applicable ranges to be pageblock-aligned, for example, to set the migratetypes properly. This patch helps to lift that restriction by reserving more pages than required for vmemmap space. This helps the start address to be page block aligned with different memory block sizes. Using this facility implies the kernel will be reserving some pages for every memoryblock. This allows the memmap on memory feature to be widely useful with different memory block size values. For ex: with 64K page size and 256MiB memory block size, we require 4 pages to map vmemmap pages, To align things correctly we end up adding a reserve of 28 pages. ie, for every 4096 pages 28 pages get reserved. Acked-by: David Hildenbrand Signed-off-by: Aneesh Kumar K.V --- .../admin-guide/mm/memory-hotplug.rst | 12 ++ mm/memory_hotplug.c | 120 +++++++++++++++--- 2 files changed, 113 insertions(+), 19 deletions(-) diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst index bd77841041af..2994958c7ce8 100644 --- a/Documentation/admin-guide/mm/memory-hotplug.rst +++ b/Documentation/admin-guide/mm/memory-hotplug.rst @@ -433,6 +433,18 @@ The following module parameters are currently defined: memory in a way that huge pages in bigger granularity cannot be formed on hotplugged memory. + + With value "force" it could result in memory + wastage due to memmap size limitations. For + example, if the memmap for a memory block + requires 1 MiB, but the pageblock size is 2 + MiB, 1 MiB of hotplugged memory will be wasted. + Note that there are still cases where the + feature cannot be enforced: for example, if the + memmap is smaller than a single page, or if the + architecture does not support the forced mode + in all configurations. + ``online_policy`` read-write: Set the basic policy used for automatic zone selection when onlining memory blocks without specifying a target zone. diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 746cb7c08c64..fe94feb32d71 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -41,17 +41,83 @@ #include "internal.h" #include "shuffle.h" +enum { + MEMMAP_ON_MEMORY_DISABLE = 0, + MEMMAP_ON_MEMORY_ENABLE, + MEMMAP_ON_MEMORY_FORCE, +}; + +static int memmap_mode __read_mostly = MEMMAP_ON_MEMORY_DISABLE; + +static inline unsigned long memory_block_memmap_size(void) +{ + return PHYS_PFN(memory_block_size_bytes()) * sizeof(struct page); +} + +static inline unsigned long memory_block_memmap_on_memory_pages(void) +{ + unsigned long nr_pages = PFN_UP(memory_block_memmap_size()); + + /* + * In "forced" memmap_on_memory mode, we add extra pages to align the + * vmemmap size to cover full pageblocks. That way, we can add memory + * even if the vmemmap size is not properly aligned, however, we might waste + * memory. + */ + if (memmap_mode == MEMMAP_ON_MEMORY_FORCE) + return pageblock_align(nr_pages); + return nr_pages; +} + #ifdef CONFIG_MHP_MEMMAP_ON_MEMORY /* * memory_hotplug.memmap_on_memory parameter */ -static bool memmap_on_memory __ro_after_init; -module_param(memmap_on_memory, bool, 0444); -MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug"); +static int set_memmap_mode(const char *val, const struct kernel_param *kp) +{ + int ret, mode; + bool enabled; + + if (sysfs_streq(val, "force") || sysfs_streq(val, "FORCE")) { + mode = MEMMAP_ON_MEMORY_FORCE; + } else { + ret = kstrtobool(val, &enabled); + if (ret < 0) + return ret; + if (enabled) + mode = MEMMAP_ON_MEMORY_ENABLE; + else + mode = MEMMAP_ON_MEMORY_DISABLE; + } + *((int *)kp->arg) = mode; + if (mode == MEMMAP_ON_MEMORY_FORCE) { + unsigned long memmap_pages = memory_block_memmap_on_memory_pages(); + + pr_info_once("Memory hotplug will reserve %ld pages in each memory block\n", + memmap_pages - PFN_UP(memory_block_memmap_size())); + } + return 0; +} + +static int get_memmap_mode(char *buffer, const struct kernel_param *kp) +{ + if (*((int *)kp->arg) == MEMMAP_ON_MEMORY_FORCE) + return sprintf(buffer, "force\n"); + return param_get_bool(buffer, kp); +} + +static const struct kernel_param_ops memmap_mode_ops = { + .set = set_memmap_mode, + .get = get_memmap_mode, +}; +module_param_cb(memmap_on_memory, &memmap_mode_ops, &memmap_mode, 0444); +MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug\n" + "With value \"force\" it could result in memory wastage due " + "to memmap size limitations (Y/N/force)"); static inline bool mhp_memmap_on_memory(void) { - return memmap_on_memory; + return memmap_mode != MEMMAP_ON_MEMORY_DISABLE; } #else static inline bool mhp_memmap_on_memory(void) @@ -1247,11 +1313,6 @@ static int online_memory_block(struct memory_block *mem, void *arg) return device_online(&mem->dev); } -static inline unsigned long memory_block_memmap_size(void) -{ - return PHYS_PFN(memory_block_size_bytes()) * sizeof(struct page); -} - #ifndef arch_supports_memmap_on_memory static inline bool arch_supports_memmap_on_memory(unsigned long vmemmap_size) { @@ -1267,7 +1328,7 @@ static inline bool arch_supports_memmap_on_memory(unsigned long vmemmap_size) static bool mhp_supports_memmap_on_memory(unsigned long size) { unsigned long vmemmap_size = memory_block_memmap_size(); - unsigned long remaining_size = size - vmemmap_size; + unsigned long memmap_pages = memory_block_memmap_on_memory_pages(); /* * Besides having arch support and the feature enabled at runtime, we @@ -1295,10 +1356,28 @@ static bool mhp_supports_memmap_on_memory(unsigned long size) * altmap as an alternative source of memory, and we do not exactly * populate a single PMD. */ - return mhp_memmap_on_memory() && - size == memory_block_size_bytes() && - IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT)) && - arch_supports_memmap_on_memory(vmemmap_size); + if (!mhp_memmap_on_memory() || size != memory_block_size_bytes()) + return false; + + /* + * Make sure the vmemmap allocation is fully contained + * so that we always allocate vmemmap memory from altmap area. + */ + if (!IS_ALIGNED(vmemmap_size, PAGE_SIZE)) + return false; + + /* + * start pfn should be pageblock_nr_pages aligned for correctly + * setting migrate types + */ + if (!pageblock_aligned(memmap_pages)) + return false; + + if (memmap_pages == PHYS_PFN(memory_block_size_bytes())) + /* No effective hotplugged memory doesn't make sense. */ + return false; + + return arch_supports_memmap_on_memory(vmemmap_size); } /* @@ -1311,7 +1390,10 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) { struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) }; enum memblock_flags memblock_flags = MEMBLOCK_NONE; - struct vmem_altmap mhp_altmap = {}; + struct vmem_altmap mhp_altmap = { + .base_pfn = PHYS_PFN(res->start), + .end_pfn = PHYS_PFN(res->end), + }; struct memory_group *group = NULL; u64 start, size; bool new_node = false; @@ -1356,8 +1438,7 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) */ if (mhp_flags & MHP_MEMMAP_ON_MEMORY) { if (mhp_supports_memmap_on_memory(size)) { - mhp_altmap.free = PHYS_PFN(size); - mhp_altmap.base_pfn = PHYS_PFN(start); + mhp_altmap.free = memory_block_memmap_on_memory_pages(); params.altmap = &mhp_altmap; } /* fallback to not using altmap */ @@ -1369,8 +1450,7 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) goto error; /* create memory block devices after memory was added */ - ret = create_memory_block_devices(start, size, mhp_altmap.alloc, - group); + ret = create_memory_block_devices(start, size, mhp_altmap.free, group); if (ret) { arch_remove_memory(start, size, NULL); goto error; @@ -2096,6 +2176,8 @@ static int __ref try_remove_memory(u64 start, u64 size) * right thing if we used vmem_altmap when hot-adding * the range. */ + mhp_altmap.base_pfn = PHYS_PFN(start); + mhp_altmap.free = nr_vmemmap_pages; mhp_altmap.alloc = nr_vmemmap_pages; altmap = &mhp_altmap; } From patchwork Thu Jul 27 08:02:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Aneesh Kumar K.V" X-Patchwork-Id: 13329619 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96966C001E0 for ; Thu, 27 Jul 2023 08:26:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 162538D0010; Thu, 27 Jul 2023 04:26:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0EC1E8D0001; Thu, 27 Jul 2023 04:26:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E80988D0010; Thu, 27 Jul 2023 04:26:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D68DD8D0001 for ; Thu, 27 Jul 2023 04:26:46 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A8732160445 for ; Thu, 27 Jul 2023 08:26:46 +0000 (UTC) X-FDA: 81056710812.29.4DE91A6 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf26.hostedemail.com (Postfix) with ESMTP id 4B8C1140014 for ; Thu, 27 Jul 2023 08:26:44 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=XScvb7zU; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf26.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690446404; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=F4qcZEoTqwkRahHimypWFVri5q7pOMDH/5i6hbY2k60=; b=BjYJXV5N11wL76yHr9AAmtIgTWnnXskHJ7TtOgZ1WIblB1A0BMdeolXEsf+5OTPhrL1u4d yjyjverfjrnR6U3pkA+Dl8xGpFJPcUC51BxlMLf6HcWNoeXKqSSwoap9gIIzIwm/VxQchv I6s1RPyo89wxXPWjh9ydV0Z93bNh30I= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=XScvb7zU; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf26.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690446404; a=rsa-sha256; cv=none; b=XTBl1HSN4K1yTsW3JM0yRNYcVpE9vvG3ch5Vogbt17A8t2igNxcfD6BPh70VCapwJPPTFW t3cN+UgbrK8fdBMc1ApgCAUo7WTIHg0xZ2/of0s8hCdvS99WKdmAKkzIriAOFrEuHuISJX 0DfE0REoUkStTEVwE20xAyJB+9hvk3M= Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36R889ip002066; Thu, 27 Jul 2023 08:26:38 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=F4qcZEoTqwkRahHimypWFVri5q7pOMDH/5i6hbY2k60=; b=XScvb7zUHqNJ/6WMfJLWApZXomcTJFmhwRZdbErn0Miefw41VPRilp9k44XhHFnBQ7Wo DOH/XqmlirxL8lcwsRS3QDemZbWf2nsRMVr3N5d5HJS16vBWKf7UBTSEYu+KZxW8NrHn CjroOQR/T994u/Elj1tk6W+Xu2R5fxn7e6c3zw6RRiJDqFSSW4pP0lQulgwSfsVwoYvj gENf8CAPqH8NISIZJh7lgjug3EtQorrPJPHUwLRNs468VyuSoN/2PS/kGLvqnUZMyeXf +ftt4qy3uxvEpoOMnhlpB02bUKkMwLvYABMoHThkdRo+rV7O+aDhl6WUhUCyXIXmhngk 6g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3s3kn6thv2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Jul 2023 08:26:37 +0000 Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 36R89mdA008631; Thu, 27 Jul 2023 08:26:36 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3s3kn6thrf-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Jul 2023 08:26:36 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 36R666Jr002024; Thu, 27 Jul 2023 08:03:06 GMT Received: from smtprelay04.wdc07v.mail.ibm.com ([172.16.1.71]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3s0tenbse5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Jul 2023 08:03:06 +0000 Received: from smtpav01.wdc07v.mail.ibm.com (smtpav01.wdc07v.mail.ibm.com [10.39.53.228]) by smtprelay04.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 36R836uI32506594 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Jul 2023 08:03:06 GMT Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 733EE58055; Thu, 27 Jul 2023 08:03:06 +0000 (GMT) Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4B88158063; Thu, 27 Jul 2023 08:03:02 +0000 (GMT) Received: from skywalker.in.ibm.com (unknown [9.109.212.144]) by smtpav01.wdc07v.mail.ibm.com (Postfix) with ESMTP; Thu, 27 Jul 2023 08:03:01 +0000 (GMT) From: "Aneesh Kumar K.V" To: linux-mm@kvack.org, akpm@linux-foundation.org, mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com, christophe.leroy@csgroup.eu Cc: Oscar Salvador , David Hildenbrand , Michal Hocko , Vishal Verma , "Aneesh Kumar K.V" Subject: [PATCH v6 5/7] powerpc/book3s64/memhotplug: Enable memmap on memory for radix Date: Thu, 27 Jul 2023 13:32:30 +0530 Message-ID: <20230727080232.667439-6-aneesh.kumar@linux.ibm.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230727080232.667439-1-aneesh.kumar@linux.ibm.com> References: <20230727080232.667439-1-aneesh.kumar@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: lh_ZiFxalrUlTTNaM9GqpVKrngePgJpC X-Proofpoint-GUID: 3DHdKSiL-Hg0qtHRueiisS1Gh-XDlQmi X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-26_08,2023-07-26_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 priorityscore=1501 phishscore=0 lowpriorityscore=0 mlxlogscore=999 clxscore=1015 adultscore=0 mlxscore=0 suspectscore=0 spamscore=0 malwarescore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2307270070 X-Rspam-User: X-Stat-Signature: 8xjjugffys8zt1ms8dmdjt8i1xeqpi6w X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 4B8C1140014 X-HE-Tag: 1690446404-159475 X-HE-Meta: U2FsdGVkX19bO83nrpzNG88fkoQ1AMeelHMM8fQ2YamXuaxg0jJW1hqatwK/ZGci4iabz69gduBEPzdnkUVKslnjQ0YukfYksMwsbEK0XXZ9URG2QnXA3XJrPJgSJbLNgEE9FF8BqP2YpD0sVqdhvPlaPnNyfdrEPdC0lB8wQfDmLpUC85IHD/h56W9y5pRSslS8pHYNVS2xMwjKXL2DUO0MaaM7RQzdJd01HvJ+GJ3UQcLPYBjy7p4u2Xu3yW3ItMaOwUtTF6Jq4IyOzgnqpotAgYlTTDoBYwLm65gziHOQtfgvte+aRXochTgJGbG0vh9KkMeVkcbCiZ8e0PU/Ld2Tq/C53MgKt+U3sSB3ybVfKgmSfCF3hkPyRkk7eigfRqo5zk8TXbF2srzDMZK1c7HDxlFmACQO3PPpl5qM6fD93H+fNMGfUH3mmyyibhfN8DvkfIcVZRvjaJFIoEipryBHtA6ljCZEfg3EXdm5bL2iL0YeHWUiSKVeJb3mnCC8dATuaowSPzfZ5zTR49a54fUx8QiO0dlcqYxlulXjYBmxo6XZbSDa621F0PNgsNjXK4eQD+LgJcHUHS5FuUwOAWJFGT3X5ntKRgkVwnU+W+jZyOoyF4uZq69duQ2FtdJo5513D+t1pLJcA+HT/DkuEIS4Qyd86KiYlK00QknY7Vf1hosZ4XJQGbjEiYASWM1fC3IWJdYfNnHyPGjcZ7qhXqMWY8zEfgYXVGSkXQkRxvI6DPVYz3Kmluo9cuRwRj/c4KiwxEZrmzQ6UcQW1mxlDosqsVVrSXb5N2YG/fOS9snTErF2WD3BQQOUfA9U79Nqu2VBv/v2RSCDx2KthdQDsgDisp7zrGR5izrnu3EDEUFDwOzt4iU9nGbqGo/lH8LXYzSU863K4Ahw9bhJXNCEwFWwqysXGC/AWWN1JGnN4frpZu4WrH2fxF+JB40+zLenlkdjqaVX21+oN80xIZS p1QQKlk1 hTYhRJNJDDsHjL4TOFdjvq4x1+Hw87jGv0jfOOGKOB+e/BgPAckOjhZT6pmB1MphJ8bhgwt0AHDidqThcsW7xjjNtwkPC6b8jj4Hx10KDHJ221RarpgveJNcZqfkgskSU+9EhMUuMVOtPThHxUcZRsZXh3LbnUd2punL7HDWsY/iCxm4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Radix vmemmap mapping can map things correctly at the PMD level or PTE level based on different device boundary checks. Hence we skip the restrictions w.r.t vmemmap size to be multiple of PMD_SIZE. This also makes the feature widely useful because to use PMD_SIZE vmemmap area we require a memory block size of 2GiB We can also use MHP_RESERVE_PAGES_MEMMAP_ON_MEMORY to that the feature can work with a memory block size of 256MB. Using altmap.reserve feature to align things correctly at pageblock granularity. We can end up losing some pages in memory with this. For ex: with a 256MiB memory block size, we require 4 pages to map vmemmap pages, In order to align things correctly we end up adding a reserve of 28 pages. ie, for every 4096 pages 28 pages get reserved. Reviewed-by: David Hildenbrand Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/pgtable.h | 21 +++++++++++++++++++ .../platforms/pseries/hotplug-memory.c | 2 +- 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index d0497d13f5b4..938294c996dc 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -157,6 +157,7 @@ config PPC select ARCH_HAS_UBSAN_SANITIZE_ALL select ARCH_HAVE_NMI_SAFE_CMPXCHG select ARCH_KEEP_MEMBLOCK + select ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE if PPC_RADIX_MMU select ARCH_MIGHT_HAVE_PC_PARPORT select ARCH_MIGHT_HAVE_PC_SERIO select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index a4893b17705a..33464e6d6431 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -161,6 +161,27 @@ static inline pgtable_t pmd_pgtable(pmd_t pmd) int __meminit vmemmap_populated(unsigned long vmemmap_addr, int vmemmap_map_size); bool altmap_cross_boundary(struct vmem_altmap *altmap, unsigned long start, unsigned long page_size); +/* + * mm/memory_hotplug.c:mhp_supports_memmap_on_memory goes into details + * some of the restrictions. We don't check for PMD_SIZE because our + * vmemmap allocation code can fallback correctly. The pageblock + * alignment requirement is met using altmap->reserve blocks. + */ +#define arch_supports_memmap_on_memory arch_supports_memmap_on_memory +static inline bool arch_supports_memmap_on_memory(unsigned long vmemmap_size) +{ + if (!radix_enabled()) + return false; + /* + * With 4K page size and 2M PMD_SIZE, we can align + * things better with memory block size value + * starting from 128MB. Hence align things with PMD_SIZE. + */ + if (IS_ENABLED(CONFIG_PPC_4K_PAGES)) + return IS_ALIGNED(vmemmap_size, PMD_SIZE); + return true; +} + #endif /* CONFIG_PPC64 */ #endif /* __ASSEMBLY__ */ diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 9c62c2c3b3d0..4f3d6a2f9065 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -637,7 +637,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb) nid = first_online_node; /* Add the memory */ - rc = __add_memory(nid, lmb->base_addr, block_sz, MHP_NONE); + rc = __add_memory(nid, lmb->base_addr, block_sz, MHP_MEMMAP_ON_MEMORY); if (rc) { invalidate_lmb_associativity_index(lmb); return rc; From patchwork Thu Jul 27 08:02:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Aneesh Kumar K.V" X-Patchwork-Id: 13329618 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D794C001DC for ; Thu, 27 Jul 2023 08:26:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CAD4A8D000F; Thu, 27 Jul 2023 04:26:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C374E8D0001; Thu, 27 Jul 2023 04:26:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB03A8D000F; Thu, 27 Jul 2023 04:26:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 945268D0001 for ; Thu, 27 Jul 2023 04:26:44 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 58BC14050F for ; Thu, 27 Jul 2023 08:26:44 +0000 (UTC) X-FDA: 81056710728.12.AE3C7C4 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf10.hostedemail.com (Postfix) with ESMTP id 95DA6C000C for ; Thu, 27 Jul 2023 08:26:41 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=ZNmu7e3E; spf=pass (imf10.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690446402; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yGUbPuvFqM/wiCBxHfgkj+nAWDDl2gtRf+klnrJT/v0=; b=tTiM5MhN29kSW/ZdDv9/fcit/Q4XDg/HEK4zfSi26HgqBDIjSrFJQn4OlV+N/N/lmtix7U TBEFTNsGiNzEi4QP1N3PKXhXOy/X07R3/nGe8tiV78Di4qaZO3GXE4u1FPBPkxKoHwqP4d VB7E890QQ5SxiW2/rlLV6VBJal9cl5A= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690446402; a=rsa-sha256; cv=none; b=pPr2NUUYWbwHSLvsKTePeGIskhvBrGRzNDTy3iK1oprrJKVnzYXNGTuyuRCcybejzeATPV ggmWj3qAiVkftxlNH1cEyjDzg0xtdCgYU2T3eBB8Hc3S3TgRx/Jae8W8Vd1M0sNLwZqQsL hg9AHueGpXG7rCtvtQmzFlZq8BmcWUA= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=ZNmu7e3E; spf=pass (imf10.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36R8AKUx023378; Thu, 27 Jul 2023 08:26:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=yGUbPuvFqM/wiCBxHfgkj+nAWDDl2gtRf+klnrJT/v0=; b=ZNmu7e3EsAcMJ+MCxN03U4oQTG0kLiUMFk0/GZ+J1QOc/wj/jGrdTu10AyoNJAFtPxJA 4lyc+qRCVaAh12jKzMl8pmanmyk8YG3QkRs4CV3fKNl8wt0PTyX7+cpgYbwUD9N69yyK GEyKJDnVkRbZn2ofdVo0Ol1W4PGT5DjdMLP6C55kVrPwrvDbW36QYHVqoR6JfhD1afBq aSAXAmR/efbCFFPrax8/MHhaNJ6bF24c0iP1CuaIMSkySVqJZMFz5OtujJCUmvLQHdfN e24G9wh3CTjEaVvV7m0rcn7hv+cKAXaTswkx49LtUnJGVsyKqw/UOUzc+suHe7a/bj6i GQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3s3m52hh2g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Jul 2023 08:26:32 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 36R8ApSD024823; Thu, 27 Jul 2023 08:26:32 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3s3m52hh1k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Jul 2023 08:26:32 +0000 Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 36R700ho003634; Thu, 27 Jul 2023 08:03:11 GMT Received: from smtprelay02.dal12v.mail.ibm.com ([172.16.1.4]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3s0txkbm5s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Jul 2023 08:03:11 +0000 Received: from smtpav01.wdc07v.mail.ibm.com (smtpav01.wdc07v.mail.ibm.com [10.39.53.228]) by smtprelay02.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 36R83AhM44827132 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Jul 2023 08:03:11 GMT Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CA83D58065; Thu, 27 Jul 2023 08:03:10 +0000 (GMT) Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0EA4F58063; Thu, 27 Jul 2023 08:03:07 +0000 (GMT) Received: from skywalker.in.ibm.com (unknown [9.109.212.144]) by smtpav01.wdc07v.mail.ibm.com (Postfix) with ESMTP; Thu, 27 Jul 2023 08:03:06 +0000 (GMT) From: "Aneesh Kumar K.V" To: linux-mm@kvack.org, akpm@linux-foundation.org, mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com, christophe.leroy@csgroup.eu Cc: Oscar Salvador , David Hildenbrand , Michal Hocko , Vishal Verma , "Aneesh Kumar K.V" Subject: [PATCH v6 6/7] mm/memory_hotplug: Embed vmem_altmap details in memory block Date: Thu, 27 Jul 2023 13:32:31 +0530 Message-ID: <20230727080232.667439-7-aneesh.kumar@linux.ibm.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230727080232.667439-1-aneesh.kumar@linux.ibm.com> References: <20230727080232.667439-1-aneesh.kumar@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: CdkKdmLel0NOkyZHJu6dbl5oRgWAgpBe X-Proofpoint-ORIG-GUID: xJ9NAgNhV8kHg295AN3yd8WsMCNn1LJn X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-26_08,2023-07-26_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 priorityscore=1501 adultscore=0 clxscore=1015 impostorscore=0 mlxscore=0 phishscore=0 mlxlogscore=838 bulkscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2307270070 X-Stat-Signature: gtao48htzi5zzfd9crqjpbna96gh36a5 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 95DA6C000C X-Rspam-User: X-HE-Tag: 1690446401-643326 X-HE-Meta: U2FsdGVkX18obrUbAHT5ovC6Qjz1DxAlFpP5gSnUwkm2roI5pFyt3JKNRkAPX7/hJJj6iuxoSkdcEENbmtRtZZDY3R8hzJQYznpTsc9YaDoc7xY8U4FWzZ+gP62QJ0NSgQs+fmffGmWpwKijzo4rNuDRHUZfj5p6kJgj+q4p6jvHVd2Nv4QvK26oUM/geC3EpSt2bW/ZTK2myzQipMywJ2sQZuWl0UP8EBY2RiC02ey7aw1gev5wb8XgyqYhi6MHPAM2m57XV2xpqDFNjBCcTacbyeHVMw6PZ9w946lPzKG/wCHcatuKBahB8fKa3BvdB6kBAiFtWXtAVVru/rGcjz1vGIPH1zwj8bomjq3if+QVOxAAOiCBFLiKDl0O2jv+/fELgk+VXFuHF6zEXFxA51gtiU7RbwbSQkBBiZTIniyp3RqPmugawiM85SFCc+XK0niN0+lGlGifqDSFy0NYt6l3xS7MrE8p2US2p8pZtCsfsi9PHABDKE8P6Pzx/D7djgOzugy67O1MnAVU2X+76CENHjahyAAVy+Gx8dXqBfmA3BAvqZoX/y8DfTMl8A7rJUGybjlsugk8/s4On464m58NigsWk1FIP0GA49ad2Rv1fXAA3jog3u9VHwlgN9pw/WyocI4NiCstCA/RJYAEAkOToEXlvbBeMIEs2tIQdW8sd4uPGc/KfjrI+GAEKWxyafkN61+VHSM+BK7teTb11bSdX3qitznoUtOC8zQii6yxpnrUpZgkA2TI7CaNsU5IyVfpeuu/BY89A1tou/QnlQgkKwHAFSszlW4+LyAV/V7D8tnIflemP1MegGq0SpHAXcio0qxtpKdbCjOg5HM+HBfjsnyVOPpyxbANA/p/8Rgf1L45xoZTCuEMbBoJTIblVCO4W+GDR2c3bVxhGyL5J8U33VnIzaOa0srOfrVYkPv2U3T8P2EdJtuysfqJ1sn3tUylVEcW7UvTCxaJ7Ja Sv5kXmd1 sh0KxdrwO94i3+fsUjsiF+wR46Sn+/RoU9ybGUG4IqHxKtL0NE6JnS7SEC77ddhjyebko9cTrqJQGMn9oUjM8TN2pzQd2rKGSndiQCruM5Pp2JDsEb23EYUa055waQ7ibawJHWTsANj0184ItmNeofTlbrYOCmylPFQ6/7gAF94ApWGycv50/eNdm+GI7PU8Sp8Fv X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With memmap on memory, some architecture needs more details w.r.t altmap such as base_pfn, end_pfn, etc to unmap vmemmap memory. Instead of computing them again when we remove a memory block, embed vmem_altmap details in struct memory_block if we are using memmap on memory block feature. No functional change in this patch Signed-off-by: Aneesh Kumar K.V Acked-by: David Hildenbrand --- drivers/base/memory.c | 25 +++++++++++------- include/linux/memory.h | 8 ++---- mm/memory_hotplug.c | 58 +++++++++++++++++++++++++++--------------- 3 files changed, 55 insertions(+), 36 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index b456ac213610..57ed61212277 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -106,6 +106,7 @@ static void memory_block_release(struct device *dev) { struct memory_block *mem = to_memory_block(dev); + WARN_ON(mem->altmap); kfree(mem); } @@ -183,7 +184,7 @@ static int memory_block_online(struct memory_block *mem) { unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr); unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; - unsigned long nr_vmemmap_pages = mem->nr_vmemmap_pages; + unsigned long nr_vmemmap_pages = 0; struct zone *zone; int ret; @@ -200,6 +201,9 @@ static int memory_block_online(struct memory_block *mem) * stage helps to keep accounting easier to follow - e.g vmemmaps * belong to the same zone as the memory they backed. */ + if (mem->altmap) + nr_vmemmap_pages = mem->altmap->free; + if (nr_vmemmap_pages) { ret = mhp_init_memmap_on_memory(start_pfn, nr_vmemmap_pages, zone); if (ret) @@ -230,7 +234,7 @@ static int memory_block_offline(struct memory_block *mem) { unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr); unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; - unsigned long nr_vmemmap_pages = mem->nr_vmemmap_pages; + unsigned long nr_vmemmap_pages = 0; int ret; if (!mem->zone) @@ -240,6 +244,9 @@ static int memory_block_offline(struct memory_block *mem) * Unaccount before offlining, such that unpopulated zone and kthreads * can properly be torn down in offline_pages(). */ + if (mem->altmap) + nr_vmemmap_pages = mem->altmap->free; + if (nr_vmemmap_pages) adjust_present_page_count(pfn_to_page(start_pfn), mem->group, -nr_vmemmap_pages); @@ -726,7 +733,7 @@ void memory_block_add_nid(struct memory_block *mem, int nid, #endif static int add_memory_block(unsigned long block_id, unsigned long state, - unsigned long nr_vmemmap_pages, + struct vmem_altmap *altmap, struct memory_group *group) { struct memory_block *mem; @@ -744,7 +751,7 @@ static int add_memory_block(unsigned long block_id, unsigned long state, mem->start_section_nr = block_id * sections_per_block; mem->state = state; mem->nid = NUMA_NO_NODE; - mem->nr_vmemmap_pages = nr_vmemmap_pages; + mem->altmap = altmap; INIT_LIST_HEAD(&mem->group_next); #ifndef CONFIG_NUMA @@ -783,14 +790,14 @@ static int __init add_boot_memory_block(unsigned long base_section_nr) if (section_count == 0) return 0; return add_memory_block(memory_block_id(base_section_nr), - MEM_ONLINE, 0, NULL); + MEM_ONLINE, NULL, NULL); } static int add_hotplug_memory_block(unsigned long block_id, - unsigned long nr_vmemmap_pages, + struct vmem_altmap *altmap, struct memory_group *group) { - return add_memory_block(block_id, MEM_OFFLINE, nr_vmemmap_pages, group); + return add_memory_block(block_id, MEM_OFFLINE, altmap, group); } static void remove_memory_block(struct memory_block *memory) @@ -818,7 +825,7 @@ static void remove_memory_block(struct memory_block *memory) * Called under device_hotplug_lock. */ int create_memory_block_devices(unsigned long start, unsigned long size, - unsigned long vmemmap_pages, + struct vmem_altmap *altmap, struct memory_group *group) { const unsigned long start_block_id = pfn_to_block_id(PFN_DOWN(start)); @@ -832,7 +839,7 @@ int create_memory_block_devices(unsigned long start, unsigned long size, return -EINVAL; for (block_id = start_block_id; block_id != end_block_id; block_id++) { - ret = add_hotplug_memory_block(block_id, vmemmap_pages, group); + ret = add_hotplug_memory_block(block_id, altmap, group); if (ret) break; } diff --git a/include/linux/memory.h b/include/linux/memory.h index 31343566c221..f53cfdaaaa41 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -77,11 +77,7 @@ struct memory_block { */ struct zone *zone; struct device dev; - /* - * Number of vmemmap pages. These pages - * lay at the beginning of the memory block. - */ - unsigned long nr_vmemmap_pages; + struct vmem_altmap *altmap; struct memory_group *group; /* group (if any) for this block */ struct list_head group_next; /* next block inside memory group */ #if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_MEMORY_HOTPLUG) @@ -147,7 +143,7 @@ static inline int hotplug_memory_notifier(notifier_fn_t fn, int pri) extern int register_memory_notifier(struct notifier_block *nb); extern void unregister_memory_notifier(struct notifier_block *nb); int create_memory_block_devices(unsigned long start, unsigned long size, - unsigned long vmemmap_pages, + struct vmem_altmap *altmap, struct memory_group *group); void remove_memory_block_devices(unsigned long start, unsigned long size); extern void memory_dev_init(void); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index fe94feb32d71..aa8724bd1d53 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1439,7 +1439,11 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) if (mhp_flags & MHP_MEMMAP_ON_MEMORY) { if (mhp_supports_memmap_on_memory(size)) { mhp_altmap.free = memory_block_memmap_on_memory_pages(); - params.altmap = &mhp_altmap; + params.altmap = kmalloc(sizeof(struct vmem_altmap), GFP_KERNEL); + if (!params.altmap) + goto error; + + memcpy(params.altmap, &mhp_altmap, sizeof(mhp_altmap)); } /* fallback to not using altmap */ } @@ -1447,13 +1451,13 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) /* call arch's memory hotadd */ ret = arch_add_memory(nid, start, size, ¶ms); if (ret < 0) - goto error; + goto error_free; /* create memory block devices after memory was added */ - ret = create_memory_block_devices(start, size, mhp_altmap.free, group); + ret = create_memory_block_devices(start, size, params.altmap, group); if (ret) { arch_remove_memory(start, size, NULL); - goto error; + goto error_free; } if (new_node) { @@ -1490,6 +1494,8 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) walk_memory_blocks(start, size, NULL, online_memory_block); return ret; +error_free: + kfree(params.altmap); error: if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) memblock_remove(start, size); @@ -2056,12 +2062,18 @@ static int check_memblock_offlined_cb(struct memory_block *mem, void *arg) return 0; } -static int get_nr_vmemmap_pages_cb(struct memory_block *mem, void *arg) +static int test_has_altmap_cb(struct memory_block *mem, void *arg) { + struct memory_block **mem_ptr = (struct memory_block **)arg; /* - * If not set, continue with the next block. + * return the memblock if we have altmap + * and break callback. */ - return mem->nr_vmemmap_pages; + if (mem->altmap) { + *mem_ptr = mem; + return 1; + } + return 0; } static int check_cpu_on_node(int nid) @@ -2136,10 +2148,10 @@ EXPORT_SYMBOL(try_offline_node); static int __ref try_remove_memory(u64 start, u64 size) { - struct vmem_altmap mhp_altmap = {}; - struct vmem_altmap *altmap = NULL; - unsigned long nr_vmemmap_pages; + int ret; + struct memory_block *mem; int rc = 0, nid = NUMA_NO_NODE; + struct vmem_altmap *altmap = NULL; BUG_ON(check_hotplug_memory_range(start, size)); @@ -2161,25 +2173,20 @@ static int __ref try_remove_memory(u64 start, u64 size) * the same granularity it was added - a single memory block. */ if (mhp_memmap_on_memory()) { - nr_vmemmap_pages = walk_memory_blocks(start, size, NULL, - get_nr_vmemmap_pages_cb); - if (nr_vmemmap_pages) { + ret = walk_memory_blocks(start, size, &mem, test_has_altmap_cb); + if (ret) { if (size != memory_block_size_bytes()) { pr_warn("Refuse to remove %#llx - %#llx," "wrong granularity\n", start, start + size); return -EINVAL; } - + altmap = mem->altmap; /* - * Let remove_pmd_table->free_hugepage_table do the - * right thing if we used vmem_altmap when hot-adding - * the range. + * Mark altmap NULL so that we can add a debug + * check on memblock free. */ - mhp_altmap.base_pfn = PHYS_PFN(start); - mhp_altmap.free = nr_vmemmap_pages; - mhp_altmap.alloc = nr_vmemmap_pages; - altmap = &mhp_altmap; + mem->altmap = NULL; } } @@ -2196,6 +2203,15 @@ static int __ref try_remove_memory(u64 start, u64 size) arch_remove_memory(start, size, altmap); + /* + * Now that we are tracking alloc and free correctly + * we can add check to verify altmap free pages. + */ + if (altmap) { + WARN(altmap->alloc, "Altmap not fully unmapped"); + kfree(altmap); + } + if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) { memblock_phys_free(start, size); memblock_remove(start, size);