From patchwork Wed Jun 5 11:40:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 13686705 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CD8BC27C53 for ; Wed, 5 Jun 2024 11:41:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8B9E16B009F; Wed, 5 Jun 2024 07:41:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 85E896B009E; Wed, 5 Jun 2024 07:41:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B1656B009F; Wed, 5 Jun 2024 07:41:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3EF746B009D for ; Wed, 5 Jun 2024 07:41:46 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id F30641A09A9 for ; Wed, 5 Jun 2024 11:41:45 +0000 (UTC) X-FDA: 82196645370.04.8FD1A6D Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf11.hostedemail.com (Postfix) with ESMTP id 5206A40015 for ; Wed, 5 Jun 2024 11:41:44 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=SrE60qV8; spf=pass (imf11.hostedemail.com: domain of bjorn@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=bjorn@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717587704; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EGSy/AlSQkoZ5j6SAb2C7P3e//gvn5UpvpNDjmHB4lc=; b=Gc0HBbOFqBtILBMacdJbucc+8PckokHeNDlLCxLkSGo4Vfb0l56yY7207gam9XWaIqb7yo caADxiIEvBrP2DlRU7Pn/0vBYowmXyMBpT7280qjN+R7sqyydgmP32iJMMvQEyB/15n3/W mapRHn4Rqx9zmvfI9l67Ejfzij0LqH8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717587704; a=rsa-sha256; cv=none; b=yBVwDO1Cf1J/3A5lFk1TNTCIiuYhoMIZwtQfgyqkUjKGnEG6q6oyesWUFIgEUWNOyhThgB kQ+GO/Df6pJRnm7tYqmWRXXJoEyV0USiLImpM3e6r6GbkTMPhgFHrHsoXUc8qphEaC9JnQ 2LW0FdKDusBMzFOHK0kjcVh4Du29ONI= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=SrE60qV8; spf=pass (imf11.hostedemail.com: domain of bjorn@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=bjorn@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 816E461828; Wed, 5 Jun 2024 11:41:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 17BD5C4AF07; Wed, 5 Jun 2024 11:41:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717587703; bh=leGoQp8DEEvTCeM4J9FC79a8Zz3S/PSgaRX/3/h/W8c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SrE60qV8oPpVdrRp9tUwh9ikl7xIxLTsNY63/3cH50PJYUc8iADgU6aQ2lWeBzVdC Bm/D+UZgx+0eqn6rNGPDrUbLTFBA2D98lam6/6CUgQU0w5daAlIVKzXOo3gLZtuy0q DQghfAdosXGf3Kxm7srDhYYV0bkMYx6RGPCm0Anlg7s8L2hxWW94oEgaGUZZBfR2+T 6x9wvIq33dfBPA55iWPQknOQO+x+spsb7EqaQ2bu9Mi4/U3NcJY07Uqia6jKeDvvXP miRZUaXHShRtvlG0Jg53UQWHeEHGOz94fFtNu/D+LJZ+EGh3HfDI3TAIw4rRxg72D7 1q6V+P31mOieA== From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: Alexandre Ghiti , Albert Ou , David Hildenbrand , Palmer Dabbelt , Paul Walmsley , linux-riscv@lists.infradead.org, Oscar Salvador Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Andrew Bresticker , Chethan Seshadri , Lorenzo Stoakes , Santosh Mamila , Sivakumar Munnangi , Sunil V L , linux-kernel@vger.kernel.org, linux-mm@kvack.org, virtualization@lists.linux-foundation.org Subject: [PATCH v4 06/11] riscv: mm: Add memory hotplugging support Date: Wed, 5 Jun 2024 13:40:49 +0200 Message-ID: <20240605114100.315918-7-bjorn@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240605114100.315918-1-bjorn@kernel.org> References: <20240605114100.315918-1-bjorn@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: dhoqyt6tujetzttrubkcmggu4nhbw4ax X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5206A40015 X-HE-Tag: 1717587704-412070 X-HE-Meta: U2FsdGVkX18exfyQswBdfrd6+r4ZkvxoLpxW4wx0RM+hp6AJQLFpMWtfCEq4ncKGbE8MlzxdFO0VfMaIDz7eBKgtNHbFh4nR08PQs+EOebPrAwaS13JJ5Pja1hsrf5TudbNPxtSlYzB/66P0PIYz/tBPJmBvE0KEqfcumM5rs+gatS2wrNgaaTLgqieN8bWzJNeKrtQ1pLK755V5Tl8Lf5e5e1yZhz7NQDvF1AsgMYHfV9L2fMsDJrUKbfpV1PvS9x8FZ5I7i6qxWW3Pq+ntkj75FIpzM4mSZjlZPaxySxwjb0I+n+sJQqS6g/dQEFCJX63TT0fqxMD029qNWHbV0VImB7Zc670beZKxjmHcb21MNT/AcJHcX3BXJ+QU1PRpX5XzTshDQhCTmpnsZaCnHZTaLz52PoS18mny0Apk/DmGqBB0Ac1YTqEXSlCTUNWe/AhigpuHhKxnYzg9WKwCVGvGmA59/K6zgSSXxMkRQQ6FeQ6quUzBO59RFEwxhKZYh7Yw/Bm2xoYID/UN0n4+QNI07tFFPJr8jOnqmJTHF23Mn7g2gyGyMxkonNxeQjdWcKLNkfImyMLlyERbitbkSLiBn8q5rg7wczG8j0HpHVsiZ3qMXFbhe7G0MQ7k4VPZvwqtAvR1CpB4chfzqChSTfPv9jhYFYllD4K608TdPSgt9Y/Yx9/3WJa8/walPDfa92jO6UqYx87wixbCkPzusyjF6D5kDSqSiqtb6vSt/73EvCPpfeeDbYCei/ktuehLDTpKTzkdli1TWhE3hOf+Vnosc+gEPUPL+0h38ssGc0Cs8cgfMZkBZ34MiizMQFRS8r/5XtjFjthpmUk4tZrcr8y7bPowUs58WgGgu1n4nA18RTUvCa5DU0SiD/rclT4Pnv9zxMjvqqjLI6Hi1y8K6VfLqBiMNS9iS2rYYS3BheJ65064XYmJoDg+lC+sDoOO/nx5HHK5Vae9lIAH6GS GrG6MUsY VSmthx67rd6k2q2PBrKYMv3el3qdjElG4u+MZMJJobaN6DcOfMxbzB3f6LzjQ0obaM3396fTLZl7kw8jF0jm3hwbewTJhNmAYSA6cizFUeLDLKXguVPD+8z/RevKyTlsAX4VKgI3RyUI5KTEio2C7PgZRAo3oh7j17FVcZktkjFrwgB4hfMKJTqtTnHXIdYw46R5Wd/S7iSvwj1ONNekbePIL7hvgEf3RSj9EPFeJj/q+Rp0cLUAECM7CzCCiG3JaHbI4j+Y1J5j8UblVN4emvxTsWzwl+jzZu6Bo5s7I17f+bqmpinmlnstGWsw+ioD57zOGNL4+pcYWg9tj7jJ/CxuVw1ZE2zCm5drX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Björn Töpel For an architecture to support memory hotplugging, a couple of callbacks needs to be implemented: arch_add_memory() This callback is responsible for adding the physical memory into the direct map, and call into the memory hotplugging generic code via __add_pages() that adds the corresponding struct page entries, and updates the vmemmap mapping. arch_remove_memory() This is the inverse of the callback above. vmemmap_free() This function tears down the vmemmap mappings (if CONFIG_SPARSEMEM_VMEMMAP is enabled), and also deallocates the backing vmemmap pages. Note that for persistent memory, an alternative allocator for the backing pages can be used; The vmem_altmap. This means that when the backing pages are cleared, extra care is needed so that the correct deallocation method is used. arch_get_mappable_range() This functions returns the PA range that the direct map can map. Used by the MHP internals for sanity checks. The page table unmap/teardown functions are heavily based on code from the x86 tree. The same remove_pgd_mapping() function is used in both vmemmap_free() and arch_remove_memory(), but in the latter function the backing pages are not removed. Signed-off-by: Björn Töpel Reviewed-by: Alexandre Ghiti Reviewed-by: Alexandre Ghiti --- arch/riscv/mm/init.c | 267 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 267 insertions(+) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 1f7e7c223bec..bfa2dea95354 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1534,3 +1534,270 @@ struct execmem_info __init *execmem_arch_setup(void) } #endif /* CONFIG_MMU */ #endif /* CONFIG_EXECMEM */ + +#ifdef CONFIG_MEMORY_HOTPLUG +static void __meminit free_pte_table(pte_t *pte_start, pmd_t *pmd) +{ + struct page *page = pmd_page(*pmd); + struct ptdesc *ptdesc = page_ptdesc(page); + pte_t *pte; + int i; + + for (i = 0; i < PTRS_PER_PTE; i++) { + pte = pte_start + i; + if (!pte_none(*pte)) + return; + } + + pagetable_pte_dtor(ptdesc); + if (PageReserved(page)) + free_reserved_page(page); + else + pagetable_free(ptdesc); + pmd_clear(pmd); +} + +static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud) +{ + struct page *page = pud_page(*pud); + struct ptdesc *ptdesc = page_ptdesc(page); + pmd_t *pmd; + int i; + + for (i = 0; i < PTRS_PER_PMD; i++) { + pmd = pmd_start + i; + if (!pmd_none(*pmd)) + return; + } + + pagetable_pmd_dtor(ptdesc); + if (PageReserved(page)) + free_reserved_page(page); + else + pagetable_free(ptdesc); + pud_clear(pud); +} + +static void __meminit free_pud_table(pud_t *pud_start, p4d_t *p4d) +{ + struct page *page = p4d_page(*p4d); + pud_t *pud; + int i; + + for (i = 0; i < PTRS_PER_PUD; i++) { + pud = pud_start + i; + if (!pud_none(*pud)) + return; + } + + if (PageReserved(page)) + free_reserved_page(page); + else + free_pages((unsigned long)page_address(page), 0); + p4d_clear(p4d); +} + +static void __meminit free_vmemmap_storage(struct page *page, size_t size, + struct vmem_altmap *altmap) +{ + int order = get_order(size); + + if (altmap) { + vmem_altmap_free(altmap, size >> PAGE_SHIFT); + return; + } + + if (PageReserved(page)) { + unsigned int nr_pages = 1 << order; + + while (nr_pages--) + free_reserved_page(page++); + return; + } + + free_pages((unsigned long)page_address(page), order); +} + +static void __meminit remove_pte_mapping(pte_t *pte_base, unsigned long addr, unsigned long end, + bool is_vmemmap, struct vmem_altmap *altmap) +{ + unsigned long next; + pte_t *ptep, pte; + + for (; addr < end; addr = next) { + next = (addr + PAGE_SIZE) & PAGE_MASK; + if (next > end) + next = end; + + ptep = pte_base + pte_index(addr); + pte = ptep_get(ptep); + if (!pte_present(*ptep)) + continue; + + pte_clear(&init_mm, addr, ptep); + if (is_vmemmap) + free_vmemmap_storage(pte_page(pte), PAGE_SIZE, altmap); + } +} + +static void __meminit remove_pmd_mapping(pmd_t *pmd_base, unsigned long addr, unsigned long end, + bool is_vmemmap, struct vmem_altmap *altmap) +{ + unsigned long next; + pte_t *pte_base; + pmd_t *pmdp, pmd; + + for (; addr < end; addr = next) { + next = pmd_addr_end(addr, end); + pmdp = pmd_base + pmd_index(addr); + pmd = pmdp_get(pmdp); + if (!pmd_present(pmd)) + continue; + + if (pmd_leaf(pmd)) { + pmd_clear(pmdp); + if (is_vmemmap) + free_vmemmap_storage(pmd_page(pmd), PMD_SIZE, altmap); + continue; + } + + pte_base = (pte_t *)pmd_page_vaddr(*pmdp); + remove_pte_mapping(pte_base, addr, next, is_vmemmap, altmap); + free_pte_table(pte_base, pmdp); + } +} + +static void __meminit remove_pud_mapping(pud_t *pud_base, unsigned long addr, unsigned long end, + bool is_vmemmap, struct vmem_altmap *altmap) +{ + unsigned long next; + pud_t *pudp, pud; + pmd_t *pmd_base; + + for (; addr < end; addr = next) { + next = pud_addr_end(addr, end); + pudp = pud_base + pud_index(addr); + pud = pudp_get(pudp); + if (!pud_present(pud)) + continue; + + if (pud_leaf(pud)) { + if (pgtable_l4_enabled) { + pud_clear(pudp); + if (is_vmemmap) + free_vmemmap_storage(pud_page(pud), PUD_SIZE, altmap); + } + continue; + } + + pmd_base = pmd_offset(pudp, 0); + remove_pmd_mapping(pmd_base, addr, next, is_vmemmap, altmap); + + if (pgtable_l4_enabled) + free_pmd_table(pmd_base, pudp); + } +} + +static void __meminit remove_p4d_mapping(p4d_t *p4d_base, unsigned long addr, unsigned long end, + bool is_vmemmap, struct vmem_altmap *altmap) +{ + unsigned long next; + p4d_t *p4dp, p4d; + pud_t *pud_base; + + for (; addr < end; addr = next) { + next = p4d_addr_end(addr, end); + p4dp = p4d_base + p4d_index(addr); + p4d = p4dp_get(p4dp); + if (!p4d_present(p4d)) + continue; + + if (p4d_leaf(p4d)) { + if (pgtable_l5_enabled) { + p4d_clear(p4dp); + if (is_vmemmap) + free_vmemmap_storage(p4d_page(p4d), P4D_SIZE, altmap); + } + continue; + } + + pud_base = pud_offset(p4dp, 0); + remove_pud_mapping(pud_base, addr, next, is_vmemmap, altmap); + + if (pgtable_l5_enabled) + free_pud_table(pud_base, p4dp); + } +} + +static void __meminit remove_pgd_mapping(unsigned long va, unsigned long end, bool is_vmemmap, + struct vmem_altmap *altmap) +{ + unsigned long addr, next; + p4d_t *p4d_base; + pgd_t *pgd; + + for (addr = va; addr < end; addr = next) { + next = pgd_addr_end(addr, end); + pgd = pgd_offset_k(addr); + + if (!pgd_present(*pgd)) + continue; + + if (pgd_leaf(*pgd)) + continue; + + p4d_base = p4d_offset(pgd, 0); + remove_p4d_mapping(p4d_base, addr, next, is_vmemmap, altmap); + } + + flush_tlb_all(); +} + +static void __meminit remove_linear_mapping(phys_addr_t start, u64 size) +{ + unsigned long va = (unsigned long)__va(start); + unsigned long end = (unsigned long)__va(start + size); + + remove_pgd_mapping(va, end, false, NULL); +} + +struct range arch_get_mappable_range(void) +{ + struct range mhp_range; + + mhp_range.start = __pa(PAGE_OFFSET); + mhp_range.end = __pa(PAGE_END - 1); + return mhp_range; +} + +int __ref arch_add_memory(int nid, u64 start, u64 size, struct mhp_params *params) +{ + int ret = 0; + + create_linear_mapping_range(start, start + size, 0, ¶ms->pgprot); + ret = __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT, params); + if (ret) { + remove_linear_mapping(start, size); + goto out; + } + + max_pfn = PFN_UP(start + size); + max_low_pfn = max_pfn; + + out: + flush_tlb_all(); + return ret; +} + +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +{ + __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap); + remove_linear_mapping(start, size); + flush_tlb_all(); +} + +void __ref vmemmap_free(unsigned long start, unsigned long end, struct vmem_altmap *altmap) +{ + remove_pgd_mapping(start, end, true, altmap); +} +#endif /* CONFIG_MEMORY_HOTPLUG */