From patchwork Tue Mar 9 21:40:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12126601 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2C1AC433DB for ; Tue, 9 Mar 2021 21:41:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 69CDD64FAD for ; Tue, 9 Mar 2021 21:41:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 69CDD64FAD Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 717838D006C; Tue, 9 Mar 2021 16:41:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6EED58D000F; Tue, 9 Mar 2021 16:41:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B6378D006C; Tue, 9 Mar 2021 16:41:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0145.hostedemail.com [216.40.44.145]) by kanga.kvack.org (Postfix) with ESMTP id 420648D000F for ; Tue, 9 Mar 2021 16:41:14 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 08A121F10 for ; Tue, 9 Mar 2021 21:41:14 +0000 (UTC) X-FDA: 77901656868.09.A263190 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf22.hostedemail.com (Postfix) with ESMTP id 38AF1C0007C5 for ; Tue, 9 Mar 2021 21:41:11 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 50291B000; Tue, 9 Mar 2021 21:41:12 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Michal Hocko , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v6 1/4] x86/vmemmap: Drop handling of 4K unaligned vmemmap range Date: Tue, 9 Mar 2021 22:40:47 +0100 Message-Id: <20210309214050.4674-2-osalvador@suse.de> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20210309214050.4674-1-osalvador@suse.de> References: <20210309214050.4674-1-osalvador@suse.de> MIME-Version: 1.0 X-Stat-Signature: zmn4gotc5aiu7sh5mjhqmmk4ihgxuzkh X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 38AF1C0007C5 Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf22; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1615326071-508147 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: remove_pte_table() is prepared to handle the case where either the start or the end of the range is not PAGE aligned. This cannot actually happen: __populate_section_memmap enforces the range to be PMD aligned, so as long as the size of the struct page remains multiple of 8, the vmemmap range will be aligned to PAGE_SIZE. Drop the dead code and place a VM_BUG_ON in vmemmap_{populate,free} to catch nasty cases. Note that the VM_BUG_ON is placed in there because vmemmap_{populate,free} is the gate of all removing and freeing page tables logic. Signed-off-by: Oscar Salvador Suggested-by: David Hildenbrand Reviewed-by: David Hildenbrand Acked-by: Dave Hansen --- arch/x86/mm/init_64.c | 48 +++++++++++++----------------------------------- 1 file changed, 13 insertions(+), 35 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index b5a3fa4033d3..b0e1d215c83e 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -962,7 +962,6 @@ remove_pte_table(pte_t *pte_start, unsigned long addr, unsigned long end, { unsigned long next, pages = 0; pte_t *pte; - void *page_addr; phys_addr_t phys_addr; pte = pte_start + pte_index(addr); @@ -983,42 +982,15 @@ remove_pte_table(pte_t *pte_start, unsigned long addr, unsigned long end, if (phys_addr < (phys_addr_t)0x40000000) return; - if (PAGE_ALIGNED(addr) && PAGE_ALIGNED(next)) { - /* - * Do not free direct mapping pages since they were - * freed when offlining, or simplely not in use. - */ - if (!direct) - free_pagetable(pte_page(*pte), 0); - - spin_lock(&init_mm.page_table_lock); - pte_clear(&init_mm, addr, pte); - spin_unlock(&init_mm.page_table_lock); + if (!direct) + free_pagetable(pte_page(*pte), 0); - /* For non-direct mapping, pages means nothing. */ - pages++; - } else { - /* - * If we are here, we are freeing vmemmap pages since - * direct mapped memory ranges to be freed are aligned. - * - * If we are not removing the whole page, it means - * other page structs in this page are being used and - * we canot remove them. So fill the unused page_structs - * with 0xFD, and remove the page when it is wholly - * filled with 0xFD. - */ - memset((void *)addr, PAGE_INUSE, next - addr); - - page_addr = page_address(pte_page(*pte)); - if (!memchr_inv(page_addr, PAGE_INUSE, PAGE_SIZE)) { - free_pagetable(pte_page(*pte), 0); + spin_lock(&init_mm.page_table_lock); + pte_clear(&init_mm, addr, pte); + spin_unlock(&init_mm.page_table_lock); - spin_lock(&init_mm.page_table_lock); - pte_clear(&init_mm, addr, pte); - spin_unlock(&init_mm.page_table_lock); - } - } + /* For non-direct mapping, pages means nothing. */ + pages++; } /* Call free_pte_table() in remove_pmd_table(). */ @@ -1197,6 +1169,9 @@ remove_pagetable(unsigned long start, unsigned long end, bool direct, void __ref vmemmap_free(unsigned long start, unsigned long end, struct vmem_altmap *altmap) { + VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE)); + VM_BUG_ON(!IS_ALIGNED(end, PAGE_SIZE)); + remove_pagetable(start, end, false, altmap); } @@ -1556,6 +1531,9 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, { int err; + VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE)); + VM_BUG_ON(!IS_ALIGNED(end, PAGE_SIZE)); + if (end - start < PAGES_PER_SECTION * sizeof(struct page)) err = vmemmap_populate_basepages(start, end, node, NULL); else if (boot_cpu_has(X86_FEATURE_PSE)) From patchwork Tue Mar 9 21:40:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12126603 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B82DFC433E6 for ; Tue, 9 Mar 2021 21:41:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 51A4C64FC7 for ; Tue, 9 Mar 2021 21:41:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 51A4C64FC7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C98198D00FC; Tue, 9 Mar 2021 16:41:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C21C18D000F; Tue, 9 Mar 2021 16:41:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A72DA8D00FC; Tue, 9 Mar 2021 16:41:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0184.hostedemail.com [216.40.44.184]) by kanga.kvack.org (Postfix) with ESMTP id 8A9B88D000F for ; Tue, 9 Mar 2021 16:41:15 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4445F180CA82C for ; Tue, 9 Mar 2021 21:41:15 +0000 (UTC) X-FDA: 77901656910.06.7782D14 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf04.hostedemail.com (Postfix) with ESMTP id 6E4733C3 for ; Tue, 9 Mar 2021 21:41:13 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 2D2B0AFC1; Tue, 9 Mar 2021 21:41:13 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Michal Hocko , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v6 2/4] x86/vmemmap: Drop handling of 1GB vmemmap ranges Date: Tue, 9 Mar 2021 22:40:48 +0100 Message-Id: <20210309214050.4674-3-osalvador@suse.de> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20210309214050.4674-1-osalvador@suse.de> References: <20210309214050.4674-1-osalvador@suse.de> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 6E4733C3 X-Stat-Signature: 615f35xfu5w9wtakyf6y7snwueenhz5r Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf04; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1615326073-210626 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There is no code to allocate 1GB pages when mapping the vmemmap range as this might waste some memory and requires more complexity which is not really worth. Drop the dead code both for the aligned and unaligned cases and leave only the direct map handling. Signed-off-by: Oscar Salvador Suggested-by: David Hildenbrand Reviewed-by: David Hildenbrand Acked-by: Dave Hansen --- arch/x86/mm/init_64.c | 35 +++++++---------------------------- 1 file changed, 7 insertions(+), 28 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index b0e1d215c83e..9ecb3c488ac8 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1062,7 +1062,6 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end, unsigned long next, pages = 0; pmd_t *pmd_base; pud_t *pud; - void *page_addr; pud = pud_start + pud_index(addr); for (; addr < end; addr = next, pud++) { @@ -1071,33 +1070,13 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end, if (!pud_present(*pud)) continue; - if (pud_large(*pud)) { - if (IS_ALIGNED(addr, PUD_SIZE) && - IS_ALIGNED(next, PUD_SIZE)) { - if (!direct) - free_pagetable(pud_page(*pud), - get_order(PUD_SIZE)); - - spin_lock(&init_mm.page_table_lock); - pud_clear(pud); - spin_unlock(&init_mm.page_table_lock); - pages++; - } else { - /* If here, we are freeing vmemmap pages. */ - memset((void *)addr, PAGE_INUSE, next - addr); - - page_addr = page_address(pud_page(*pud)); - if (!memchr_inv(page_addr, PAGE_INUSE, - PUD_SIZE)) { - free_pagetable(pud_page(*pud), - get_order(PUD_SIZE)); - - spin_lock(&init_mm.page_table_lock); - pud_clear(pud); - spin_unlock(&init_mm.page_table_lock); - } - } - + if (pud_large(*pud) && + IS_ALIGNED(addr, PUD_SIZE) && + IS_ALIGNED(next, PUD_SIZE)) { + spin_lock(&init_mm.page_table_lock); + pud_clear(pud); + spin_unlock(&init_mm.page_table_lock); + pages++; continue; } From patchwork Tue Mar 9 21:40:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12126605 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E205CC433E0 for ; Tue, 9 Mar 2021 21:41:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6A9F364FAD for ; Tue, 9 Mar 2021 21:41:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6A9F364FAD Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4A7078D012A; Tue, 9 Mar 2021 16:41:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 432098D000F; Tue, 9 Mar 2021 16:41:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0FB5C8D012A; Tue, 9 Mar 2021 16:41:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0029.hostedemail.com [216.40.44.29]) by kanga.kvack.org (Postfix) with ESMTP id EB8AA8D000F for ; Tue, 9 Mar 2021 16:41:15 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 98FB4180C1165 for ; Tue, 9 Mar 2021 21:41:15 +0000 (UTC) X-FDA: 77901656910.10.B85D6A7 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf06.hostedemail.com (Postfix) with ESMTP id 31B2EC0007D4 for ; Tue, 9 Mar 2021 21:41:15 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id EFC00B028; Tue, 9 Mar 2021 21:41:13 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Michal Hocko , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v6 3/4] x86/vmemmap: Handle unpopulated sub-pmd ranges Date: Tue, 9 Mar 2021 22:40:49 +0100 Message-Id: <20210309214050.4674-4-osalvador@suse.de> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20210309214050.4674-1-osalvador@suse.de> References: <20210309214050.4674-1-osalvador@suse.de> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 31B2EC0007D4 X-Stat-Signature: hnxtk9j1f4zwtg1p3wcwpkzstd5zespm Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf06; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1615326075-21123 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When sizeof(struct page) is not a power of 2, sections do not span a PMD anymore and so when populating them some parts of the PMD will remain unused. Because of this, PMDs will be left behind when depopulating sections since remove_pmd_table() thinks that those unused parts are still in use. Fix this by marking the unused parts with PAGE_UNUSED, so memchr_inv() will do the right thing and will let us free the PMD when the last user of it is gone. This patch is based on a similar patch by David Hildenbrand: https://lore.kernel.org/linux-mm/20200722094558.9828-9-david@redhat.com/ Signed-off-by: Oscar Salvador Reviewed-by: David Hildenbrand Acked-by: Dave Hansen Reported-by: Naresh Kamboju --- arch/x86/mm/init_64.c | 65 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 53 insertions(+), 12 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 9ecb3c488ac8..d93b36856ed3 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -826,6 +826,51 @@ void __init paging_init(void) zone_sizes_init(); } +#ifdef CONFIG_SPARSEMEM_VMEMMAP +#define PAGE_UNUSED 0xFD + +/* Returns true if the PMD is completely unused and thus it can be freed */ +static bool __meminit vmemmap_pmd_is_unused(unsigned long addr, unsigned long end) +{ + unsigned long start = ALIGN_DOWN(addr, PMD_SIZE); + + memset((void *)addr, PAGE_UNUSED, end - addr); + + return !memchr_inv((void *)start, PAGE_UNUSED, PMD_SIZE); +} + +static void __meminit vmemmap_use_sub_pmd(unsigned long start) +{ + /* + * As we expect to add in the same granularity as we remove, it's + * sufficient to mark only some piece used to block the memmap page from + * getting removed when removing some other adjacent memmap (just in + * case the first memmap never gets initialized e.g., because the memory + * block never gets onlined). + */ + memset((void *)start, 0, sizeof(struct page)); +} + +static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long end) +{ + /* + * Could be our memmap page is filled with PAGE_UNUSED already from a + * previous remove. Make sure to reset it. + */ + vmemmap_use_sub_pmd(start); + + /* + * Mark with PAGE_UNUSED the unused parts of the new memmap range + */ + if (!IS_ALIGNED(start, PMD_SIZE)) + memset((void *)start, PAGE_UNUSED, + start - ALIGN_DOWN(start, PMD_SIZE)); + if (!IS_ALIGNED(end, PMD_SIZE)) + memset((void *)end, PAGE_UNUSED, + ALIGN(end, PMD_SIZE) - end); +} +#endif + /* * Memory hotplug specific functions */ @@ -871,8 +916,6 @@ int arch_add_memory(int nid, u64 start, u64 size, return add_pages(nid, start_pfn, nr_pages, params); } -#define PAGE_INUSE 0xFD - static void __meminit free_pagetable(struct page *page, int order) { unsigned long magic; @@ -1006,7 +1049,6 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end, unsigned long next, pages = 0; pte_t *pte_base; pmd_t *pmd; - void *page_addr; pmd = pmd_start + pmd_index(addr); for (; addr < end; addr = next, pmd++) { @@ -1026,20 +1068,13 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end, pmd_clear(pmd); spin_unlock(&init_mm.page_table_lock); pages++; - } else { - /* If here, we are freeing vmemmap pages. */ - memset((void *)addr, PAGE_INUSE, next - addr); - - page_addr = page_address(pmd_page(*pmd)); - if (!memchr_inv(page_addr, PAGE_INUSE, - PMD_SIZE)) { + } else if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && + vmemmap_pmd_is_unused(addr, next)) { free_hugepage_table(pmd_page(*pmd), altmap); - spin_lock(&init_mm.page_table_lock); pmd_clear(pmd); spin_unlock(&init_mm.page_table_lock); - } } continue; @@ -1492,11 +1527,17 @@ static int __meminit vmemmap_populate_hugepages(unsigned long start, addr_end = addr + PMD_SIZE; p_end = p + PMD_SIZE; + + if (!IS_ALIGNED(addr, PMD_SIZE) || + !IS_ALIGNED(next, PMD_SIZE)) + vmemmap_use_new_sub_pmd(addr, next); + continue; } else if (altmap) return -ENOMEM; /* no fallback */ } else if (pmd_large(*pmd)) { vmemmap_verify((pte_t *)pmd, node, addr, next); + vmemmap_use_sub_pmd(addr); continue; } if (vmemmap_populate_basepages(addr, next, node, NULL)) From patchwork Tue Mar 9 21:40:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12126607 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38275C433DB for ; Tue, 9 Mar 2021 21:41:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CC47E64FC0 for ; Tue, 9 Mar 2021 21:41:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC47E64FC0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 284E88D0147; Tue, 9 Mar 2021 16:41:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 20EF98D000F; Tue, 9 Mar 2021 16:41:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F2D7E8D0147; Tue, 9 Mar 2021 16:41:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0003.hostedemail.com [216.40.44.3]) by kanga.kvack.org (Postfix) with ESMTP id D192B8D000F for ; Tue, 9 Mar 2021 16:41:16 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 88CBF181AF5D3 for ; Tue, 9 Mar 2021 21:41:16 +0000 (UTC) X-FDA: 77901656952.06.2652026 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf03.hostedemail.com (Postfix) with ESMTP id C9704C0007CA for ; Tue, 9 Mar 2021 21:41:13 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id BF6CFB023; Tue, 9 Mar 2021 21:41:14 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Michal Hocko , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v6 4/4] x86/vmemmap: Optimize for consecutive sections in partial populated PMDs Date: Tue, 9 Mar 2021 22:40:50 +0100 Message-Id: <20210309214050.4674-5-osalvador@suse.de> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20210309214050.4674-1-osalvador@suse.de> References: <20210309214050.4674-1-osalvador@suse.de> MIME-Version: 1.0 X-Stat-Signature: cnmfszaooctod4w5idbdk4eckmywakd8 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C9704C0007CA Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf03; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1615326073-972649 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We can optimize in the case we are adding consecutive sections, so no memset(PAGE_UNUSED) is needed. In that case, let us keep track where the unused range of the previous memory range begins, so we can compare it with start of the range to be added. If they are equal, we know sections are added consecutively. For that purpose, let us introduce 'unused_pmd_start', which always holds the beginning of the unused memory range. In the case a section does not contiguously follow the previous one, we know we can memset [unused_pmd_start, PMD_BOUNDARY) with PAGE_UNUSE. This patch is based on a similar patch by David Hildenbrand: https://lore.kernel.org/linux-mm/20200722094558.9828-10-david@redhat.com/ Signed-off-by: Oscar Salvador Acked-by: Dave Hansen --- arch/x86/mm/init_64.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 60 insertions(+), 5 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index d93b36856ed3..13187a3debe9 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -829,17 +829,42 @@ void __init paging_init(void) #ifdef CONFIG_SPARSEMEM_VMEMMAP #define PAGE_UNUSED 0xFD +/* + * The unused vmemmap range, which was not yet memset(PAGE_UNUSED), ranges + * from unused_pmd_start to next PMD_SIZE boundary. + */ +static unsigned long unused_pmd_start __meminitdata; + +static void __meminit vmemmap_flush_unused_pmd(void) +{ + if (!unused_pmd_start) + return; + /* + * Clears (unused_pmd_start, PMD_END] + */ + memset((void *)unused_pmd_start, PAGE_UNUSED, + ALIGN(unused_pmd_start, PMD_SIZE) - unused_pmd_start); + unused_pmd_start = 0; +} + +#ifdef CONFIG_MEMORY_HOTPLUG /* Returns true if the PMD is completely unused and thus it can be freed */ static bool __meminit vmemmap_pmd_is_unused(unsigned long addr, unsigned long end) { unsigned long start = ALIGN_DOWN(addr, PMD_SIZE); + /* + * Flush the unused range cache to ensure that memchr_inv() will work + * for the whole range. + */ + vmemmap_flush_unused_pmd(); memset((void *)addr, PAGE_UNUSED, end - addr); return !memchr_inv((void *)start, PAGE_UNUSED, PMD_SIZE); } +#endif -static void __meminit vmemmap_use_sub_pmd(unsigned long start) +static void __meminit __vmemmap_use_sub_pmd(unsigned long start) { /* * As we expect to add in the same granularity as we remove, it's @@ -851,13 +876,38 @@ static void __meminit vmemmap_use_sub_pmd(unsigned long start) memset((void *)start, 0, sizeof(struct page)); } +static void __meminit vmemmap_use_sub_pmd(unsigned long start, unsigned long end) +{ + /* + * We only optimize if the new used range directly follows the + * previously unused range (esp., when populating consecutive sections). + */ + if (unused_pmd_start == start) { + if (likely(IS_ALIGNED(end, PMD_SIZE))) + unused_pmd_start = 0; + else + unused_pmd_start = end; + return; + } + + /* + * If the range does not contiguously follows previous one, make sure + * to mark the unused range of the previous one so it can be removed. + */ + vmemmap_flush_unused_pmd(); + __vmemmap_use_sub_pmd(start); +} + + static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long end) { + vmemmap_flush_unused_pmd(); + /* * Could be our memmap page is filled with PAGE_UNUSED already from a * previous remove. Make sure to reset it. */ - vmemmap_use_sub_pmd(start); + __vmemmap_use_sub_pmd(start); /* * Mark with PAGE_UNUSED the unused parts of the new memmap range @@ -865,9 +915,14 @@ static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long if (!IS_ALIGNED(start, PMD_SIZE)) memset((void *)start, PAGE_UNUSED, start - ALIGN_DOWN(start, PMD_SIZE)); + + /* + * We want to avoid memset(PAGE_UNUSED) when populating the vmemmap of + * consecutive sections. Remember for the last added PMD where the + * unused range begins. + */ if (!IS_ALIGNED(end, PMD_SIZE)) - memset((void *)end, PAGE_UNUSED, - ALIGN(end, PMD_SIZE) - end); + unused_pmd_start = end; } #endif @@ -1537,7 +1592,7 @@ static int __meminit vmemmap_populate_hugepages(unsigned long start, return -ENOMEM; /* no fallback */ } else if (pmd_large(*pmd)) { vmemmap_verify((pte_t *)pmd, node, addr, next); - vmemmap_use_sub_pmd(addr); + vmemmap_use_sub_pmd(addr, next); continue; } if (vmemmap_populate_basepages(addr, next, node, NULL))