[v2,2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges

Message ID	20210203104750.23405-3-osalvador@suse.de (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=hDF9=HF=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A1C9B64E2C From: Oscar Salvador <osalvador@suse.de> To: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com>, Dave Hansen <dave.hansen@linux.intel.com>, Andy Lutomirski <luto@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>, Michal Hocko <mhocko@kernel.org>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador <osalvador@suse.de> Subject: [PATCH v2 2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges Date: Wed, 3 Feb 2021 11:47:49 +0100 Message-Id: <20210203104750.23405-3-osalvador@suse.de> In-Reply-To: <20210203104750.23405-1-osalvador@suse.de> References: <20210203104750.23405-1-osalvador@suse.de> Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	Cleanup and fixups for vmemmap handling \| expand [v2,0/3] Cleanup and fixups for vmemmap handling [v2,1/3] x86/vmemmap: Drop handling of 4K unaligned vmemmap range [v2,2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges [v2,3/3] x86/vmemmap: Handle unpopulated sub-pmd ranges

Message ID

20210203104750.23405-3-osalvador@suse.de (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A1C9B64E2C
From: Oscar Salvador <osalvador@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	Borislav Petkov <bp@alien8.de>,
	x86@kernel.org,
	"H . Peter Anvin" <hpa@zytor.com>,
	Michal Hocko <mhocko@kernel.org>,
	linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Oscar Salvador <osalvador@suse.de>
Subject: [PATCH v2 2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges
Date: Wed,  3 Feb 2021 11:47:49 +0100
Message-Id: <20210203104750.23405-3-osalvador@suse.de>
In-Reply-To: <20210203104750.23405-1-osalvador@suse.de>
References: <20210203104750.23405-1-osalvador@suse.de>
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

Cleanup and fixups for vmemmap handling | expand

Commit Message

Oscar Salvador Feb. 3, 2021, 10:47 a.m. UTC

We never get to allocate 1GB pages when mapping the vmemmap range.
Drop the dead code both for the aligned and unaligned cases and leave
only the direct map handling.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Suggested-by: David Hildenbrand <david@redhat.com>
---
 arch/x86/mm/init_64.c | 31 ++++---------------------------
 1 file changed, 4 insertions(+), 27 deletions(-)

Comments

David Hildenbrand Feb. 3, 2021, 1:33 p.m. UTC | #1

On 03.02.21 11:47, Oscar Salvador wrote:
> We never get to allocate 1GB pages when mapping the vmemmap range.
> Drop the dead code both for the aligned and unaligned cases and leave
> only the direct map handling.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Suggested-by: David Hildenbrand <david@redhat.com>
> ---
>   arch/x86/mm/init_64.c | 31 ++++---------------------------
>   1 file changed, 4 insertions(+), 27 deletions(-)
> 
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index b0e1d215c83e..28729c6b9775 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1062,7 +1062,6 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
>   	unsigned long next, pages = 0;
>   	pmd_t *pmd_base;
>   	pud_t *pud;
> -	void *page_addr;
>   
>   	pud = pud_start + pud_index(addr);
>   	for (; addr < end; addr = next, pud++) {
> @@ -1072,32 +1071,10 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
>   			continue;
>   
>   		if (pud_large(*pud)) {
> -			if (IS_ALIGNED(addr, PUD_SIZE) &&
> -			    IS_ALIGNED(next, PUD_SIZE)) {
> -				if (!direct)
> -					free_pagetable(pud_page(*pud),
> -						       get_order(PUD_SIZE));
> -
> -				spin_lock(&init_mm.page_table_lock);
> -				pud_clear(pud);
> -				spin_unlock(&init_mm.page_table_lock);
> -				pages++;
> -			} else {
> -				/* If here, we are freeing vmemmap pages. */
> -				memset((void *)addr, PAGE_INUSE, next - addr);
> -
> -				page_addr = page_address(pud_page(*pud));
> -				if (!memchr_inv(page_addr, PAGE_INUSE,
> -						PUD_SIZE)) {
> -					free_pagetable(pud_page(*pud),
> -						       get_order(PUD_SIZE));
> -
> -					spin_lock(&init_mm.page_table_lock);
> -					pud_clear(pud);
> -					spin_unlock(&init_mm.page_table_lock);
> -				}
> -			}
> -
> +			spin_lock(&init_mm.page_table_lock);
> +			pud_clear(pud);
> +			spin_unlock(&init_mm.page_table_lock);
> +			pages++;
>   			continue;
>   		}

One problem I see with existing code / this change making more obvious 
is that when trying to remove in other granularity than we added (e.g., 
unplug a 128MB DIMM avaialble during boot), we remove the direct map of 
unrelated DIMMs.

I think we should keep the

if (IS_ALIGNED(addr, PUD_SIZE) &&
     IS_ALIGNED(next, PUD_SIZE)) {
...
}

bits. Thoguhts?

Apart from that looks good.

Oscar Salvador Feb. 3, 2021, 2:10 p.m. UTC | #2

On Wed, Feb 03, 2021 at 02:33:56PM +0100, David Hildenbrand wrote:
> One problem I see with existing code / this change making more obvious is
> that when trying to remove in other granularity than we added (e.g., unplug
> a 128MB DIMM avaialble during boot), we remove the direct map of unrelated
> DIMMs.

So, let me see if I understand your concern.

We have a range that was mapped with 1GB page, and we try to remove
a 128MB chunk from it.
Yes, in that case we would clear the pud, and that is bad, so we should
keep the PAGE_ALIGNED checks.

Now, let us assume that scenario.
If you have a 1GB mapped range and you remove it in smaller chunks bit by bit
(e.g: 128M), the direct mapping of that range will never be cleared unless
I am missing something (and the pagetables won't be freed) ?

David Hildenbrand Feb. 3, 2021, 2:12 p.m. UTC | #3

On 03.02.21 15:10, Oscar Salvador wrote:
> On Wed, Feb 03, 2021 at 02:33:56PM +0100, David Hildenbrand wrote:
>> One problem I see with existing code / this change making more obvious is
>> that when trying to remove in other granularity than we added (e.g., unplug
>> a 128MB DIMM avaialble during boot), we remove the direct map of unrelated
>> DIMMs.
> 
> So, let me see if I understand your concern.
> 
> We have a range that was mapped with 1GB page, and we try to remove
> a 128MB chunk from it.
> Yes, in that case we would clear the pud, and that is bad, so we should
> keep the PAGE_ALIGNED checks.
> 
> Now, let us assume that scenario.
> If you have a 1GB mapped range and you remove it in smaller chunks bit by bit
> (e.g: 128M), the direct mapping of that range will never be cleared unless

No, that's exactly what's happening. Good thing is that it barely ever 
happens, so I assume leaving behind some direct mapping / page tables is 
not that bad.

Oscar Salvador Feb. 3, 2021, 2:15 p.m. UTC | #4

On Wed, Feb 03, 2021 at 03:12:05PM +0100, David Hildenbrand wrote:
> On 03.02.21 15:10, Oscar Salvador wrote:
> > On Wed, Feb 03, 2021 at 02:33:56PM +0100, David Hildenbrand wrote:
> > > One problem I see with existing code / this change making more obvious is
> > > that when trying to remove in other granularity than we added (e.g., unplug
> > > a 128MB DIMM avaialble during boot), we remove the direct map of unrelated
> > > DIMMs.
> > 
> > So, let me see if I understand your concern.
> > 
> > We have a range that was mapped with 1GB page, and we try to remove
> > a 128MB chunk from it.
> > Yes, in that case we would clear the pud, and that is bad, so we should
> > keep the PAGE_ALIGNED checks.
> > 
> > Now, let us assume that scenario.
> > If you have a 1GB mapped range and you remove it in smaller chunks bit by bit
> > (e.g: 128M), the direct mapping of that range will never be cleared unless
> 
> No, that's exactly what's happening. Good thing is that it barely ever
> happens, so I assume leaving behind some direct mapping / page tables is not
> that bad.

Sorry, I meant that that is the current situation now.

Then let us keep the PAGE_ALIGNED stuff.

I shall resend a v3 later today.


thanks for the review ;-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index b0e1d215c83e..28729c6b9775 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1062,7 +1062,6 @@  remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
 	unsigned long next, pages = 0;
 	pmd_t *pmd_base;
 	pud_t *pud;
-	void *page_addr;
 
 	pud = pud_start + pud_index(addr);
 	for (; addr < end; addr = next, pud++) {
@@ -1072,32 +1071,10 @@  remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
 			continue;
 
 		if (pud_large(*pud)) {
-			if (IS_ALIGNED(addr, PUD_SIZE) &&
-			    IS_ALIGNED(next, PUD_SIZE)) {
-				if (!direct)
-					free_pagetable(pud_page(*pud),
-						       get_order(PUD_SIZE));
-
-				spin_lock(&init_mm.page_table_lock);
-				pud_clear(pud);
-				spin_unlock(&init_mm.page_table_lock);
-				pages++;
-			} else {
-				/* If here, we are freeing vmemmap pages. */
-				memset((void *)addr, PAGE_INUSE, next - addr);
-
-				page_addr = page_address(pud_page(*pud));
-				if (!memchr_inv(page_addr, PAGE_INUSE,
-						PUD_SIZE)) {
-					free_pagetable(pud_page(*pud),
-						       get_order(PUD_SIZE));
-
-					spin_lock(&init_mm.page_table_lock);
-					pud_clear(pud);
-					spin_unlock(&init_mm.page_table_lock);
-				}
-			}
-
+			spin_lock(&init_mm.page_table_lock);
+			pud_clear(pud);
+			spin_unlock(&init_mm.page_table_lock);
+			pages++;
 			continue;
 		}

[v2,2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges

Commit Message

Comments

Patch