Message ID | 20210301083230.30924-3-osalvador@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Cleanup and fixups for vmemmap handling | expand |
On 3/1/21 12:32 AM, Oscar Salvador wrote: > We never get to allocate 1GB pages when mapping the vmemmap range. > Drop the dead code both for the aligned and unaligned cases and leave > only the direct map handling. Could you elaborate a bit on why 1GB pages are never used? It is just unlikely to have a 64GB contiguous area of memory that needs 1GB of contiguous vmemmap? Or, does the fact that sections are smaller than 64GB keeps this from happening?
On Thu, Mar 04, 2021 at 10:42:59AM -0800, Dave Hansen wrote: > On 3/1/21 12:32 AM, Oscar Salvador wrote: > > We never get to allocate 1GB pages when mapping the vmemmap range. > > Drop the dead code both for the aligned and unaligned cases and leave > > only the direct map handling. > > Could you elaborate a bit on why 1GB pages are never used? It is just > unlikely to have a 64GB contiguous area of memory that needs 1GB of > contiguous vmemmap? Or, does the fact that sections are smaller than > 64GB keeps this from happening? AFAIK, the biggest we populate vmemmap pages with is 2MB, plus the fact that as you pointed out, memory sections on x86_64 are 128M, which is way smaller than what would require to allocate a 1GB for vmemmap pages. Am I missing something?
On 08.03.21 19:48, Oscar Salvador wrote: > On Thu, Mar 04, 2021 at 10:42:59AM -0800, Dave Hansen wrote: >> On 3/1/21 12:32 AM, Oscar Salvador wrote: >>> We never get to allocate 1GB pages when mapping the vmemmap range. >>> Drop the dead code both for the aligned and unaligned cases and leave >>> only the direct map handling. >> >> Could you elaborate a bit on why 1GB pages are never used? It is just >> unlikely to have a 64GB contiguous area of memory that needs 1GB of >> contiguous vmemmap? Or, does the fact that sections are smaller than >> 64GB keeps this from happening? > > AFAIK, the biggest we populate vmemmap pages with is 2MB, plus the fact > that as you pointed out, memory sections on x86_64 are 128M, which is > way smaller than what would require to allocate a 1GB for vmemmap pages. > > Am I missing something? Right now, it is dead code that you are removing. Just like for 2MB vmemmap pages, we would proactively have populate 1G pages when adding individual sections. You can easily waste a lot of memory. Of course, one could also make a final pass over the tables to see where it makes sense forming 1GB pages. But then, we would need quite some logic when removing individual sections (e.g., a 128 MB DIMM) - and I remember there are corner cases where we might have to remove boot memory ... Long story short, I don't think 1G vmemmap pages are really worth the trouble.
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index b0e1d215c83e..9ecb3c488ac8 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1062,7 +1062,6 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end, unsigned long next, pages = 0; pmd_t *pmd_base; pud_t *pud; - void *page_addr; pud = pud_start + pud_index(addr); for (; addr < end; addr = next, pud++) { @@ -1071,33 +1070,13 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end, if (!pud_present(*pud)) continue; - if (pud_large(*pud)) { - if (IS_ALIGNED(addr, PUD_SIZE) && - IS_ALIGNED(next, PUD_SIZE)) { - if (!direct) - free_pagetable(pud_page(*pud), - get_order(PUD_SIZE)); - - spin_lock(&init_mm.page_table_lock); - pud_clear(pud); - spin_unlock(&init_mm.page_table_lock); - pages++; - } else { - /* If here, we are freeing vmemmap pages. */ - memset((void *)addr, PAGE_INUSE, next - addr); - - page_addr = page_address(pud_page(*pud)); - if (!memchr_inv(page_addr, PAGE_INUSE, - PUD_SIZE)) { - free_pagetable(pud_page(*pud), - get_order(PUD_SIZE)); - - spin_lock(&init_mm.page_table_lock); - pud_clear(pud); - spin_unlock(&init_mm.page_table_lock); - } - } - + if (pud_large(*pud) && + IS_ALIGNED(addr, PUD_SIZE) && + IS_ALIGNED(next, PUD_SIZE)) { + spin_lock(&init_mm.page_table_lock); + pud_clear(pud); + spin_unlock(&init_mm.page_table_lock); + pages++; continue; }