Message ID | 20200626093450.27741-1-joro@8bytes.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/mm: Pre-allocate p4d/pud pages for vmalloc area | expand |
On Fri, Jun 26, 2020 at 11:34:50AM +0200, Joerg Roedel wrote: > From: Joerg Roedel <jroedel@suse.de> > > Pre-allocate the page-table pages for the vmalloc area at the level > which needs synchronization on x86. This is P4D for 5-level and PUD > for 4-level paging. > > Doing this at boot makes sure all page-tables in the system have these > pages already and do not need to be synchronized at runtime. The > runtime synchronizatin takes the pgd_lock and iterates over all > page-tables in the system, so it can take quite long and is better > avoided. > > Signed-off-by: Joerg Roedel <jroedel@suse.de> > --- Can't we now remove arch_sync_kernel_mappings() from this same file?
On Fri, Jun 26, 2020 at 01:07:31PM +0200, Peter Zijlstra wrote:
> Can't we now remove arch_sync_kernel_mappings() from this same file?
Only if we panic on allocation failure.
Joerg
On Fri, Jun 26, 2020 at 01:17:11PM +0200, Joerg Roedel wrote: > On Fri, Jun 26, 2020 at 01:07:31PM +0200, Peter Zijlstra wrote: > > Can't we now remove arch_sync_kernel_mappings() from this same file? > > Only if we panic on allocation failure. I think we do that in plenty places already, so sure ;-)
On Fri, Jun 26, 2020 at 01:31:01PM +0200, Peter Zijlstra wrote: > On Fri, Jun 26, 2020 at 01:17:11PM +0200, Joerg Roedel wrote: > > On Fri, Jun 26, 2020 at 01:07:31PM +0200, Peter Zijlstra wrote: > > > Can't we now remove arch_sync_kernel_mappings() from this same file? > > > > Only if we panic on allocation failure. > > I think we do that in plenty places already, so sure ;-) That is, this is boot time only, right? clone() would return -ENOMEM, as it's part of the normal page-table copy.
On Fri, Jun 26, 2020 at 01:32:15PM +0200, Peter Zijlstra wrote: > That is, this is boot time only, right? clone() would return -ENOMEM, as > it's part of the normal page-table copy. Yes, the pre-allocation happens shortly after the buddy allocator took over from bootmem. I don't quite get what clone() has to do with it. Joerg
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index dbae185511cd..475a4008445b 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1238,6 +1238,59 @@ static void __init register_page_bootmem_info(void) #endif } +/* + * Pre-allocates page-table pages for the vmalloc area in the kernel page-table. + * Only the level which needs to be synchronized between all page-tables is + * allocated because the synchronization can be expensive. + */ +static void __init preallocate_vmalloc_pages(void) +{ + unsigned long addr; + const char *lvl; + int count = 0; + + for (addr = VMALLOC_START; addr <= VMALLOC_END; addr = ALIGN(addr + 1, PGDIR_SIZE)) { + pgd_t *pgd = pgd_offset_k(addr); + p4d_t *p4d; + pud_t *pud; + + p4d = p4d_offset(pgd, addr); + if (p4d_none(*p4d)) { + /* Can only happen with 5-level paging */ + p4d = p4d_alloc(&init_mm, pgd, addr); + if (!p4d) { + lvl = "p4d"; + goto failed; + } + count += 1; + } + + if (pgtable_l5_enabled()) + continue; + + pud = pud_offset(p4d, addr); + if (pud_none(*pud)) { + /* Ends up here only with 4-level paging */ + pud = pud_alloc(&init_mm, p4d, addr); + if (!pud) { + lvl = "pud"; + goto failed; + } + count += 1; + } + } + + return; + +failed: + + /* + * A failure here is not fatal - If the pages can be allocated later it + * will be synchronized to other page-tables. + */ + pr_err("Failed to pre-allocate %s pages for vmalloc area\n", lvl); +} + void __init mem_init(void) { pci_iommu_alloc(); @@ -1261,6 +1314,8 @@ void __init mem_init(void) if (get_gate_vma(&init_mm)) kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR, PAGE_SIZE, KCORE_USER); + preallocate_vmalloc_pages(); + mem_init_print_info(NULL); }