Message ID | 20241021051151.4664-1-suhua.tanke@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | memblock: uniformly initialize all reserved pages to MIGRATE_MOVABLE | expand |
From: Mike Rapoport (Microsoft) <rppt@kernel.org> On Mon, 21 Oct 2024 13:11:51 +0800, Hua Su wrote: > Currently when CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set, the reserved > pages are initialized to MIGRATE_MOVABLE by default in memmap_init. > > Reserved memory mainly store the metadata of struct page. When > HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON=Y and hugepages are allocated, > the HVO will remap the vmemmap virtual address range to the page which > vmemmap_reuse is mapped to. The pages previously mapping the range will > be freed to the buddy system. > > [...] Applied to for-next branch of memblock.git tree, thanks! [1/1] memblock: uniformly initialize all reserved pages to MIGRATE_MOVABLE commit: ad48825232a91a382f665bb7c3bf0044027791d4 tree: https://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock branch: for-next -- Sincerely yours, Mike.
Hello, kernel test robot noticed "kernel_BUG_at_include/linux/mm.h" on: commit: 0a19e28247d042d639e5a46c3698adeda268a7a2 ("[PATCH] memblock: uniformly initialize all reserved pages to MIGRATE_MOVABLE") url: https://github.com/intel-lab-lkp/linux/commits/Hua-Su/memblock-uniformly-initialize-all-reserved-pages-to-MIGRATE_MOVABLE/20241021-131358 base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/all/20241021051151.4664-1-suhua.tanke@gmail.com/ patch subject: [PATCH] memblock: uniformly initialize all reserved pages to MIGRATE_MOVABLE in testcase: boot config: x86_64-randconfig-012-20241023 compiler: gcc-12 test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G (please refer to attached dmesg/kmsg for entire log/backtrace) +------------------------------------------+------------+------------+ | | a8883372ec | 0a19e28247 | +------------------------------------------+------------+------------+ | boot_successes | 18 | 0 | | boot_failures | 0 | 18 | | kernel_BUG_at_include/linux/mm.h | 0 | 18 | | Oops:invalid_opcode:#[##]SMP_PTI | 0 | 18 | | RIP:page_zone | 0 | 18 | | Kernel_panic-not_syncing:Fatal_exception | 0 | 18 | +------------------------------------------+------------+------------+ If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202410251024.eb4a89f1-oliver.sang@intel.com [ 0.262363][ T0] ------------[ cut here ]------------ [ 0.262921][ T0] kernel BUG at include/linux/mm.h:1637! [ 0.263532][ T0] Oops: invalid opcode: 0000 [#1] SMP PTI [ 0.264140][ T0] CPU: 0 UID: 0 PID: 0 Comm: swapper Tainted: G T 6.12.0-rc3-00235-g0a19e28247d0 #1 [ 0.265300][ T0] Tainted: [T]=RANDSTRUCT [ 0.265762][ T0] RIP: 0010:page_zone (include/linux/mm.h:1858) [ 0.266284][ T0] Code: 43 08 89 ee 48 89 df 31 d2 5b 5d 41 5c 41 5d 41 5e e9 f1 08 02 00 48 8b 07 48 ff c0 75 0e 48 c7 c6 27 2e 99 ac e8 42 73 fd ff <0f> 0b 48 8b 07 48 c1 e8 3e 48 69 c0 40 06 00 00 48 05 c0 63 6c ad All code ======== 0: 43 08 89 ee 48 89 df rex.XB or %cl,-0x2076b712(%r9) 7: 31 d2 xor %edx,%edx 9: 5b pop %rbx a: 5d pop %rbp b: 41 5c pop %r12 d: 41 5d pop %r13 f: 41 5e pop %r14 11: e9 f1 08 02 00 jmpq 0x20907 16: 48 8b 07 mov (%rdi),%rax 19: 48 ff c0 inc %rax 1c: 75 0e jne 0x2c 1e: 48 c7 c6 27 2e 99 ac mov $0xffffffffac992e27,%rsi 25: e8 42 73 fd ff callq 0xfffffffffffd736c 2a:* 0f 0b ud2 <-- trapping instruction 2c: 48 8b 07 mov (%rdi),%rax 2f: 48 c1 e8 3e shr $0x3e,%rax 33: 48 69 c0 40 06 00 00 imul $0x640,%rax,%rax 3a: 48 05 c0 63 6c ad add $0xffffffffad6c63c0,%rax Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 48 8b 07 mov (%rdi),%rax 5: 48 c1 e8 3e shr $0x3e,%rax 9: 48 69 c0 40 06 00 00 imul $0x640,%rax,%rax 10: 48 05 c0 63 6c ad add $0xffffffffad6c63c0,%rax [ 0.268346][ T0] RSP: 0000:fffffffface03dc0 EFLAGS: 00010046 [ 0.268988][ T0] RAX: 0000000000000000 RBX: 0000000000000007 RCX: 0000000000000000 [ 0.269844][ T0] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 0.270685][ T0] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 0.271486][ T0] R10: 0000000000000000 R11: 6d75642065676170 R12: 0000000000000001 [ 0.272336][ T0] R13: 0000000000159400 R14: fffff7bb05650000 R15: ffff9bfa1ffff178 [ 0.273172][ T0] FS: 0000000000000000(0000) GS:ffff9bfcefa00000(0000) knlGS:0000000000000000 [ 0.274145][ T0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.274817][ T0] CR2: ffff9bfcfffff000 CR3: 000000015c2b2000 CR4: 00000000000000b0 [ 0.275658][ T0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.276502][ T0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.277343][ T0] Call Trace: [ 0.277691][ T0] <TASK> [ 0.277992][ T0] ? __die_body (arch/x86/kernel/dumpstack.c:421) [ 0.278448][ T0] ? die (arch/x86/kernel/dumpstack.c:449) [ 0.278838][ T0] ? do_trap (arch/x86/kernel/traps.c:156 arch/x86/kernel/traps.c:197) [ 0.279276][ T0] ? page_zone (include/linux/mm.h:1858) [ 0.279720][ T0] ? page_zone (include/linux/mm.h:1858) [ 0.280170][ T0] ? do_error_trap (arch/x86/kernel/traps.c:218) [ 0.280648][ T0] ? page_zone (include/linux/mm.h:1858) [ 0.281095][ T0] ? exc_invalid_op (arch/x86/kernel/traps.c:316) [ 0.281597][ T0] ? page_zone (include/linux/mm.h:1858) [ 0.282041][ T0] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) [ 0.282582][ T0] ? page_zone (include/linux/mm.h:1858) [ 0.283027][ T0] set_pfnblock_flags_mask (mm/page_alloc.c:408) [ 0.283583][ T0] reserve_bootmem_region (mm/mm_init.c:729 mm/mm_init.c:765) [ 0.284142][ T0] free_low_memory_core_early (mm/memblock.c:2192 mm/memblock.c:2205) [ 0.284736][ T0] ? swiotlb_init_io_tlb_pool+0x86/0x133 [ 0.285419][ T0] memblock_free_all (mm/memblock.c:2252) [ 0.285925][ T0] mem_init (arch/x86/mm/init_64.c:1360) [ 0.286332][ T0] mm_core_init (mm/mm_init.c:2658) [ 0.286790][ T0] start_kernel (init/main.c:965) [ 0.287272][ T0] x86_64_start_reservations (arch/x86/kernel/head64.c:381) [ 0.287850][ T0] x86_64_start_kernel (arch/x86/kernel/ebda.c:57) [ 0.288377][ T0] common_startup_64 (arch/x86/kernel/head_64.S:414) [ 0.288899][ T0] </TASK> [ 0.289213][ T0] Modules linked in: [ 0.289626][ T0] ---[ end trace 0000000000000000 ]--- [ 0.290175][ T0] RIP: 0010:page_zone (include/linux/mm.h:1858) [ 0.290680][ T0] Code: 43 08 89 ee 48 89 df 31 d2 5b 5d 41 5c 41 5d 41 5e e9 f1 08 02 00 48 8b 07 48 ff c0 75 0e 48 c7 c6 27 2e 99 ac e8 42 73 fd ff <0f> 0b 48 8b 07 48 c1 e8 3e 48 69 c0 40 06 00 00 48 05 c0 63 6c ad All code ======== 0: 43 08 89 ee 48 89 df rex.XB or %cl,-0x2076b712(%r9) 7: 31 d2 xor %edx,%edx 9: 5b pop %rbx a: 5d pop %rbp b: 41 5c pop %r12 d: 41 5d pop %r13 f: 41 5e pop %r14 11: e9 f1 08 02 00 jmpq 0x20907 16: 48 8b 07 mov (%rdi),%rax 19: 48 ff c0 inc %rax 1c: 75 0e jne 0x2c 1e: 48 c7 c6 27 2e 99 ac mov $0xffffffffac992e27,%rsi 25: e8 42 73 fd ff callq 0xfffffffffffd736c 2a:* 0f 0b ud2 <-- trapping instruction 2c: 48 8b 07 mov (%rdi),%rax 2f: 48 c1 e8 3e shr $0x3e,%rax 33: 48 69 c0 40 06 00 00 imul $0x640,%rax,%rax 3a: 48 05 c0 63 6c ad add $0xffffffffad6c63c0,%rax Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 48 8b 07 mov (%rdi),%rax 5: 48 c1 e8 3e shr $0x3e,%rax 9: 48 69 c0 40 06 00 00 imul $0x640,%rax,%rax 10: 48 05 c0 63 6c ad add $0xffffffffad6c63c0,%rax The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20241025/202410251024.eb4a89f1-oliver.sang@intel.com
diff --git a/mm/mm_init.c b/mm/mm_init.c index 4ba5607aaf19..6dbf2df23eee 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -722,6 +722,10 @@ static void __meminit init_reserved_page(unsigned long pfn, int nid) if (zone_spans_pfn(zone, pfn)) break; } + + if (pageblock_aligned(pfn)) + set_pageblock_migratetype(pfn_to_page(pfn), MIGRATE_MOVABLE); + __init_single_page(pfn_to_page(pfn), pfn, zid, nid); } #else
Currently when CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set, the reserved pages are initialized to MIGRATE_MOVABLE by default in memmap_init. Reserved memory mainly store the metadata of struct page. When HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON=Y and hugepages are allocated, the HVO will remap the vmemmap virtual address range to the page which vmemmap_reuse is mapped to. The pages previously mapping the range will be freed to the buddy system. Before this patch: when CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set, the freed memory was placed on the Movable list; When CONFIG_DEFERRED_STRUCT_PAGE_INIT=Y, the freed memory was placed on the Unmovable list. After this patch, the freed memory is placed on the Movable list regardless of whether CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. Eg: Tested on a virtual machine(1000GB): Intel(R) Xeon(R) Platinum 8358P CPU After vm start: echo 500000 > /proc/sys/vm/nr_hugepages cat /proc/meminfo | grep -i huge HugePages_Total: 500000 HugePages_Free: 500000 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 1024000000 kB cat /proc/pagetypeinfo before: Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10 … Node 0, zone Normal, type Unmovable 51 2 1 28 53 35 35 43 40 69 3852 Node 0, zone Normal, type Movable 6485 4610 666 202 200 185 208 87 54 2 240 Node 0, zone Normal, type Reclaimable 2 2 1 23 13 1 2 1 0 1 0 Node 0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone Normal, type Isolate 0 0 0 0 0 0 0 0 0 0 0 Unmovable ≈ 15GB after: Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10 … Node 0, zone Normal, type Unmovable 0 1 1 0 0 0 0 1 1 1 0 Node 0, zone Normal, type Movable 1563 4107 1119 189 256 368 286 132 109 4 3841 Node 0, zone Normal, type Reclaimable 2 2 1 23 13 1 2 1 0 1 0 Node 0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone Normal, type Isolate 0 0 0 0 0 0 0 0 0 0 0 Signed-off-by: Hua Su <suhua.tanke@gmail.com> --- mm/mm_init.c | 4 ++++ 1 file changed, 4 insertions(+)