Message ID | 1571131302-32290-1-git-send-email-anshuman.khandual@arm.com (mailing list archive) |
---|---|
Headers | show |
Series | mm/debug: Add tests validating architecture page table helpers | expand |
The x86 will crash with linux-next during boot due to this series (v5) with the config below plus CONFIG_DEBUG_VM_PGTABLE=y. I am not sure if v6 would address it. https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config [ 33.862600][ T1] page:ffffea0009000000 is uninitialized and poisoned [ 33.862608][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffff871140][ T1] ? _raw_spin_unlock_irq+0x27/0x40 [ 33.871140][ T1] ? rest_init+0x307/0x307 [ 33.871140][ T1] kernel_init+0x11/0x139 [ 33.871140][ T1] ? rest_init+0x307/0x307 [ 33.871140][ T1] ret_from_fork+0x27/0x50 [ 33.871140][ T1] Modules linked in: [ 33.871140][ T1] ---[ end trace e99d392b0f7befbd ]--- [ 33.871140][ T1] RIP: 0010:alloc_gigantic_page_order+0x3fe/0x490 On Tue, 2019-10-15 at 14:51 +0530, Anshuman Khandual wrote: > This series adds a test validation for architecture exported page table > helpers. Patch in the series adds basic transformation tests at various > levels of the page table. Before that it exports gigantic page allocation > function from HugeTLB. > > This test was originally suggested by Catalin during arm64 THP migration > RFC discussion earlier. Going forward it can include more specific tests > with respect to various generic MM functions like THP, HugeTLB etc and > platform specific tests. > > https://lore.kernel.org/linux-mm/20190628102003.GA56463@arrakis.emea.arm.com/ > > Changes in V6: > > - Moved alloc_gigantic_page_order() into mm/page_alloc.c per Michal > - Moved alloc_gigantic_page_order() within CONFIG_CONTIG_ALLOC in the test > - Folded Andrew's include/asm-generic/pgtable.h fix into the test patch 2/2 > > Changes in V5: (https://patchwork.kernel.org/project/linux-mm/list/?series=185991) > > - Redefined and moved X86 mm_p4d_folded() into a different header per Kirill/Ingo > - Updated the config option comment per Ingo and dropped 'kernel module' reference > - Updated the commit message and dropped 'kernel module' reference > - Changed DEBUG_ARCH_PGTABLE_TEST into DEBUG_VM_PGTABLE per Ingo > - Moved config option from mm/Kconfig.debug into lib/Kconfig.debug > - Renamed core test function arch_pgtable_tests() as debug_vm_pgtable() > - Renamed mm/arch_pgtable_test.c as mm/debug_vm_pgtable.c > - debug_vm_pgtable() gets called from kernel_init_freeable() after init_mm_internals() > - Added an entry in Documentation/features/debug/ per Ingo > - Enabled the test on arm64 and x86 platforms for now > > Changes in V4: (https://patchwork.kernel.org/project/linux-mm/list/?series=183465) > > - Disable DEBUG_ARCH_PGTABLE_TEST for ARM and IA64 platforms > > Changes in V3: (https://lore.kernel.org/patchwork/project/lkml/list/?series=411216) > > - Changed test trigger from module format into late_initcall() > - Marked all functions with __init to be freed after completion > - Changed all __PGTABLE_PXX_FOLDED checks as mm_pxx_folded() > - Folded in PPC32 fixes from Christophe > > Changes in V2: > > https://lore.kernel.org/linux-mm/1568268173-31302-1-git-send-email-anshuman.khandual@arm.com/T/#t > > - Fixed small typo error in MODULE_DESCRIPTION() > - Fixed m64k build problems for lvalue concerns in pmd_xxx_tests() > - Fixed dynamic page table level folding problems on x86 as per Kirril > - Fixed second pointers during pxx_populate_tests() per Kirill and Gerald > - Allocate and free pte table with pte_alloc_one/pte_free per Kirill > - Modified pxx_clear_tests() to accommodate s390 lower 12 bits situation > - Changed RANDOM_NZVALUE value from 0xbe to 0xff > - Changed allocation, usage, free sequence for saved_ptep > - Renamed VMA_FLAGS as VMFLAGS > - Implemented a new method for random vaddr generation > - Implemented some other cleanups > - Dropped extern reference to mm_alloc() > - Created and exported new alloc_gigantic_page_order() > - Dropped the custom allocator and used new alloc_gigantic_page_order() > > Changes in V1: > > https://lore.kernel.org/linux-mm/1567497706-8649-1-git-send-email-anshuman.khandual@arm.com/ > > - Added fallback mechanism for PMD aligned memory allocation failure > > Changes in RFC V2: > > https://lore.kernel.org/linux-mm/1565335998-22553-1-git-send-email-anshuman.khandual@arm.com/T/#u > > - Moved test module and it's config from lib/ to mm/ > - Renamed config TEST_ARCH_PGTABLE as DEBUG_ARCH_PGTABLE_TEST > - Renamed file from test_arch_pgtable.c to arch_pgtable_test.c > - Added relevant MODULE_DESCRIPTION() and MODULE_AUTHOR() details > - Dropped loadable module config option > - Basic tests now use memory blocks with required size and alignment > - PUD aligned memory block gets allocated with alloc_contig_range() > - If PUD aligned memory could not be allocated it falls back on PMD aligned > memory block from page allocator and pud_* tests are skipped > - Clear and populate tests now operate on real in memory page table entries > - Dummy mm_struct gets allocated with mm_alloc() > - Dummy page table entries get allocated with [pud|pmd|pte]_alloc_[map]() > - Simplified [p4d|pgd]_basic_tests(), now has random values in the entries > > Original RFC V1: > > https://lore.kernel.org/linux-mm/1564037723-26676-1-git-send-email-anshuman.khandual@arm.com/ > > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> > Cc: Jason Gunthorpe <jgg@ziepe.ca> > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Michal Hocko <mhocko@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Mark Brown <broonie@kernel.org> > Cc: Steven Price <Steven.Price@arm.com> > Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> > Cc: Masahiro Yamada <yamada.masahiro@socionext.com> > Cc: Kees Cook <keescook@chromium.org> > Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: Sri Krishna chowdary <schowdary@nvidia.com> > Cc: Dave Hansen <dave.hansen@intel.com> > Cc: Russell King - ARM Linux <linux@armlinux.org.uk> > Cc: Michael Ellerman <mpe@ellerman.id.au> > Cc: Paul Mackerras <paulus@samba.org> > Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> > Cc: Heiko Carstens <heiko.carstens@de.ibm.com> > Cc: "David S. Miller" <davem@davemloft.net> > Cc: Vineet Gupta <vgupta@synopsys.com> > Cc: James Hogan <jhogan@kernel.org> > Cc: Paul Burton <paul.burton@mips.com> > Cc: Ralf Baechle <ralf@linux-mips.org> > Cc: Kirill A. Shutemov <kirill@shutemov.name> > Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> > Cc: Christophe Leroy <christophe.leroy@c-s.fr> > Cc: Mike Kravetz <mike.kravetz@oracle.com> > Cc: linux-snps-arc@lists.infradead.org > Cc: linux-mips@vger.kernel.org > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-ia64@vger.kernel.org > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-s390@vger.kernel.org > Cc: linux-sh@vger.kernel.org > Cc: sparclinux@vger.kernel.org > Cc: x86@kernel.org > Cc: linux-kernel@vger.kernel.org > > > Anshuman Khandual (2): > mm/page_alloc: Make alloc_gigantic_page() available for general use > mm/debug: Add tests validating architecture page table helpers > > .../debug/debug-vm-pgtable/arch-support.txt | 34 ++ > arch/arm64/Kconfig | 1 + > arch/x86/Kconfig | 1 + > arch/x86/include/asm/pgtable_64.h | 6 + > include/asm-generic/pgtable.h | 6 + > include/linux/gfp.h | 3 + > init/main.c | 1 + > lib/Kconfig.debug | 21 + > mm/Makefile | 1 + > mm/debug_vm_pgtable.c | 450 ++++++++++++++++++ > mm/hugetlb.c | 76 +-- > mm/page_alloc.c | 98 ++++ > 12 files changed, 623 insertions(+), 75 deletions(-) > create mode 100644 Documentation/features/debug/debug-vm-pgtable/arch-support.txt > create mode 100644 mm/debug_vm_pgtable.c >
On 10/15/2019 08:11 PM, Qian Cai wrote: > The x86 will crash with linux-next during boot due to this series (v5) with the > config below plus CONFIG_DEBUG_VM_PGTABLE=y. I am not sure if v6 would address > it. > > https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config > > [ 33.862600][ T1] page:ffffea0009000000 is uninitialized and poisoned > [ 33.862608][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff > ffffff871140][ T1] ? _raw_spin_unlock_irq+0x27/0x40 > [ 33.871140][ T1] ? rest_init+0x307/0x307 > [ 33.871140][ T1] kernel_init+0x11/0x139 > [ 33.871140][ T1] ? rest_init+0x307/0x307 > [ 33.871140][ T1] ret_from_fork+0x27/0x50 > [ 33.871140][ T1] Modules linked in: > [ 33.871140][ T1] ---[ end trace e99d392b0f7befbd ]--- > [ 33.871140][ T1] RIP: 0010:alloc_gigantic_page_order+0x3fe/0x490 Hmm, with defconfig (DEBUG_VM=y and DEBUG_VM_PGTABLE=y) it does not crash but with the config above, it does. Just wondering if it is possible that these pages might not been initialized yet because DEFERRED_STRUCT_PAGE_INIT=y ? [ 13.898549][ T1] page:ffffea0005000000 is uninitialized and poisoned [ 13.898549][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff [ 13.898549][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff [ 13.898549][ T1] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) [ 13.898549][ T1] ------------[ cut here ]------------ [ 13.898549][ T1] kernel BUG at ./include/linux/mm.h:1107! [ 13.898549][ T1] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 13.898549][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-next-20191015+ #
On Tue, 2019-10-15 at 20:51 +0530, Anshuman Khandual wrote: > > On 10/15/2019 08:11 PM, Qian Cai wrote: > > The x86 will crash with linux-next during boot due to this series (v5) with the > > config below plus CONFIG_DEBUG_VM_PGTABLE=y. I am not sure if v6 would address > > it. > > > > https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config > > > > [ 33.862600][ T1] page:ffffea0009000000 is uninitialized and poisoned > > [ 33.862608][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff > > ffffff871140][ T1] ? _raw_spin_unlock_irq+0x27/0x40 > > [ 33.871140][ T1] ? rest_init+0x307/0x307 > > [ 33.871140][ T1] kernel_init+0x11/0x139 > > [ 33.871140][ T1] ? rest_init+0x307/0x307 > > [ 33.871140][ T1] ret_from_fork+0x27/0x50 > > [ 33.871140][ T1] Modules linked in: > > [ 33.871140][ T1] ---[ end trace e99d392b0f7befbd ]--- > > [ 33.871140][ T1] RIP: 0010:alloc_gigantic_page_order+0x3fe/0x490 > > Hmm, with defconfig (DEBUG_VM=y and DEBUG_VM_PGTABLE=y) it does not crash but > with the config above, it does. Just wondering if it is possible that these > pages might not been initialized yet because DEFERRED_STRUCT_PAGE_INIT=y ? Yes, this patch works fine. diff --git a/init/main.c b/init/main.c index 676d8020dd29..591be8f9e8e0 100644 --- a/init/main.c +++ b/init/main.c @@ -1177,7 +1177,6 @@ static noinline void __init kernel_init_freeable(void) workqueue_init(); init_mm_internals(); - debug_vm_pgtable(); do_pre_smp_initcalls(); lockup_detector_init(); @@ -1186,6 +1185,8 @@ static noinline void __init kernel_init_freeable(void) sched_init_smp(); page_alloc_init_late(); + debug_vm_pgtable(); + /* Initialize page ext after all struct pages are initialized. */ page_ext_init(); > > [ 13.898549][ T1] page:ffffea0005000000 is uninitialized and poisoned > [ 13.898549][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > [ 13.898549][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > [ 13.898549][ T1] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) > [ 13.898549][ T1] ------------[ cut here ]------------ > [ 13.898549][ T1] kernel BUG at ./include/linux/mm.h:1107! > [ 13.898549][ T1] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI > [ 13.898549][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-next-20191015+ #
On Tue 15-10-19 20:51:11, Anshuman Khandual wrote: > > > On 10/15/2019 08:11 PM, Qian Cai wrote: > > The x86 will crash with linux-next during boot due to this series (v5) with the > > config below plus CONFIG_DEBUG_VM_PGTABLE=y. I am not sure if v6 would address > > it. > > > > https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config > > > > [ 33.862600][ T1] page:ffffea0009000000 is uninitialized and poisoned > > [ 33.862608][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff > > ffffff871140][ T1] ? _raw_spin_unlock_irq+0x27/0x40 > > [ 33.871140][ T1] ? rest_init+0x307/0x307 > > [ 33.871140][ T1] kernel_init+0x11/0x139 > > [ 33.871140][ T1] ? rest_init+0x307/0x307 > > [ 33.871140][ T1] ret_from_fork+0x27/0x50 > > [ 33.871140][ T1] Modules linked in: > > [ 33.871140][ T1] ---[ end trace e99d392b0f7befbd ]--- > > [ 33.871140][ T1] RIP: 0010:alloc_gigantic_page_order+0x3fe/0x490 > > Hmm, with defconfig (DEBUG_VM=y and DEBUG_VM_PGTABLE=y) it does not crash but > with the config above, it does. Just wondering if it is possible that these > pages might not been initialized yet because DEFERRED_STRUCT_PAGE_INIT=y ? Quite likely. You need to wait for page_alloc_init_late to finish. > > [ 13.898549][ T1] page:ffffea0005000000 is uninitialized and poisoned > [ 13.898549][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > [ 13.898549][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > [ 13.898549][ T1] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) > [ 13.898549][ T1] ------------[ cut here ]------------ > [ 13.898549][ T1] kernel BUG at ./include/linux/mm.h:1107! > [ 13.898549][ T1] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI > [ 13.898549][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-next-20191015+ #
On 10/16/2019 12:12 AM, Qian Cai wrote: > On Tue, 2019-10-15 at 20:51 +0530, Anshuman Khandual wrote: >> >> On 10/15/2019 08:11 PM, Qian Cai wrote: >>> The x86 will crash with linux-next during boot due to this series (v5) with the >>> config below plus CONFIG_DEBUG_VM_PGTABLE=y. I am not sure if v6 would address >>> it. >>> >>> https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config >>> >>> [ 33.862600][ T1] page:ffffea0009000000 is uninitialized and poisoned >>> [ 33.862608][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff >>> ffffff871140][ T1] ? _raw_spin_unlock_irq+0x27/0x40 >>> [ 33.871140][ T1] ? rest_init+0x307/0x307 >>> [ 33.871140][ T1] kernel_init+0x11/0x139 >>> [ 33.871140][ T1] ? rest_init+0x307/0x307 >>> [ 33.871140][ T1] ret_from_fork+0x27/0x50 >>> [ 33.871140][ T1] Modules linked in: >>> [ 33.871140][ T1] ---[ end trace e99d392b0f7befbd ]--- >>> [ 33.871140][ T1] RIP: 0010:alloc_gigantic_page_order+0x3fe/0x490 >> >> Hmm, with defconfig (DEBUG_VM=y and DEBUG_VM_PGTABLE=y) it does not crash but >> with the config above, it does. Just wondering if it is possible that these >> pages might not been initialized yet because DEFERRED_STRUCT_PAGE_INIT=y ? > > Yes, this patch works fine. > > diff --git a/init/main.c b/init/main.c > index 676d8020dd29..591be8f9e8e0 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -1177,7 +1177,6 @@ static noinline void __init kernel_init_freeable(void) > workqueue_init(); > > init_mm_internals(); > - debug_vm_pgtable(); > > do_pre_smp_initcalls(); > lockup_detector_init(); > @@ -1186,6 +1185,8 @@ static noinline void __init kernel_init_freeable(void) > sched_init_smp(); > > page_alloc_init_late(); > + debug_vm_pgtable(); > + > /* Initialize page ext after all struct pages are initialized. */ > page_ext_init(); > Sure, will keep this in mind if we at all end up with memory allocation approach for this test. >> >> [ 13.898549][ T1] page:ffffea0005000000 is uninitialized and poisoned >> [ 13.898549][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff >> [ 13.898549][ T1] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff >> [ 13.898549][ T1] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) >> [ 13.898549][ T1] ------------[ cut here ]------------ >> [ 13.898549][ T1] kernel BUG at ./include/linux/mm.h:1107! >> [ 13.898549][ T1] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI >> [ 13.898549][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-next-20191015+ # >