Message ID | 20241021004613.18793-3-dpsmith@apertussolutions.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Boot modules for Hyperlaunch | expand |
On 21/10/2024 1:45 am, Daniel P. Smith wrote: > To start transitioning consider_modules() over to struct boot_module, begin > with taking the array of struct boot_modules but use the temporary struct > element mod. > > No functional change intended. > > Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com> > Reviewed-by: Jason Andryuk <jason.andryuk@amd.com> I've taken patch 1 (currently in testing to push). However, this change breaks PV Shim (same test case that I debugged the OSSTest failure with on Saturday). (XEN) PVH start info: (pa 0000ffc0) (XEN) version: 1 (XEN) flags: 0 (XEN) nr_modules: 2 (XEN) modlist_pa: 000000000000ff60 (XEN) cmdline_pa: 000000000000ffa0 (XEN) cmdline: 'pv-shim console=xen,pv' (XEN) rsdp_pa: 00000000fc008000 (XEN) mod[0].pa: 0000000001042000 (XEN) mod[0].size: 0000000006907440 (XEN) mod[0].cmdline_pa: 0000000000000000 (XEN) mod[1].pa: 00000000016d9000 (XEN) mod[1].size: 0000000019870751 (XEN) mod[1].cmdline_pa: 0000000000000000 (XEN) Bootloader: PVH Directboot (XEN) Command line: pv-shim console=xen,pv (XEN) Xen image load base address: 0 ... (XEN) ----[ Xen-4.20-unstable x86_64 debug=y ubsan=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82d040a51d38>] arch/x86/pv/dom0_build.c#dom0_construct+0x2020/0x24eb ... (XEN) Xen call trace: (XEN) [<ffff82d040a51d38>] R arch/x86/pv/dom0_build.c#dom0_construct+0x2020/0x24eb (XEN) [<ffff82d040a52222>] F dom0_construct_pv+0x1f/0xcd (XEN) [<ffff82d040a7bb27>] F construct_dom0+0xad/0xc0 (XEN) [<ffff82d040a6eb3f>] F __start_xen+0x50f2/0x53aa (XEN) [<ffff82d040206184>] F __high_start+0x94/0xa0 ... (XEN) Pagetable walk from ffff8287fffffff8: (XEN) L4[0x105] = 000000001fa17063 ffffffffffffffff (XEN) L3[0x01f] = 0000000000000000 ffffffffffffffff (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) FATAL PAGE FAULT (XEN) [error_code=0000] (XEN) Faulting linear address: ffff8287fffffff8 (XEN) **************************************** We've got a wild access, but it's not clear exactly what's going on here. addr2line says it's line 904 which is in the middle of the page_list_for_each() loop setting up the physmap. My gut feeling is that the pointer -> array conversion in consider_modules() isn't correct, because that's literally the only non-trivial change, but I can't spot anything concretely wrong... ~Andrew
On 21/10/2024 10:53 am, Andrew Cooper wrote: > On 21/10/2024 1:45 am, Daniel P. Smith wrote: >> To start transitioning consider_modules() over to struct boot_module, begin >> with taking the array of struct boot_modules but use the temporary struct >> element mod. >> >> No functional change intended. >> >> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com> >> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com> > I've taken patch 1 (currently in testing to push). > > However, this change breaks PV Shim (same test case that I debugged the > OSSTest failure with on Saturday). > > (XEN) PVH start info: (pa 0000ffc0) > (XEN) version: 1 > (XEN) flags: 0 > (XEN) nr_modules: 2 > (XEN) modlist_pa: 000000000000ff60 > (XEN) cmdline_pa: 000000000000ffa0 > (XEN) cmdline: 'pv-shim console=xen,pv' > (XEN) rsdp_pa: 00000000fc008000 > (XEN) mod[0].pa: 0000000001042000 > (XEN) mod[0].size: 0000000006907440 > (XEN) mod[0].cmdline_pa: 0000000000000000 > (XEN) mod[1].pa: 00000000016d9000 > (XEN) mod[1].size: 0000000019870751 > (XEN) mod[1].cmdline_pa: 0000000000000000 > (XEN) Bootloader: PVH Directboot > (XEN) Command line: pv-shim console=xen,pv > (XEN) Xen image load base address: 0 > ... > (XEN) ----[ Xen-4.20-unstable x86_64 debug=y ubsan=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82d040a51d38>] > arch/x86/pv/dom0_build.c#dom0_construct+0x2020/0x24eb > ... > (XEN) Xen call trace: > (XEN) [<ffff82d040a51d38>] R > arch/x86/pv/dom0_build.c#dom0_construct+0x2020/0x24eb > (XEN) [<ffff82d040a52222>] F dom0_construct_pv+0x1f/0xcd > (XEN) [<ffff82d040a7bb27>] F construct_dom0+0xad/0xc0 > (XEN) [<ffff82d040a6eb3f>] F __start_xen+0x50f2/0x53aa > (XEN) [<ffff82d040206184>] F __high_start+0x94/0xa0 > ... > (XEN) Pagetable walk from ffff8287fffffff8: > (XEN) L4[0x105] = 000000001fa17063 ffffffffffffffff > (XEN) L3[0x01f] = 0000000000000000 ffffffffffffffff > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) FATAL PAGE FAULT > (XEN) [error_code=0000] > (XEN) Faulting linear address: ffff8287fffffff8 > (XEN) **************************************** > > We've got a wild access, but it's not clear exactly what's going on here. > > addr2line says it's line 904 which is in the middle of the > page_list_for_each() loop setting up the physmap. > > My gut feeling is that the pointer -> array conversion in > consider_modules() isn't correct, because that's literally the only > non-trivial change, but I can't spot anything concretely wrong... To follow up, this isn't really the fault of this patch. It's a hindenbug left by prior patches. Fix at https://lore.kernel.org/xen-devel/20241022124114.84498-1-andrew.cooper3@citrix.com/T/#u ~Andrew
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index db670258d650..1eafa0a61e0e 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -645,7 +645,7 @@ static void __init noinline move_xen(void) #undef BOOTSTRAP_MAP_LIMIT static uint64_t __init consider_modules( - uint64_t s, uint64_t e, uint32_t size, const module_t *mod, + uint64_t s, uint64_t e, uint32_t size, const struct boot_module mods[], unsigned int nr_mods, unsigned int this_mod) { unsigned int i; @@ -655,20 +655,20 @@ static uint64_t __init consider_modules( for ( i = 0; i < nr_mods ; ++i ) { - uint64_t start = (uint64_t)mod[i].mod_start << PAGE_SHIFT; - uint64_t end = start + PAGE_ALIGN(mod[i].mod_end); + uint64_t start = pfn_to_paddr(mods[i].mod->mod_start); + uint64_t end = start + PAGE_ALIGN(mods[i].mod->mod_end); if ( i == this_mod ) continue; if ( s < end && start < e ) { - end = consider_modules(end, e, size, mod + i + 1, + end = consider_modules(end, e, size, &mods[i + 1], nr_mods - i - 1, this_mod - i - 1); if ( end ) return end; - return consider_modules(s, start, size, mod + i + 1, + return consider_modules(s, start, size, &mods[i + 1], nr_mods - i - 1, this_mod - i - 1); } } @@ -1451,7 +1451,7 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p) { /* Don't overlap with modules. */ end = consider_modules(s, e, reloc_size + mask, - mod, bi->nr_modules, -1); + bi->mods, bi->nr_modules, -1); end &= ~mask; } else @@ -1486,7 +1486,7 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p) continue; /* Don't overlap with other modules (or Xen itself). */ - end = consider_modules(s, e, size, mod, + end = consider_modules(s, e, size, bi->mods, bi->nr_modules + relocated, j); if ( highmem_start && end > highmem_start ) @@ -1513,7 +1513,7 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p) while ( !kexec_crash_area.start ) { /* Don't overlap with modules (or Xen itself). */ - e = consider_modules(s, e, PAGE_ALIGN(kexec_crash_area.size), mod, + e = consider_modules(s, e, PAGE_ALIGN(kexec_crash_area.size), bi->mods, bi->nr_modules + relocated, -1); if ( s >= e ) break;