Message ID | 20240119145241.769622-1-bhe@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | Split crash out from kexec and clean up related config items | expand |
Hi Baoquan, On 19/01/24 8:22 pm, Baoquan He wrote: > Motivation: > ============= > Previously, LKP reported a building error. When investigating, it can't > be resolved reasonablly with the present messy kdump config items. > > https://lore.kernel.org/oe-kbuild-all/202312182200.Ka7MzifQ-lkp@intel.com/ > > The kdump (crash dumping) related config items could causes confusions: > > Firstly, > --- > CRASH_CORE enables codes including > - crashkernel reservation; > - elfcorehdr updating; > - vmcoreinfo exporting; > - crash hotplug handling; > > Now fadump of powerpc, kcore dynamic debugging and kdump all selects > CRASH_CORE, while fadump > - fadump needs crashkernel parsing, vmcoreinfo exporting, and accessing > global variable 'elfcorehdr_addr'; > - kcore only needs vmcoreinfo exporting; > - kdump needs all of the current kernel/crash_core.c. > > So only enabling PROC_CORE or FA_DUMP will enable CRASH_CORE, this > mislead people that we enable crash dumping, actual it's not. > > Secondly, > --- > It's not reasonable to allow KEXEC_CORE select CRASH_CORE. > > Because KEXEC_CORE enables codes which allocate control pages, copy > kexec/kdump segments, and prepare for switching. These codes are > shared by both kexec reboot and kdump. We could want kexec reboot, > but disable kdump. In that case, CRASH_CORE should not be selected. > > -------------------- > CONFIG_CRASH_CORE=y > CONFIG_KEXEC_CORE=y > CONFIG_KEXEC=y > CONFIG_KEXEC_FILE=y > --------------------- > > Thirdly, > --- > It's not reasonable to allow CRASH_DUMP select KEXEC_CORE. > > That could make KEXEC_CORE, CRASH_DUMP are enabled independently from > KEXEC or KEXEC_FILE. However, w/o KEXEC or KEXEC_FILE, the KEXEC_CORE > code built in doesn't make any sense because no kernel loading or > switching will happen to utilize the KEXEC_CORE code. > --------------------- > CONFIG_CRASH_CORE=y > CONFIG_KEXEC_CORE=y > CONFIG_CRASH_DUMP=y > --------------------- > > In this case, what is worse, on arch sh and arm, KEXEC relies on MMU, > while CRASH_DUMP can still be enabled when !MMU, then compiling error is > seen as the lkp test robot reported in above link. > > ------arch/sh/Kconfig------ > config ARCH_SUPPORTS_KEXEC > def_bool MMU > > config ARCH_SUPPORTS_CRASH_DUMP > def_bool BROKEN_ON_SMP > --------------------------- > > Changes: > =========== > 1, split out crash_reserve.c from crash_core.c; > 2, split out vmcore_infoc. from crash_core.c; > 3, move crash related codes in kexec_core.c into crash_core.c; > 4, remove dependency of FA_DUMP on CRASH_DUMP; > 5, clean up kdump related config items; > 6, wrap up crash codes in crash related ifdefs on all 9 arch-es > which support crash dumping; > > Achievement: > =========== > With above changes, I can rearrange the config item logic as below (the right > item depends on or is selected by the left item): > > PROC_KCORE -----------> VMCORE_INFO > > |----------> VMCORE_INFO > FA_DUMP----| > |----------> CRASH_RESERVE FA_DUMP also needs PROC_VMCORE (CRASH_DUMP by dependency, I guess). So, the FA_DUMP related changes here will need a relook.. > ---->VMCORE_INFO > / > |---->CRASH_RESERVE > KEXEC --| /| > |--> KEXEC_CORE--> CRASH_DUMP-->/-|---->PROC_VMCORE > KEXEC_FILE --| \ | > \---->CRASH_HOTPLUG > > > KEXEC --| > |--> KEXEC_CORE (for kexec reboot only) > KEXEC_FILE --| > > Test > ======== > On all 8 architectures, including x86_64, arm64, s390x, sh, arm, mips, > riscv, loongarch, I did below three cases of config item setting and > building all passed. Let me take configs on x86_64 as exampmle here: > > (1) Both CONFIG_KEXEC and KEXEC_FILE is unset, then all kexec/kdump > items are unset automatically: > # Kexec and crash features > # CONFIG_KEXEC is not set > # CONFIG_KEXEC_FILE is not set > # end of Kexec and crash features > > (2) set CONFIG_KEXEC_FILE and 'make olddefconfig': > --------------- > # Kexec and crash features > CONFIG_CRASH_RESERVE=y > CONFIG_VMCORE_INFO=y > CONFIG_KEXEC_CORE=y > CONFIG_KEXEC_FILE=y > CONFIG_CRASH_DUMP=y > CONFIG_CRASH_HOTPLUG=y > CONFIG_CRASH_MAX_MEMORY_RANGES=8192 > # end of Kexec and crash features > --------------- > > (3) unset CONFIG_CRASH_DUMP in case 2 and execute 'make olddefconfig': > ------------------------ > # Kexec and crash features > CONFIG_KEXEC_CORE=y > CONFIG_KEXEC_FILE=y > # end of Kexec and crash features > ------------------------ > > Note: > For ppc, it needs investigation to make clear how to split out crash > code in arch folder. On powerpc, both kdump and fadump need PROC_VMCORE & CRASH_DUMP. Hope that clears things. So, patch 3/14 breaks things for FA_DUMP.. > Hope Hari and Pingfan can help have a look, see if > it's doable. Now, I make it either have both kexec and crash enabled, or > disable both of them altogether. Sure. I will take a closer look... Thanks Hari
On 02/02/24 at 10:53am, Hari Bathini wrote: > Hi Baoquan, > > On 19/01/24 8:22 pm, Baoquan He wrote: > > Motivation: > > ============= > > Previously, LKP reported a building error. When investigating, it can't > > be resolved reasonablly with the present messy kdump config items. > > > > https://lore.kernel.org/oe-kbuild-all/202312182200.Ka7MzifQ-lkp@intel.com/ > > > > The kdump (crash dumping) related config items could causes confusions: > > > > Firstly, > > --- > > CRASH_CORE enables codes including > > - crashkernel reservation; > > - elfcorehdr updating; > > - vmcoreinfo exporting; > > - crash hotplug handling; > > > > Now fadump of powerpc, kcore dynamic debugging and kdump all selects > > CRASH_CORE, while fadump > > - fadump needs crashkernel parsing, vmcoreinfo exporting, and accessing > > global variable 'elfcorehdr_addr'; > > - kcore only needs vmcoreinfo exporting; > > - kdump needs all of the current kernel/crash_core.c. > > > > So only enabling PROC_CORE or FA_DUMP will enable CRASH_CORE, this > > mislead people that we enable crash dumping, actual it's not. > > > > Secondly, > > --- > > It's not reasonable to allow KEXEC_CORE select CRASH_CORE. > > > > Because KEXEC_CORE enables codes which allocate control pages, copy > > kexec/kdump segments, and prepare for switching. These codes are > > shared by both kexec reboot and kdump. We could want kexec reboot, > > but disable kdump. In that case, CRASH_CORE should not be selected. > > > > -------------------- > > CONFIG_CRASH_CORE=y > > CONFIG_KEXEC_CORE=y > > CONFIG_KEXEC=y > > CONFIG_KEXEC_FILE=y > > --------------------- > > > > Thirdly, > > --- > > It's not reasonable to allow CRASH_DUMP select KEXEC_CORE. > > > > That could make KEXEC_CORE, CRASH_DUMP are enabled independently from > > KEXEC or KEXEC_FILE. However, w/o KEXEC or KEXEC_FILE, the KEXEC_CORE > > code built in doesn't make any sense because no kernel loading or > > switching will happen to utilize the KEXEC_CORE code. > > --------------------- > > CONFIG_CRASH_CORE=y > > CONFIG_KEXEC_CORE=y > > CONFIG_CRASH_DUMP=y > > --------------------- > > > > In this case, what is worse, on arch sh and arm, KEXEC relies on MMU, > > while CRASH_DUMP can still be enabled when !MMU, then compiling error is > > seen as the lkp test robot reported in above link. > > > > ------arch/sh/Kconfig------ > > config ARCH_SUPPORTS_KEXEC > > def_bool MMU > > > > config ARCH_SUPPORTS_CRASH_DUMP > > def_bool BROKEN_ON_SMP > > --------------------------- > > > > Changes: > > =========== > > 1, split out crash_reserve.c from crash_core.c; > > 2, split out vmcore_infoc. from crash_core.c; > > 3, move crash related codes in kexec_core.c into crash_core.c; > > 4, remove dependency of FA_DUMP on CRASH_DUMP; > > 5, clean up kdump related config items; > > 6, wrap up crash codes in crash related ifdefs on all 9 arch-es > > which support crash dumping; > > > > Achievement: > > =========== > > With above changes, I can rearrange the config item logic as below (the right > > item depends on or is selected by the left item): > > > > PROC_KCORE -----------> VMCORE_INFO > > > > |----------> VMCORE_INFO > > FA_DUMP----| > > |----------> CRASH_RESERVE > > FA_DUMP also needs PROC_VMCORE (CRASH_DUMP by dependency, I guess). > So, the FA_DUMP related changes here will need a relook.. Thanks for checking this. So FA_DUMP needs vmcoreinfo exporting, crashkernel reservation, /proc/vmcore. Then it's easy to adjust the kernel config item of FA_DUMP to make it select CRASH_DUMP. Except of this, do you have concern about the current code and Kconfig refactorying? ---->VMCORE_INFO /| FA_DUMP--> CRASH_DUMP-->/-|---->CRASH_RESERVE \ | \---->PROC_VMCORE > > > > ---->VMCORE_INFO > > / > > |---->CRASH_RESERVE > > KEXEC --| /| > > |--> KEXEC_CORE--> CRASH_DUMP-->/-|---->PROC_VMCORE > > KEXEC_FILE --| \ | > > \---->CRASH_HOTPLUG > > > > > > KEXEC --| > > |--> KEXEC_CORE (for kexec reboot only) > > KEXEC_FILE --| > > > > Test > > ======== > > On all 8 architectures, including x86_64, arm64, s390x, sh, arm, mips, > > riscv, loongarch, I did below three cases of config item setting and > > building all passed. Let me take configs on x86_64 as exampmle here: > > > > (1) Both CONFIG_KEXEC and KEXEC_FILE is unset, then all kexec/kdump > > items are unset automatically: > > # Kexec and crash features > > # CONFIG_KEXEC is not set > > # CONFIG_KEXEC_FILE is not set > > # end of Kexec and crash features > > > > (2) set CONFIG_KEXEC_FILE and 'make olddefconfig': > > --------------- > > # Kexec and crash features > > CONFIG_CRASH_RESERVE=y > > CONFIG_VMCORE_INFO=y > > CONFIG_KEXEC_CORE=y > > CONFIG_KEXEC_FILE=y > > CONFIG_CRASH_DUMP=y > > CONFIG_CRASH_HOTPLUG=y > > CONFIG_CRASH_MAX_MEMORY_RANGES=8192 > > # end of Kexec and crash features > > --------------- > > > > (3) unset CONFIG_CRASH_DUMP in case 2 and execute 'make olddefconfig': > > ------------------------ > > # Kexec and crash features > > CONFIG_KEXEC_CORE=y > > CONFIG_KEXEC_FILE=y > > # end of Kexec and crash features > > ------------------------ > > > > Note: > > For ppc, it needs investigation to make clear how to split out crash > > code in arch folder. > > On powerpc, both kdump and fadump need PROC_VMCORE & CRASH_DUMP. > Hope that clears things. So, patch 3/14 breaks things for FA_DUMP.. I see it now. We can easily fix that with below patch. What do you think? By the way, do you have chance to help test these on powerpc system? I can find ppc64le machine, while I don't know how to operate to test fadump. From fa8e6c3930d4f22f2b3768399c5bf0523c17adde Mon Sep 17 00:00:00 2001 From: Baoquan He <bhe@redhat.com> Date: Sun, 4 Feb 2024 11:06:54 +0800 Subject: [PATCH] power/fadump: make FA_DUMP select CRASH_DUMP Content-type: text/plain FA_DUMP which is similar with kdump needs vmcoreinfo exporting, crashkernel reservation and /proc/vmcore file . After refactoring crash related codes and Kconfig items, make FA_DUMP select CRASH_DUMP. Now the dependency layout is like below: ---->VMCORE_INFO /| FA_DUMP--> CRASH_DUMP-->/-|---->CRASH_RESERVE \ | \---->PROC_VMCORE Signed-off-by: Baoquan He <bhe@redhat.com> --- arch/powerpc/Kconfig | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index f182fb354bef..d5d4c890f010 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -695,8 +695,7 @@ config ARCH_SELECTS_CRASH_DUMP config FA_DUMP bool "Firmware-assisted dump" depends on PPC64 && (PPC_RTAS || PPC_POWERNV) - select VMCORE_INFO - select CRASH_RESERVE + select CRASH_DUMP help A robust mechanism to get reliable kernel crash dump with assistance from firmware. This approach does not use kexec,
Hi Baoquan, On 04/02/24 8:56 am, Baoquan He wrote: >>> Hope Hari and Pingfan can help have a look, see if >>> it's doable. Now, I make it either have both kexec and crash enabled, or >>> disable both of them altogether. >> >> Sure. I will take a closer look... > Thanks a lot. Please feel free to post patches to make that, or I can do > it with your support or suggestion. Tested your changes and on top of these changes, came up with the below changes to get it working for powerpc: https://lore.kernel.org/all/20240213113150.1148276-1-hbathini@linux.ibm.com/ Please take a look. Thanks Hari
On 02/21/24 at 11:15am, Hari Bathini wrote: > Hi Baoquan, > > On 04/02/24 8:56 am, Baoquan He wrote: > > > > Hope Hari and Pingfan can help have a look, see if > > > > it's doable. Now, I make it either have both kexec and crash enabled, or > > > > disable both of them altogether. > > > > > > Sure. I will take a closer look... > > Thanks a lot. Please feel free to post patches to make that, or I can do > > it with your support or suggestion. > > Tested your changes and on top of these changes, came up with the below > changes to get it working for powerpc: > > > https://lore.kernel.org/all/20240213113150.1148276-1-hbathini@linux.ibm.com/ > > Please take a look. I added a comment to the patch 1 consulting if the "struct crash_mem" is appropriate to cover other cases except of kdump memory regions. I am wondering if its name need be adjusted, or other kind of memory you mentioned can use other structures or create a new one. If it's has to be done like that, it's fine.
On Wed, 21 Feb 2024 11:15:00 +0530 Hari Bathini <hbathini@linux.ibm.com> wrote: > On 04/02/24 8:56 am, Baoquan He wrote: > >>> Hope Hari and Pingfan can help have a look, see if > >>> it's doable. Now, I make it either have both kexec and crash enabled, or > >>> disable both of them altogether. > >> > >> Sure. I will take a closer look... > > Thanks a lot. Please feel free to post patches to make that, or I can do > > it with your support or suggestion. > > Tested your changes and on top of these changes, came up with the below > changes to get it working for powerpc: > > > https://lore.kernel.org/all/20240213113150.1148276-1-hbathini@linux.ibm.com/ So can we take it that you're OK with Baoquan's series as-is? Baoquan, do you believe the patches in mm-unstable are ready for moving into mm-stable in preparation for an upstream merge?
On 22/02/24 2:27 am, Andrew Morton wrote: > On Wed, 21 Feb 2024 11:15:00 +0530 Hari Bathini <hbathini@linux.ibm.com> wrote: > >> On 04/02/24 8:56 am, Baoquan He wrote: >>>>> Hope Hari and Pingfan can help have a look, see if >>>>> it's doable. Now, I make it either have both kexec and crash enabled, or >>>>> disable both of them altogether. >>>> >>>> Sure. I will take a closer look... >>> Thanks a lot. Please feel free to post patches to make that, or I can do >>> it with your support or suggestion. >> >> Tested your changes and on top of these changes, came up with the below >> changes to get it working for powerpc: >> >> >> https://lore.kernel.org/all/20240213113150.1148276-1-hbathini@linux.ibm.com/ > > So can we take it that you're OK with Baoquan's series as-is? Hi Andrew, If you mean v3 (https://lore.kernel.org/all/20240124051254.67105-1-bhe@redhat.com/) + follow-up from Baoquan (https://lore.kernel.org/all/Zb8D1ASrgX0qVm9z@MiWiFi-R3L-srv/) Yes. My changes are based on top of the above patches.. Thanks Hari
On 02/21/24 at 12:57pm, Andrew Morton wrote: > On Wed, 21 Feb 2024 11:15:00 +0530 Hari Bathini <hbathini@linux.ibm.com> wrote: > > > On 04/02/24 8:56 am, Baoquan He wrote: > > >>> Hope Hari and Pingfan can help have a look, see if > > >>> it's doable. Now, I make it either have both kexec and crash enabled, or > > >>> disable both of them altogether. > > >> > > >> Sure. I will take a closer look... > > > Thanks a lot. Please feel free to post patches to make that, or I can do > > > it with your support or suggestion. > > > > Tested your changes and on top of these changes, came up with the below > > changes to get it working for powerpc: > > > > > > https://lore.kernel.org/all/20240213113150.1148276-1-hbathini@linux.ibm.com/ > > So can we take it that you're OK with Baoquan's series as-is? > > Baoquan, do you believe the patches in mm-unstable are ready for moving > into mm-stable in preparation for an upstream merge? Yeah, I think they are ready to go for merging. For Hari's patchset, the main part was planned before. And I am not familiar with fadump in powerpc, the Kconfig fix from Hari is a good guarantee with the expertise. Surely, I will await Hari's comment on that.
On Thu, 22 Feb 2024 10:47:29 +0530 Hari Bathini <hbathini@linux.ibm.com> wrote: > > > On 22/02/24 2:27 am, Andrew Morton wrote: > > On Wed, 21 Feb 2024 11:15:00 +0530 Hari Bathini <hbathini@linux.ibm.com> wrote: > > > >> On 04/02/24 8:56 am, Baoquan He wrote: > >>>>> Hope Hari and Pingfan can help have a look, see if > >>>>> it's doable. Now, I make it either have both kexec and crash enabled, or > >>>>> disable both of them altogether. > >>>> > >>>> Sure. I will take a closer look... > >>> Thanks a lot. Please feel free to post patches to make that, or I can do > >>> it with your support or suggestion. > >> > >> Tested your changes and on top of these changes, came up with the below > >> changes to get it working for powerpc: > >> > >> > >> https://lore.kernel.org/all/20240213113150.1148276-1-hbathini@linux.ibm.com/ > > > > So can we take it that you're OK with Baoquan's series as-is? > > Hi Andrew, > > If you mean > > v3 (https://lore.kernel.org/all/20240124051254.67105-1-bhe@redhat.com/) > + > follow-up from Baoquan > (https://lore.kernel.org/all/Zb8D1ASrgX0qVm9z@MiWiFi-R3L-srv/) > > Yes. > Can I add your Acked-by: and/or Tested-by: to the patches in this series?
On 23/02/24 2:59 am, Andrew Morton wrote: > On Thu, 22 Feb 2024 10:47:29 +0530 Hari Bathini <hbathini@linux.ibm.com> wrote: > >> >> >> On 22/02/24 2:27 am, Andrew Morton wrote: >>> On Wed, 21 Feb 2024 11:15:00 +0530 Hari Bathini <hbathini@linux.ibm.com> wrote: >>> >>>> On 04/02/24 8:56 am, Baoquan He wrote: >>>>>>> Hope Hari and Pingfan can help have a look, see if >>>>>>> it's doable. Now, I make it either have both kexec and crash enabled, or >>>>>>> disable both of them altogether. >>>>>> >>>>>> Sure. I will take a closer look... >>>>> Thanks a lot. Please feel free to post patches to make that, or I can do >>>>> it with your support or suggestion. >>>> >>>> Tested your changes and on top of these changes, came up with the below >>>> changes to get it working for powerpc: >>>> >>>> >>>> https://lore.kernel.org/all/20240213113150.1148276-1-hbathini@linux.ibm.com/ >>> >>> So can we take it that you're OK with Baoquan's series as-is? >> >> Hi Andrew, >> >> If you mean >> >> v3 (https://lore.kernel.org/all/20240124051254.67105-1-bhe@redhat.com/) >> + >> follow-up from Baoquan >> (https://lore.kernel.org/all/Zb8D1ASrgX0qVm9z@MiWiFi-R3L-srv/) >> >> Yes. >> > > Can I add your Acked-by: and/or Tested-by: to the patches in this series? Sure, Andrew. Acked-by: Hari Bathini <hbathini@linux.ibm.com> for.. Patches 1-5 & 8 in: https://lore.kernel.org/all/20240124051254.67105-1-bhe@redhat.com/ and this follow-up patch: https://lore.kernel.org/all/Zb8D1ASrgX0qVm9z@MiWiFi-R3L-srv/ Thanks Hari