Message ID | 20241025151134.1275575-1-david@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | fs/proc/vmcore: kdump support for virtio-mem on s390 | expand |
On 10/25/24 at 05:11pm, David Hildenbrand wrote: > This is based on "[PATCH v3 0/7] virtio-mem: s390 support" [1], which adds > virtio-mem support on s390. > > The only "different than everything else" thing about virtio-mem on s390 > is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr > during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the > crash kernel must detect memory ranges of the crashed/panicked kernel to > include via PT_LOAD in the vmcore. > > On other architectures, all RAM regions (boot + hotplugged) can easily be > observed on the old (to crash) kernel (e.g., using /proc/iomem) to create > the elfcore hdr. > > On s390, information about "ordinary" memory (heh, "storage") can be > obtained by querying the hypervisor/ultravisor via SCLP/diag260, and > that information is stored early during boot in the "physmem" memblock > data structure. > > But virtio-mem memory is always detected by as device driver, which is > usually build as a module. So in the crash kernel, this memory can only be > properly detected once the virtio-mem driver started up. > > The virtio-mem driver already supports the "kdump mode", where it won't > hotplug any memory but instead queries the device to implement the > pfn_is_ram() callback, to avoid reading unplugged memory holes when reading > the vmcore. > > With this series, if the virtio-mem driver is included in the kdump > initrd -- which dracut already takes care of under Fedora/RHEL -- it will > now detect the device RAM ranges on s390 once it probes the devices, to add > them to the vmcore using the same callback mechanism we already have for > pfn_is_ram(). > > To add these device RAM ranges to the vmcore ("patch the vmcore"), we will > add new PT_LOAD entries that describe these memory ranges, and update > all offsets vmcore size so it is all consistent. > > Note that makedumfile is shaky with v6.12-rcX, I made the "obvious" things > (e.g., free page detection) work again while testing as documented in [2]. > > Creating the dumps using makedumpfile seems to work fine, and the > dump regions (PT_LOAD) are as expected. I yet have to check in more detail > if the created dumps are good (IOW, the right memory was dumped, but it > looks like makedumpfile reads the right memory when interpreting the > kernel data structures, which is promising). > > Patch #1 -- #6 are vmcore preparations and cleanups Thanks for CC-ing me, I will review the patch 1-6, vmcore part next week.
On 10/25/24 at 05:11pm, David Hildenbrand wrote: > This is based on "[PATCH v3 0/7] virtio-mem: s390 support" [1], which adds > virtio-mem support on s390. > > The only "different than everything else" thing about virtio-mem on s390 > is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr > during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the > crash kernel must detect memory ranges of the crashed/panicked kernel to > include via PT_LOAD in the vmcore. > > On other architectures, all RAM regions (boot + hotplugged) can easily be > observed on the old (to crash) kernel (e.g., using /proc/iomem) to create > the elfcore hdr. > > On s390, information about "ordinary" memory (heh, "storage") can be > obtained by querying the hypervisor/ultravisor via SCLP/diag260, and > that information is stored early during boot in the "physmem" memblock > data structure. > > But virtio-mem memory is always detected by as device driver, which is > usually build as a module. So in the crash kernel, this memory can only be ~~~~~~~~~~~ Is it 1st kernel or 2nd kernel? Usually we call the 1st kernel as panicked kernel, crashed kernel, the 2nd kernel as kdump kernel. > properly detected once the virtio-mem driver started up. > > The virtio-mem driver already supports the "kdump mode", where it won't > hotplug any memory but instead queries the device to implement the > pfn_is_ram() callback, to avoid reading unplugged memory holes when reading > the vmcore. > > With this series, if the virtio-mem driver is included in the kdump > initrd -- which dracut already takes care of under Fedora/RHEL -- it will > now detect the device RAM ranges on s390 once it probes the devices, to add > them to the vmcore using the same callback mechanism we already have for > pfn_is_ram(). Do you mean on s390 virtio-mem memory region will be detected and added to vmcore in kdump kernel when virtio-mem driver is initialized? Not sure if I understand it correctly. > > To add these device RAM ranges to the vmcore ("patch the vmcore"), we will > add new PT_LOAD entries that describe these memory ranges, and update > all offsets vmcore size so it is all consistent. > > Note that makedumfile is shaky with v6.12-rcX, I made the "obvious" things > (e.g., free page detection) work again while testing as documented in [2]. > > Creating the dumps using makedumpfile seems to work fine, and the > dump regions (PT_LOAD) are as expected. I yet have to check in more detail > if the created dumps are good (IOW, the right memory was dumped, but it > looks like makedumpfile reads the right memory when interpreting the > kernel data structures, which is promising). > > Patch #1 -- #6 are vmcore preparations and cleanups > Patch #7 adds the infrastructure for drivers to report device RAM > Patch #8 + #9 are virtio-mem preparations > Patch #10 implements virtio-mem support to report device RAM > Patch #11 activates it for s390, implementing a new function to fill > PT_LOAD entry for device RAM > > [1] https://lkml.kernel.org/r/20241025141453.1210600-1-david@redhat.com > [2] https://github.com/makedumpfile/makedumpfile/issues/16 > > Cc: Heiko Carstens <hca@linux.ibm.com> > Cc: Vasily Gorbik <gor@linux.ibm.com> > Cc: Alexander Gordeev <agordeev@linux.ibm.com> > Cc: Christian Borntraeger <borntraeger@linux.ibm.com> > Cc: Sven Schnelle <svens@linux.ibm.com> > Cc: "Michael S. Tsirkin" <mst@redhat.com> > Cc: Jason Wang <jasowang@redhat.com> > Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Cc: "Eugenio Pérez" <eperezma@redhat.com> > Cc: Baoquan He <bhe@redhat.com> > Cc: Vivek Goyal <vgoyal@redhat.com> > Cc: Dave Young <dyoung@redhat.com> > Cc: Thomas Huth <thuth@redhat.com> > Cc: Cornelia Huck <cohuck@redhat.com> > Cc: Janosch Frank <frankja@linux.ibm.com> > Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> > Cc: Eric Farman <farman@linux.ibm.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > > David Hildenbrand (11): > fs/proc/vmcore: convert vmcore_cb_lock into vmcore_mutex > fs/proc/vmcore: replace vmcoredd_mutex by vmcore_mutex > fs/proc/vmcore: disallow vmcore modifications after the vmcore was > opened > fs/proc/vmcore: move vmcore definitions from kcore.h to crash_dump.h > fs/proc/vmcore: factor out allocating a vmcore memory node > fs/proc/vmcore: factor out freeing a list of vmcore ranges > fs/proc/vmcore: introduce PROC_VMCORE_DEVICE_RAM to detect device RAM > ranges in 2nd kernel > virtio-mem: mark device ready before registering callbacks in kdump > mode > virtio-mem: remember usable region size > virtio-mem: support CONFIG_PROC_VMCORE_DEVICE_RAM > s390/kdump: virtio-mem kdump support (CONFIG_PROC_VMCORE_DEVICE_RAM) > > arch/s390/Kconfig | 1 + > arch/s390/kernel/crash_dump.c | 39 +++-- > drivers/virtio/Kconfig | 1 + > drivers/virtio/virtio_mem.c | 103 +++++++++++++- > fs/proc/Kconfig | 25 ++++ > fs/proc/vmcore.c | 258 +++++++++++++++++++++++++--------- > include/linux/crash_dump.h | 47 +++++++ > include/linux/kcore.h | 13 -- > 8 files changed, 396 insertions(+), 91 deletions(-) > > -- > 2.46.1 >
On 15.11.24 09:46, Baoquan He wrote: > On 10/25/24 at 05:11pm, David Hildenbrand wrote: >> This is based on "[PATCH v3 0/7] virtio-mem: s390 support" [1], which adds >> virtio-mem support on s390. >> >> The only "different than everything else" thing about virtio-mem on s390 >> is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr >> during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the >> crash kernel must detect memory ranges of the crashed/panicked kernel to >> include via PT_LOAD in the vmcore. >> >> On other architectures, all RAM regions (boot + hotplugged) can easily be >> observed on the old (to crash) kernel (e.g., using /proc/iomem) to create >> the elfcore hdr. >> >> On s390, information about "ordinary" memory (heh, "storage") can be >> obtained by querying the hypervisor/ultravisor via SCLP/diag260, and >> that information is stored early during boot in the "physmem" memblock >> data structure. >> >> But virtio-mem memory is always detected by as device driver, which is >> usually build as a module. So in the crash kernel, this memory can only be > ~~~~~~~~~~~ > Is it 1st kernel or 2nd kernel? > Usually we call the 1st kernel as panicked kernel, crashed kernel, the > 2nd kernel as kdump kernel. It should have been called "kdump (2nd) kernel" here indeed. >> properly detected once the virtio-mem driver started up. >> >> The virtio-mem driver already supports the "kdump mode", where it won't >> hotplug any memory but instead queries the device to implement the >> pfn_is_ram() callback, to avoid reading unplugged memory holes when reading >> the vmcore. >> >> With this series, if the virtio-mem driver is included in the kdump >> initrd -- which dracut already takes care of under Fedora/RHEL -- it will >> now detect the device RAM ranges on s390 once it probes the devices, to add >> them to the vmcore using the same callback mechanism we already have for >> pfn_is_ram(). > > Do you mean on s390 virtio-mem memory region will be detected and added > to vmcore in kdump kernel when virtio-mem driver is initialized? Not > sure if I understand it correctly. Yes exactly. In the kdump kernel, the driver gets probed and registers the vmcore callbacks. From there, we detect and add the device regions. Thanks!
On 11/15/24 at 09:55am, David Hildenbrand wrote: > On 15.11.24 09:46, Baoquan He wrote: > > On 10/25/24 at 05:11pm, David Hildenbrand wrote: > > > This is based on "[PATCH v3 0/7] virtio-mem: s390 support" [1], which adds > > > virtio-mem support on s390. > > > > > > The only "different than everything else" thing about virtio-mem on s390 > > > is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr > > > during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the > > > crash kernel must detect memory ranges of the crashed/panicked kernel to > > > include via PT_LOAD in the vmcore. > > > > > > On other architectures, all RAM regions (boot + hotplugged) can easily be > > > observed on the old (to crash) kernel (e.g., using /proc/iomem) to create > > > the elfcore hdr. > > > > > > On s390, information about "ordinary" memory (heh, "storage") can be > > > obtained by querying the hypervisor/ultravisor via SCLP/diag260, and > > > that information is stored early during boot in the "physmem" memblock > > > data structure. > > > > > > But virtio-mem memory is always detected by as device driver, which is > > > usually build as a module. So in the crash kernel, this memory can only be > > ~~~~~~~~~~~ > > Is it 1st kernel or 2nd kernel? > > Usually we call the 1st kernel as panicked kernel, crashed kernel, the > > 2nd kernel as kdump kernel. > > It should have been called "kdump (2nd) kernel" here indeed. > > > > properly detected once the virtio-mem driver started up. > > > > > > The virtio-mem driver already supports the "kdump mode", where it won't > > > hotplug any memory but instead queries the device to implement the > > > pfn_is_ram() callback, to avoid reading unplugged memory holes when reading > > > the vmcore. > > > > > > With this series, if the virtio-mem driver is included in the kdump > > > initrd -- which dracut already takes care of under Fedora/RHEL -- it will > > > now detect the device RAM ranges on s390 once it probes the devices, to add > > > them to the vmcore using the same callback mechanism we already have for > > > pfn_is_ram(). > > > > Do you mean on s390 virtio-mem memory region will be detected and added > > to vmcore in kdump kernel when virtio-mem driver is initialized? Not > > sure if I understand it correctly. > > Yes exactly. In the kdump kernel, the driver gets probed and registers the > vmcore callbacks. From there, we detect and add the device regions. I see now, thanks for your confirmation.