Message ID | 52CD6E33.400@redhat.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Wed, Jan 08, 2014 at 11:26:43PM +0800, Baoquan wrote: [..] > [ 1.592222] acpi PNP0A03:03: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. > [ 1.605045] PCI host bridge to bus 0000:ff > [ 1.609615] pci_bus 0000:ff: root bus resource [bus ff] > [ 1.632117] System RAM resource [mem 0x01000000-0x7bffffff] cannot be added > [ 1.639892] init_memory_mapping: [mem 0x100000000-0x87fffffff] > [ 1.717793] swapper/0: page allocation failure: order:9, mode:0x84d0 > [ 1.724884] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-59.el7.x86_64 #1 > [ 1.732842] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S001.032520101647 03/25/2010 > [ 1.743224] 0000000000000000 ffff8800339878c8 ffffffff815b64ad ffff880033987950 > [ 1.751513] ffffffff8113a980 ffff88003673ab28 00000000000001fe 0000000000000001 > [ 1.759804] ffff880000000040 ffffffff810bc28a 0000000000000000 0000000000000200 > [ 1.768096] Call Trace: [348/1928] > [ 1.770834] [<ffffffff815b64ad>] dump_stack+0x19/0x1b > [ 1.776561] [<ffffffff8113a980>] warn_alloc_failed+0xf0/0x160 > [ 1.783076] [<ffffffff810bc28a>] ? on_each_cpu_mask+0x2a/0x60 > [ 1.789581] [<ffffffff8113e92f>] __alloc_pages_nodemask+0x7ff/0xa00 > [ 1.796672] [<ffffffff815ada2c>] vmemmap_alloc_block+0x62/0xba > [ 1.803274] [<ffffffff815ada99>] vmemmap_alloc_block_buf+0x15/0x3b > [ 1.810263] [<ffffffff815ab8a6>] vmemmap_populate+0xb4/0x21b > [ 1.816673] [<ffffffff815adecd>] sparse_mem_map_populate+0x27/0x35 > [ 1.823665] [<ffffffff815ad8bf>] sparse_add_one_section+0x7a/0x185 > [ 1.830659] [<ffffffff8159b74f>] __add_pages+0xaf/0x240 > [ 1.836588] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > [ 1.842804] [<ffffffff8159ba89>] add_memory+0xb9/0x1b0 > [ 1.848638] [<ffffffff8132dd2c>] acpi_memory_device_add+0x18d/0x26d > [ 1.855728] [<ffffffff81303b91>] acpi_bus_device_attach+0x7d/0xcd > [ 1.862625] [<ffffffff8131d92d>] acpi_ns_walk_namespace+0xc8/0x17f > [ 1.869616] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > [ 1.876896] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > [ 1.884177] [<ffffffff8131de1c>] acpi_walk_namespace+0x95/0xc5 > [ 1.890780] [<ffffffff81304866>] acpi_bus_scan+0x8b/0x9d > [ 1.896805] [<ffffffff81a14a15>] acpi_scan_init+0x63/0x160 > [ 1.903021] [<ffffffff81a14830>] acpi_init+0x25d/0x2a6 So basically acpi thinks that some memory block is a hot plug memory and tries to add it. And that consumes lots of memory and we don't have that memory in second kernel. For this reason, we pass a custom E820 map to second kernel so that it only initializes page tables and memmap array for a very small physical memory range. Now question is what is hot plug memory. In this case we have not physically plugged in any physical memory. So why acpi is considering this memory to be a hot add memory operation. Are there memory hotplug slots and these ranges always considered hot added memory? IOW, what if I hotplug a memory and then reboot the system. Will new E820 map contain this new memory range or not? I guess simplest way to solve this problem might be to disable memory hot plug in kdump kernel. Is there any command line parameter to do that? If we disable memory hotplug in second kernel, and a hot plug memory is passed in E820 map, will it still work. Can I access that memory in second kernel? Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, January 08, 2014 10:58:29 AM Vivek Goyal wrote: > On Wed, Jan 08, 2014 at 11:26:43PM +0800, Baoquan wrote: > > [..] > > [ 1.592222] acpi PNP0A03:03: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. > > [ 1.605045] PCI host bridge to bus 0000:ff > > [ 1.609615] pci_bus 0000:ff: root bus resource [bus ff] > > [ 1.632117] System RAM resource [mem 0x01000000-0x7bffffff] cannot be added > > [ 1.639892] init_memory_mapping: [mem 0x100000000-0x87fffffff] > > [ 1.717793] swapper/0: page allocation failure: order:9, mode:0x84d0 > > [ 1.724884] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-59.el7.x86_64 #1 > > [ 1.732842] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S001.032520101647 03/25/2010 > > [ 1.743224] 0000000000000000 ffff8800339878c8 ffffffff815b64ad ffff880033987950 > > [ 1.751513] ffffffff8113a980 ffff88003673ab28 00000000000001fe 0000000000000001 > > [ 1.759804] ffff880000000040 ffffffff810bc28a 0000000000000000 0000000000000200 > > [ 1.768096] Call Trace: [348/1928] > > [ 1.770834] [<ffffffff815b64ad>] dump_stack+0x19/0x1b > > [ 1.776561] [<ffffffff8113a980>] warn_alloc_failed+0xf0/0x160 > > [ 1.783076] [<ffffffff810bc28a>] ? on_each_cpu_mask+0x2a/0x60 > > [ 1.789581] [<ffffffff8113e92f>] __alloc_pages_nodemask+0x7ff/0xa00 > > [ 1.796672] [<ffffffff815ada2c>] vmemmap_alloc_block+0x62/0xba > > [ 1.803274] [<ffffffff815ada99>] vmemmap_alloc_block_buf+0x15/0x3b > > [ 1.810263] [<ffffffff815ab8a6>] vmemmap_populate+0xb4/0x21b > > [ 1.816673] [<ffffffff815adecd>] sparse_mem_map_populate+0x27/0x35 > > [ 1.823665] [<ffffffff815ad8bf>] sparse_add_one_section+0x7a/0x185 > > [ 1.830659] [<ffffffff8159b74f>] __add_pages+0xaf/0x240 > > [ 1.836588] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > > [ 1.842804] [<ffffffff8159ba89>] add_memory+0xb9/0x1b0 > > [ 1.848638] [<ffffffff8132dd2c>] acpi_memory_device_add+0x18d/0x26d > > [ 1.855728] [<ffffffff81303b91>] acpi_bus_device_attach+0x7d/0xcd > > [ 1.862625] [<ffffffff8131d92d>] acpi_ns_walk_namespace+0xc8/0x17f > > [ 1.869616] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > [ 1.876896] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > [ 1.884177] [<ffffffff8131de1c>] acpi_walk_namespace+0x95/0xc5 > > [ 1.890780] [<ffffffff81304866>] acpi_bus_scan+0x8b/0x9d > > [ 1.896805] [<ffffffff81a14a15>] acpi_scan_init+0x63/0x160 > > [ 1.903021] [<ffffffff81a14830>] acpi_init+0x25d/0x2a6 > > So basically acpi thinks that some memory block is a hot plug memory > and tries to add it. And that consumes lots of memory and we don't have > that memory in second kernel. That's not exactly the case. What seems to happen is that there is an ACPI memory object in the ACPI namespace and the ACPI memory hotplug driver attempts to bind to it. That driver attempts to find removable memory blocks associated with that object and to add them to the memory map. Why don't you simply append acpi=off to the kexec command line? That should make the problem go away. Thanks!
On Thu, 2014-01-09 at 00:07 +0100, Rafael J. Wysocki wrote: > On Wednesday, January 08, 2014 10:58:29 AM Vivek Goyal wrote: > > On Wed, Jan 08, 2014 at 11:26:43PM +0800, Baoquan wrote: > > > > [..] > > > [ 1.592222] acpi PNP0A03:03: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. > > > [ 1.605045] PCI host bridge to bus 0000:ff > > > [ 1.609615] pci_bus 0000:ff: root bus resource [bus ff] > > > [ 1.632117] System RAM resource [mem 0x01000000-0x7bffffff] cannot be added > > > [ 1.639892] init_memory_mapping: [mem 0x100000000-0x87fffffff] > > > [ 1.717793] swapper/0: page allocation failure: order:9, mode:0x84d0 > > > [ 1.724884] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-59.el7.x86_64 #1 > > > [ 1.732842] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S001.032520101647 03/25/2010 > > > [ 1.743224] 0000000000000000 ffff8800339878c8 ffffffff815b64ad ffff880033987950 > > > [ 1.751513] ffffffff8113a980 ffff88003673ab28 00000000000001fe 0000000000000001 > > > [ 1.759804] ffff880000000040 ffffffff810bc28a 0000000000000000 0000000000000200 > > > [ 1.768096] Call Trace: [348/1928] > > > [ 1.770834] [<ffffffff815b64ad>] dump_stack+0x19/0x1b > > > [ 1.776561] [<ffffffff8113a980>] warn_alloc_failed+0xf0/0x160 > > > [ 1.783076] [<ffffffff810bc28a>] ? on_each_cpu_mask+0x2a/0x60 > > > [ 1.789581] [<ffffffff8113e92f>] __alloc_pages_nodemask+0x7ff/0xa00 > > > [ 1.796672] [<ffffffff815ada2c>] vmemmap_alloc_block+0x62/0xba > > > [ 1.803274] [<ffffffff815ada99>] vmemmap_alloc_block_buf+0x15/0x3b > > > [ 1.810263] [<ffffffff815ab8a6>] vmemmap_populate+0xb4/0x21b > > > [ 1.816673] [<ffffffff815adecd>] sparse_mem_map_populate+0x27/0x35 > > > [ 1.823665] [<ffffffff815ad8bf>] sparse_add_one_section+0x7a/0x185 > > > [ 1.830659] [<ffffffff8159b74f>] __add_pages+0xaf/0x240 > > > [ 1.836588] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > > > [ 1.842804] [<ffffffff8159ba89>] add_memory+0xb9/0x1b0 > > > [ 1.848638] [<ffffffff8132dd2c>] acpi_memory_device_add+0x18d/0x26d > > > [ 1.855728] [<ffffffff81303b91>] acpi_bus_device_attach+0x7d/0xcd > > > [ 1.862625] [<ffffffff8131d92d>] acpi_ns_walk_namespace+0xc8/0x17f > > > [ 1.869616] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > [ 1.876896] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > [ 1.884177] [<ffffffff8131de1c>] acpi_walk_namespace+0x95/0xc5 > > > [ 1.890780] [<ffffffff81304866>] acpi_bus_scan+0x8b/0x9d > > > [ 1.896805] [<ffffffff81a14a15>] acpi_scan_init+0x63/0x160 > > > [ 1.903021] [<ffffffff81a14830>] acpi_init+0x25d/0x2a6 > > > > So basically acpi thinks that some memory block is a hot plug memory > > and tries to add it. And that consumes lots of memory and we don't have > > that memory in second kernel. > > That's not exactly the case. What seems to happen is that there is an ACPI > memory object in the ACPI namespace and the ACPI memory hotplug driver > attempts to bind to it. That driver attempts to find removable memory blocks > associated with that object and to add them to the memory map. > > Why don't you simply append acpi=off to the kexec command line? That should > make the problem go away. Yes, that should work, but Baoquan's approach makes sense to me. When memmap=exactmap is specified, the kernel should ignore any memory information from the firmware. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/09/14 at 12:07am, Rafael J. Wysocki wrote: > On Wednesday, January 08, 2014 10:58:29 AM Vivek Goyal wrote: > > On Wed, Jan 08, 2014 at 11:26:43PM +0800, Baoquan wrote: > > > > [..] > > > [ 1.592222] acpi PNP0A03:03: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. > > > [ 1.605045] PCI host bridge to bus 0000:ff > > > [ 1.609615] pci_bus 0000:ff: root bus resource [bus ff] > > > [ 1.632117] System RAM resource [mem 0x01000000-0x7bffffff] cannot be added > > > [ 1.639892] init_memory_mapping: [mem 0x100000000-0x87fffffff] > > > [ 1.717793] swapper/0: page allocation failure: order:9, mode:0x84d0 > > > [ 1.724884] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-59.el7.x86_64 #1 > > > [ 1.732842] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S001.032520101647 03/25/2010 > > > [ 1.743224] 0000000000000000 ffff8800339878c8 ffffffff815b64ad ffff880033987950 > > > [ 1.751513] ffffffff8113a980 ffff88003673ab28 00000000000001fe 0000000000000001 > > > [ 1.759804] ffff880000000040 ffffffff810bc28a 0000000000000000 0000000000000200 > > > [ 1.768096] Call Trace: [348/1928] > > > [ 1.770834] [<ffffffff815b64ad>] dump_stack+0x19/0x1b > > > [ 1.776561] [<ffffffff8113a980>] warn_alloc_failed+0xf0/0x160 > > > [ 1.783076] [<ffffffff810bc28a>] ? on_each_cpu_mask+0x2a/0x60 > > > [ 1.789581] [<ffffffff8113e92f>] __alloc_pages_nodemask+0x7ff/0xa00 > > > [ 1.796672] [<ffffffff815ada2c>] vmemmap_alloc_block+0x62/0xba > > > [ 1.803274] [<ffffffff815ada99>] vmemmap_alloc_block_buf+0x15/0x3b > > > [ 1.810263] [<ffffffff815ab8a6>] vmemmap_populate+0xb4/0x21b > > > [ 1.816673] [<ffffffff815adecd>] sparse_mem_map_populate+0x27/0x35 > > > [ 1.823665] [<ffffffff815ad8bf>] sparse_add_one_section+0x7a/0x185 > > > [ 1.830659] [<ffffffff8159b74f>] __add_pages+0xaf/0x240 > > > [ 1.836588] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > > > [ 1.842804] [<ffffffff8159ba89>] add_memory+0xb9/0x1b0 > > > [ 1.848638] [<ffffffff8132dd2c>] acpi_memory_device_add+0x18d/0x26d > > > [ 1.855728] [<ffffffff81303b91>] acpi_bus_device_attach+0x7d/0xcd > > > [ 1.862625] [<ffffffff8131d92d>] acpi_ns_walk_namespace+0xc8/0x17f > > > [ 1.869616] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > [ 1.876896] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > [ 1.884177] [<ffffffff8131de1c>] acpi_walk_namespace+0x95/0xc5 > > > [ 1.890780] [<ffffffff81304866>] acpi_bus_scan+0x8b/0x9d > > > [ 1.896805] [<ffffffff81a14a15>] acpi_scan_init+0x63/0x160 > > > [ 1.903021] [<ffffffff81a14830>] acpi_init+0x25d/0x2a6 > > > > So basically acpi thinks that some memory block is a hot plug memory > > and tries to add it. And that consumes lots of memory and we don't have > > that memory in second kernel. > > That's not exactly the case. What seems to happen is that there is an ACPI > memory object in the ACPI namespace and the ACPI memory hotplug driver > attempts to bind to it. That driver attempts to find removable memory blocks > associated with that object and to add them to the memory map. Yeah, since kdump kernel will detect rsdp for legacy machine in the first 1K of the EBDA or between E0000 and FFFFF. > > Why don't you simply append acpi=off to the kexec command line? That should > make the problem go away. acpi=off doesn't work, kdump kernel hang immediately after crash is triggered. Because acpi information is needed by kdump kernel, we can't disable it. -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, January 08, 2014 05:11:48 PM Toshi Kani wrote: > On Thu, 2014-01-09 at 00:07 +0100, Rafael J. Wysocki wrote: > > On Wednesday, January 08, 2014 10:58:29 AM Vivek Goyal wrote: > > > On Wed, Jan 08, 2014 at 11:26:43PM +0800, Baoquan wrote: > > > > > > [..] > > > > [ 1.592222] acpi PNP0A03:03: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. > > > > [ 1.605045] PCI host bridge to bus 0000:ff > > > > [ 1.609615] pci_bus 0000:ff: root bus resource [bus ff] > > > > [ 1.632117] System RAM resource [mem 0x01000000-0x7bffffff] cannot be added > > > > [ 1.639892] init_memory_mapping: [mem 0x100000000-0x87fffffff] > > > > [ 1.717793] swapper/0: page allocation failure: order:9, mode:0x84d0 > > > > [ 1.724884] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-59.el7.x86_64 #1 > > > > [ 1.732842] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S001.032520101647 03/25/2010 > > > > [ 1.743224] 0000000000000000 ffff8800339878c8 ffffffff815b64ad ffff880033987950 > > > > [ 1.751513] ffffffff8113a980 ffff88003673ab28 00000000000001fe 0000000000000001 > > > > [ 1.759804] ffff880000000040 ffffffff810bc28a 0000000000000000 0000000000000200 > > > > [ 1.768096] Call Trace: [348/1928] > > > > [ 1.770834] [<ffffffff815b64ad>] dump_stack+0x19/0x1b > > > > [ 1.776561] [<ffffffff8113a980>] warn_alloc_failed+0xf0/0x160 > > > > [ 1.783076] [<ffffffff810bc28a>] ? on_each_cpu_mask+0x2a/0x60 > > > > [ 1.789581] [<ffffffff8113e92f>] __alloc_pages_nodemask+0x7ff/0xa00 > > > > [ 1.796672] [<ffffffff815ada2c>] vmemmap_alloc_block+0x62/0xba > > > > [ 1.803274] [<ffffffff815ada99>] vmemmap_alloc_block_buf+0x15/0x3b > > > > [ 1.810263] [<ffffffff815ab8a6>] vmemmap_populate+0xb4/0x21b > > > > [ 1.816673] [<ffffffff815adecd>] sparse_mem_map_populate+0x27/0x35 > > > > [ 1.823665] [<ffffffff815ad8bf>] sparse_add_one_section+0x7a/0x185 > > > > [ 1.830659] [<ffffffff8159b74f>] __add_pages+0xaf/0x240 > > > > [ 1.836588] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > > > > [ 1.842804] [<ffffffff8159ba89>] add_memory+0xb9/0x1b0 > > > > [ 1.848638] [<ffffffff8132dd2c>] acpi_memory_device_add+0x18d/0x26d > > > > [ 1.855728] [<ffffffff81303b91>] acpi_bus_device_attach+0x7d/0xcd > > > > [ 1.862625] [<ffffffff8131d92d>] acpi_ns_walk_namespace+0xc8/0x17f > > > > [ 1.869616] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > [ 1.876896] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > [ 1.884177] [<ffffffff8131de1c>] acpi_walk_namespace+0x95/0xc5 > > > > [ 1.890780] [<ffffffff81304866>] acpi_bus_scan+0x8b/0x9d > > > > [ 1.896805] [<ffffffff81a14a15>] acpi_scan_init+0x63/0x160 > > > > [ 1.903021] [<ffffffff81a14830>] acpi_init+0x25d/0x2a6 > > > > > > So basically acpi thinks that some memory block is a hot plug memory > > > and tries to add it. And that consumes lots of memory and we don't have > > > that memory in second kernel. > > > > That's not exactly the case. What seems to happen is that there is an ACPI > > memory object in the ACPI namespace and the ACPI memory hotplug driver > > attempts to bind to it. That driver attempts to find removable memory blocks > > associated with that object and to add them to the memory map. > > > > Why don't you simply append acpi=off to the kexec command line? That should > > make the problem go away. > > Yes, that should work, but Baoquan's approach makes sense to me. When > memmap=exactmap is specified, the kernel should ignore any memory > information from the firmware. OK Baoquan, please modify your patch to get rid of the #ifdef CONFIG_X86 in acpi_memory_hotplug_init(). For example, you can add a function returning true if use_exactmap is set and false otherwise and make acpi_memory_hotplug_init() call that function. Alternatively, you can define arch-independent no_memory_hotplug (instead of use_exactmap) and set if for memmap=exactmap. Thanks!
On Thu, Jan 09, 2014 at 12:07:17AM +0100, Rafael J. Wysocki wrote: > On Wednesday, January 08, 2014 10:58:29 AM Vivek Goyal wrote: > > On Wed, Jan 08, 2014 at 11:26:43PM +0800, Baoquan wrote: > > > > [..] > > > [ 1.592222] acpi PNP0A03:03: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. > > > [ 1.605045] PCI host bridge to bus 0000:ff > > > [ 1.609615] pci_bus 0000:ff: root bus resource [bus ff] > > > [ 1.632117] System RAM resource [mem 0x01000000-0x7bffffff] cannot be added > > > [ 1.639892] init_memory_mapping: [mem 0x100000000-0x87fffffff] > > > [ 1.717793] swapper/0: page allocation failure: order:9, mode:0x84d0 > > > [ 1.724884] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-59.el7.x86_64 #1 > > > [ 1.732842] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S001.032520101647 03/25/2010 > > > [ 1.743224] 0000000000000000 ffff8800339878c8 ffffffff815b64ad ffff880033987950 > > > [ 1.751513] ffffffff8113a980 ffff88003673ab28 00000000000001fe 0000000000000001 > > > [ 1.759804] ffff880000000040 ffffffff810bc28a 0000000000000000 0000000000000200 > > > [ 1.768096] Call Trace: [348/1928] > > > [ 1.770834] [<ffffffff815b64ad>] dump_stack+0x19/0x1b > > > [ 1.776561] [<ffffffff8113a980>] warn_alloc_failed+0xf0/0x160 > > > [ 1.783076] [<ffffffff810bc28a>] ? on_each_cpu_mask+0x2a/0x60 > > > [ 1.789581] [<ffffffff8113e92f>] __alloc_pages_nodemask+0x7ff/0xa00 > > > [ 1.796672] [<ffffffff815ada2c>] vmemmap_alloc_block+0x62/0xba > > > [ 1.803274] [<ffffffff815ada99>] vmemmap_alloc_block_buf+0x15/0x3b > > > [ 1.810263] [<ffffffff815ab8a6>] vmemmap_populate+0xb4/0x21b > > > [ 1.816673] [<ffffffff815adecd>] sparse_mem_map_populate+0x27/0x35 > > > [ 1.823665] [<ffffffff815ad8bf>] sparse_add_one_section+0x7a/0x185 > > > [ 1.830659] [<ffffffff8159b74f>] __add_pages+0xaf/0x240 > > > [ 1.836588] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > > > [ 1.842804] [<ffffffff8159ba89>] add_memory+0xb9/0x1b0 > > > [ 1.848638] [<ffffffff8132dd2c>] acpi_memory_device_add+0x18d/0x26d > > > [ 1.855728] [<ffffffff81303b91>] acpi_bus_device_attach+0x7d/0xcd > > > [ 1.862625] [<ffffffff8131d92d>] acpi_ns_walk_namespace+0xc8/0x17f > > > [ 1.869616] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > [ 1.876896] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > [ 1.884177] [<ffffffff8131de1c>] acpi_walk_namespace+0x95/0xc5 > > > [ 1.890780] [<ffffffff81304866>] acpi_bus_scan+0x8b/0x9d > > > [ 1.896805] [<ffffffff81a14a15>] acpi_scan_init+0x63/0x160 > > > [ 1.903021] [<ffffffff81a14830>] acpi_init+0x25d/0x2a6 > > > > So basically acpi thinks that some memory block is a hot plug memory > > and tries to add it. And that consumes lots of memory and we don't have > > that memory in second kernel. > > That's not exactly the case. What seems to happen is that there is an ACPI > memory object in the ACPI namespace and the ACPI memory hotplug driver > attempts to bind to it. That driver attempts to find removable memory blocks > associated with that object and to add them to the memory map. > > Why don't you simply append acpi=off to the kexec command line? That should > make the problem go away. I think we need to initialize acpi because we rely on it for other tables and things. In fact everything in second kernel re-initializes so why ACPI should be an exception? We want second kernel boot path to be as close as possible to first kernel so that chances of successful boot are higher. So I don't think turning off acpi is way to go here. Key question is, whey this memory is still being considered as hotplugged memory while nothing has been hotplugged. I think acpi should not treat this memory as hotplug memory. And if ACPI does not have a way to figure it out, then disable memory hotplug functionality makes sense to me. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 08, 2014 at 05:11:48PM -0700, Toshi Kani wrote: > On Thu, 2014-01-09 at 00:07 +0100, Rafael J. Wysocki wrote: > > On Wednesday, January 08, 2014 10:58:29 AM Vivek Goyal wrote: > > > On Wed, Jan 08, 2014 at 11:26:43PM +0800, Baoquan wrote: > > > > > > [..] > > > > [ 1.592222] acpi PNP0A03:03: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. > > > > [ 1.605045] PCI host bridge to bus 0000:ff > > > > [ 1.609615] pci_bus 0000:ff: root bus resource [bus ff] > > > > [ 1.632117] System RAM resource [mem 0x01000000-0x7bffffff] cannot be added > > > > [ 1.639892] init_memory_mapping: [mem 0x100000000-0x87fffffff] > > > > [ 1.717793] swapper/0: page allocation failure: order:9, mode:0x84d0 > > > > [ 1.724884] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-59.el7.x86_64 #1 > > > > [ 1.732842] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S001.032520101647 03/25/2010 > > > > [ 1.743224] 0000000000000000 ffff8800339878c8 ffffffff815b64ad ffff880033987950 > > > > [ 1.751513] ffffffff8113a980 ffff88003673ab28 00000000000001fe 0000000000000001 > > > > [ 1.759804] ffff880000000040 ffffffff810bc28a 0000000000000000 0000000000000200 > > > > [ 1.768096] Call Trace: [348/1928] > > > > [ 1.770834] [<ffffffff815b64ad>] dump_stack+0x19/0x1b > > > > [ 1.776561] [<ffffffff8113a980>] warn_alloc_failed+0xf0/0x160 > > > > [ 1.783076] [<ffffffff810bc28a>] ? on_each_cpu_mask+0x2a/0x60 > > > > [ 1.789581] [<ffffffff8113e92f>] __alloc_pages_nodemask+0x7ff/0xa00 > > > > [ 1.796672] [<ffffffff815ada2c>] vmemmap_alloc_block+0x62/0xba > > > > [ 1.803274] [<ffffffff815ada99>] vmemmap_alloc_block_buf+0x15/0x3b > > > > [ 1.810263] [<ffffffff815ab8a6>] vmemmap_populate+0xb4/0x21b > > > > [ 1.816673] [<ffffffff815adecd>] sparse_mem_map_populate+0x27/0x35 > > > > [ 1.823665] [<ffffffff815ad8bf>] sparse_add_one_section+0x7a/0x185 > > > > [ 1.830659] [<ffffffff8159b74f>] __add_pages+0xaf/0x240 > > > > [ 1.836588] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > > > > [ 1.842804] [<ffffffff8159ba89>] add_memory+0xb9/0x1b0 > > > > [ 1.848638] [<ffffffff8132dd2c>] acpi_memory_device_add+0x18d/0x26d > > > > [ 1.855728] [<ffffffff81303b91>] acpi_bus_device_attach+0x7d/0xcd > > > > [ 1.862625] [<ffffffff8131d92d>] acpi_ns_walk_namespace+0xc8/0x17f > > > > [ 1.869616] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > [ 1.876896] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > [ 1.884177] [<ffffffff8131de1c>] acpi_walk_namespace+0x95/0xc5 > > > > [ 1.890780] [<ffffffff81304866>] acpi_bus_scan+0x8b/0x9d > > > > [ 1.896805] [<ffffffff81a14a15>] acpi_scan_init+0x63/0x160 > > > > [ 1.903021] [<ffffffff81a14830>] acpi_init+0x25d/0x2a6 > > > > > > So basically acpi thinks that some memory block is a hot plug memory > > > and tries to add it. And that consumes lots of memory and we don't have > > > that memory in second kernel. > > > > That's not exactly the case. What seems to happen is that there is an ACPI > > memory object in the ACPI namespace and the ACPI memory hotplug driver > > attempts to bind to it. That driver attempts to find removable memory blocks > > associated with that object and to add them to the memory map. > > > > Why don't you simply append acpi=off to the kexec command line? That should > > make the problem go away. > > Yes, that should work, but Baoquan's approach makes sense to me. When > memmap=exactmap is specified, the kernel should ignore any memory > information from the firmware. memmap=exactmap is only for E820 map. It does not say that later memory can not be hotplugged. So to me specifying exactmap does not imply that memory hotplugging is disabled. IMO, it makes sense to have a separate knob to disable memory hotplug behavior. Also from kdump point of view, I don't want to rely on exactmap as in new implementation I am planning to move away from exactmap. I will pass new memory map in bootparams and stop passing it on command line. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 09, 2014 at 02:10:26PM +0100, Rafael J. Wysocki wrote: > On Wednesday, January 08, 2014 05:11:48 PM Toshi Kani wrote: > > On Thu, 2014-01-09 at 00:07 +0100, Rafael J. Wysocki wrote: > > > On Wednesday, January 08, 2014 10:58:29 AM Vivek Goyal wrote: > > > > On Wed, Jan 08, 2014 at 11:26:43PM +0800, Baoquan wrote: > > > > > > > > [..] > > > > > [ 1.592222] acpi PNP0A03:03: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. > > > > > [ 1.605045] PCI host bridge to bus 0000:ff > > > > > [ 1.609615] pci_bus 0000:ff: root bus resource [bus ff] > > > > > [ 1.632117] System RAM resource [mem 0x01000000-0x7bffffff] cannot be added > > > > > [ 1.639892] init_memory_mapping: [mem 0x100000000-0x87fffffff] > > > > > [ 1.717793] swapper/0: page allocation failure: order:9, mode:0x84d0 > > > > > [ 1.724884] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-59.el7.x86_64 #1 > > > > > [ 1.732842] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S001.032520101647 03/25/2010 > > > > > [ 1.743224] 0000000000000000 ffff8800339878c8 ffffffff815b64ad ffff880033987950 > > > > > [ 1.751513] ffffffff8113a980 ffff88003673ab28 00000000000001fe 0000000000000001 > > > > > [ 1.759804] ffff880000000040 ffffffff810bc28a 0000000000000000 0000000000000200 > > > > > [ 1.768096] Call Trace: [348/1928] > > > > > [ 1.770834] [<ffffffff815b64ad>] dump_stack+0x19/0x1b > > > > > [ 1.776561] [<ffffffff8113a980>] warn_alloc_failed+0xf0/0x160 > > > > > [ 1.783076] [<ffffffff810bc28a>] ? on_each_cpu_mask+0x2a/0x60 > > > > > [ 1.789581] [<ffffffff8113e92f>] __alloc_pages_nodemask+0x7ff/0xa00 > > > > > [ 1.796672] [<ffffffff815ada2c>] vmemmap_alloc_block+0x62/0xba > > > > > [ 1.803274] [<ffffffff815ada99>] vmemmap_alloc_block_buf+0x15/0x3b > > > > > [ 1.810263] [<ffffffff815ab8a6>] vmemmap_populate+0xb4/0x21b > > > > > [ 1.816673] [<ffffffff815adecd>] sparse_mem_map_populate+0x27/0x35 > > > > > [ 1.823665] [<ffffffff815ad8bf>] sparse_add_one_section+0x7a/0x185 > > > > > [ 1.830659] [<ffffffff8159b74f>] __add_pages+0xaf/0x240 > > > > > [ 1.836588] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > > > > > [ 1.842804] [<ffffffff8159ba89>] add_memory+0xb9/0x1b0 > > > > > [ 1.848638] [<ffffffff8132dd2c>] acpi_memory_device_add+0x18d/0x26d > > > > > [ 1.855728] [<ffffffff81303b91>] acpi_bus_device_attach+0x7d/0xcd > > > > > [ 1.862625] [<ffffffff8131d92d>] acpi_ns_walk_namespace+0xc8/0x17f > > > > > [ 1.869616] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > > [ 1.876896] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > > [ 1.884177] [<ffffffff8131de1c>] acpi_walk_namespace+0x95/0xc5 > > > > > [ 1.890780] [<ffffffff81304866>] acpi_bus_scan+0x8b/0x9d > > > > > [ 1.896805] [<ffffffff81a14a15>] acpi_scan_init+0x63/0x160 > > > > > [ 1.903021] [<ffffffff81a14830>] acpi_init+0x25d/0x2a6 > > > > > > > > So basically acpi thinks that some memory block is a hot plug memory > > > > and tries to add it. And that consumes lots of memory and we don't have > > > > that memory in second kernel. > > > > > > That's not exactly the case. What seems to happen is that there is an ACPI > > > memory object in the ACPI namespace and the ACPI memory hotplug driver > > > attempts to bind to it. That driver attempts to find removable memory blocks > > > associated with that object and to add them to the memory map. > > > > > > Why don't you simply append acpi=off to the kexec command line? That should > > > make the problem go away. > > > > Yes, that should work, but Baoquan's approach makes sense to me. When > > memmap=exactmap is specified, the kernel should ignore any memory > > information from the firmware. > > OK > > Baoquan, please modify your patch to get rid of the #ifdef CONFIG_X86 in > acpi_memory_hotplug_init(). For example, you can add a function returning true > if use_exactmap is set and false otherwise and make acpi_memory_hotplug_init() > call that function. Alternatively, you can define arch-independent > no_memory_hotplug (instead of use_exactmap) and set if for memmap=exactmap. > Prarit sent a patch to introduce no_memory_hotplug command line. I still think that memmap=exactmap does not necessarily mean that memory hotplug is disabled. What about mem= parameter. If somebody specifies mem=1G, should that mean there can not be any hotplugged memory. I think we should atleast define a new command line parameter to disable memory hotplug. After that users can specify both memmap=exactmap and "no_mem_hotplug" on command line and control the behavior of kernel. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2014-01-09 at 09:50 -0500, Vivek Goyal wrote: > On Wed, Jan 08, 2014 at 05:11:48PM -0700, Toshi Kani wrote: > > On Thu, 2014-01-09 at 00:07 +0100, Rafael J. Wysocki wrote: > > > On Wednesday, January 08, 2014 10:58:29 AM Vivek Goyal wrote: > > > > On Wed, Jan 08, 2014 at 11:26:43PM +0800, Baoquan wrote: > > > > > > > > [..] > > > > > [ 1.592222] acpi PNP0A03:03: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. > > > > > [ 1.605045] PCI host bridge to bus 0000:ff > > > > > [ 1.609615] pci_bus 0000:ff: root bus resource [bus ff] > > > > > [ 1.632117] System RAM resource [mem 0x01000000-0x7bffffff] cannot be added > > > > > [ 1.639892] init_memory_mapping: [mem 0x100000000-0x87fffffff] > > > > > [ 1.717793] swapper/0: page allocation failure: order:9, mode:0x84d0 > > > > > [ 1.724884] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-59.el7.x86_64 #1 > > > > > [ 1.732842] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S001.032520101647 03/25/2010 > > > > > [ 1.743224] 0000000000000000 ffff8800339878c8 ffffffff815b64ad ffff880033987950 > > > > > [ 1.751513] ffffffff8113a980 ffff88003673ab28 00000000000001fe 0000000000000001 > > > > > [ 1.759804] ffff880000000040 ffffffff810bc28a 0000000000000000 0000000000000200 > > > > > [ 1.768096] Call Trace: [348/1928] > > > > > [ 1.770834] [<ffffffff815b64ad>] dump_stack+0x19/0x1b > > > > > [ 1.776561] [<ffffffff8113a980>] warn_alloc_failed+0xf0/0x160 > > > > > [ 1.783076] [<ffffffff810bc28a>] ? on_each_cpu_mask+0x2a/0x60 > > > > > [ 1.789581] [<ffffffff8113e92f>] __alloc_pages_nodemask+0x7ff/0xa00 > > > > > [ 1.796672] [<ffffffff815ada2c>] vmemmap_alloc_block+0x62/0xba > > > > > [ 1.803274] [<ffffffff815ada99>] vmemmap_alloc_block_buf+0x15/0x3b > > > > > [ 1.810263] [<ffffffff815ab8a6>] vmemmap_populate+0xb4/0x21b > > > > > [ 1.816673] [<ffffffff815adecd>] sparse_mem_map_populate+0x27/0x35 > > > > > [ 1.823665] [<ffffffff815ad8bf>] sparse_add_one_section+0x7a/0x185 > > > > > [ 1.830659] [<ffffffff8159b74f>] __add_pages+0xaf/0x240 > > > > > [ 1.836588] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > > > > > [ 1.842804] [<ffffffff8159ba89>] add_memory+0xb9/0x1b0 > > > > > [ 1.848638] [<ffffffff8132dd2c>] acpi_memory_device_add+0x18d/0x26d > > > > > [ 1.855728] [<ffffffff81303b91>] acpi_bus_device_attach+0x7d/0xcd > > > > > [ 1.862625] [<ffffffff8131d92d>] acpi_ns_walk_namespace+0xc8/0x17f > > > > > [ 1.869616] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > > [ 1.876896] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > > [ 1.884177] [<ffffffff8131de1c>] acpi_walk_namespace+0x95/0xc5 > > > > > [ 1.890780] [<ffffffff81304866>] acpi_bus_scan+0x8b/0x9d > > > > > [ 1.896805] [<ffffffff81a14a15>] acpi_scan_init+0x63/0x160 > > > > > [ 1.903021] [<ffffffff81a14830>] acpi_init+0x25d/0x2a6 > > > > > > > > So basically acpi thinks that some memory block is a hot plug memory > > > > and tries to add it. And that consumes lots of memory and we don't have > > > > that memory in second kernel. > > > > > > That's not exactly the case. What seems to happen is that there is an ACPI > > > memory object in the ACPI namespace and the ACPI memory hotplug driver > > > attempts to bind to it. That driver attempts to find removable memory blocks > > > associated with that object and to add them to the memory map. > > > > > > Why don't you simply append acpi=off to the kexec command line? That should > > > make the problem go away. > > > > Yes, that should work, but Baoquan's approach makes sense to me. When > > memmap=exactmap is specified, the kernel should ignore any memory > > information from the firmware. > > memmap=exactmap is only for E820 map. It does not say that later memory > can not be hotplugged. So to me specifying exactmap does not imply that > memory hotplugging is disabled. There are multiple ways to describe memory range info in the firmware; e820, EFI memory descriptor table, and ACPI memory device objects. They basically provide the same info. This problem happens when the firmware implements ACPI memory device objects, which are necessary to support memory hotplug, but do not mean that the system always supports hotplug when they exist. They are optional objects that firmware vendors may choose to implement. While the exactmap option does not imply that memory hotplug is disabled, it does require that the kernel only consumes user-supplied memory range information. Hence, Baoquan's approach makes sense to me. > IMO, it makes sense to have a separate knob to disable memory hotplug > behavior. Regular users do not know if their systems implement ACPI memory device objects or not. So, asking users to specify a separate option when their systems implement ACPI memory objects is tricky, IMO. > Also from kdump point of view, I don't want to rely on exactmap as in > new implementation I am planning to move away from exactmap. I will > pass new memory map in bootparams and stop passing it on command line. I think we still need a flag that indicates the kernel can only consume the new memory map in bootparams, and cannot to obtain from the firmware. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2014-01-09 at 09:53 -0500, Vivek Goyal wrote: > On Thu, Jan 09, 2014 at 02:10:26PM +0100, Rafael J. Wysocki wrote: > > On Wednesday, January 08, 2014 05:11:48 PM Toshi Kani wrote: > > > On Thu, 2014-01-09 at 00:07 +0100, Rafael J. Wysocki wrote: > > > > On Wednesday, January 08, 2014 10:58:29 AM Vivek Goyal wrote: > > > > > On Wed, Jan 08, 2014 at 11:26:43PM +0800, Baoquan wrote: > > > > > > > > > > [..] > > > > > > [ 1.592222] acpi PNP0A03:03: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. > > > > > > [ 1.605045] PCI host bridge to bus 0000:ff > > > > > > [ 1.609615] pci_bus 0000:ff: root bus resource [bus ff] > > > > > > [ 1.632117] System RAM resource [mem 0x01000000-0x7bffffff] cannot be added > > > > > > [ 1.639892] init_memory_mapping: [mem 0x100000000-0x87fffffff] > > > > > > [ 1.717793] swapper/0: page allocation failure: order:9, mode:0x84d0 > > > > > > [ 1.724884] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-59.el7.x86_64 #1 > > > > > > [ 1.732842] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S001.032520101647 03/25/2010 > > > > > > [ 1.743224] 0000000000000000 ffff8800339878c8 ffffffff815b64ad ffff880033987950 > > > > > > [ 1.751513] ffffffff8113a980 ffff88003673ab28 00000000000001fe 0000000000000001 > > > > > > [ 1.759804] ffff880000000040 ffffffff810bc28a 0000000000000000 0000000000000200 > > > > > > [ 1.768096] Call Trace: [348/1928] > > > > > > [ 1.770834] [<ffffffff815b64ad>] dump_stack+0x19/0x1b > > > > > > [ 1.776561] [<ffffffff8113a980>] warn_alloc_failed+0xf0/0x160 > > > > > > [ 1.783076] [<ffffffff810bc28a>] ? on_each_cpu_mask+0x2a/0x60 > > > > > > [ 1.789581] [<ffffffff8113e92f>] __alloc_pages_nodemask+0x7ff/0xa00 > > > > > > [ 1.796672] [<ffffffff815ada2c>] vmemmap_alloc_block+0x62/0xba > > > > > > [ 1.803274] [<ffffffff815ada99>] vmemmap_alloc_block_buf+0x15/0x3b > > > > > > [ 1.810263] [<ffffffff815ab8a6>] vmemmap_populate+0xb4/0x21b > > > > > > [ 1.816673] [<ffffffff815adecd>] sparse_mem_map_populate+0x27/0x35 > > > > > > [ 1.823665] [<ffffffff815ad8bf>] sparse_add_one_section+0x7a/0x185 > > > > > > [ 1.830659] [<ffffffff8159b74f>] __add_pages+0xaf/0x240 > > > > > > [ 1.836588] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > > > > > > [ 1.842804] [<ffffffff8159ba89>] add_memory+0xb9/0x1b0 > > > > > > [ 1.848638] [<ffffffff8132dd2c>] acpi_memory_device_add+0x18d/0x26d > > > > > > [ 1.855728] [<ffffffff81303b91>] acpi_bus_device_attach+0x7d/0xcd > > > > > > [ 1.862625] [<ffffffff8131d92d>] acpi_ns_walk_namespace+0xc8/0x17f > > > > > > [ 1.869616] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > > > [ 1.876896] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > > > [ 1.884177] [<ffffffff8131de1c>] acpi_walk_namespace+0x95/0xc5 > > > > > > [ 1.890780] [<ffffffff81304866>] acpi_bus_scan+0x8b/0x9d > > > > > > [ 1.896805] [<ffffffff81a14a15>] acpi_scan_init+0x63/0x160 > > > > > > [ 1.903021] [<ffffffff81a14830>] acpi_init+0x25d/0x2a6 > > > > > > > > > > So basically acpi thinks that some memory block is a hot plug memory > > > > > and tries to add it. And that consumes lots of memory and we don't have > > > > > that memory in second kernel. > > > > > > > > That's not exactly the case. What seems to happen is that there is an ACPI > > > > memory object in the ACPI namespace and the ACPI memory hotplug driver > > > > attempts to bind to it. That driver attempts to find removable memory blocks > > > > associated with that object and to add them to the memory map. > > > > > > > > Why don't you simply append acpi=off to the kexec command line? That should > > > > make the problem go away. > > > > > > Yes, that should work, but Baoquan's approach makes sense to me. When > > > memmap=exactmap is specified, the kernel should ignore any memory > > > information from the firmware. > > > > OK > > > > Baoquan, please modify your patch to get rid of the #ifdef CONFIG_X86 in > > acpi_memory_hotplug_init(). For example, you can add a function returning true > > if use_exactmap is set and false otherwise and make acpi_memory_hotplug_init() > > call that function. Alternatively, you can define arch-independent > > no_memory_hotplug (instead of use_exactmap) and set if for memmap=exactmap. > > > > Prarit sent a patch to introduce no_memory_hotplug command line. I still > think that memmap=exactmap does not necessarily mean that memory hotplug > is disabled. > > What about mem= parameter. If somebody specifies mem=1G, should that mean > there can not be any hotplugged memory. Good point. Yes, I think we need to ignore ACPI memory objects in this case as well. I suppose the use of this option is limited for specific test purpose, and disabling memory hotplug is not a big issue here. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 09, 2014 at 09:03:59AM -0700, Toshi Kani wrote: [..] > > > > > So basically acpi thinks that some memory block is a hot plug memory > > > > > and tries to add it. And that consumes lots of memory and we don't have > > > > > that memory in second kernel. > > > > > > > > That's not exactly the case. What seems to happen is that there is an ACPI > > > > memory object in the ACPI namespace and the ACPI memory hotplug driver > > > > attempts to bind to it. That driver attempts to find removable memory blocks > > > > associated with that object and to add them to the memory map. > > > > > > > > Why don't you simply append acpi=off to the kexec command line? That should > > > > make the problem go away. > > > > > > Yes, that should work, but Baoquan's approach makes sense to me. When > > > memmap=exactmap is specified, the kernel should ignore any memory > > > information from the firmware. > > > > memmap=exactmap is only for E820 map. It does not say that later memory > > can not be hotplugged. So to me specifying exactmap does not imply that > > memory hotplugging is disabled. > > There are multiple ways to describe memory range info in the firmware; > e820, EFI memory descriptor table, and ACPI memory device objects. They > basically provide the same info. So ACPI memory device objects contain all the memory ranges as exported in E820? > > This problem happens when the firmware implements ACPI memory device > objects, which are necessary to support memory hotplug, but do not mean > that the system always supports hotplug when they exist. They are > optional objects that firmware vendors may choose to implement. This is confusing. So even if memory hotplug is not supported, ACPI memory device objects might be present. What's the purpose? How do they help. If they represent same info as firmware provided using a BIOS call early (E820 map), then how does system later avoid adding same memory ranges. IOW, in terms of design, what's the objective. Why to create this additional path of getting memory information. > > While the exactmap option does not imply that memory hotplug is > disabled, But Bao's approach will disable memory hotplug on exactmap. > it does require that the kernel only consumes user-supplied > memory range information. Hence, Baoquan's approach makes sense to me. I am fine with this as long as memmap=exactmap is not the only way to disable memory hotplug. I need another way too so that users who are not using exactmap can still disable memory hotplug. > > > IMO, it makes sense to have a separate knob to disable memory hotplug > > behavior. > > Regular users do not know if their systems implement ACPI memory device > objects or not. So, asking users to specify a separate option when > their systems implement ACPI memory objects is tricky, IMO. They can always specify no_memory_hotplug, irrespective of the fact that kernel supports memory hotplug or not. Anyway, I don't mind if one implicitly disables memory hotplug if memmap=exactmap or mem=X is specified. It is just a matter of figuring how what should be a more intutive behavior from user's point of view. But I do want a separate path to disable memory hotplug so that even if I am not using memmap=exactmap or mem=X, I should be able to disable memory hotplug. > > > Also from kdump point of view, I don't want to rely on exactmap as in > > new implementation I am planning to move away from exactmap. I will > > pass new memory map in bootparams and stop passing it on command line. > > I think we still need a flag that indicates the kernel can only consume > the new memory map in bootparams, and cannot to obtain from the > firmware. I think creating a new command line option is simpler as compared to creating a new flag in bootparam which in turn disables memory hotplug. More users can use that option. For example, if for some reason hotplug code is crashing, one can just disable it on command line as work around and move on. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2014-01-09 at 11:24 -0500, Vivek Goyal wrote: > On Thu, Jan 09, 2014 at 09:03:59AM -0700, Toshi Kani wrote: > > [..] > > > > > > So basically acpi thinks that some memory block is a hot plug memory > > > > > > and tries to add it. And that consumes lots of memory and we don't have > > > > > > that memory in second kernel. > > > > > > > > > > That's not exactly the case. What seems to happen is that there is an ACPI > > > > > memory object in the ACPI namespace and the ACPI memory hotplug driver > > > > > attempts to bind to it. That driver attempts to find removable memory blocks > > > > > associated with that object and to add them to the memory map. > > > > > > > > > > Why don't you simply append acpi=off to the kexec command line? That should > > > > > make the problem go away. > > > > > > > > Yes, that should work, but Baoquan's approach makes sense to me. When > > > > memmap=exactmap is specified, the kernel should ignore any memory > > > > information from the firmware. > > > > > > memmap=exactmap is only for E820 map. It does not say that later memory > > > can not be hotplugged. So to me specifying exactmap does not imply that > > > memory hotplugging is disabled. > > > > There are multiple ways to describe memory range info in the firmware; > > e820, EFI memory descriptor table, and ACPI memory device objects. They > > basically provide the same info. > > So ACPI memory device objects contain all the memory ranges as exported > in E820? Yes. (Some vendors might choose to implement some portion of memory with memory objects, but I think they are special cases.) > > This problem happens when the firmware implements ACPI memory device > > objects, which are necessary to support memory hotplug, but do not mean > > that the system always supports hotplug when they exist. They are > > optional objects that firmware vendors may choose to implement. > > This is confusing. So even if memory hotplug is not supported, ACPI memory > device objects might be present. What's the purpose? How do they help. They do not help at this point, but the point is that memory objects can be present without hotplug support. There is nothing wrong with it per the spec. > If they represent same info as firmware provided using a BIOS call early > (E820 map), then how does system later avoid adding same memory ranges. It attempts to add, but fails with -EEXIST because it is already there. > IOW, in terms of design, what's the objective. Why to create this > additional path of getting memory information. To support memory hot-remove requests, ACPI objects need to be initialized for the existing memory ranges beforehand. > > While the exactmap option does not imply that memory hotplug is > > disabled, > > But Bao's approach will disable memory hotplug on exactmap. Right, but it does not seem worthwhile for adding complexity to support memory hotplug and exactmap at the same time. > > it does require that the kernel only consumes user-supplied > > memory range information. Hence, Baoquan's approach makes sense to me. > > I am fine with this as long as memmap=exactmap is not the only way to > disable memory hotplug. I need another way too so that users who are > not using exactmap can still disable memory hotplug. There is a config option to enable/disable memory hotplug. You are right that the exactmap option is not the way to disable memory hotplug. This option requests the kernel to use user-supplied memory ranges only, so memory hotplug will not be supported under this constrain. > > > IMO, it makes sense to have a separate knob to disable memory hotplug > > > behavior. > > > > Regular users do not know if their systems implement ACPI memory device > > objects or not. So, asking users to specify a separate option when > > their systems implement ACPI memory objects is tricky, IMO. > > They can always specify no_memory_hotplug, irrespective of the fact that > kernel supports memory hotplug or not. > > Anyway, I don't mind if one implicitly disables memory hotplug if > memmap=exactmap or mem=X is specified. It is just a matter of figuring > how what should be a more intutive behavior from user's point of view. Since memory hotplug won't work under the constrain of the exactmap option, it seems natural to disable it. > But I do want a separate path to disable memory hotplug so that even > if I am not using memmap=exactmap or mem=X, I should be able to disable > memory hotplug. I think this is a separate topic. > > > Also from kdump point of view, I don't want to rely on exactmap as in > > > new implementation I am planning to move away from exactmap. I will > > > pass new memory map in bootparams and stop passing it on command line. > > > > I think we still need a flag that indicates the kernel can only consume > > the new memory map in bootparams, and cannot to obtain from the > > firmware. > > I think creating a new command line option is simpler as compared to > creating a new flag in bootparam which in turn disables memory hotplug. > More users can use that option. For example, if for some reason hotplug > code is crashing, one can just disable it on command line as work around > and move on. I do not have a strong opinion about having such option. However, I think it is more user friendly to keep the exactmap option works alone on any platforms. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 09, 2014 at 10:24:25AM -0700, Toshi Kani wrote: [..] > > I think creating a new command line option is simpler as compared to > > creating a new flag in bootparam which in turn disables memory hotplug. > > More users can use that option. For example, if for some reason hotplug > > code is crashing, one can just disable it on command line as work around > > and move on. > > I do not have a strong opinion about having such option. However, I > think it is more user friendly to keep the exactmap option works alone > on any platforms. I think we should create internally a variable which will disable memory hotplug. And set that variable based on memmap=exactmap, mem=X and also provide a way to disable memory hotplug directly using command line option. Current kexec-tools can use memmap=exactmap and be happy. I am writing a new kexec syscall and will not be using memmap=exactmap and would need to use that command line option to disable memory hotplug behavior. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2014-01-09 at 13:23 -0500, Vivek Goyal wrote: > On Thu, Jan 09, 2014 at 10:24:25AM -0700, Toshi Kani wrote: > > [..] > > > I think creating a new command line option is simpler as compared to > > > creating a new flag in bootparam which in turn disables memory hotplug. > > > More users can use that option. For example, if for some reason hotplug > > > code is crashing, one can just disable it on command line as work around > > > and move on. > > > > I do not have a strong opinion about having such option. However, I > > think it is more user friendly to keep the exactmap option works alone > > on any platforms. > > I think we should create internally a variable which will disable memory > hotplug. And set that variable based on memmap=exactmap, mem=X and also > provide a way to disable memory hotplug directly using command line > option. > > Current kexec-tools can use memmap=exactmap and be happy. I am writing > a new kexec syscall and will not be using memmap=exactmap and would need > to use that command line option to disable memory hotplug behavior. Sounds good to me. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 09, 2014 at 11:34:30AM -0700, Toshi Kani wrote: > On Thu, 2014-01-09 at 13:23 -0500, Vivek Goyal wrote: > > On Thu, Jan 09, 2014 at 10:24:25AM -0700, Toshi Kani wrote: > > > > [..] > > > > I think creating a new command line option is simpler as compared to > > > > creating a new flag in bootparam which in turn disables memory hotplug. > > > > More users can use that option. For example, if for some reason hotplug > > > > code is crashing, one can just disable it on command line as work around > > > > and move on. > > > > > > I do not have a strong opinion about having such option. However, I > > > think it is more user friendly to keep the exactmap option works alone > > > on any platforms. > > > > I think we should create internally a variable which will disable memory > > hotplug. And set that variable based on memmap=exactmap, mem=X and also > > provide a way to disable memory hotplug directly using command line > > option. > > > > Current kexec-tools can use memmap=exactmap and be happy. I am writing > > a new kexec syscall and will not be using memmap=exactmap and would need > > to use that command line option to disable memory hotplug behavior. > > Sounds good to me. Nobody responded to my other question, so I would ask it again. Assume we have disabled hotplug memory in second kernel. First kernel saw hotplug memory and assume crash kernel reserved region came from there. We will pass this memory in bootparams to second kernel and it will show up in E820 map. It should still be accessible in second kernel, is that right? Or there is some dependency on ACPI doing some magic before this memory range is available in second kernel? Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2014-01-09 at 16:27 -0500, Vivek Goyal wrote: > On Thu, Jan 09, 2014 at 11:34:30AM -0700, Toshi Kani wrote: > > On Thu, 2014-01-09 at 13:23 -0500, Vivek Goyal wrote: > > > On Thu, Jan 09, 2014 at 10:24:25AM -0700, Toshi Kani wrote: > > > > > > [..] > > > > > I think creating a new command line option is simpler as compared to > > > > > creating a new flag in bootparam which in turn disables memory hotplug. > > > > > More users can use that option. For example, if for some reason hotplug > > > > > code is crashing, one can just disable it on command line as work around > > > > > and move on. > > > > > > > > I do not have a strong opinion about having such option. However, I > > > > think it is more user friendly to keep the exactmap option works alone > > > > on any platforms. > > > > > > I think we should create internally a variable which will disable memory > > > hotplug. And set that variable based on memmap=exactmap, mem=X and also > > > provide a way to disable memory hotplug directly using command line > > > option. > > > > > > Current kexec-tools can use memmap=exactmap and be happy. I am writing > > > a new kexec syscall and will not be using memmap=exactmap and would need > > > to use that command line option to disable memory hotplug behavior. > > > > Sounds good to me. > > Nobody responded to my other question, so I would ask it again. > > Assume we have disabled hotplug memory in second kernel. First kernel > saw hotplug memory and assume crash kernel reserved region came from > there. We will pass this memory in bootparams to second kernel and it > will show up in E820 map. It should still be accessible in second kernel, > is that right? Yes. > Or there is some dependency on ACPI doing some magic before this memory > range is available in second kernel? No. The 1st kernel reserves the crash kernel region, which cannot be hot-deleted. So, this region continues to be accessible by the 2nd kernel without any operation. I am more curious to know how makedumpfile decides what memory ranges to dump. The 1st kernel may have performed memory hot-add / delete operations before a crash, so it needs to know the valid physical address range at the time of crash, and may not rely on the E820 map from BIOS (which is stale). Am I right to assume that makedumpfile gets it from the page tables of the 1st kernel? Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, January 09, 2014 11:34:30 AM Toshi Kani wrote: > On Thu, 2014-01-09 at 13:23 -0500, Vivek Goyal wrote: > > On Thu, Jan 09, 2014 at 10:24:25AM -0700, Toshi Kani wrote: > > > > [..] > > > > I think creating a new command line option is simpler as compared to > > > > creating a new flag in bootparam which in turn disables memory hotplug. > > > > More users can use that option. For example, if for some reason hotplug > > > > code is crashing, one can just disable it on command line as work around > > > > and move on. > > > > > > I do not have a strong opinion about having such option. However, I > > > think it is more user friendly to keep the exactmap option works alone > > > on any platforms. > > > > I think we should create internally a variable which will disable memory > > hotplug. And set that variable based on memmap=exactmap, mem=X and also > > provide a way to disable memory hotplug directly using command line > > option. > > > > Current kexec-tools can use memmap=exactmap and be happy. I am writing > > a new kexec syscall and will not be using memmap=exactmap and would need > > to use that command line option to disable memory hotplug behavior. > > Sounds good to me. Agreed. Thanks!
On 01/09/14 at 02:56pm, Toshi Kani wrote: > On Thu, 2014-01-09 at 16:27 -0500, Vivek Goyal wrote: > > On Thu, Jan 09, 2014 at 11:34:30AM -0700, Toshi Kani wrote: > > > On Thu, 2014-01-09 at 13:23 -0500, Vivek Goyal wrote: > > > > On Thu, Jan 09, 2014 at 10:24:25AM -0700, Toshi Kani wrote: > > > > > > > > [..] > > > > > > I think creating a new command line option is simpler as compared to > > > > > > creating a new flag in bootparam which in turn disables memory hotplug. > > > > > > More users can use that option. For example, if for some reason hotplug > > > > > > code is crashing, one can just disable it on command line as work around > > > > > > and move on. > > > > > > > > > > I do not have a strong opinion about having such option. However, I > > > > > think it is more user friendly to keep the exactmap option works alone > > > > > on any platforms. > > > > > > > > I think we should create internally a variable which will disable memory > > > > hotplug. And set that variable based on memmap=exactmap, mem=X and also > > > > provide a way to disable memory hotplug directly using command line > > > > option. > > > > > > > > Current kexec-tools can use memmap=exactmap and be happy. I am writing > > > > a new kexec syscall and will not be using memmap=exactmap and would need > > > > to use that command line option to disable memory hotplug behavior. > > > > > > Sounds good to me. > > > > Nobody responded to my other question, so I would ask it again. > > > > Assume we have disabled hotplug memory in second kernel. First kernel > > saw hotplug memory and assume crash kernel reserved region came from > > there. We will pass this memory in bootparams to second kernel and it > > will show up in E820 map. It should still be accessible in second kernel, > > is that right? > > Yes. > > > Or there is some dependency on ACPI doing some magic before this memory > > range is available in second kernel? > > No. The 1st kernel reserves the crash kernel region, which cannot be > hot-deleted. So, this region continues to be accessible by the 2nd > kernel without any operation. Now what I understand is if a several memsection is reserved for crashkernel, then in 2nd kernel, they are just like normal memory. In ns object tree, they are not treated as hotplug memory. Otherwise, any hotplug memory which is not reserved for 2nd kernel can be parsed and need be added as hotplug memory, and add them into movable zone. Am I right? The other question, e820 reserve is done earlier than acpi initialization, because acpi_early_init() invocation is very late in start_kernel(). Does that means at the very beginning all memorys are in e820, later when acpi_early_init is called, hotplug memory is detected, they will be moved to different place or need be marked with a specific flag? > > I am more curious to know how makedumpfile decides what memory ranges to > dump. The 1st kernel may have performed memory hot-add / delete > operations before a crash, so it needs to know the valid physical > address range at the time of crash, and may not rely on the E820 map > from BIOS (which is stale). Am I right to assume that makedumpfile gets > it from the page tables of the 1st kernel? makedumpfile just do the dump, what memory ranges to dump is decided in 1st kernel by kexec-tools. In 1st kernel, if kexec-tools executed, it will find all System Ram memorys which exclude the reserved regions for kdump kernel, then build a logical elf file, each load segment is one of these System Ram memory regions, its addr and length is written into the program header. Then makedumpfile just read this elf file, and read all of them and dump. If after kexec-tools execution and before crash, a hotplug memory is removed, udev will check this and trigger a kdump restart, kexec-tools is executed again, System Ram region information are stored. The logical file header will be passed to 2nd kernel. > > Thanks, > -Toshi > > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
(2014/01/10 16:11), Baoquan wrote: > On 01/09/14 at 02:56pm, Toshi Kani wrote: >> On Thu, 2014-01-09 at 16:27 -0500, Vivek Goyal wrote: >>> On Thu, Jan 09, 2014 at 11:34:30AM -0700, Toshi Kani wrote: >>>> On Thu, 2014-01-09 at 13:23 -0500, Vivek Goyal wrote: >>>>> On Thu, Jan 09, 2014 at 10:24:25AM -0700, Toshi Kani wrote: >>>>> >>>>> [..] >>>>>>> I think creating a new command line option is simpler as compared to >>>>>>> creating a new flag in bootparam which in turn disables memory hotplug. >>>>>>> More users can use that option. For example, if for some reason hotplug >>>>>>> code is crashing, one can just disable it on command line as work around >>>>>>> and move on. >>>>>> >>>>>> I do not have a strong opinion about having such option. However, I >>>>>> think it is more user friendly to keep the exactmap option works alone >>>>>> on any platforms. >>>>> >>>>> I think we should create internally a variable which will disable memory >>>>> hotplug. And set that variable based on memmap=exactmap, mem=X and also >>>>> provide a way to disable memory hotplug directly using command line >>>>> option. >>>>> >>>>> Current kexec-tools can use memmap=exactmap and be happy. I am writing >>>>> a new kexec syscall and will not be using memmap=exactmap and would need >>>>> to use that command line option to disable memory hotplug behavior. >>>> >>>> Sounds good to me. >>> >>> Nobody responded to my other question, so I would ask it again. >>> >>> Assume we have disabled hotplug memory in second kernel. First kernel >>> saw hotplug memory and assume crash kernel reserved region came from >>> there. We will pass this memory in bootparams to second kernel and it >>> will show up in E820 map. It should still be accessible in second kernel, >>> is that right? >> >> Yes. >> >>> Or there is some dependency on ACPI doing some magic before this memory >>> range is available in second kernel? >> >> No. The 1st kernel reserves the crash kernel region, which cannot be >> hot-deleted. So, this region continues to be accessible by the 2nd >> kernel without any operation. > If my understanding is correct: > Now what I understand is if a several memsection is reserved for > crashkernel, then in 2nd kernel, they are just like normal memory. correct. > In ns > object tree, they are not treated as hotplug memory. wrong. They are treated as hotplug memory. But the memory cannot hot removed because the memory has kernel memory. > Otherwise, any hotplug memory which is not reserved for 2nd kernel can > be parsed and need be added as hotplug memory, and add them into movable > zone. wrong. The memory is allocated as normal zone and it is offline. > > Am I right? > > The other question, e820 reserve is done earlier than acpi > initialization, because acpi_early_init() invocation is very late in > start_kernel(). Does that means at the very beginning all memorys are in > e820, later when acpi_early_init is called, hotplug memory is detected, > they will be moved to different place or need be marked with a specific > flag? No. Thanks, Yasuaki Ishimatsu > > > >> >> I am more curious to know how makedumpfile decides what memory ranges to >> dump. The 1st kernel may have performed memory hot-add / delete >> operations before a crash, so it needs to know the valid physical >> address range at the time of crash, and may not rely on the E820 map >> from BIOS (which is stale). Am I right to assume that makedumpfile gets >> it from the page tables of the 1st kernel? > > makedumpfile just do the dump, what memory ranges to dump is decided in > 1st kernel by kexec-tools. In 1st kernel, if kexec-tools executed, it > will find all System Ram memorys which exclude the reserved regions for > kdump kernel, then build a logical elf file, each load segment is one of > these System Ram memory regions, its addr and length is written into the > program header. > > Then makedumpfile just read this elf file, and read all of them and > dump. > > If after kexec-tools execution and before crash, a hotplug memory is > removed, udev will check this and trigger a kdump restart, kexec-tools > is executed again, System Ram region information are stored. The logical > file header will be passed to 2nd kernel. > > >> >> Thanks, >> -Toshi >> >> >> _______________________________________________ >> kexec mailing list >> kexec@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/kexec > -- > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>In ns > >object tree, they are not treated as hotplug memory. > > wrong. > They are treated as hotplug memory. But the memory cannot hot removed > because the memory has kernel memory. > > >Otherwise, any hotplug memory which is not reserved for 2nd kernel can > >be parsed and need be added as hotplug memory, and add them into movable > >zone. > > wrong. > The memory is allocated as normal zone and it is offline. Hi, Thanks for answering. I am confused. Now the fact is in 1st kernel memory is reserved for crashkernel and passed to 2nd kernel by exactmap. Then in 2nd kernel, reserved memory regions are added into e820. Later hotplug memory still trigger add_memory, and cause bug I reported. > > > > >Am I right? > > > > >The other question, e820 reserve is done earlier than acpi > >initialization, because acpi_early_init() invocation is very late in > >start_kernel(). Does that means at the very beginning all memorys are in > >e820, later when acpi_early_init is called, hotplug memory is detected, > >they will be moved to different place or need be marked with a specific > >flag? > > No. > > Thanks, > Yasuaki Ishimatsu > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
(2014/01/10 18:14), Baoquan wrote: > > >In ns >>> object tree, they are not treated as hotplug memory. >> >> wrong. >> They are treated as hotplug memory. But the memory cannot hot removed >> because the memory has kernel memory. >> >>> Otherwise, any hotplug memory which is not reserved for 2nd kernel can >>> be parsed and need be added as hotplug memory, and add them into movable >>> zone. >> >> wrong. >> The memory is allocated as normal zone and it is offline. > > Hi, > > Thanks for answering. > > I am confused. Now the fact is in 1st kernel memory is reserved for > crashkernel and passed to 2nd kernel by exactmap. Then in 2nd kernel, > reserved memory regions are added into e820. Later hotplug memory still > trigger add_memory, and cause bug I reported. Does the issue occur even if you apply the following Prarit's patch to your kernel and add no_memory_hotplug boot option to 2nd kernel? http://marc.info/?l=linux-acpi&m=138922019607796&w=2 Thanks, Yasuaki Ishimatsu > > >> >>> >>> Am I right? >>> >> >>> The other question, e820 reserve is done earlier than acpi >>> initialization, because acpi_early_init() invocation is very late in >>> start_kernel(). Does that means at the very beginning all memorys are in >>> e820, later when acpi_early_init is called, hotplug memory is detected, >>> they will be moved to different place or need be marked with a specific >>> flag? >> >> No. >> >> Thanks, >> Yasuaki Ishimatsu >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/10/14 at 06:35pm, Yasuaki Ishimatsu wrote: > (2014/01/10 18:14), Baoquan wrote: > > > > >In ns > >>>object tree, they are not treated as hotplug memory. > >> > >>wrong. > >>They are treated as hotplug memory. But the memory cannot hot removed > >>because the memory has kernel memory. > >> > >>>Otherwise, any hotplug memory which is not reserved for 2nd kernel can > >>>be parsed and need be added as hotplug memory, and add them into movable > >>>zone. > >> > >>wrong. > >>The memory is allocated as normal zone and it is offline. > > > >Hi, > > > >Thanks for answering. > > > > > >I am confused. Now the fact is in 1st kernel memory is reserved for > >crashkernel and passed to 2nd kernel by exactmap. Then in 2nd kernel, > >reserved memory regions are added into e820. Later hotplug memory still > >trigger add_memory, and cause bug I reported. > > Does the issue occur even if you apply the following Prarit's patch to > your kernel and add no_memory_hotplug boot option to 2nd kernel? > > http://marc.info/?l=linux-acpi&m=138922019607796&w=2 This issue is the same as Prarit's. He posted the formal patch. But still there are some questions we want to know. > > Thanks, > Yasuaki Ishimatsu > > > > > > >> > >>> > >>>Am I right? > >>> > >> > >>>The other question, e820 reserve is done earlier than acpi > >>>initialization, because acpi_early_init() invocation is very late in > >>>start_kernel(). Does that means at the very beginning all memorys are in > >>>e820, later when acpi_early_init is called, hotplug memory is detected, > >>>they will be moved to different place or need be marked with a specific > >>>flag? > >> > >>No. > >> > >>Thanks, > >>Yasuaki Ishimatsu > >> > > > >-- > >To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > >the body of a message to majordomo@vger.kernel.org > >More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2014-01-10 at 17:14 +0800, Baoquan wrote: : > > > > >Otherwise, any hotplug memory which is not reserved for 2nd kernel can > > >be parsed and need be added as hotplug memory, and add them into movable > > >zone. > > > > wrong. > > The memory is allocated as normal zone and it is offline. This is "logical" offline, which means that the memory is accessible, but the 1st kernel does not use it. > Hi, > > Thanks for answering. > > I am confused. Now the fact is in 1st kernel memory is reserved for > crashkernel and passed to 2nd kernel by exactmap. Then in 2nd kernel, > reserved memory regions are added into e820. Right. And this memory is accessible. > Later hotplug memory still > trigger add_memory, and cause bug I reported. This is because the 2nd kernel gets all memory ranges from ACPI without your change. This is bad, not only it causes the panic you reported but also it can overwrite the 1st kernel's memory. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2014-01-10 at 15:11 +0800, Baoquan wrote: > On 01/09/14 at 02:56pm, Toshi Kani wrote: : > > I am more curious to know how makedumpfile decides what memory ranges to > > dump. The 1st kernel may have performed memory hot-add / delete > > operations before a crash, so it needs to know the valid physical > > address range at the time of crash, and may not rely on the E820 map > > from BIOS (which is stale). Am I right to assume that makedumpfile gets > > it from the page tables of the 1st kernel? > > makedumpfile just do the dump, what memory ranges to dump is decided in > 1st kernel by kexec-tools. In 1st kernel, if kexec-tools executed, it > will find all System Ram memorys which exclude the reserved regions for > kdump kernel, then build a logical elf file, each load segment is one of > these System Ram memory regions, its addr and length is written into the > program header. > > Then makedumpfile just read this elf file, and read all of them and > dump. > > If after kexec-tools execution and before crash, a hotplug memory is > removed, udev will check this and trigger a kdump restart, kexec-tools > is executed again, System Ram region information are stored. The logical > file header will be passed to 2nd kernel. Oh, that's how it works. Thanks for the explanation! In case of hot-delete, ideally, the elf file should be updated after a memory region is put into off-line, but before it is ejected. But it is difficult/vulnerable to coordinate such sequence with user space. So, the current scheme sounds good to me. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: linux/arch/x86/include/asm/e820.h =================================================================== --- linux.orig/arch/x86/include/asm/e820.h +++ linux/arch/x86/include/asm/e820.h @@ -69,6 +69,7 @@ static inline bool is_ISA_range(u64 s, u { return s >= ISA_START_ADDRESS && e <= ISA_END_ADDRESS; } +extern bool use_exactmap; #endif /* __ASSEMBLY__ */ #include <linux/ioport.h> Index: linux/arch/x86/kernel/e820.c =================================================================== --- linux.orig/arch/x86/kernel/e820.c +++ linux/arch/x86/kernel/e820.c @@ -838,6 +838,7 @@ static int __init parse_memopt(char *p) } early_param("mem", parse_memopt); +bool use_exactmap; static int __init parse_memmap_one(char *p) { char *oldp; @@ -857,6 +858,7 @@ static int __init parse_memmap_one(char #endif e820.nr_map = 0; userdef = 1; + use_exactmap = true; return 0; } Index: linux/drivers/acpi/acpi_memhotplug.c =================================================================== --- linux.orig/drivers/acpi/acpi_memhotplug.c +++ linux/drivers/acpi/acpi_memhotplug.c @@ -363,5 +363,9 @@ static void acpi_memory_device_remove(st void __init acpi_memory_hotplug_init(void) { +#ifdef CONFIG_X86 + if (use_exactmap) + return; +#endif acpi_scan_add_handler_with_hotplug(&memory_device_handler, "memory"); }