Message ID | 1389650161-13292-1-git-send-email-prarit@redhat.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
On Mon, Jan 13, 2014 at 4:56 PM, Prarit Bhargava <prarit@redhat.com> wrote: > When booting a kexec/kdump kernel on a system that has specific memory hotplug > regions the boot will fail with warnings like: > > [ 2.939467] swapper/0: page allocation failure: order:9, mode:0x84d0 > [ 2.946564] CPU: 0 PID: 1 Comm: swapper/0 Not tainted > 3.10.0-65.el7.x86_64 #1 > [ 2.954532] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS > QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011 > [ 2.964926] 0000000000000000 ffff8800341bd8c8 ffffffff815bcc67 > ffff8800341bd950 > [ 2.973224] ffffffff8113b1a0 ffff880036339b00 0000000000000009 > 00000000000084d0 > [ 2.981523] ffff8800341bd950 ffffffff815b87ee 0000000000000000 > 0000000000000200 > [ 2.989821] Call Trace: > [ 2.992560] [<ffffffff815bcc67>] dump_stack+0x19/0x1b > [ 2.998300] [<ffffffff8113b1a0>] warn_alloc_failed+0xf0/0x160 > [ 3.004817] [<ffffffff815b87ee>] ? > __alloc_pages_direct_compact+0xac/0x196 > [ 3.012594] [<ffffffff8113f14f>] __alloc_pages_nodemask+0x7ff/0xa00 > [ 3.019692] [<ffffffff815b417c>] vmemmap_alloc_block+0x62/0xba > [ 3.026303] [<ffffffff815b41e9>] vmemmap_alloc_block_buf+0x15/0x3b > [ 3.033302] [<ffffffff815b1ff6>] vmemmap_populate+0xb4/0x21b > [ 3.039718] [<ffffffff815b461d>] sparse_mem_map_populate+0x27/0x35 > [ 3.046717] [<ffffffff815b400f>] sparse_add_one_section+0x7a/0x185 > [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 > [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 > [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d > [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd > [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f > [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 > [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d > [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 > [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 > [ 3.131987] [<ffffffff81a1fd58>] ? acpi_sleep_proc_init+0x2a/0x2a > [ 3.138889] [<ffffffff810020e2>] do_one_initcall+0xe2/0x190 > [ 3.145210] [<ffffffff819e20c4>] kernel_init_freeable+0x17c/0x207 > [ 3.152111] [<ffffffff819e18d0>] ? do_early_param+0x88/0x88 > [ 3.158430] [<ffffffff8159fea0>] ? rest_init+0x80/0x80 > [ 3.164264] [<ffffffff8159feae>] kernel_init+0xe/0x180 > [ 3.170097] [<ffffffff815cca2c>] ret_from_fork+0x7c/0xb0 > [ 3.176123] [<ffffffff8159fea0>] ? rest_init+0x80/0x80 > [ 3.181956] Mem-Info: > [ 3.184490] Node 0 DMA per-cpu: > [ 3.188007] CPU 0: hi: 0, btch: 1 usd: 0 > [ 3.193353] Node 0 DMA32 per-cpu: > [ 3.197060] CPU 0: hi: 42, btch: 7 usd: 0 > [ 3.202410] active_anon:0 inactive_anon:0 isolated_anon:0 > [ 3.202410] active_file:0 inactive_file:0 isolated_file:0 > [ 3.202410] unevictable:0 dirty:0 writeback:0 unstable:0 > [ 3.202410] free:872 slab_reclaimable:13 slab_unreclaimable:1880 > [ 3.202410] mapped:0 shmem:0 pagetables:0 bounce:0 > [ 3.202410] free_cma:0 > > because the system has run out of memory at boot time. This occurs > because of the following sequence in the boot: > > Main kernel boots and sets E820 map. The second kernel is booted with a > map generated by the kdump service using memmap= and memmap=exactmap. > These parameters are added to the kernel parameters of the kexec/kdump > kernel. The kexec/kdump kernel has limited memory resources so as not > to severely impact the main kernel. > > The system then panics and the kdump/kexec kernel boots (which is a > completely new kernel boot). During this boot ACPI is initialized and the > kernel (as can be seen above) traverses the ACPI namespace and finds an > entry for a memory device to be hotadded. > > ie) > > [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 > [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 > [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d > [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd > [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f > [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 > [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d > [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 > [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 > > At this point the kernel adds page table information and the the kexec/kdump > kernel runs out of memory. > > This can also be reproduced with a "regular" kernel by using the > memmap=exactmap and mem=X parameters on the main kernel and booting. > > This patchset resolves the problem by adding a kernel parameter, > acpi_no_memhotplug, to disable ACPI memory hotplug. ACPI memory hotplug > should also be disabled by default when a user specified a memory mapping with > "memmap=exactmap" or "mem=X". > > Signed-off-by: Prarit Bhargava <prarit@redhat.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: "H. Peter Anvin" <hpa@zytor.com> > Cc: x86@kernel.org > Cc: Len Brown <lenb@kernel.org> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> > Cc: Linn Crosetto <linn@hp.com> > Cc: Pekka Enberg <penberg@kernel.org> > Cc: Yinghai Lu <yinghai@kernel.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Toshi Kani <toshi.kani@hp.com> > Cc: Tang Chen <tangchen@cn.fujitsu.com> > Cc: Wen Congyang <wency@cn.fujitsu.com> > Cc: Vivek Goyal <vgoyal@redhat.com> > Cc: kosaki.motohiro@gmail.com > Cc: dyoung@redhat.com > Cc: Toshi Kani <toshi.kani@hp.com> > Cc: linux-acpi@vger.kernel.org > Cc: linux-mm@kvack.org I think we need a knob manually enable mem-hotplug when specify memmap. But it is another story. Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/13/2014 05:17 PM, KOSAKI Motohiro wrote: > On Mon, Jan 13, 2014 at 4:56 PM, Prarit Bhargava <prarit@redhat.com> wrote: >> When booting a kexec/kdump kernel on a system that has specific memory hotplug >> regions the boot will fail with warnings like: >> >> [ 2.939467] swapper/0: page allocation failure: order:9, mode:0x84d0 >> [ 2.946564] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >> 3.10.0-65.el7.x86_64 #1 >> [ 2.954532] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS >> QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011 >> [ 2.964926] 0000000000000000 ffff8800341bd8c8 ffffffff815bcc67 >> ffff8800341bd950 >> [ 2.973224] ffffffff8113b1a0 ffff880036339b00 0000000000000009 >> 00000000000084d0 >> [ 2.981523] ffff8800341bd950 ffffffff815b87ee 0000000000000000 >> 0000000000000200 >> [ 2.989821] Call Trace: >> [ 2.992560] [<ffffffff815bcc67>] dump_stack+0x19/0x1b >> [ 2.998300] [<ffffffff8113b1a0>] warn_alloc_failed+0xf0/0x160 >> [ 3.004817] [<ffffffff815b87ee>] ? >> __alloc_pages_direct_compact+0xac/0x196 >> [ 3.012594] [<ffffffff8113f14f>] __alloc_pages_nodemask+0x7ff/0xa00 >> [ 3.019692] [<ffffffff815b417c>] vmemmap_alloc_block+0x62/0xba >> [ 3.026303] [<ffffffff815b41e9>] vmemmap_alloc_block_buf+0x15/0x3b >> [ 3.033302] [<ffffffff815b1ff6>] vmemmap_populate+0xb4/0x21b >> [ 3.039718] [<ffffffff815b461d>] sparse_mem_map_populate+0x27/0x35 >> [ 3.046717] [<ffffffff815b400f>] sparse_add_one_section+0x7a/0x185 >> [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 >> [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 >> [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 >> [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d >> [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd >> [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f >> [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 >> [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 >> [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 >> [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d >> [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 >> [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 >> [ 3.131987] [<ffffffff81a1fd58>] ? acpi_sleep_proc_init+0x2a/0x2a >> [ 3.138889] [<ffffffff810020e2>] do_one_initcall+0xe2/0x190 >> [ 3.145210] [<ffffffff819e20c4>] kernel_init_freeable+0x17c/0x207 >> [ 3.152111] [<ffffffff819e18d0>] ? do_early_param+0x88/0x88 >> [ 3.158430] [<ffffffff8159fea0>] ? rest_init+0x80/0x80 >> [ 3.164264] [<ffffffff8159feae>] kernel_init+0xe/0x180 >> [ 3.170097] [<ffffffff815cca2c>] ret_from_fork+0x7c/0xb0 >> [ 3.176123] [<ffffffff8159fea0>] ? rest_init+0x80/0x80 >> [ 3.181956] Mem-Info: >> [ 3.184490] Node 0 DMA per-cpu: >> [ 3.188007] CPU 0: hi: 0, btch: 1 usd: 0 >> [ 3.193353] Node 0 DMA32 per-cpu: >> [ 3.197060] CPU 0: hi: 42, btch: 7 usd: 0 >> [ 3.202410] active_anon:0 inactive_anon:0 isolated_anon:0 >> [ 3.202410] active_file:0 inactive_file:0 isolated_file:0 >> [ 3.202410] unevictable:0 dirty:0 writeback:0 unstable:0 >> [ 3.202410] free:872 slab_reclaimable:13 slab_unreclaimable:1880 >> [ 3.202410] mapped:0 shmem:0 pagetables:0 bounce:0 >> [ 3.202410] free_cma:0 >> >> because the system has run out of memory at boot time. This occurs >> because of the following sequence in the boot: >> >> Main kernel boots and sets E820 map. The second kernel is booted with a >> map generated by the kdump service using memmap= and memmap=exactmap. >> These parameters are added to the kernel parameters of the kexec/kdump >> kernel. The kexec/kdump kernel has limited memory resources so as not >> to severely impact the main kernel. >> >> The system then panics and the kdump/kexec kernel boots (which is a >> completely new kernel boot). During this boot ACPI is initialized and the >> kernel (as can be seen above) traverses the ACPI namespace and finds an >> entry for a memory device to be hotadded. >> >> ie) >> >> [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 >> [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 >> [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 >> [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d >> [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd >> [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f >> [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 >> [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 >> [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 >> [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d >> [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 >> [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 >> >> At this point the kernel adds page table information and the the kexec/kdump >> kernel runs out of memory. >> >> This can also be reproduced with a "regular" kernel by using the >> memmap=exactmap and mem=X parameters on the main kernel and booting. >> >> This patchset resolves the problem by adding a kernel parameter, >> acpi_no_memhotplug, to disable ACPI memory hotplug. ACPI memory hotplug >> should also be disabled by default when a user specified a memory mapping with >> "memmap=exactmap" or "mem=X". >> >> Signed-off-by: Prarit Bhargava <prarit@redhat.com> >> Cc: Thomas Gleixner <tglx@linutronix.de> >> Cc: Ingo Molnar <mingo@redhat.com> >> Cc: "H. Peter Anvin" <hpa@zytor.com> >> Cc: x86@kernel.org >> Cc: Len Brown <lenb@kernel.org> >> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> >> Cc: Linn Crosetto <linn@hp.com> >> Cc: Pekka Enberg <penberg@kernel.org> >> Cc: Yinghai Lu <yinghai@kernel.org> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Toshi Kani <toshi.kani@hp.com> >> Cc: Tang Chen <tangchen@cn.fujitsu.com> >> Cc: Wen Congyang <wency@cn.fujitsu.com> >> Cc: Vivek Goyal <vgoyal@redhat.com> >> Cc: kosaki.motohiro@gmail.com >> Cc: dyoung@redhat.com >> Cc: Toshi Kani <toshi.kani@hp.com> >> Cc: linux-acpi@vger.kernel.org >> Cc: linux-mm@kvack.org > > I think we need a knob manually enable mem-hotplug when specify memmap. But > it is another story. > > Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> As mentioned, self-NAK. I have seen a system that I needed to specify memmap=exactmap & had hotplug memory. I will only keep the acpi_no_memhotplug option in the next version of the patch. P. -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
(2014/01/14 8:41), Prarit Bhargava wrote: > > > On 01/13/2014 05:17 PM, KOSAKI Motohiro wrote: >> On Mon, Jan 13, 2014 at 4:56 PM, Prarit Bhargava <prarit@redhat.com> wrote: >>> When booting a kexec/kdump kernel on a system that has specific memory hotplug >>> regions the boot will fail with warnings like: >>> >>> [ 2.939467] swapper/0: page allocation failure: order:9, mode:0x84d0 >>> [ 2.946564] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >>> 3.10.0-65.el7.x86_64 #1 >>> [ 2.954532] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS >>> QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011 >>> [ 2.964926] 0000000000000000 ffff8800341bd8c8 ffffffff815bcc67 >>> ffff8800341bd950 >>> [ 2.973224] ffffffff8113b1a0 ffff880036339b00 0000000000000009 >>> 00000000000084d0 >>> [ 2.981523] ffff8800341bd950 ffffffff815b87ee 0000000000000000 >>> 0000000000000200 >>> [ 2.989821] Call Trace: >>> [ 2.992560] [<ffffffff815bcc67>] dump_stack+0x19/0x1b >>> [ 2.998300] [<ffffffff8113b1a0>] warn_alloc_failed+0xf0/0x160 >>> [ 3.004817] [<ffffffff815b87ee>] ? >>> __alloc_pages_direct_compact+0xac/0x196 >>> [ 3.012594] [<ffffffff8113f14f>] __alloc_pages_nodemask+0x7ff/0xa00 >>> [ 3.019692] [<ffffffff815b417c>] vmemmap_alloc_block+0x62/0xba >>> [ 3.026303] [<ffffffff815b41e9>] vmemmap_alloc_block_buf+0x15/0x3b >>> [ 3.033302] [<ffffffff815b1ff6>] vmemmap_populate+0xb4/0x21b >>> [ 3.039718] [<ffffffff815b461d>] sparse_mem_map_populate+0x27/0x35 >>> [ 3.046717] [<ffffffff815b400f>] sparse_add_one_section+0x7a/0x185 >>> [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 >>> [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 >>> [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 >>> [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d >>> [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd >>> [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f >>> [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 >>> [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 >>> [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 >>> [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d >>> [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 >>> [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 >>> [ 3.131987] [<ffffffff81a1fd58>] ? acpi_sleep_proc_init+0x2a/0x2a >>> [ 3.138889] [<ffffffff810020e2>] do_one_initcall+0xe2/0x190 >>> [ 3.145210] [<ffffffff819e20c4>] kernel_init_freeable+0x17c/0x207 >>> [ 3.152111] [<ffffffff819e18d0>] ? do_early_param+0x88/0x88 >>> [ 3.158430] [<ffffffff8159fea0>] ? rest_init+0x80/0x80 >>> [ 3.164264] [<ffffffff8159feae>] kernel_init+0xe/0x180 >>> [ 3.170097] [<ffffffff815cca2c>] ret_from_fork+0x7c/0xb0 >>> [ 3.176123] [<ffffffff8159fea0>] ? rest_init+0x80/0x80 >>> [ 3.181956] Mem-Info: >>> [ 3.184490] Node 0 DMA per-cpu: >>> [ 3.188007] CPU 0: hi: 0, btch: 1 usd: 0 >>> [ 3.193353] Node 0 DMA32 per-cpu: >>> [ 3.197060] CPU 0: hi: 42, btch: 7 usd: 0 >>> [ 3.202410] active_anon:0 inactive_anon:0 isolated_anon:0 >>> [ 3.202410] active_file:0 inactive_file:0 isolated_file:0 >>> [ 3.202410] unevictable:0 dirty:0 writeback:0 unstable:0 >>> [ 3.202410] free:872 slab_reclaimable:13 slab_unreclaimable:1880 >>> [ 3.202410] mapped:0 shmem:0 pagetables:0 bounce:0 >>> [ 3.202410] free_cma:0 >>> >>> because the system has run out of memory at boot time. This occurs >>> because of the following sequence in the boot: >>> >>> Main kernel boots and sets E820 map. The second kernel is booted with a >>> map generated by the kdump service using memmap= and memmap=exactmap. >>> These parameters are added to the kernel parameters of the kexec/kdump >>> kernel. The kexec/kdump kernel has limited memory resources so as not >>> to severely impact the main kernel. >>> >>> The system then panics and the kdump/kexec kernel boots (which is a >>> completely new kernel boot). During this boot ACPI is initialized and the >>> kernel (as can be seen above) traverses the ACPI namespace and finds an >>> entry for a memory device to be hotadded. >>> >>> ie) >>> >>> [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 >>> [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 >>> [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 >>> [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d >>> [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd >>> [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f >>> [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 >>> [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 >>> [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 >>> [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d >>> [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 >>> [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 >>> >>> At this point the kernel adds page table information and the the kexec/kdump >>> kernel runs out of memory. >>> >>> This can also be reproduced with a "regular" kernel by using the >>> memmap=exactmap and mem=X parameters on the main kernel and booting. >>> >>> This patchset resolves the problem by adding a kernel parameter, >>> acpi_no_memhotplug, to disable ACPI memory hotplug. ACPI memory hotplug >>> should also be disabled by default when a user specified a memory mapping with >>> "memmap=exactmap" or "mem=X". >>> >>> Signed-off-by: Prarit Bhargava <prarit@redhat.com> >>> Cc: Thomas Gleixner <tglx@linutronix.de> >>> Cc: Ingo Molnar <mingo@redhat.com> >>> Cc: "H. Peter Anvin" <hpa@zytor.com> >>> Cc: x86@kernel.org >>> Cc: Len Brown <lenb@kernel.org> >>> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> >>> Cc: Linn Crosetto <linn@hp.com> >>> Cc: Pekka Enberg <penberg@kernel.org> >>> Cc: Yinghai Lu <yinghai@kernel.org> >>> Cc: Andrew Morton <akpm@linux-foundation.org> >>> Cc: Toshi Kani <toshi.kani@hp.com> >>> Cc: Tang Chen <tangchen@cn.fujitsu.com> >>> Cc: Wen Congyang <wency@cn.fujitsu.com> >>> Cc: Vivek Goyal <vgoyal@redhat.com> >>> Cc: kosaki.motohiro@gmail.com >>> Cc: dyoung@redhat.com >>> Cc: Toshi Kani <toshi.kani@hp.com> >>> Cc: linux-acpi@vger.kernel.org >>> Cc: linux-mm@kvack.org >> >> I think we need a knob manually enable mem-hotplug when specify memmap. But >> it is another story. >> >> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > As mentioned, self-NAK. I have seen a system that I needed to specify > memmap=exactmap & had hotplug memory. I will only keep the acpi_no_memhotplug > option in the next version of the patch. Your following first patch is simply and makes sense. http://marc.info/?l=linux-acpi&m=138922019607796&w=2 Thanks, Yasuaki Ishiamtsu > > P. > -- > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2014-01-14 at 10:11 +0900, Yasuaki Ishimatsu wrote: : > >> I think we need a knob manually enable mem-hotplug when specify memmap. But > >> it is another story. > >> > >> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > > > As mentioned, self-NAK. I have seen a system that I needed to specify > > memmap=exactmap & had hotplug memory. I will only keep the acpi_no_memhotplug > > option in the next version of the patch. > > > Your following first patch is simply and makes sense. > > http://marc.info/?l=linux-acpi&m=138922019607796&w=2 > In this option, it also requires changing kexec-tools to specify the new option for kdump. It won't be simpler. Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/13/14 at 04:56pm, Prarit Bhargava wrote: > When booting a kexec/kdump kernel on a system that has specific memory hotplug > regions the boot will fail with warnings like: > > [ 2.939467] swapper/0: page allocation failure: order:9, mode:0x84d0 > [ 2.946564] CPU: 0 PID: 1 Comm: swapper/0 Not tainted > 3.10.0-65.el7.x86_64 #1 > [ 2.954532] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS > QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011 > [ 2.964926] 0000000000000000 ffff8800341bd8c8 ffffffff815bcc67 > ffff8800341bd950 > [ 2.973224] ffffffff8113b1a0 ffff880036339b00 0000000000000009 > 00000000000084d0 > [ 2.981523] ffff8800341bd950 ffffffff815b87ee 0000000000000000 > 0000000000000200 > [ 2.989821] Call Trace: > [ 2.992560] [<ffffffff815bcc67>] dump_stack+0x19/0x1b > [ 2.998300] [<ffffffff8113b1a0>] warn_alloc_failed+0xf0/0x160 > [ 3.004817] [<ffffffff815b87ee>] ? > __alloc_pages_direct_compact+0xac/0x196 > [ 3.012594] [<ffffffff8113f14f>] __alloc_pages_nodemask+0x7ff/0xa00 > [ 3.019692] [<ffffffff815b417c>] vmemmap_alloc_block+0x62/0xba > [ 3.026303] [<ffffffff815b41e9>] vmemmap_alloc_block_buf+0x15/0x3b > [ 3.033302] [<ffffffff815b1ff6>] vmemmap_populate+0xb4/0x21b > [ 3.039718] [<ffffffff815b461d>] sparse_mem_map_populate+0x27/0x35 > [ 3.046717] [<ffffffff815b400f>] sparse_add_one_section+0x7a/0x185 > [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 > [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 > [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d > [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd > [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f > [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 > [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d > [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 > [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 > [ 3.131987] [<ffffffff81a1fd58>] ? acpi_sleep_proc_init+0x2a/0x2a > [ 3.138889] [<ffffffff810020e2>] do_one_initcall+0xe2/0x190 > [ 3.145210] [<ffffffff819e20c4>] kernel_init_freeable+0x17c/0x207 > [ 3.152111] [<ffffffff819e18d0>] ? do_early_param+0x88/0x88 > [ 3.158430] [<ffffffff8159fea0>] ? rest_init+0x80/0x80 > [ 3.164264] [<ffffffff8159feae>] kernel_init+0xe/0x180 > [ 3.170097] [<ffffffff815cca2c>] ret_from_fork+0x7c/0xb0 > [ 3.176123] [<ffffffff8159fea0>] ? rest_init+0x80/0x80 > [ 3.181956] Mem-Info: > [ 3.184490] Node 0 DMA per-cpu: > [ 3.188007] CPU 0: hi: 0, btch: 1 usd: 0 > [ 3.193353] Node 0 DMA32 per-cpu: > [ 3.197060] CPU 0: hi: 42, btch: 7 usd: 0 > [ 3.202410] active_anon:0 inactive_anon:0 isolated_anon:0 > [ 3.202410] active_file:0 inactive_file:0 isolated_file:0 > [ 3.202410] unevictable:0 dirty:0 writeback:0 unstable:0 > [ 3.202410] free:872 slab_reclaimable:13 slab_unreclaimable:1880 > [ 3.202410] mapped:0 shmem:0 pagetables:0 bounce:0 > [ 3.202410] free_cma:0 > > because the system has run out of memory at boot time. This occurs > because of the following sequence in the boot: > > Main kernel boots and sets E820 map. The second kernel is booted with a > map generated by the kdump service using memmap= and memmap=exactmap. > These parameters are added to the kernel parameters of the kexec/kdump > kernel. The kexec/kdump kernel has limited memory resources so as not > to severely impact the main kernel. > > The system then panics and the kdump/kexec kernel boots (which is a > completely new kernel boot). During this boot ACPI is initialized and the > kernel (as can be seen above) traverses the ACPI namespace and finds an > entry for a memory device to be hotadded. > > ie) > > [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 > [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 > [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d > [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd > [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f > [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 > [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d > [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 > [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 > > At this point the kernel adds page table information and the the kexec/kdump > kernel runs out of memory. > > This can also be reproduced with a "regular" kernel by using the > memmap=exactmap and mem=X parameters on the main kernel and booting. > > This patchset resolves the problem by adding a kernel parameter, > acpi_no_memhotplug, to disable ACPI memory hotplug. ACPI memory hotplug > should also be disabled by default when a user specified a memory mapping with > "memmap=exactmap" or "mem=X". > > Signed-off-by: Prarit Bhargava <prarit@redhat.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: "H. Peter Anvin" <hpa@zytor.com> > Cc: x86@kernel.org > Cc: Len Brown <lenb@kernel.org> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> > Cc: Linn Crosetto <linn@hp.com> > Cc: Pekka Enberg <penberg@kernel.org> > Cc: Yinghai Lu <yinghai@kernel.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Toshi Kani <toshi.kani@hp.com> > Cc: Tang Chen <tangchen@cn.fujitsu.com> > Cc: Wen Congyang <wency@cn.fujitsu.com> > Cc: Vivek Goyal <vgoyal@redhat.com> > Cc: kosaki.motohiro@gmail.com > Cc: dyoung@redhat.com > Cc: Toshi Kani <toshi.kani@hp.com> > Cc: linux-acpi@vger.kernel.org > Cc: linux-mm@kvack.org > --- > Documentation/kernel-parameters.txt | 3 +++ > arch/x86/kernel/e820.c | 4 ++++ > drivers/acpi/acpi_memhotplug.c | 18 ++++++++++++++++++ > include/linux/memory_hotplug.h | 9 +++++++++ > 4 files changed, 34 insertions(+) > > diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt > index b9e9bd8..ea93f75 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -343,6 +343,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted. > no: ACPI OperationRegions are not marked as reserved, > no further checks are performed. > > + acpi_no_memhotplug [ACPI] Disable memory hotplug. Useful for kexec > + and kdump kernels. > + > add_efi_memmap [EFI; X86] Include EFI memory map in > kernel's map of available physical RAM. > > diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c > index 174da5f..3c431fe 100644 > --- a/arch/x86/kernel/e820.c > +++ b/arch/x86/kernel/e820.c > @@ -20,6 +20,7 @@ > #include <linux/firmware-map.h> > #include <linux/memblock.h> > #include <linux/sort.h> > +#include <linux/memory_hotplug.h> > > #include <asm/e820.h> > #include <asm/proto.h> > @@ -834,6 +835,8 @@ static int __init parse_memopt(char *p) > return -EINVAL; > e820_remove_range(mem_size, ULLONG_MAX - mem_size, E820_RAM, 1); > > + set_acpi_no_memhotplug(); > + > return 0; > } > early_param("mem", parse_memopt); > @@ -857,6 +860,7 @@ static int __init parse_memmap_one(char *p) > #endif > e820.nr_map = 0; > userdef = 1; > + set_acpi_no_memhotplug(); > return 0; > } > > diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c > index 551dad7..d104a7d 100644 > --- a/drivers/acpi/acpi_memhotplug.c > +++ b/drivers/acpi/acpi_memhotplug.c > @@ -361,7 +361,25 @@ static void acpi_memory_device_remove(struct acpi_device *device) > acpi_memory_device_free(mem_device); > } > > +static bool acpi_no_memhotplug; > + > +void set_acpi_no_memhotplug(void) > +{ > + acpi_no_memhotplug = true; > + pr_info_once("ACPI: Memory Hotplug Disabled\n"); > +} > + > void __init acpi_memory_hotplug_init(void) > { > + if (acpi_no_memhotplug) > + return; > + > acpi_scan_add_handler_with_hotplug(&memory_device_handler, "memory"); > } > + > +static int __init disable_acpi_memory_hotplug(char *str) > +{ > + set_acpi_no_memhotplug(); > + return 1; > +} > +__setup("acpi_no_memhotplug", disable_acpi_memory_hotplug); > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h > index 4ca3d95..3cdb6e0 100644 > --- a/include/linux/memory_hotplug.h > +++ b/include/linux/memory_hotplug.h > @@ -12,6 +12,15 @@ struct pglist_data; > struct mem_section; > struct memory_block; > > +#ifdef CONFIG_ACPI_HOTPLUG_MEMORY > +/* set flag to disable ACPI memory hotplug */ > +extern void set_acpi_no_memhotplug(void); > +#else > +static inline void set_acpi_no_memhotplug(void) > +{ > +} > +#endif > + > #ifdef CONFIG_MEMORY_HOTPLUG > > /* > -- > 1.7.9.3 > Acked-by: Dave Young <dyoung@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
(2014/01/14 10:41), Toshi Kani wrote: > On Tue, 2014-01-14 at 10:11 +0900, Yasuaki Ishimatsu wrote: > : >>>> I think we need a knob manually enable mem-hotplug when specify memmap. But >>>> it is another story. >>>> >>>> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> >>> >>> As mentioned, self-NAK. I have seen a system that I needed to specify >>> memmap=exactmap & had hotplug memory. I will only keep the acpi_no_memhotplug >>> option in the next version of the patch. >> >> >> Your following first patch is simply and makes sense. >> >> http://marc.info/?l=linux-acpi&m=138922019607796&w=2 >> > > In this option, it also requires changing kexec-tools to specify the new > option for kdump. It won't be simpler. Hmm. I use memm= boot option and hotplug memory for memory hot-remove. At least, the patch cannot be accepted. Thanks, Yasuaki Ishimatsu > > Thanks, > -Toshi > > -- > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/13/2014 09:43 PM, Yasuaki Ishimatsu wrote: > (2014/01/14 10:41), Toshi Kani wrote: >> On Tue, 2014-01-14 at 10:11 +0900, Yasuaki Ishimatsu wrote: >> : >>>>> I think we need a knob manually enable mem-hotplug when specify memmap. But >>>>> it is another story. >>>>> >>>>> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> >>>> >>>> As mentioned, self-NAK. I have seen a system that I needed to specify >>>> memmap=exactmap & had hotplug memory. I will only keep the acpi_no_memhotplug >>>> option in the next version of the patch. >>> >>> >>> Your following first patch is simply and makes sense. >>> >>> http://marc.info/?l=linux-acpi&m=138922019607796&w=2 >>> >> >> In this option, it also requires changing kexec-tools to specify the new >> option for kdump. It won't be simpler. > > Hmm. > I use memm= boot option and hotplug memory for memory hot-remove. > At least, the patch cannot be accepted. Thanks for the information Yasuaki. I will resubmit my first patch that only adds the kernel parameter. P. > > Thanks, > Yasuaki Ishimatsu > >> >> Thanks, >> -Toshi >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/13/2014 08:41 PM, Toshi Kani wrote: > On Tue, 2014-01-14 at 10:11 +0900, Yasuaki Ishimatsu wrote: > : >>>> I think we need a knob manually enable mem-hotplug when specify memmap. But >>>> it is another story. >>>> >>>> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> >>> >>> As mentioned, self-NAK. I have seen a system that I needed to specify >>> memmap=exactmap & had hotplug memory. I will only keep the acpi_no_memhotplug >>> option in the next version of the patch. >> >> >> Your following first patch is simply and makes sense. >> >> http://marc.info/?l=linux-acpi&m=138922019607796&w=2 >> > > In this option, it also requires changing kexec-tools to specify the new > option for kdump. It won't be simpler. It will be simpler for the kernel and those of us who have to debug busted e820 maps ;) Unfortunately I may not be able to give you the automatic disable. I did contemplate adding a !is_kdump_kernel() to the ACPI memory hotplug init call, but it seems like that is unacceptable as well. P. > > Thanks, > -Toshi > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jan 14, 2014 at 06:05:15AM -0500, Prarit Bhargava wrote: > > > On 01/13/2014 08:41 PM, Toshi Kani wrote: > > On Tue, 2014-01-14 at 10:11 +0900, Yasuaki Ishimatsu wrote: > > : > >>>> I think we need a knob manually enable mem-hotplug when specify memmap. But > >>>> it is another story. > >>>> > >>>> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > >>> > >>> As mentioned, self-NAK. I have seen a system that I needed to specify > >>> memmap=exactmap & had hotplug memory. I will only keep the acpi_no_memhotplug > >>> option in the next version of the patch. > >> > >> > >> Your following first patch is simply and makes sense. > >> > >> http://marc.info/?l=linux-acpi&m=138922019607796&w=2 > >> > > > > In this option, it also requires changing kexec-tools to specify the new > > option for kdump. It won't be simpler. > > It will be simpler for the kernel and those of us who have to debug busted e820 > maps ;) > > Unfortunately I may not be able to give you the automatic disable. I did > contemplate adding a !is_kdump_kernel() to the ACPI memory hotplug init call, > but it seems like that is unacceptable as well. Yep, I don't think hotplug feature should be tied to it being a kdump kernel. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2014-01-14 at 11:43 +0900, Yasuaki Ishimatsu wrote: > (2014/01/14 10:41), Toshi Kani wrote: > > On Tue, 2014-01-14 at 10:11 +0900, Yasuaki Ishimatsu wrote: > > : > >>>> I think we need a knob manually enable mem-hotplug when specify memmap. But > >>>> it is another story. > >>>> > >>>> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > >>> > >>> As mentioned, self-NAK. I have seen a system that I needed to specify > >>> memmap=exactmap & had hotplug memory. I will only keep the acpi_no_memhotplug > >>> option in the next version of the patch. > >> > >> > >> Your following first patch is simply and makes sense. > >> > >> http://marc.info/?l=linux-acpi&m=138922019607796&w=2 > >> > > > > In this option, it also requires changing kexec-tools to specify the new > > option for kdump. It won't be simpler. > > Hmm. > I use memm= boot option and hotplug memory for memory hot-remove. > At least, the patch cannot be accepted. Do you mean mem=nG option? I am curious to know how memory hot-remove works in this case. Don't acpi_memhotplug add all the memory ranges at boot, and defeat the mem=nG option? Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jan 14, 2014 at 06:05:15AM -0500, Prarit Bhargava wrote: [..] > >>> As mentioned, self-NAK. I have seen a system that I needed to specify > >>> memmap=exactmap & had hotplug memory. I will only keep the acpi_no_memhotplug > >>> option in the next version of the patch. > >> > >> > >> Your following first patch is simply and makes sense. > >> > >> http://marc.info/?l=linux-acpi&m=138922019607796&w=2 > >> > > > > In this option, it also requires changing kexec-tools to specify the new > > option for kdump. It won't be simpler. > > It will be simpler for the kernel and those of us who have to debug busted e820 > maps ;) > > Unfortunately I may not be able to give you the automatic disable. I did > contemplate adding a !is_kdump_kernel() to the ACPI memory hotplug init call, > but it seems like that is unacceptable as well. I think everybody agrees that there has to be a stand alone command line option to disable memory hotplug. Whether to tie it into memmap=exactmap and mem=X is the contentious bit. So I would suggest that just post a patch to disable memory hotplut using a command line and later more patches can go in if people strongly feel the need to tie it into memmap=exactmap. In the mean time, we will modify /etc/sysconfig/kdump to pass acpi_no_memhotplug so that user does not have to worry about passing this parameter and kexec-tools will not have to be modified either. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2014-01-14 at 10:26 -0500, Vivek Goyal wrote: > On Tue, Jan 14, 2014 at 06:05:15AM -0500, Prarit Bhargava wrote: > > [..] > > >>> As mentioned, self-NAK. I have seen a system that I needed to specify > > >>> memmap=exactmap & had hotplug memory. I will only keep the acpi_no_memhotplug > > >>> option in the next version of the patch. > > >> > > >> > > >> Your following first patch is simply and makes sense. > > >> > > >> http://marc.info/?l=linux-acpi&m=138922019607796&w=2 > > >> > > > > > > In this option, it also requires changing kexec-tools to specify the new > > > option for kdump. It won't be simpler. > > > > It will be simpler for the kernel and those of us who have to debug busted e820 > > maps ;) > > > > Unfortunately I may not be able to give you the automatic disable. I did > > contemplate adding a !is_kdump_kernel() to the ACPI memory hotplug init call, > > but it seems like that is unacceptable as well. > > I think everybody agrees that there has to be a stand alone command line > option to disable memory hotplug. > > Whether to tie it into memmap=exactmap and mem=X is the contentious bit. > So I would suggest that just post a patch to disable memory hotplut using > a command line and later more patches can go in if people strongly feel > the need to tie it into memmap=exactmap. > > In the mean time, we will modify /etc/sysconfig/kdump to pass > acpi_no_memhotplug so that user does not have to worry about passing this > parameter and kexec-tools will not have to be modified either. Fine by me. Thanks for modifying /etc/sysconfig/kdump file. -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index b9e9bd8..ea93f75 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -343,6 +343,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted. no: ACPI OperationRegions are not marked as reserved, no further checks are performed. + acpi_no_memhotplug [ACPI] Disable memory hotplug. Useful for kexec + and kdump kernels. + add_efi_memmap [EFI; X86] Include EFI memory map in kernel's map of available physical RAM. diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index 174da5f..3c431fe 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -20,6 +20,7 @@ #include <linux/firmware-map.h> #include <linux/memblock.h> #include <linux/sort.h> +#include <linux/memory_hotplug.h> #include <asm/e820.h> #include <asm/proto.h> @@ -834,6 +835,8 @@ static int __init parse_memopt(char *p) return -EINVAL; e820_remove_range(mem_size, ULLONG_MAX - mem_size, E820_RAM, 1); + set_acpi_no_memhotplug(); + return 0; } early_param("mem", parse_memopt); @@ -857,6 +860,7 @@ static int __init parse_memmap_one(char *p) #endif e820.nr_map = 0; userdef = 1; + set_acpi_no_memhotplug(); return 0; } diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c index 551dad7..d104a7d 100644 --- a/drivers/acpi/acpi_memhotplug.c +++ b/drivers/acpi/acpi_memhotplug.c @@ -361,7 +361,25 @@ static void acpi_memory_device_remove(struct acpi_device *device) acpi_memory_device_free(mem_device); } +static bool acpi_no_memhotplug; + +void set_acpi_no_memhotplug(void) +{ + acpi_no_memhotplug = true; + pr_info_once("ACPI: Memory Hotplug Disabled\n"); +} + void __init acpi_memory_hotplug_init(void) { + if (acpi_no_memhotplug) + return; + acpi_scan_add_handler_with_hotplug(&memory_device_handler, "memory"); } + +static int __init disable_acpi_memory_hotplug(char *str) +{ + set_acpi_no_memhotplug(); + return 1; +} +__setup("acpi_no_memhotplug", disable_acpi_memory_hotplug); diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 4ca3d95..3cdb6e0 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -12,6 +12,15 @@ struct pglist_data; struct mem_section; struct memory_block; +#ifdef CONFIG_ACPI_HOTPLUG_MEMORY +/* set flag to disable ACPI memory hotplug */ +extern void set_acpi_no_memhotplug(void); +#else +static inline void set_acpi_no_memhotplug(void) +{ +} +#endif + #ifdef CONFIG_MEMORY_HOTPLUG /*
When booting a kexec/kdump kernel on a system that has specific memory hotplug regions the boot will fail with warnings like: [ 2.939467] swapper/0: page allocation failure: order:9, mode:0x84d0 [ 2.946564] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-65.el7.x86_64 #1 [ 2.954532] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011 [ 2.964926] 0000000000000000 ffff8800341bd8c8 ffffffff815bcc67 ffff8800341bd950 [ 2.973224] ffffffff8113b1a0 ffff880036339b00 0000000000000009 00000000000084d0 [ 2.981523] ffff8800341bd950 ffffffff815b87ee 0000000000000000 0000000000000200 [ 2.989821] Call Trace: [ 2.992560] [<ffffffff815bcc67>] dump_stack+0x19/0x1b [ 2.998300] [<ffffffff8113b1a0>] warn_alloc_failed+0xf0/0x160 [ 3.004817] [<ffffffff815b87ee>] ? __alloc_pages_direct_compact+0xac/0x196 [ 3.012594] [<ffffffff8113f14f>] __alloc_pages_nodemask+0x7ff/0xa00 [ 3.019692] [<ffffffff815b417c>] vmemmap_alloc_block+0x62/0xba [ 3.026303] [<ffffffff815b41e9>] vmemmap_alloc_block_buf+0x15/0x3b [ 3.033302] [<ffffffff815b1ff6>] vmemmap_populate+0xb4/0x21b [ 3.039718] [<ffffffff815b461d>] sparse_mem_map_populate+0x27/0x35 [ 3.046717] [<ffffffff815b400f>] sparse_add_one_section+0x7a/0x185 [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 [ 3.131987] [<ffffffff81a1fd58>] ? acpi_sleep_proc_init+0x2a/0x2a [ 3.138889] [<ffffffff810020e2>] do_one_initcall+0xe2/0x190 [ 3.145210] [<ffffffff819e20c4>] kernel_init_freeable+0x17c/0x207 [ 3.152111] [<ffffffff819e18d0>] ? do_early_param+0x88/0x88 [ 3.158430] [<ffffffff8159fea0>] ? rest_init+0x80/0x80 [ 3.164264] [<ffffffff8159feae>] kernel_init+0xe/0x180 [ 3.170097] [<ffffffff815cca2c>] ret_from_fork+0x7c/0xb0 [ 3.176123] [<ffffffff8159fea0>] ? rest_init+0x80/0x80 [ 3.181956] Mem-Info: [ 3.184490] Node 0 DMA per-cpu: [ 3.188007] CPU 0: hi: 0, btch: 1 usd: 0 [ 3.193353] Node 0 DMA32 per-cpu: [ 3.197060] CPU 0: hi: 42, btch: 7 usd: 0 [ 3.202410] active_anon:0 inactive_anon:0 isolated_anon:0 [ 3.202410] active_file:0 inactive_file:0 isolated_file:0 [ 3.202410] unevictable:0 dirty:0 writeback:0 unstable:0 [ 3.202410] free:872 slab_reclaimable:13 slab_unreclaimable:1880 [ 3.202410] mapped:0 shmem:0 pagetables:0 bounce:0 [ 3.202410] free_cma:0 because the system has run out of memory at boot time. This occurs because of the following sequence in the boot: Main kernel boots and sets E820 map. The second kernel is booted with a map generated by the kdump service using memmap= and memmap=exactmap. These parameters are added to the kernel parameters of the kexec/kdump kernel. The kexec/kdump kernel has limited memory resources so as not to severely impact the main kernel. The system then panics and the kdump/kexec kernel boots (which is a completely new kernel boot). During this boot ACPI is initialized and the kernel (as can be seen above) traverses the ACPI namespace and finds an entry for a memory device to be hotadded. ie) [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240 [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0 [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90 [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5 [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160 [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6 At this point the kernel adds page table information and the the kexec/kdump kernel runs out of memory. This can also be reproduced with a "regular" kernel by using the memmap=exactmap and mem=X parameters on the main kernel and booting. This patchset resolves the problem by adding a kernel parameter, acpi_no_memhotplug, to disable ACPI memory hotplug. ACPI memory hotplug should also be disabled by default when a user specified a memory mapping with "memmap=exactmap" or "mem=X". Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Len Brown <lenb@kernel.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Linn Crosetto <linn@hp.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: kosaki.motohiro@gmail.com Cc: dyoung@redhat.com Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-acpi@vger.kernel.org Cc: linux-mm@kvack.org --- Documentation/kernel-parameters.txt | 3 +++ arch/x86/kernel/e820.c | 4 ++++ drivers/acpi/acpi_memhotplug.c | 18 ++++++++++++++++++ include/linux/memory_hotplug.h | 9 +++++++++ 4 files changed, 34 insertions(+)