Message ID | 20200430102908.10107-4-david@redhat.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | mm/memory_hotplug: Allow to not create firmware memmap entries | expand |
On 4/30/20 3:29 AM, David Hildenbrand wrote: > Currently, when adding memory, we create entries in /sys/firmware/memmap/ > as "System RAM". This does not reflect the reality and will lead to > kexec-tools to add that memory to the fixed-up initial memmap for a > kexec kernel (loaded via kexec_load()). The memory will be considered > initial System RAM by the kexec kernel. > > We should let the kexec kernel decide how to use that memory - just as > we do during an ordinary reboot. ... > - rc = add_memory(numa_node, new_res->start, resource_size(new_res), 0); > + rc = add_memory(numa_node, new_res->start, resource_size(new_res), > + MHP_NO_FIRMWARE_MEMMAP); Looks fine. But, if you send another revision, could you add a comment about the actual goal of MHP_NO_FIRMWARE_MEMMAP? Maybe: /* * MHP_NO_FIRMWARE_MEMMAP ensures that future * kexec'd kernels will not treat this as RAM. */ Not a biggie, though. Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
On 30.04.20 13:23, Dave Hansen wrote: > On 4/30/20 3:29 AM, David Hildenbrand wrote: >> Currently, when adding memory, we create entries in /sys/firmware/memmap/ >> as "System RAM". This does not reflect the reality and will lead to >> kexec-tools to add that memory to the fixed-up initial memmap for a >> kexec kernel (loaded via kexec_load()). The memory will be considered >> initial System RAM by the kexec kernel. >> >> We should let the kexec kernel decide how to use that memory - just as >> we do during an ordinary reboot. > ... >> - rc = add_memory(numa_node, new_res->start, resource_size(new_res), 0); >> + rc = add_memory(numa_node, new_res->start, resource_size(new_res), >> + MHP_NO_FIRMWARE_MEMMAP); > > Looks fine. But, if you send another revision, could you add a comment > about the actual goal of MHP_NO_FIRMWARE_MEMMAP? Maybe: > > /* > * MHP_NO_FIRMWARE_MEMMAP ensures that future > * kexec'd kernels will not treat this as RAM. > */ > > Not a biggie, though. Sure, maybe Andrew can fixup when applying (if no resend is necessary). Thanks Dave! > > Acked-by: Dave Hansen <dave.hansen@linux.intel.com> >
diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index e159184e0ba0..929823a79816 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -65,7 +65,8 @@ int dev_dax_kmem_probe(struct device *dev) new_res->flags = IORESOURCE_SYSTEM_RAM; new_res->name = dev_name(dev); - rc = add_memory(numa_node, new_res->start, resource_size(new_res), 0); + rc = add_memory(numa_node, new_res->start, resource_size(new_res), + MHP_NO_FIRMWARE_MEMMAP); if (rc) { release_resource(new_res); kfree(new_res);
Currently, when adding memory, we create entries in /sys/firmware/memmap/ as "System RAM". This does not reflect the reality and will lead to kexec-tools to add that memory to the fixed-up initial memmap for a kexec kernel (loaded via kexec_load()). The memory will be considered initial System RAM by the kexec kernel. We should let the kexec kernel decide how to use that memory - just as we do during an ordinary reboot. Before configuring the namespace: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-33fffffff : namespace0.0 3280000000-32ffffffff : PCI Bus 0000:00 After configuring the namespace: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 148200000-33fffffff : dax0.0 3280000000-32ffffffff : PCI Bus 0000:00 After loading kmem: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 150000000-33fffffff : dax0.0 150000000-33fffffff : System RAM 3280000000-32ffffffff : PCI Bus 0000:00 After a proper reboot: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 148200000-33fffffff : dax0.0 3280000000-32ffffffff : PCI Bus 0000:00 Within the kexec kernel before this change: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 150000000-33fffffff : System RAM 3280000000-32ffffffff : PCI Bus 0000:00 Within the kexec kernel after this change: [root@localhost ~]# cat /proc/iomem ... 140000000-33fffffff : Persistent Memory 140000000-1481fffff : namespace0.0 148200000-33fffffff : dax0.0 3280000000-32ffffffff : PCI Bus 0000:00 /sys/firmware/memmap/ before this change: 0000000000000000-000000000009fc00 (System RAM) 000000000009fc00-00000000000a0000 (Reserved) 00000000000f0000-0000000000100000 (Reserved) 0000000000100000-00000000bffdf000 (System RAM) 00000000bffdf000-00000000c0000000 (Reserved) 00000000feffc000-00000000ff000000 (Reserved) 00000000fffc0000-0000000100000000 (Reserved) 0000000100000000-0000000140000000 (System RAM) 0000000150000000-0000000340000000 (System RAM) /sys/firmware/memmap/ after a proper reboot: 0000000000000000-000000000009fc00 (System RAM) 000000000009fc00-00000000000a0000 (Reserved) 00000000000f0000-0000000000100000 (Reserved) 0000000000100000-00000000bffdf000 (System RAM) 00000000bffdf000-00000000c0000000 (Reserved) 00000000feffc000-00000000ff000000 (Reserved) 00000000fffc0000-0000000100000000 (Reserved) 0000000100000000-0000000140000000 (System RAM) /sys/firmware/memmap/ after this change: 0000000000000000-000000000009fc00 (System RAM) 000000000009fc00-00000000000a0000 (Reserved) 00000000000f0000-0000000000100000 (Reserved) 0000000000100000-00000000bffdf000 (System RAM) 00000000bffdf000-00000000c0000000 (Reserved) 00000000feffc000-00000000ff000000 (Reserved) 00000000fffc0000-0000000100000000 (Reserved) 0000000100000000-0000000140000000 (System RAM) kexec-tools already seem to basically ignore any System RAM that's not on top level when searching for areas to place kexec images - but also for determining crash areas to dump via kdump. This behavior is not changed by this patch. kexec-tools probably has to be fixed to also include this memory in system dumps. Note: kexec_file_load() does the right thing already within the kernel. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: David Hildenbrand <david@redhat.com> --- drivers/dax/kmem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)