Message ID | 20181226133351.106676005@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | PMEM NUMA node and hotness accounting/migration | expand |
On Wed, Dec 26, 2018 at 09:14:47PM +0800, Fengguang Wu wrote: > From: Fan Du <fan.du@intel.com> > > This is a hack to enumerate PMEM as NUMA nodes. > It's necessary for current BIOS that don't yet fill ACPI HMAT table. > > WARNING: take care to backup. It is mutual exclusive with libnvdimm > subsystem and can destroy ndctl managed namespaces. Why depend on firmware to present this "correctly"? It seems to me like less effort all around to have ndctl label some namespaces as being for this kind of use.
On Wed, Dec 26, 2018 at 07:41:41PM -0800, Matthew Wilcox wrote: >On Wed, Dec 26, 2018 at 09:14:47PM +0800, Fengguang Wu wrote: >> From: Fan Du <fan.du@intel.com> >> >> This is a hack to enumerate PMEM as NUMA nodes. >> It's necessary for current BIOS that don't yet fill ACPI HMAT table. >> >> WARNING: take care to backup. It is mutual exclusive with libnvdimm >> subsystem and can destroy ndctl managed namespaces. > >Why depend on firmware to present this "correctly"? It seems to me like >less effort all around to have ndctl label some namespaces as being for >this kind of use. Dave Hansen may be more suitable to answer your question. He posted patches to make PMEM NUMA node coexist with libnvdimm and ndctl: [PATCH 0/9] Allow persistent memory to be used like normal RAM https://lkml.org/lkml/2018/10/23/9 That depends on future BIOS. So we did this quick hack to test out PMEM NUMA node for the existing BIOS. Thanks, Fengguang
On Wed, Dec 26, 2018 at 8:11 PM Fengguang Wu <fengguang.wu@intel.com> wrote: > > On Wed, Dec 26, 2018 at 07:41:41PM -0800, Matthew Wilcox wrote: > >On Wed, Dec 26, 2018 at 09:14:47PM +0800, Fengguang Wu wrote: > >> From: Fan Du <fan.du@intel.com> > >> > >> This is a hack to enumerate PMEM as NUMA nodes. > >> It's necessary for current BIOS that don't yet fill ACPI HMAT table. > >> > >> WARNING: take care to backup. It is mutual exclusive with libnvdimm > >> subsystem and can destroy ndctl managed namespaces. > > > >Why depend on firmware to present this "correctly"? It seems to me like > >less effort all around to have ndctl label some namespaces as being for > >this kind of use. > > Dave Hansen may be more suitable to answer your question. He posted > patches to make PMEM NUMA node coexist with libnvdimm and ndctl: > > [PATCH 0/9] Allow persistent memory to be used like normal RAM > https://lkml.org/lkml/2018/10/23/9 > > That depends on future BIOS. So we did this quick hack to test out > PMEM NUMA node for the existing BIOS. No, it does not depend on a future BIOS. Willy, have a look here [1], here [2], and here [3] for the work-in-progress ndctl takeover approach (actually 'daxctl' in this case). [1]: https://lkml.org/lkml/2018/10/23/9 [2]: https://lkml.org/lkml/2018/10/31/243 [3]: https://lists.01.org/pipermail/linux-nvdimm/2018-November/018677.html
On Wed, Dec 26, 2018 at 9:13 PM Dan Williams <dan.j.williams@intel.com> wrote: > > On Wed, Dec 26, 2018 at 8:11 PM Fengguang Wu <fengguang.wu@intel.com> wrote: > > > > On Wed, Dec 26, 2018 at 07:41:41PM -0800, Matthew Wilcox wrote: > > >On Wed, Dec 26, 2018 at 09:14:47PM +0800, Fengguang Wu wrote: > > >> From: Fan Du <fan.du@intel.com> > > >> > > >> This is a hack to enumerate PMEM as NUMA nodes. > > >> It's necessary for current BIOS that don't yet fill ACPI HMAT table. > > >> > > >> WARNING: take care to backup. It is mutual exclusive with libnvdimm > > >> subsystem and can destroy ndctl managed namespaces. > > > > > >Why depend on firmware to present this "correctly"? It seems to me like > > >less effort all around to have ndctl label some namespaces as being for > > >this kind of use. > > > > Dave Hansen may be more suitable to answer your question. He posted > > patches to make PMEM NUMA node coexist with libnvdimm and ndctl: > > > > [PATCH 0/9] Allow persistent memory to be used like normal RAM > > https://lkml.org/lkml/2018/10/23/9 > > > > That depends on future BIOS. So we did this quick hack to test out > > PMEM NUMA node for the existing BIOS. > > No, it does not depend on a future BIOS. It is correct. We already have Dave's patches + Dan's patch (added target_node field) work on our machine which has SRAT. Thanks, Yang > > Willy, have a look here [1], here [2], and here [3] for the > work-in-progress ndctl takeover approach (actually 'daxctl' in this > case). > > [1]: https://lkml.org/lkml/2018/10/23/9 > [2]: https://lkml.org/lkml/2018/10/31/243 > [3]: https://lists.01.org/pipermail/linux-nvdimm/2018-November/018677.html >
On Thu, Dec 27, 2018 at 11:32:06AM -0800, Yang Shi wrote: >On Wed, Dec 26, 2018 at 9:13 PM Dan Williams <dan.j.williams@intel.com> wrote: >> >> On Wed, Dec 26, 2018 at 8:11 PM Fengguang Wu <fengguang.wu@intel.com> wrote: >> > >> > On Wed, Dec 26, 2018 at 07:41:41PM -0800, Matthew Wilcox wrote: >> > >On Wed, Dec 26, 2018 at 09:14:47PM +0800, Fengguang Wu wrote: >> > >> From: Fan Du <fan.du@intel.com> >> > >> >> > >> This is a hack to enumerate PMEM as NUMA nodes. >> > >> It's necessary for current BIOS that don't yet fill ACPI HMAT table. >> > >> >> > >> WARNING: take care to backup. It is mutual exclusive with libnvdimm >> > >> subsystem and can destroy ndctl managed namespaces. >> > > >> > >Why depend on firmware to present this "correctly"? It seems to me like >> > >less effort all around to have ndctl label some namespaces as being for >> > >this kind of use. >> > >> > Dave Hansen may be more suitable to answer your question. He posted >> > patches to make PMEM NUMA node coexist with libnvdimm and ndctl: >> > >> > [PATCH 0/9] Allow persistent memory to be used like normal RAM >> > https://lkml.org/lkml/2018/10/23/9 >> > >> > That depends on future BIOS. So we did this quick hack to test out >> > PMEM NUMA node for the existing BIOS. >> >> No, it does not depend on a future BIOS. > >It is correct. We already have Dave's patches + Dan's patch (added >target_node field) work on our machine which has SRAT. Thanks for the correction. It looks my perception was out of date. So we can follow Dave+Dan's patches to create the PMEM NUMA nodes. Thanks, Fengguang >> >> Willy, have a look here [1], here [2], and here [3] for the >> work-in-progress ndctl takeover approach (actually 'daxctl' in this >> case). >> >> [1]: https://lkml.org/lkml/2018/10/23/9 >> [2]: https://lkml.org/lkml/2018/10/31/243 >> [3]: https://lists.01.org/pipermail/linux-nvdimm/2018-November/018677.html >> >
--- linux.orig/arch/x86/kernel/e820.c 2018-12-23 19:20:34.587078783 +0800 +++ linux/arch/x86/kernel/e820.c 2018-12-23 19:20:34.587078783 +0800 @@ -403,7 +403,8 @@ static int __init __append_e820_table(st /* Ignore the entry on 64-bit overflow: */ if (start > end && likely(size)) return -1; - + if (type == E820_TYPE_PMEM) + type = E820_TYPE_RAM; e820__range_add(start, size, type); entry++;