mbox series

[RFC,0/8] arm64/memblock: Handling of CXL Fixed Memory Windows.

Message ID 20240529171236.32002-1-Jonathan.Cameron@huawei.com
Headers show
Series arm64/memblock: Handling of CXL Fixed Memory Windows. | expand

Message

Jonathan Cameron May 29, 2024, 5:12 p.m. UTC
RFC because
- I'm relying heavily on comments people made back when Dan proposed
  generic memblock based tracking of Numa nodes that his
  approach was fine for arm64 (but not for other architectures).
- Final patch.  I'm hoping someone will explain why the hot remove path
  removes reserved memblocks. That currently breaks this approach as we
  can't re-add memory at an address where previously removed some.
- I'm not particularly confident in this area so this might be
  stupidly broken in a way I've not considered.
  
On x86 CXL Fixed Memory Windows, as described in the ACPI CEDT table CFMW
Structures either result in a new NUMA node being assigned, or extend an
existing NUMA node. Unlike ACPI memory hotplug (where the PXM is included
in the signalling and whatever SRAT told us is largely ignored) CXL NUMA
node assignment is based on static data available early in boot.  Note
that whilst they define a range of physical memory, until we program various
address decoders + hotplug relevant devices there is no memory there.
The wrinkle is that the firmware may well have configured some CXL
memory and be presenting it as normal system memory (in appropriate
firmware tables etc).

Unfortunately despite using some nice general sounding functions, the
solution is somewhat x86 specific. This series is a first attempt to
support NUMA nodes for CXL memory on on arm64.  Note I tried or
considered few different approaches:
- A new MEMBLOCK flag to indicate that the memblock was just for
  NUMA mappings. That turned out to be fiddly as a lot of places
  needed modifying.
- Adding completely separate handling of CFMWS entries.
  That means handling them completely differently to SRAT entrees.
- Reparse the CEDT table at time of hotplug and figure out which
  node to use based on something like normal NUMA nodes, + number
  of CEDT CFMWS entry.  This solution looked likely to be messy
  and may be fragile.
So not seeing a way forwards I asked on the montly CXL open
source sync call...

Dan William's pointed out a similar discussion was had a few years
ago but a memblock approach was rejected because only arm64 uses
memblocks as the single source of information of NUMA nodes for
memory. Given I'm looking at ARM64 that sounded perfect.

[PATCH v2 00/22] device-dax: Support sub-dividing soft-reserved ranges
https://lore.kernel.org/linux-mm/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com/

This series leverages two of Dan's patches with minor tweaks.
Very kind of Dan to write nice patches for arm64 support so
I've kept the original authorship as my changes were mainly code
movement.

The remainder of the series deals with the differences between CFMWS
address ranges and soft reserved ones - primarily that there is not necessarily
anything in the EFI memory map or similar so we need to add an entry.

The solution is a little ugly and this isn't an area of the kernel I know
at all well, so I'd love to hear suggestions of a better way to do this!

As I don't have an arm64 system that does the mixture of firmware setup
CXL memory and additional hotplugged memory dealt with by the OS those
code paths were tested by a dirty hack to create overlapping memblocks.

Dan Williams (2):
  arm64: numa: Introduce a memory_add_physaddr_to_nid()
  arm64: memblock: Introduce a generic phys_addr_to_target_node()

Jonathan Cameron (6):
  mm: memblock: Add a means to add to memblock.reserved
  arch_numa: Avoid onlining empty NUMA nodes
  arch_numa: Make numa_add_memblk() set nid for memblock.reserved
    regions
  arm64: mm: numa_fill_memblks() to add a memblock.reserved region if
    match.
  acpi: srat: cxl: Skip zero length CXL fixed memory windows.
  HACK: mm: memory_hotplug: Drop memblock_phys_free() call in
    try_remove_memory()

 arch/arm64/include/asm/sparsemem.h |  8 ++++
 arch/arm64/mm/init.c               | 77 ++++++++++++++++++++++++++++++
 drivers/acpi/numa/srat.c           |  5 ++
 drivers/base/arch_numa.c           | 12 +++++
 include/linux/memblock.h           | 10 ++++
 include/linux/mm.h                 | 14 ++++++
 mm/memblock.c                      | 33 +++++++++++--
 mm/memory_hotplug.c                |  2 +-
 mm/mm_init.c                       | 29 ++++++++++-
 9 files changed, 185 insertions(+), 5 deletions(-)