diff mbox series

[RFC] cxl/region: No need to create dax devices when regions are reserved

Message ID 20241116141218.11945-1-engguopeng@buaa.edu.cn
State New
Headers show
Series [RFC] cxl/region: No need to create dax devices when regions are reserved | expand

Commit Message

peng guo Nov. 16, 2024, 2:12 p.m. UTC
When adding the kernel boot command 'efi=nosoftreserve memmap=32G$34G',
the CXL memory is considered conventional memory, and the original
intention is to reserve all CXL memory using memmap, 34G is the starting
physical address of CXL memory, and the memory size is 32G. But its
resource tree is as follows, the dax device is still created for this
memory region and uses this memory as system ram.

        100000000-87effffff : System RAM
        87f000000-67ffffffff : Reserved
          880000000-67ffffffff : CXL Window 0
                880000000-107fffffff : region0
                  880000000-107fffffff : dax0.0
                        880000000-107fffffff : System RAM (kmem)

The numa node information is as follows, the CXL memory as node2, all
memory have been online:
        node 0 size: 31656 MB
        node 0 free: 29751 MB
        ...
        node 1 size: 32206 MB
        node 1 free: 29445 MB
        node 2 cpus:
        node 2 size: 32768 MB
        node 2 free: 32768 MB

When adding kernel boot command efi=nosoftreserve memmap=32G$35G reserve
CXL memory starting from 35G start address, its resource tree is as
follows, no dax device is created, and the range of reserved memory meets
expectations:

        87f000000-87fffffff : Reserved
        880000000-67ffffffff : CXL Window 0
          880000000-8bfffffff : System RAM
          8c0000000-10bfffffff : Reserved

The numa node information is as follows, Node 2 only uses 1G of unreserved
memory:
        node 0 size: 31109 MB
        node 0 free: 29364 MB
        ...
        node 1 size: 32206 MB
        node 1 free: 29736 MB
        node 2 cpus:
        node 2 size: 1007 MB
        node 2 free: 945 MB

The above shows that the memmap command is valid, but it is invalid when
all memory is reserved.

Currently, the CXL memory is only checked to see if part of it has been
already online as 'System RAM'. When all memory is reserved, no portion
of CXL memory is used as system memory, the check is pass and dax device
is created normally, the region is manged by CXL.

        if (walk_iomem_res_desc(IORES_DESC_NONE,
                        IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
                        p->res->start, p->res->end, cxlr,
                        is_system_ram) > 0)
        return 0;

Add a range check for reserved resources. If reserved resources cover the
region, skip the creation of the dax device.If the range of reserved
resources does not cover the region, part of CXL will be used as system
memory and the next check will fail.

After adding the check for reserved memory resources, the kernel boot
command 'efi=nosoftreserve memmap=32G$34G', reserve all memory, resource
tree and numa node information of CXL device as follows, node 2 does not 
exist because it is fully reserved:

        100000000-87effffff : System RAM
        87f000000-67ffffffff : Reserved
          880000000-67ffffffff : CXL Window 0
                880000000-107fffffff : region0

        node 0 size: 31365 MB
        node 0 free: 29226 MB
        ...
        node 1 size: 32206 MB
        node 1 free: 29703 MB

When the kernel startup parameter is not added or the startup parameter is
set to 'efi=nosoftreserve memmap=32G$35G',the result is the same as before
the modification.

Signed-off-by: peng guo <engguopeng@buaa.edu.cn>
---
 drivers/cxl/core/region.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)
diff mbox series

Patch

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index dff618c708dc..f484dfea1b0e 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3367,6 +3367,19 @@  int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, CXL);
 
+static int is_reserve_ram(struct resource *res, void *arg)
+{
+	struct cxl_region *cxlr = arg;
+	struct cxl_region_params *p = &cxlr->params;
+
+	if ((p->res->start < res->start) || (res->end < p->res->end))
+		return 0;
+
+	dev_dbg(&cxlr->dev, "%pr has Reserve RAM: %pr\n", p->res, res);
+
+	return 1;
+}
+
 static int is_system_ram(struct resource *res, void *arg)
 {
 	struct cxl_region *cxlr = arg;
@@ -3435,6 +3448,16 @@  static int cxl_region_probe(struct device *dev)
 	case CXL_DECODER_PMEM:
 		return devm_cxl_add_pmem_region(cxlr);
 	case CXL_DECODER_RAM:
+		/*
+		 * The region can not be manged by CXL if all of
+		 * it is already reserved
+		 */
+		if (walk_iomem_res_desc(IORES_DESC_RESERVED,
+					IORESOURCE_MEM,
+					p->res->start, p->res->end, cxlr,
+					is_reserve_ram) > 0)
+			return 0;
+
 		/*
 		 * The region can not be manged by CXL if any portion of
 		 * it is already online as 'System RAM'