From patchwork Wed May 10 18:44:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13237183 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90701C7EE24 for ; Wed, 10 May 2023 18:44:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229461AbjEJSov (ORCPT ); Wed, 10 May 2023 14:44:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236038AbjEJSos (ORCPT ); Wed, 10 May 2023 14:44:48 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E5AC65A0 for ; Wed, 10 May 2023 11:44:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683744275; x=1715280275; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=queWowh+a2U/UwG0y21u4XRP0gutO0bVFUA/7debRvg=; b=gnDY3WMFdt1GnOVq1iiiZRtDjCmPSd0nIk3SwHbm1CK030eCqltNLG7M EqoedN/OcJZfOK4d757UTPoALxBmC+98uSMo8PP0AHeFE00BtohmUN9hF 909GXGaPhZzLC/h+HT8RaMyQyATv50lm19xL8ZWVKeAl5H/VzV3vJK3wJ EdGDp/Bn1Bse4/Xq9AZmZLgQJ4v8vtjHy+XWzNbrbWpiBikjjIZvAEH52 h6iDgFtPLg/eKDzgLEL8zPZh2HANb8EY8GQmqRDTooUfaUCWJ0Bda9oHL oLpsJaVMhB4R/x+qVAqIH1A+gxAEjspZPIBqxCJUi6f/1caczXKbxdJ++ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="353370143" X-IronPort-AV: E=Sophos;i="5.99,265,1677571200"; d="scan'208";a="353370143" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2023 11:44:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="769004638" X-IronPort-AV: E=Sophos;i="5.99,265,1677571200"; d="scan'208";a="769004638" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.34.89]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2023 11:44:32 -0700 From: alison.schofield@intel.com To: Dan Williams , Ira Weiny , Vishal Verma , Dave Jiang , Ben Widawsky Cc: Alison Schofield , linux-cxl@vger.kernel.org Subject: [RFC 1/3] x86/numa: Introduce numa_find_node(start, end) Date: Wed, 10 May 2023 11:44:26 -0700 Message-Id: <6bf1866161446f03105ec50c3a09de194d830bc3.1683742429.git.alison.schofield@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org From: Alison Schofield phys_to_target_node(phys_addr_t start) returns a NUMA node id for a single physical address. In order to discover if there is a NUMA node assigned to any, in a range of addressses, there is no solution. Repeatedly calling phys_to_target_node() from start/end is too expensive to consider. Examining the numa memblks is nicer. Introduce numa_find_node(start, end) to return the first NUMA node found anywhere in the start/end HPA range. Signed-off-by: Alison Schofield --- arch/x86/include/asm/numa.h | 1 + arch/x86/mm/numa.c | 14 ++++++++++++++ 2 files changed, 15 insertions(+) diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h index e3bae2b60a0d..5f2b811f1a5f 100644 --- a/arch/x86/include/asm/numa.h +++ b/arch/x86/include/asm/numa.h @@ -34,6 +34,7 @@ extern nodemask_t numa_nodes_parsed __initdata; extern int __init numa_add_memblk(int nodeid, u64 start, u64 end); extern void __init numa_set_distance(int from, int to, int distance); +extern int __init numa_find_node(u64 start, u64 end); static inline void set_apicid_to_node(int apicid, s16 node) { diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index 2aadb2019b4f..62990977f720 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -225,6 +225,20 @@ static void __init alloc_node_data(int nid) node_set_online(nid); } +/* find node with any memblk start/end */ +int __init numa_find_node(u64 start, u64 end) +{ + struct numa_meminfo *mi = &numa_meminfo; + + for (int i = 0; i < mi->nr_blks; i++) { + struct numa_memblk *bi = &mi->blk[i]; + + if (start <= bi->start && end >= bi->end) + return bi->nid; + } + return NUMA_NO_NODE; +} + /** * numa_cleanup_meminfo - Cleanup a numa_meminfo * @mi: numa_meminfo to clean up From patchwork Wed May 10 18:44:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13237182 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BDBAC77B7C for ; Wed, 10 May 2023 18:44:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235987AbjEJSou (ORCPT ); Wed, 10 May 2023 14:44:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229461AbjEJSot (ORCPT ); Wed, 10 May 2023 14:44:49 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 325812D59 for ; Wed, 10 May 2023 11:44:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683744276; x=1715280276; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HwzxLzZHwXP/JxDwYJyWdZLl983+BiFLh6gVRvAii8c=; b=WlEzew1axwkBbDHHBCvN02M95DRjrZmiTCrw90qOxOOe2XhXp9UoiZKW 67UXFM9Wmjhrgb95cJQFZYzLfXN0RWsKql3JuwBpD25cI5+iC7jUc41UO Ps+wSJtkmB7ChdV/OrywVClaa38jJ4UqkP4gkAOi33QMHViTyTuFb74FG cdcXf4/w8BDoat/LeOqJYROXPQ8EK29raRq8k+PaCVQTM2A/m61LKPERv 0jGgBtpQ/c/rtDl5/UcOGaSclmWxTBEQnXSbvUv6iY2MHtYoGkmjd9Dzj iL2CIhEXpNttsTIRRo/RfyFem6v6ZaC3ZdmReTN5bgRaxxJRRkwnKhMtT Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="353370151" X-IronPort-AV: E=Sophos;i="5.99,265,1677571200"; d="scan'208";a="353370151" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2023 11:44:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="769004646" X-IronPort-AV: E=Sophos;i="5.99,265,1677571200"; d="scan'208";a="769004646" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.34.89]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2023 11:44:33 -0700 From: alison.schofield@intel.com To: Dan Williams , Ira Weiny , Vishal Verma , Dave Jiang , Ben Widawsky Cc: Alison Schofield , linux-cxl@vger.kernel.org Subject: [RFC 2/3] x86/numa: Introduce numa_remove_memblks(node, start, end) Date: Wed, 10 May 2023 11:44:27 -0700 Message-Id: <57bc9bb8823a295dc71d7e8ff8ae93a3af8c0be3.1683742429.git.alison.schofield@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org From: Alison Schofield Add support for removing memblks from numa_meminfo that are within a start/end range, and of a node. numa_add_memblk() allows in kernel users to add a memblk to a NUMA node. There is no method exposed to remove a memblk. The use case here is to allow the ACPI driver to remove redundant memblks when it knows they exist, rather than implementing a cleanup that needlessly walks all memblks during a cleanup phase. numa_cleanup_meminfo() exists for merging memblks, however, it only considers adjacent memblks, and, it actually moves the memblks to numa_reserved_meminfo, before doing the cleanup. Signed-off-by: Alison Schofield --- arch/x86/include/asm/numa.h | 1 + arch/x86/mm/numa.c | 22 ++++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h index 5f2b811f1a5f..cb8b9a8cae32 100644 --- a/arch/x86/include/asm/numa.h +++ b/arch/x86/include/asm/numa.h @@ -35,6 +35,7 @@ extern nodemask_t numa_nodes_parsed __initdata; extern int __init numa_add_memblk(int nodeid, u64 start, u64 end); extern void __init numa_set_distance(int from, int to, int distance); extern int __init numa_find_node(u64 start, u64 end); +extern void __init numa_remove_memblks(int node, u64 start, u64 end); static inline void set_apicid_to_node(int apicid, s16 node) { diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index 62990977f720..42d70f01ca0a 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -239,6 +239,28 @@ int __init numa_find_node(u64 start, u64 end) return NUMA_NO_NODE; } +/** + * numa_remove_memblks - Remove memblocks from a node + * @node: node + * @start: start addr of memblks to remove + * @end: end addr of memblks to remove + * + * Remove any memblks of node within start/end range + */ +void __init numa_remove_memblks(int node, u64 start, u64 end) +{ + struct numa_meminfo *mi = &numa_meminfo; + + for (int i = 0; i < mi->nr_blks; i++) { + struct numa_memblk *bi = &mi->blk[i]; + + if (bi->nid != node) + continue; + if (start <= bi->start && end >= bi->end) + numa_remove_memblk_from(i--, &numa_meminfo); + } +} + /** * numa_cleanup_meminfo - Cleanup a numa_meminfo * @mi: numa_meminfo to clean up From patchwork Wed May 10 18:44:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13237184 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F841C7EE22 for ; Wed, 10 May 2023 18:44:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236038AbjEJSoz (ORCPT ); Wed, 10 May 2023 14:44:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236047AbjEJSox (ORCPT ); Wed, 10 May 2023 14:44:53 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A44BD192 for ; Wed, 10 May 2023 11:44:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683744288; x=1715280288; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NfoRdAcWIpZiBUmLDyQdOxAswwlz5g4XpU706H4ZUjA=; b=ZtJuCqYi4elnWm/ASa/7hycIU0wh/strwJUwtR3UduDPvkW9cnnU134n 1E3ELrBGAIBuKeaTddDm5OusLxngjDkmzIqMLDeSjeqKxgITEH2kh94RI sXRPnl95wjvI+z1Cp9Ch5TnR39oBswkfbP5YzpDpdms4xI0ijdvy7Yaec PkQcddGjjvEXxzJWNuvtHCf5OuvH7mqJfHwipNmxRPTCKZ7LsAprgn6eU gRZqrdF9GxeI8AKXJWSHE3gHtP49F9TOPOlNYYZgehZf69BZRed69gqmI V732T0P0ytCSG6B7xsH6PFbRwaoBhQ04bhauE7jaKNqRROYjd4dn09uNc g==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="353370157" X-IronPort-AV: E=Sophos;i="5.99,265,1677571200"; d="scan'208";a="353370157" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2023 11:44:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="769004652" X-IronPort-AV: E=Sophos;i="5.99,265,1677571200"; d="scan'208";a="769004652" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.34.89]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2023 11:44:35 -0700 From: alison.schofield@intel.com To: Dan Williams , Ira Weiny , Vishal Verma , Dave Jiang , Ben Widawsky Cc: Alison Schofield , linux-cxl@vger.kernel.org Subject: [RFC 3/3] ACPI: NUMA: Apply SRAT proximity domain to entire CFMWS window Date: Wed, 10 May 2023 11:44:28 -0700 Message-Id: <79a10b7101a8fab56f9ff3f9a4de73bee3156b40.1683742429.git.alison.schofield@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org From: Alison Schofield Commit fd49f99c1809 ("ACPI: NUMA: Add a node and memblk for each CFMWS not in SRAT") did not account for the case where an SRAT entry only partially describes a CFMWS window. It assumed an SRAT entry covered the entire CFMWS HPA range, start through end. Broaden the search for an SRAT defined NUMA node, by replacing the previously used phys_to_targe_node(start) search, with the recently introduced, numa_find_node(), that can discover a NUMA node anywhere in the CFMWS HPA range. If any NUMA node is discovered, proactively cleanup, by removing any memblks, partial or whole, in the HPA range, and add one memblk that encompasses the entire range. That has the effect of applying the SRAT defined proximity domain to the entire range, as well as doing a memblk cleanup at the point a redundancy is created. Considered and rejected, letting numa_cleanup_meminfo() try to remove the redundancy. It doesn't currently address this case, because these memblks will be moved to numa_reserved_meminfo, before any numa_meminfo merge is done. Also, the merge logic in numa_cleanup_meminfo() works on adjacent memblks, so it would need to grow in complexity to search for these potential cases. Considered and ready to reconsider, allow an extra memblk for every CFMWS HPA range that is also described in the SRAT. Is that a concern? If not a concern to have the extra memblk, then skip the memblk remove work entirely. Fixes: fd49f99c1809 ("ACPI: NUMA: Add a node and memblk for each CFMWS not in SRAT") Signed-off-by: Alison Schofield --- drivers/acpi/numa/srat.c | 32 ++++++++++++++++++++++++++------ 1 file changed, 26 insertions(+), 6 deletions(-) diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c index 1f4fc5f8a819..f41b65e9b085 100644 --- a/drivers/acpi/numa/srat.c +++ b/drivers/acpi/numa/srat.c @@ -301,27 +301,47 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma) static int __init acpi_parse_cfmws(union acpi_subtable_headers *header, void *arg, const unsigned long table_end) { + int node, found_node, *fake_pxm = arg; struct acpi_cedt_cfmws *cfmws; - int *fake_pxm = arg; u64 start, end; - int node; cfmws = (struct acpi_cedt_cfmws *)header; start = cfmws->base_hpa; end = cfmws->base_hpa + cfmws->window_size; - /* Skip if the SRAT already described the NUMA details for this HPA */ - node = phys_to_target_node(start); - if (node != NUMA_NO_NODE) + /* + * The SRAT may have already described the NUMA details for + * this CFMWS HPA range, yet it may not have created memblks + * for the entire range. Look for a node with a memblk covering + * any part of the HPA range. Don't bother figuring out if it + * is partially or wholly described. Replace any memblks in the + * range with one single memblk that covers the entire range. + * + * This preserves the SRAT defined node and Proximity Domain. + */ + + found_node = numa_find_node(start, end); + if (found_node != NUMA_NO_NODE) { + numa_remove_memblks(found_node, start, end); + if (numa_add_memblk(found_node, start, end) < 0) { + /* CXL driver must handle the NUMA_NO_NODE case */ + pr_warn("ACPI NUMA: failed to add memblk for CFMWS node %d [mem %#llx-%#llx]\n", + found_node, start, end); + } return 0; + } + /* + * SRAT did not describe this window at all. + * Create a new node with a fake proximity domain. Add a + * memblk covering the entire HPA range. + */ node = acpi_map_pxm_to_node(*fake_pxm); if (node == NUMA_NO_NODE) { pr_err("ACPI NUMA: Too many proximity domains while processing CFMWS.\n"); return -EINVAL; } - if (numa_add_memblk(node, start, end) < 0) { /* CXL driver must handle the NUMA_NO_NODE case */ pr_warn("ACPI NUMA: Failed to add memblk for CFMWS node %d [mem %#llx-%#llx]\n",