Message ID | 20250403183315.286710-3-terry.bowman@amd.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Add managed SOFT RESERVE resource handling | expand |
On Thu, 3 Apr 2025 13:33:13 -0500 Terry Bowman <terry.bowman@amd.com> wrote: > From: Nathan Fontenot <nathan.fontenot@amd.com> > > Update handling of SOFT RESERVE iomem resources that intersect with > CXL region resources to remove intersections from the SOFT RESERVE > resources. The current approach of leaving SOFT RESERVE resources as > is can cause failures during hotplug replace of CXL devices because > the resource is not available for reuse after teardown of the CXL device. > > To accomplish this the cxl acpi driver creates a worker thread at the Inconsistent in capitalization. I'd just use CXL ACPI here given you used CXL PCI below. > end of cxl_acpi_probe(). This worker thread first waits for the CXL PCI > CXL mem drivers have loaded. The cxl core/suspend.c code is updated to > add a pci_loaded variable, in addition to the mem_active variable, that > is updated when the pci driver loads. Remove CONFIG_CXL_SUSPEND Kconfig as > it is no longer needed. A new cxl_wait_for_pci_mem() routine uses a > waitqueue for both these driver to be loaded. The need to add this > additional waitqueue is ensure the CXL PCI and CXL mem drivers have loaded > before we wait for their probe, without it the cxl acpi probe worker thread > calls wait_for_device_probe() before these drivers are loaded. > > After the CXL PCI and CXL mem drivers load the cxl acpi worker thread CXL ACPI > uses wait_for_device_probe() to ensure device probe routines have > completed. Does it matter if these drivers go away again? Everything seems to be one way at the moment. > > Once probe completes and regions have been created, find all cxl CXL > regions that have been created and trim any SOFT RESERVE resources > that intersect with the region. > > Update cxl_acpi_exit() to cancel pending waitqueue work. > > Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> > Signed-off-by: Terry Bowman <terry.bowman@amd.com> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index be8a7dc77719..40835ec692c8 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -858,6 +858,7 @@ bool is_cxl_pmem_region(struct device *dev); > struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev); > int cxl_add_to_region(struct cxl_port *root, > struct cxl_endpoint_decoder *cxled); > +int cxl_region_srmem_update(void); As before: srmem is a bit obscure. Maybe spell it out more. > struct cxl_dax_region *to_cxl_dax_region(struct device *dev); > u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa); > #else > @@ -902,6 +903,8 @@ void cxl_coordinates_combine(struct access_coordinate *out, > > bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port); > > +void cxl_wait_for_pci_mem(void);
Hi Terry,
kernel test robot noticed the following build errors:
[auto build test ERROR on aae0594a7053c60b82621136257c8b648c67b512]
url: https://github.com/intel-lab-lkp/linux/commits/Terry-Bowman/kernel-resource-Provide-mem-region-release-for-SOFT-RESERVES/20250404-023601
base: aae0594a7053c60b82621136257c8b648c67b512
patch link: https://lore.kernel.org/r/20250403183315.286710-3-terry.bowman%40amd.com
patch subject: [PATCH v3 2/4] cxl: Update Soft Reserved resources upon region creation
config: hexagon-randconfig-001-20250404 (https://download.01.org/0day-ci/archive/20250404/202504042103.wFCRBR7K-lkp@intel.com/config)
compiler: clang version 15.0.7 (https://github.com/llvm/llvm-project 8dfdcc7b7bf66834a761bd8de445840ef68e4d1a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250404/202504042103.wFCRBR7K-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202504042103.wFCRBR7K-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from drivers/cxl/core/suspend.c:7:
>> drivers/cxl/cxlpci.h:126:2: error: call to undeclared function 'pcie_capability_read_word'; ISO C99 and later do not support implicit function declarations [-Werror,-Wimplicit-function-declaration]
pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &lnksta2);
^
1 error generated.
vim +/pcie_capability_read_word +126 drivers/cxl/cxlpci.h
e0c818e00443ce Robert Richter 2024-02-16 116
4d07a05397c8c1 Dave Jiang 2023-12-21 117 /*
4d07a05397c8c1 Dave Jiang 2023-12-21 118 * CXL v3.0 6.2.3 Table 6-4
4d07a05397c8c1 Dave Jiang 2023-12-21 119 * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
4d07a05397c8c1 Dave Jiang 2023-12-21 120 * mode, otherwise it's 68B flits mode.
4d07a05397c8c1 Dave Jiang 2023-12-21 121 */
4d07a05397c8c1 Dave Jiang 2023-12-21 122 static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
4d07a05397c8c1 Dave Jiang 2023-12-21 123 {
4d07a05397c8c1 Dave Jiang 2023-12-21 124 u16 lnksta2;
4d07a05397c8c1 Dave Jiang 2023-12-21 125
4d07a05397c8c1 Dave Jiang 2023-12-21 @126 pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &lnksta2);
4d07a05397c8c1 Dave Jiang 2023-12-21 127 return lnksta2 & PCI_EXP_LNKSTA2_FLIT;
4d07a05397c8c1 Dave Jiang 2023-12-21 128 }
4d07a05397c8c1 Dave Jiang 2023-12-21 129
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 205547e5543a..c7377956c1d5 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -117,10 +117,6 @@ config CXL_PORT default CXL_BUS tristate -config CXL_SUSPEND - def_bool y - depends on SUSPEND && CXL_MEM - config CXL_REGION bool "CXL: Region Support" default CXL_BUS diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index cb14829bb9be..94f2d649bb30 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -7,6 +7,8 @@ #include <linux/acpi.h> #include <linux/pci.h> #include <linux/node.h> +#include <linux/pm.h> +#include <linux/workqueue.h> #include <asm/div64.h> #include "cxlpci.h" #include "cxl.h" @@ -813,6 +815,27 @@ static int pair_cxl_resource(struct device *dev, void *data) return 0; } +static void cxl_srmem_work_fn(struct work_struct *work) +{ + /* Wait for CXL PCI and mem drivers to load */ + cxl_wait_for_pci_mem(); + + /* + * Once the CXL PCI and mem drivers have loaded wait + * for the driver probe routines to complete. + */ + wait_for_device_probe(); + + cxl_region_srmem_update(); +} + +DECLARE_WORK(cxl_sr_work, cxl_srmem_work_fn); + +static void cxl_srmem_update(void) +{ + schedule_work(&cxl_sr_work); +} + static int cxl_acpi_probe(struct platform_device *pdev) { int rc; @@ -887,6 +910,10 @@ static int cxl_acpi_probe(struct platform_device *pdev) /* In case PCI is scanned before ACPI re-trigger memdev attach */ cxl_bus_rescan(); + + /* Update SOFT RESERVED resources that intersect with CXL regions */ + cxl_srmem_update(); + return 0; } @@ -918,6 +945,7 @@ static int __init cxl_acpi_init(void) static void __exit cxl_acpi_exit(void) { + cancel_work_sync(&cxl_sr_work); platform_driver_unregister(&cxl_acpi_driver); cxl_bus_drain(); } diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile index 086df97a0fcf..035864db8a32 100644 --- a/drivers/cxl/core/Makefile +++ b/drivers/cxl/core/Makefile @@ -1,6 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_CXL_BUS) += cxl_core.o -obj-$(CONFIG_CXL_SUSPEND) += suspend.o +obj-y += suspend.o ccflags-y += -I$(srctree)/drivers/cxl CFLAGS_trace.o = -DTRACE_INCLUDE_PATH=. -I$(src) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index c3f4dc244df7..25d70175f204 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -10,6 +10,7 @@ #include <linux/sort.h> #include <linux/idr.h> #include <linux/memory-tiers.h> +#include <linux/ioport.h> #include <cxlmem.h> #include <cxl.h> #include "core.h" @@ -2333,7 +2334,7 @@ const struct device_type cxl_region_type = { bool is_cxl_region(struct device *dev) { - return dev->type == &cxl_region_type; + return dev && dev->type == &cxl_region_type; } EXPORT_SYMBOL_NS_GPL(is_cxl_region, "CXL"); @@ -3443,6 +3444,27 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled) } EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL"); +int cxl_region_srmem_update(void) +{ + struct device *dev = NULL; + struct cxl_region *cxlr; + struct resource *res; + + do { + dev = bus_find_next_device(&cxl_bus_type, dev); + if (is_cxl_region(dev)) { + cxlr = to_cxl_region(dev); + res = cxlr->params.res; + release_srmem_region_adjustable(res->start, + resource_size(res)); + } + put_device(dev); + } while (dev); + + return 0; +} +EXPORT_SYMBOL_NS_GPL(cxl_region_srmem_update, "CXL"); + u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa) { struct cxl_region_ref *iter; diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c index 29aa5cc5e565..4813641e1b7b 100644 --- a/drivers/cxl/core/suspend.c +++ b/drivers/cxl/core/suspend.c @@ -2,9 +2,14 @@ /* Copyright(c) 2022 Intel Corporation. All rights reserved. */ #include <linux/atomic.h> #include <linux/export.h> +#include <linux/wait.h> #include "cxlmem.h" +#include "cxlpci.h" static atomic_t mem_active; +static atomic_t pci_loaded; + +static DECLARE_WAIT_QUEUE_HEAD(cxl_wait_queue); bool cxl_mem_active(void) { @@ -14,6 +19,7 @@ bool cxl_mem_active(void) void cxl_mem_active_inc(void) { atomic_inc(&mem_active); + wake_up(&cxl_wait_queue); } EXPORT_SYMBOL_NS_GPL(cxl_mem_active_inc, "CXL"); @@ -22,3 +28,38 @@ void cxl_mem_active_dec(void) atomic_dec(&mem_active); } EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL"); + +void mark_cxl_pci_loaded(void) +{ + atomic_inc(&pci_loaded); + wake_up(&cxl_wait_queue); +} +EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL"); + +static bool cxl_pci_loaded(void) +{ + if (IS_ENABLED(CONFIG_CXL_PCI)) + return atomic_read(&pci_loaded) != 0; + + return true; +} + +static bool cxl_mem_probed(void) +{ + if (IS_ENABLED(CONFIG_CXL_MEM)) + return atomic_read(&mem_active) != 0; + + return true; +} + +void cxl_wait_for_pci_mem(void) +{ + if (IS_ENABLED(CONFIG_CXL_PCI) || IS_ENABLED(CONFIG_CXL_MEM)) + if (wait_event_timeout(cxl_wait_queue, + cxl_pci_loaded() && cxl_mem_probed(), + 30 * HZ)) { + pr_debug("Timeout waiting for CXL PCI or CXL Memory probing"); + } + +} +EXPORT_SYMBOL_NS_GPL(cxl_wait_for_pci_mem, "CXL"); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index be8a7dc77719..40835ec692c8 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -858,6 +858,7 @@ bool is_cxl_pmem_region(struct device *dev); struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev); int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled); +int cxl_region_srmem_update(void); struct cxl_dax_region *to_cxl_dax_region(struct device *dev); u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa); #else @@ -902,6 +903,8 @@ void cxl_coordinates_combine(struct access_coordinate *out, bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port); +void cxl_wait_for_pci_mem(void); + /* * Unit test builds overrides this to __weak, find the 'strong' version * of these symbols in tools/testing/cxl/. diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 3ec6b906371b..1bd1e88c4cc0 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -853,17 +853,8 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd); int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa); int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa); -#ifdef CONFIG_CXL_SUSPEND void cxl_mem_active_inc(void); void cxl_mem_active_dec(void); -#else -static inline void cxl_mem_active_inc(void) -{ -} -static inline void cxl_mem_active_dec(void) -{ -} -#endif int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd); diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h index 54e219b0049e..5a811ac63fcf 100644 --- a/drivers/cxl/cxlpci.h +++ b/drivers/cxl/cxlpci.h @@ -135,4 +135,5 @@ void read_cdat_data(struct cxl_port *port); void cxl_cor_error_detected(struct pci_dev *pdev); pci_ers_result_t cxl_error_detected(struct pci_dev *pdev, pci_channel_state_t state); +void mark_cxl_pci_loaded(void); #endif /* __CXL_PCI_H__ */ diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 4288f4814cc5..b784008489b3 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -1185,6 +1185,8 @@ static int __init cxl_pci_driver_init(void) if (rc) pci_unregister_driver(&cxl_pci_driver); + mark_cxl_pci_loaded(); + return rc; } diff --git a/include/linux/pm.h b/include/linux/pm.h index 78855d794342..11ff485c9722 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -35,14 +35,7 @@ static inline void pm_vt_switch_unregister(struct device *dev) } #endif /* CONFIG_VT_CONSOLE_SLEEP */ -#ifdef CONFIG_CXL_SUSPEND bool cxl_mem_active(void); -#else -static inline bool cxl_mem_active(void) -{ - return false; -} -#endif /* * Device power management