Message ID | 20250106121017.1620-16-shiju.jose@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers | expand |
shiju.jose@ wrote: > From: Shiju Jose <shiju.jose@huawei.com> > > CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub control > feature. The device patrol scrub proactively locates and makes corrections > to errors in regular cycle. > > Allow specifying the number of hours within which the patrol scrub must be > completed, subject to minimum and maximum limits reported by the device. > Also allow disabling scrub allowing trade-off error rates against > performance. > > Add support for patrol scrub control on CXL memory devices. > Register with the EDAC device driver, which retrieves the scrub attribute > descriptors from EDAC scrub and exposes the sysfs scrub control attributes > to userspace. For example, scrub control for the CXL memory device > "cxl_mem0" is exposed in /sys/bus/edac/devices/cxl_mem0/scrubX/. > > Additionally, add support for region-based CXL memory patrol scrub control. > CXL memory regions may be interleaved across one or more CXL memory > devices. For example, region-based scrub control for "cxl_region1" is > exposed in /sys/bus/edac/devices/cxl_region1/scrubX/. > > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Signed-off-by: Shiju Jose <shiju.jose@huawei.com> > --- > Documentation/edac/scrub.rst | 66 ++++++ > drivers/cxl/Kconfig | 17 ++ > drivers/cxl/core/Makefile | 1 + > drivers/cxl/core/memfeature.c | 392 ++++++++++++++++++++++++++++++++++ > drivers/cxl/core/region.c | 6 + > drivers/cxl/cxlmem.h | 7 + > drivers/cxl/mem.c | 5 + > include/cxl/features.h | 16 ++ > 8 files changed, 510 insertions(+) > create mode 100644 drivers/cxl/core/memfeature.c > diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst > index f86645c7f0af..80e986c57885 100644 > --- a/Documentation/edac/scrub.rst > +++ b/Documentation/edac/scrub.rst > @@ -325,3 +325,69 @@ root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_d > 10800 > > root@localhost:~# echo 0 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background > + > +2. CXL memory device patrol scrubber > + > +2.1 Device based scrubbing > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/min_cycle_duration > + > +3600 > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/max_cycle_duration > + > +918000 > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration > + > +43200 > + > +root@localhost:~# echo 54000 > /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration > + > +54000 > + > +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background > + > +1 > + > +root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background > + > +0 > + > +2.2. Region based scrubbing > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/min_cycle_duration > + > +3600 > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/max_cycle_duration > + > +918000 > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration > + > +43200 > + > +root@localhost:~# echo 54000 > /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration > + > +54000 > + > +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_region0/scrub0/enable_background > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/enable_background > + > +1 > + > +root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_region0/scrub0/enable_background > + > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/enable_background What is this content-free blob of cat and echo statements? Please write actual documentation with theory of operation, clarification of assumptions, rationale for defaults, guidance on changing defaults... > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig > index 0bc6a2cb8474..6078f02e883b 100644 > --- a/drivers/cxl/Kconfig > +++ b/drivers/cxl/Kconfig > @@ -154,4 +154,21 @@ config CXL_FEATURES > > If unsure say 'y'. > > +config CXL_RAS_FEATURES > + tristate "CXL: Memory RAS features" > + depends on CXL_PCI What is the build dependency on CXL_PCI? This enabling does not call back into symbols provided by cxl_pci.ko does it? > + depends on CXL_MEM Similar comment, and this also goes away if all of this just moves into the new cxl_features driver. > + depends on EDAC > + help > + The CXL memory RAS feature control is optional and allows host to > + control the RAS features configurations of CXL Type 3 devices. > + > + It registers with the EDAC device subsystem to expose control > + attributes of CXL memory device's RAS features to the user. > + It provides interface functions to support configuring the CXL > + memory device's RAS features. > + Say 'y/m/n' to enable/disable control of the CXL.mem device's RAS features. > + See section 8.2.9.9.11 of CXL 3.1 specification for the detailed > + information of CXL memory device features. Usually the "say X" statement provides a rationale like. "Say y/m if you have an expert need to change default memory scrub rates established by the platform/device, otherwise say n" > + > endif > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile > index 73b6348afd67..54baca513ecb 100644 > --- a/drivers/cxl/core/Makefile > +++ b/drivers/cxl/core/Makefile > @@ -17,3 +17,4 @@ cxl_core-y += cdat.o > cxl_core-y += features.o > cxl_core-$(CONFIG_TRACING) += trace.o > cxl_core-$(CONFIG_CXL_REGION) += region.o > +cxl_core-$(CONFIG_CXL_RAS_FEATURES) += memfeature.o > diff --git a/drivers/cxl/core/memfeature.c b/drivers/cxl/core/memfeature.c > new file mode 100644 > index 000000000000..77d1bf6ce45f > --- /dev/null > +++ b/drivers/cxl/core/memfeature.c > @@ -0,0 +1,392 @@ > +// SPDX-License-Identifier: GPL-2.0-or-later > +/* > + * CXL memory RAS feature driver. > + * > + * Copyright (c) 2024 HiSilicon Limited. > + * > + * - Supports functions to configure RAS features of the > + * CXL memory devices. > + * - Registers with the EDAC device subsystem driver to expose > + * the features sysfs attributes to the user for configuring > + * CXL memory RAS feature. > + */ > + > +#include <linux/cleanup.h> > +#include <linux/edac.h> > +#include <linux/limits.h> > +#include <cxl/features.h> > +#include <cxl.h> > +#include <cxlmem.h> > + > +#define CXL_DEV_NUM_RAS_FEATURES 1 > +#define CXL_DEV_HOUR_IN_SECS 3600 > + > +#define CXL_DEV_NAME_LEN 128 > + > +/* CXL memory patrol scrub control functions */ > +struct cxl_patrol_scrub_context { > + u8 instance; > + u16 get_feat_size; > + u16 set_feat_size; > + u8 get_version; > + u8 set_version; > + u16 effects; > + struct cxl_memdev *cxlmd; > + struct cxl_region *cxlr; > +}; > + > +/** > + * struct cxl_memdev_ps_params - CXL memory patrol scrub parameter data structure. > + * @enable: [IN & OUT] enable(1)/disable(0) patrol scrub. > + * @scrub_cycle_changeable: [OUT] scrub cycle attribute of patrol scrub is changeable. > + * @scrub_cycle_hrs: [IN] Requested patrol scrub cycle in hours. > + * [OUT] Current patrol scrub cycle in hours. > + * @min_scrub_cycle_hrs:[OUT] minimum patrol scrub cycle in hours supported. > + */ > +struct cxl_memdev_ps_params { > + bool enable; > + bool scrub_cycle_changeable; > + u8 scrub_cycle_hrs; > + u8 min_scrub_cycle_hrs; > +}; > + > +enum cxl_scrub_param { > + CXL_PS_PARAM_ENABLE, > + CXL_PS_PARAM_SCRUB_CYCLE, > +}; > + > +#define CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK BIT(0) > +#define CXL_MEMDEV_PS_SCRUB_CYCLE_REALTIME_REPORT_CAP_MASK BIT(1) > +#define CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK GENMASK(7, 0) > +#define CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK GENMASK(15, 8) > +#define CXL_MEMDEV_PS_FLAG_ENABLED_MASK BIT(0) > + > +struct cxl_memdev_ps_rd_attrs { > + u8 scrub_cycle_cap; > + __le16 scrub_cycle_hrs; > + u8 scrub_flags; > +} __packed; > + > +struct cxl_memdev_ps_wr_attrs { > + u8 scrub_cycle_hrs; > + u8 scrub_flags; > +} __packed; If these are packed to match specification layout, include a specification reference comment. > + > +static int cxl_mem_ps_get_attrs(struct cxl_mailbox *cxl_mbox, > + struct cxl_memdev_ps_params *params) > +{ > + size_t rd_data_size = sizeof(struct cxl_memdev_ps_rd_attrs); > + u16 scrub_cycle_hrs; > + size_t data_size; > + u16 return_code; > + struct cxl_memdev_ps_rd_attrs *rd_attrs __free(kfree) = > + kmalloc(rd_data_size, GFP_KERNEL); I would feel better with kzalloc() if short reads are possible. How big can rd_data_size get? I.e. should this be kvzalloc()? > + if (!rd_attrs) > + return -ENOMEM; > + > + data_size = cxl_get_feature(cxl_mbox->features, CXL_FEAT_PATROL_SCRUB_UUID, > + CXL_GET_FEAT_SEL_CURRENT_VALUE, > + rd_attrs, rd_data_size, 0, &return_code); > + if (!data_size || return_code != CXL_MBOX_CMD_RC_SUCCESS) > + return -EIO; > + > + params->scrub_cycle_changeable = FIELD_GET(CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK, > + rd_attrs->scrub_cycle_cap); > + params->enable = FIELD_GET(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, > + rd_attrs->scrub_flags); > + scrub_cycle_hrs = le16_to_cpu(rd_attrs->scrub_cycle_hrs); > + params->scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, > + scrub_cycle_hrs); > + params->min_scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK, > + scrub_cycle_hrs); > + > + return 0; > +} > + > +static int cxl_ps_get_attrs(struct cxl_patrol_scrub_context *cxl_ps_ctx, > + struct cxl_memdev_ps_params *params) > +{ > + struct cxl_memdev *cxlmd; > + u16 min_scrub_cycle = 0; > + int i, ret; > + > + if (cxl_ps_ctx->cxlr) { > + struct cxl_region *cxlr = cxl_ps_ctx->cxlr; > + struct cxl_region_params *p = &cxlr->params; > + > + for (i = p->interleave_ways - 1; i >= 0; i--) { > + struct cxl_endpoint_decoder *cxled = p->targets[i]; It looks like this is called directly as a callback from EDAC. Where is the locking that keeps cxl_ps_ctx->cxlr valid, or p->targets content stable? > + > + cxlmd = cxled_to_memdev(cxled); > + ret = cxl_mem_ps_get_attrs(&cxlmd->cxlds->cxl_mbox, params); > + if (ret) > + return ret; > + > + if (params->min_scrub_cycle_hrs > min_scrub_cycle) > + min_scrub_cycle = params->min_scrub_cycle_hrs; > + } > + params->min_scrub_cycle_hrs = min_scrub_cycle; > + return 0; > + } > + cxlmd = cxl_ps_ctx->cxlmd; > + > + return cxl_mem_ps_get_attrs(&cxlmd->cxlds->cxl_mbox, params); > +} > + > +static int cxl_mem_ps_set_attrs(struct device *dev, > + struct cxl_patrol_scrub_context *cxl_ps_ctx, > + struct cxl_mailbox *cxl_mbox, > + struct cxl_memdev_ps_params *params, > + enum cxl_scrub_param param_type) > +{ > + struct cxl_memdev_ps_wr_attrs wr_attrs; > + struct cxl_memdev_ps_params rd_params; > + u16 return_code; > + int ret; > + > + ret = cxl_mem_ps_get_attrs(cxl_mbox, &rd_params); > + if (ret) { > + dev_err(dev, "Get cxlmemdev patrol scrub params failed ret=%d\n", > + ret); > + return ret; > + } > + > + switch (param_type) { > + case CXL_PS_PARAM_ENABLE: > + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, > + params->enable); > + wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, > + rd_params.scrub_cycle_hrs); > + break; > + case CXL_PS_PARAM_SCRUB_CYCLE: > + if (params->scrub_cycle_hrs < rd_params.min_scrub_cycle_hrs) { > + dev_err(dev, "Invalid CXL patrol scrub cycle(%d) to set\n", > + params->scrub_cycle_hrs); > + dev_err(dev, "Minimum supported CXL patrol scrub cycle in hour %d\n", > + rd_params.min_scrub_cycle_hrs); > + return -EINVAL; > + } > + wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, > + params->scrub_cycle_hrs); > + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, > + rd_params.enable); > + break; > + } > + > + ret = cxl_set_feature(cxl_mbox->features, CXL_FEAT_PATROL_SCRUB_UUID, > + cxl_ps_ctx->set_version, > + &wr_attrs, sizeof(wr_attrs), > + CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET, > + 0, &return_code); > + if (ret || return_code != CXL_MBOX_CMD_RC_SUCCESS) { > + dev_err(dev, "CXL patrol scrub set feature failed ret=%d return_code=%u\n", > + ret, return_code); What can the admin do with this log spam? I would reconsider making all of these dev_dbg() and improving the sysfs documentation on what error codes mean. [..] > + > +int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr) Please separate this into a memdev helper and a region helper. It is silly to have two arguments to a function where one is expected to be NULL at all times, and then have an if else statement inside that to effectively turn it back into 2 code paths. If there is code to be shared amongst those, make *that* the shared helper. > +{ > + struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES]; > + char cxl_dev_name[CXL_DEV_NAME_LEN]; > + int num_ras_features = 0; > + u8 scrub_inst = 0; > + int rc; > + > + rc = cxl_memdev_scrub_init(cxlmd, cxlr, &ras_features[num_ras_features], > + scrub_inst); > + if (rc < 0) > + return rc; > + > + scrub_inst++; > + num_ras_features++; > + > + if (cxlr) > + snprintf(cxl_dev_name, sizeof(cxl_dev_name), > + "cxl_region%d", cxlr->id); Why not pass dev_name(&cxlr->dev) directly? > + else > + snprintf(cxl_dev_name, sizeof(cxl_dev_name), > + "%s_%s", "cxl", dev_name(&cxlmd->dev)); Can a "cxl" directory be created so that the raw name can be used? > + > + return edac_dev_register(&cxlmd->dev, cxl_dev_name, NULL, > + num_ras_features, ras_features); I'm so confused... a few lines down in this patch we have: rc = cxl_mem_ras_features_init(NULL, cxlr); ...so how can this call to edac_dev_register() unconditionally de-reference @cxlmd? Are there any tests for this? cxl-test is purpose-built for this kind of basic coverage tests. > +EXPORT_SYMBOL_NS_GPL(cxl_mem_ras_features_init, "CXL"); > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index b98b1ccffd1c..c2be70cd87f8 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -3449,6 +3449,12 @@ static int cxl_region_probe(struct device *dev) > p->res->start, p->res->end, cxlr, > is_system_ram) > 0) > return 0; > + > + rc = cxl_mem_ras_features_init(NULL, cxlr); > + if (rc) > + dev_warn(&cxlr->dev, "CXL RAS features init for region_id=%d failed\n", > + cxlr->id); There is more to RAS than EDAC memory scrub so this message is misleading. It is also unnecessary because the driver continues to load and the admin, if they care, will notice that the EDAC attributes are missing. > + > return devm_cxl_add_dax_region(cxlr); > default: > dev_dbg(&cxlr->dev, "unsupported region mode: %d\n", > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h > index 55c55685cb39..2b02e47cd7e7 100644 > --- a/drivers/cxl/cxlmem.h > +++ b/drivers/cxl/cxlmem.h > @@ -800,6 +800,13 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd); > int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa); > int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa); > > +#if IS_ENABLED(CONFIG_CXL_RAS_FEATURES) > +int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr); > +#else > +static inline int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr) > +{ return 0; } > +#endif > + > #ifdef CONFIG_CXL_SUSPEND > void cxl_mem_active_inc(void); > void cxl_mem_active_dec(void); > diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c > index 2f03a4d5606e..d236b4b8a93c 100644 > --- a/drivers/cxl/mem.c > +++ b/drivers/cxl/mem.c > @@ -116,6 +116,10 @@ static int cxl_mem_probe(struct device *dev) > if (!cxlds->media_ready) > return -EBUSY; > > + rc = cxl_mem_ras_features_init(cxlmd, NULL); > + if (rc) > + dev_warn(&cxlmd->dev, "CXL RAS features init failed\n"); > + > /* > * Someone is trying to reattach this device after it lost its port > * connection (an endpoint port previously registered by this memdev was > @@ -259,3 +263,4 @@ MODULE_ALIAS_CXL(CXL_DEVICE_MEMORY_EXPANDER); > * endpoint registration. > */ > MODULE_SOFTDEP("pre: cxl_port"); > +MODULE_SOFTDEP("pre: cxl_features"); Why?
On Fri, 24 Jan 2025 12:38:43 -0800 Dan Williams <dan.j.williams@intel.com> wrote: > shiju.jose@ wrote: > > From: Shiju Jose <shiju.jose@huawei.com> > > > > CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub control > > feature. The device patrol scrub proactively locates and makes corrections > > to errors in regular cycle. > > > > Allow specifying the number of hours within which the patrol scrub must be > > completed, subject to minimum and maximum limits reported by the device. > > Also allow disabling scrub allowing trade-off error rates against > > performance. > > > > Add support for patrol scrub control on CXL memory devices. > > Register with the EDAC device driver, which retrieves the scrub attribute > > descriptors from EDAC scrub and exposes the sysfs scrub control attributes > > to userspace. For example, scrub control for the CXL memory device > > "cxl_mem0" is exposed in /sys/bus/edac/devices/cxl_mem0/scrubX/. > > > > Additionally, add support for region-based CXL memory patrol scrub control. > > CXL memory regions may be interleaved across one or more CXL memory > > devices. For example, region-based scrub control for "cxl_region1" is > > exposed in /sys/bus/edac/devices/cxl_region1/scrubX/. > > > > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Hi Dan, A few specific replies in line. I've left the detail stuff to Shiju to address. Definitely a few things in there I'd missed! Thanks, Jonathan > > --- > > Documentation/edac/scrub.rst | 66 ++++++ > > drivers/cxl/Kconfig | 17 ++ > > drivers/cxl/core/Makefile | 1 + > > drivers/cxl/core/memfeature.c | 392 ++++++++++++++++++++++++++++++++++ > > drivers/cxl/core/region.c | 6 + > > drivers/cxl/cxlmem.h | 7 + > > drivers/cxl/mem.c | 5 + > > include/cxl/features.h | 16 ++ > > 8 files changed, 510 insertions(+) > > create mode 100644 drivers/cxl/core/memfeature.c > > diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst > > index f86645c7f0af..80e986c57885 100644 > > --- a/Documentation/edac/scrub.rst > > +++ b/Documentation/edac/scrub.rst > > @@ -325,3 +325,69 @@ root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_d > > 10800 > > > > root@localhost:~# echo 0 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background > > + > > +2. CXL memory device patrol scrubber > > + > > +2.1 Device based scrubbing > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/min_cycle_duration > > + > > +3600 > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/max_cycle_duration > > + > > +918000 > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration > > + > > +43200 > > + > > +root@localhost:~# echo 54000 > /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration > > + > > +54000 > > + > > +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background > > + > > +1 > > + > > +root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background > > + > > +0 > > + > > +2.2. Region based scrubbing > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/min_cycle_duration > > + > > +3600 > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/max_cycle_duration > > + > > +918000 > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration > > + > > +43200 > > + > > +root@localhost:~# echo 54000 > /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration > > + > > +54000 > > + > > +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_region0/scrub0/enable_background > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/enable_background > > + > > +1 > > + > > +root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_region0/scrub0/enable_background > > + > > +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/enable_background > > What is this content-free blob of cat and echo statements? Please write actual > documentation with theory of operation, clarification of assumptions, > rationale for defaults, guidance on changing defaults... Note this is a continuation of existing documentation, but sure some inline comments talking more about it would be fine. The rational and top level discussion is meant to be described in patch 2 as it is not CXL specific. Defaults are a device thing, there are no software driven defaults. So I'd suggest the above just adds a few comments along the lines of what the actions of each block does. Something like: Check current parameters and program the scrubbing for a region to repeat every X seconds (% of day) > > > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig > > index 0bc6a2cb8474..6078f02e883b 100644 > > --- a/drivers/cxl/Kconfig > > +++ b/drivers/cxl/Kconfig > > @@ -154,4 +154,21 @@ config CXL_FEATURES > > > > If unsure say 'y'. > > > > +config CXL_RAS_FEATURES > > + tristate "CXL: Memory RAS features" > > + depends on CXL_PCI > > What is the build dependency on CXL_PCI? This enabling does not call back into > symbols provided by cxl_pci.ko does it? > > > + depends on CXL_MEM > > Similar comment, and this also goes away if all of this just moves into > the new cxl_features driver. I'm not sure moving to be a child of cxl_features makes sense. Can probably do it but it's making the spiders web of connections even harder to relate to the underlying hardware. In my mental model, this stuff takes services from 'features' and 'mailbox' parts of the CXL driver. Take repair. Less than half of each of those drivers is feature related (a few 'what can I do' type aspects). The control plane goes via maintenance commands. Obviously we can get to those by adding additional infrastructure to the features driver, but that seems likely to be ugly and where do we stop? It won't scale to likely future cases where the feature part is a tiny tweak on some much larger chunk of infrastructure (which is mostly what spec defined features seem to be for). I don't think we want to support the complexity of device built-in test in the 3.2 spec necessarily (haven't really thought about it yet!) but it is an example of what would be an EDAC feature but has no dependence on features. We could register the patrol scrub stuff from features, and the rest separately but that seems even more confusing. So to me, nesting under features is an ugly solution but I'm not that attached to current approach. So in my view this stuff should be dependent on CXL_FEATURES but not a child of it. (lots skipped - I'll leave the detailed stuff to Shiju!) > > + else > > + snprintf(cxl_dev_name, sizeof(cxl_dev_name), > > + "%s_%s", "cxl", dev_name(&cxlmd->dev)); > > Can a "cxl" directory be created so that the raw name can be used? I'd like feedback from Boris on that. It is a mess to instantiate devices in subdirectories under a bus (that's kind of the big issue with the EDAC usage of the device model already). I'd say no it can't. >
Hi Dan, Thanks for the comments. Please find reply inline. Thanks, Shiju >-----Original Message----- >From: Dan Williams <dan.j.williams@intel.com> >Sent: 24 January 2025 20:39 >To: Shiju Jose <shiju.jose@huawei.com>; linux-edac@vger.kernel.org; linux- >cxl@vger.kernel.org; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux- >kernel@vger.kernel.org >Cc: bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org; >mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan >Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com; >alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com; >david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com; >Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com; >Jon.Grimm@amd.com; dave.hansen@linux.intel.com; >naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com; >somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com; >duenwen@google.com; gthelen@google.com; >wschwartz@amperecomputing.com; dferguson@amperecomputing.com; >wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei ><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto >Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com; >wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm ><linuxarm@huawei.com>; Shiju Jose <shiju.jose@huawei.com> >Subject: Re: [PATCH v18 15/19] cxl/memfeature: Add CXL memory device patrol >scrub control feature > >shiju.jose@ wrote: >> From: Shiju Jose <shiju.jose@huawei.com> >> >> CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub >> control feature. The device patrol scrub proactively locates and makes >> corrections to errors in regular cycle. >> >> Allow specifying the number of hours within which the patrol scrub >> must be completed, subject to minimum and maximum limits reported by the >device. >> Also allow disabling scrub allowing trade-off error rates against >> performance. >> >> Add support for patrol scrub control on CXL memory devices. >> Register with the EDAC device driver, which retrieves the scrub >> attribute descriptors from EDAC scrub and exposes the sysfs scrub >> control attributes to userspace. For example, scrub control for the >> CXL memory device "cxl_mem0" is exposed in >/sys/bus/edac/devices/cxl_mem0/scrubX/. >> >> Additionally, add support for region-based CXL memory patrol scrub control. >> CXL memory regions may be interleaved across one or more CXL memory >> devices. For example, region-based scrub control for "cxl_region1" is >> exposed in /sys/bus/edac/devices/cxl_region1/scrubX/. >> >> Reviewed-by: Dave Jiang <dave.jiang@intel.com> >> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> >> --- >> Documentation/edac/scrub.rst | 66 ++++++ >> drivers/cxl/Kconfig | 17 ++ >> drivers/cxl/core/Makefile | 1 + >> drivers/cxl/core/memfeature.c | 392 >++++++++++++++++++++++++++++++++++ >> drivers/cxl/core/region.c | 6 + >> drivers/cxl/cxlmem.h | 7 + >> drivers/cxl/mem.c | 5 + >> include/cxl/features.h | 16 ++ >> 8 files changed, 510 insertions(+) >> create mode 100644 drivers/cxl/core/memfeature.c diff --git >> a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst index >> f86645c7f0af..80e986c57885 100644 >> --- a/Documentation/edac/scrub.rst >> +++ b/Documentation/edac/scrub.rst >> @@ -325,3 +325,69 @@ root@localhost:~# cat >> /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_d >> 10800 >> >> root@localhost:~# echo 0 > >> /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background >> + >> +2. CXL memory device patrol scrubber >> + >> +2.1 Device based scrubbing >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_mem0/scrub0/min_cycle_duration >> + >> +3600 >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_mem0/scrub0/max_cycle_duration >> + >> +918000 >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration >> + >> +43200 >> + >> +root@localhost:~# echo 54000 > >> +/sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration >> + >> +54000 >> + >> +root@localhost:~# echo 1 > >> +/sys/bus/edac/devices/cxl_mem0/scrub0/enable_background >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_mem0/scrub0/enable_background >> + >> +1 >> + >> +root@localhost:~# echo 0 > >> +/sys/bus/edac/devices/cxl_mem0/scrub0/enable_background >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_mem0/scrub0/enable_background >> + >> +0 >> + >> +2.2. Region based scrubbing >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_region0/scrub0/min_cycle_duration >> + >> +3600 >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_region0/scrub0/max_cycle_duration >> + >> +918000 >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration >> + >> +43200 >> + >> +root@localhost:~# echo 54000 > >> +/sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration >> + >> +54000 >> + >> +root@localhost:~# echo 1 > >> +/sys/bus/edac/devices/cxl_region0/scrub0/enable_background >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_region0/scrub0/enable_background >> + >> +1 >> + >> +root@localhost:~# echo 0 > >> +/sys/bus/edac/devices/cxl_region0/scrub0/enable_background >> + >> +root@localhost:~# cat >> +/sys/bus/edac/devices/cxl_region0/scrub0/enable_background > >What is this content-free blob of cat and echo statements? Please write actual >documentation with theory of operation, clarification of assumptions, rationale >for defaults, guidance on changing defaults... Jonathan already replied. > >> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index >> 0bc6a2cb8474..6078f02e883b 100644 >> --- a/drivers/cxl/Kconfig >> +++ b/drivers/cxl/Kconfig >> @@ -154,4 +154,21 @@ config CXL_FEATURES >> >> If unsure say 'y'. >> >> +config CXL_RAS_FEATURES >> + tristate "CXL: Memory RAS features" >> + depends on CXL_PCI > >What is the build dependency on CXL_PCI? This enabling does not call back into >symbols provided by cxl_pci.ko does it? Will remove, which is not required. Initially cxl_mem_ras_features_init() was called from the pci.c > >> + depends on CXL_MEM > >Similar comment, and this also goes away if all of this just moves into the new >cxl_features driver. Agree with Jonathan told in reply. These are RAS specific features for CXL memory devices and thus added in memfeature.c > >> + depends on EDAC >> + help >> + The CXL memory RAS feature control is optional and allows host to >> + control the RAS features configurations of CXL Type 3 devices. >> + >> + It registers with the EDAC device subsystem to expose control >> + attributes of CXL memory device's RAS features to the user. >> + It provides interface functions to support configuring the CXL >> + memory device's RAS features. >> + Say 'y/m/n' to enable/disable control of the CXL.mem device's RAS >features. >> + See section 8.2.9.9.11 of CXL 3.1 specification for the detailed >> + information of CXL memory device features. > >Usually the "say X" statement provides a rationale like. > >"Say y/m if you have an expert need to change default memory scrub rates >established by the platform/device, otherwise say n" Will change. > >> + >> endif >> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile >> index 73b6348afd67..54baca513ecb 100644 >> --- a/drivers/cxl/core/Makefile >> +++ b/drivers/cxl/core/Makefile >> @@ -17,3 +17,4 @@ cxl_core-y += cdat.o cxl_core-y += features.o >> cxl_core-$(CONFIG_TRACING) += trace.o >> cxl_core-$(CONFIG_CXL_REGION) += region.o >> +cxl_core-$(CONFIG_CXL_RAS_FEATURES) += memfeature.o >> diff --git a/drivers/cxl/core/memfeature.c >> b/drivers/cxl/core/memfeature.c new file mode 100644 index >> 000000000000..77d1bf6ce45f >> --- /dev/null >> +++ b/drivers/cxl/core/memfeature.c >> @@ -0,0 +1,392 @@ >> +// SPDX-License-Identifier: GPL-2.0-or-later >> +/* >> + * CXL memory RAS feature driver. >> + * >> + * Copyright (c) 2024 HiSilicon Limited. >> + * >> + * - Supports functions to configure RAS features of the >> + * CXL memory devices. >> + * - Registers with the EDAC device subsystem driver to expose >> + * the features sysfs attributes to the user for configuring >> + * CXL memory RAS feature. >> + */ >> + >> +#include <linux/cleanup.h> >> +#include <linux/edac.h> >> +#include <linux/limits.h> >> +#include <cxl/features.h> >> +#include <cxl.h> >> +#include <cxlmem.h> >> + >> +#define CXL_DEV_NUM_RAS_FEATURES 1 >> +#define CXL_DEV_HOUR_IN_SECS 3600 >> + >> +#define CXL_DEV_NAME_LEN 128 >> + >> +/* CXL memory patrol scrub control functions */ struct >> +cxl_patrol_scrub_context { >> + u8 instance; >> + u16 get_feat_size; >> + u16 set_feat_size; >> + u8 get_version; >> + u8 set_version; >> + u16 effects; >> + struct cxl_memdev *cxlmd; >> + struct cxl_region *cxlr; >> +}; >> + >> +/** >> + * struct cxl_memdev_ps_params - CXL memory patrol scrub parameter data >structure. >> + * @enable: [IN & OUT] enable(1)/disable(0) patrol scrub. >> + * @scrub_cycle_changeable: [OUT] scrub cycle attribute of patrol scrub is >changeable. >> + * @scrub_cycle_hrs: [IN] Requested patrol scrub cycle in hours. >> + * [OUT] Current patrol scrub cycle in hours. >> + * @min_scrub_cycle_hrs:[OUT] minimum patrol scrub cycle in hours >supported. >> + */ >> +struct cxl_memdev_ps_params { >> + bool enable; >> + bool scrub_cycle_changeable; >> + u8 scrub_cycle_hrs; >> + u8 min_scrub_cycle_hrs; >> +}; >> + >> +enum cxl_scrub_param { >> + CXL_PS_PARAM_ENABLE, >> + CXL_PS_PARAM_SCRUB_CYCLE, >> +}; >> + >> +#define CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK BIT(0) >> +#define > CXL_MEMDEV_PS_SCRUB_CYCLE_REALTIME_REPORT_CAP_MASK > BIT(1) >> +#define CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK GENMASK(7, 0) >> +#define CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK GENMASK(15, >8) >> +#define CXL_MEMDEV_PS_FLAG_ENABLED_MASK BIT(0) >> + >> +struct cxl_memdev_ps_rd_attrs { >> + u8 scrub_cycle_cap; >> + __le16 scrub_cycle_hrs; >> + u8 scrub_flags; >> +} __packed; >> + >> +struct cxl_memdev_ps_wr_attrs { >> + u8 scrub_cycle_hrs; >> + u8 scrub_flags; >> +} __packed; > >If these are packed to match specification layout, include a specification >reference comment. Will add specification reference comment. Added same for memory repair features, but missed here. > >> + >> +static int cxl_mem_ps_get_attrs(struct cxl_mailbox *cxl_mbox, >> + struct cxl_memdev_ps_params *params) { >> + size_t rd_data_size = sizeof(struct cxl_memdev_ps_rd_attrs); >> + u16 scrub_cycle_hrs; >> + size_t data_size; >> + u16 return_code; >> + struct cxl_memdev_ps_rd_attrs *rd_attrs __free(kfree) = >> + kmalloc(rd_data_size, >GFP_KERNEL); > >I would feel better with kzalloc() if short reads are possible. Will change to kzalloc(). > >How big can rd_data_size get? I.e. should this be kvzalloc()? rd_data_size is 4 bytes for the patrol scrub feature. > >> + if (!rd_attrs) >> + return -ENOMEM; >> + >> + data_size = cxl_get_feature(cxl_mbox->features, >CXL_FEAT_PATROL_SCRUB_UUID, >> + CXL_GET_FEAT_SEL_CURRENT_VALUE, >> + rd_attrs, rd_data_size, 0, &return_code); >> + if (!data_size || return_code != CXL_MBOX_CMD_RC_SUCCESS) >> + return -EIO; >> + >> + params->scrub_cycle_changeable = >FIELD_GET(CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK, >> + rd_attrs->scrub_cycle_cap); >> + params->enable = >FIELD_GET(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, >> + rd_attrs->scrub_flags); >> + scrub_cycle_hrs = le16_to_cpu(rd_attrs->scrub_cycle_hrs); >> + params->scrub_cycle_hrs = >FIELD_GET(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, >> + scrub_cycle_hrs); >> + params->min_scrub_cycle_hrs = >FIELD_GET(CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK, >> + scrub_cycle_hrs); >> + >> + return 0; >> +} >> + >> +static int cxl_ps_get_attrs(struct cxl_patrol_scrub_context *cxl_ps_ctx, >> + struct cxl_memdev_ps_params *params) { >> + struct cxl_memdev *cxlmd; >> + u16 min_scrub_cycle = 0; >> + int i, ret; >> + >> + if (cxl_ps_ctx->cxlr) { >> + struct cxl_region *cxlr = cxl_ps_ctx->cxlr; >> + struct cxl_region_params *p = &cxlr->params; >> + >> + for (i = p->interleave_ways - 1; i >= 0; i--) { >> + struct cxl_endpoint_decoder *cxled = p->targets[i]; > >It looks like this is called directly as a callback from EDAC. Where is the locking >that keeps cxl_ps_ctx->cxlr valid, or p->targets content stable? Jonathan already replied. > >> + >> + cxlmd = cxled_to_memdev(cxled); >> + ret = cxl_mem_ps_get_attrs(&cxlmd->cxlds->cxl_mbox, >params); >> + if (ret) >> + return ret; >> + >> + if (params->min_scrub_cycle_hrs > min_scrub_cycle) >> + min_scrub_cycle = params- >>min_scrub_cycle_hrs; >> + } >> + params->min_scrub_cycle_hrs = min_scrub_cycle; >> + return 0; >> + } >> + cxlmd = cxl_ps_ctx->cxlmd; >> + >> + return cxl_mem_ps_get_attrs(&cxlmd->cxlds->cxl_mbox, params); } >> + >> +static int cxl_mem_ps_set_attrs(struct device *dev, >> + struct cxl_patrol_scrub_context *cxl_ps_ctx, >> + struct cxl_mailbox *cxl_mbox, >> + struct cxl_memdev_ps_params *params, >> + enum cxl_scrub_param param_type) >> +{ >> + struct cxl_memdev_ps_wr_attrs wr_attrs; >> + struct cxl_memdev_ps_params rd_params; >> + u16 return_code; >> + int ret; >> + >> + ret = cxl_mem_ps_get_attrs(cxl_mbox, &rd_params); >> + if (ret) { >> + dev_err(dev, "Get cxlmemdev patrol scrub params failed >ret=%d\n", >> + ret); >> + return ret; >> + } >> + >> + switch (param_type) { >> + case CXL_PS_PARAM_ENABLE: >> + wr_attrs.scrub_flags = >FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, >> + params->enable); >> + wr_attrs.scrub_cycle_hrs = >FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, >> + >rd_params.scrub_cycle_hrs); >> + break; >> + case CXL_PS_PARAM_SCRUB_CYCLE: >> + if (params->scrub_cycle_hrs < rd_params.min_scrub_cycle_hrs) >{ >> + dev_err(dev, "Invalid CXL patrol scrub cycle(%d) to >set\n", >> + params->scrub_cycle_hrs); >> + dev_err(dev, "Minimum supported CXL patrol scrub >cycle in hour %d\n", >> + rd_params.min_scrub_cycle_hrs); >> + return -EINVAL; >> + } >> + wr_attrs.scrub_cycle_hrs = >FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, >> + params->scrub_cycle_hrs); >> + wr_attrs.scrub_flags = >FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, >> + rd_params.enable); >> + break; >> + } >> + >> + ret = cxl_set_feature(cxl_mbox->features, >CXL_FEAT_PATROL_SCRUB_UUID, >> + cxl_ps_ctx->set_version, >> + &wr_attrs, sizeof(wr_attrs), >> + CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET, >> + 0, &return_code); >> + if (ret || return_code != CXL_MBOX_CMD_RC_SUCCESS) { >> + dev_err(dev, "CXL patrol scrub set feature failed ret=%d >return_code=%u\n", >> + ret, return_code); > >What can the admin do with this log spam? I would reconsider making all of >these dev_dbg() and improving the sysfs documentation on what error codes >mean. Sure will change. > >[..] >> + >> +int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct >> +cxl_region *cxlr) > >Please separate this into a memdev helper and a region helper. It is silly to have >two arguments to a function where one is expected to be NULL at all times, and >then have an if else statement inside that to effectively turn it back into 2 code >paths. > >If there is code to be shared amongst those, make *that* the shared helper. I added single function cxl_mem_ras_features_init() for both memdev and region based scrubbing to reduce code size as there were feedbacks try reduce code size. > >> +{ >> + struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES]; >> + char cxl_dev_name[CXL_DEV_NAME_LEN]; >> + int num_ras_features = 0; >> + u8 scrub_inst = 0; >> + int rc; >> + >> + rc = cxl_memdev_scrub_init(cxlmd, cxlr, >&ras_features[num_ras_features], >> + scrub_inst); >> + if (rc < 0) >> + return rc; >> + >> + scrub_inst++; >> + num_ras_features++; >> + >> + if (cxlr) >> + snprintf(cxl_dev_name, sizeof(cxl_dev_name), >> + "cxl_region%d", cxlr->id); > >Why not pass dev_name(&cxlr->dev) directly? Jonathan already replied. > >> + else >> + snprintf(cxl_dev_name, sizeof(cxl_dev_name), >> + "%s_%s", "cxl", dev_name(&cxlmd->dev)); > >Can a "cxl" directory be created so that the raw name can be used? > >> + >> + return edac_dev_register(&cxlmd->dev, cxl_dev_name, NULL, >> + num_ras_features, ras_features); > >I'm so confused... a few lines down in this patch we have: > > rc = cxl_mem_ras_features_init(NULL, cxlr); > >...so how can this call to edac_dev_register() unconditionally de-reference >@cxlmd? Thanks for spotting this. It is a bug, need to fix , cxlmd inited for region based scrubbing was done inside cxl_mem_ras_features_init() previously, which now moved to inside cxl_memdev_scrub_init(). Region based scrubbing required better testing because of some difficulty in running this use case in my test setup. Will check with Jonathan how to do. > >Are there any tests for this? cxl-test is purpose-built for this kind of basic >coverage tests. Will check this. > >> +EXPORT_SYMBOL_NS_GPL(cxl_mem_ras_features_init, "CXL"); >> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c >> index b98b1ccffd1c..c2be70cd87f8 100644 >> --- a/drivers/cxl/core/region.c >> +++ b/drivers/cxl/core/region.c >> @@ -3449,6 +3449,12 @@ static int cxl_region_probe(struct device *dev) >> p->res->start, p->res->end, cxlr, >> is_system_ram) > 0) >> return 0; >> + >> + rc = cxl_mem_ras_features_init(NULL, cxlr); >> + if (rc) >> + dev_warn(&cxlr->dev, "CXL RAS features init for >region_id=%d failed\n", >> + cxlr->id); > >There is more to RAS than EDAC memory scrub so this message is misleading. It >is also unnecessary because the driver continues to load and the admin, if they >care, will notice that the EDAC attributes are missing. This message was added for the debugging purpose in CXL driver. I will change to dev_dbg(). > >> + >> return devm_cxl_add_dax_region(cxlr); >> default: >> dev_dbg(&cxlr->dev, "unsupported region mode: %d\n", diff -- >git >> a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index >> 55c55685cb39..2b02e47cd7e7 100644 >> --- a/drivers/cxl/cxlmem.h >> +++ b/drivers/cxl/cxlmem.h >> @@ -800,6 +800,13 @@ int cxl_trigger_poison_list(struct cxl_memdev >> *cxlmd); int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa); >> int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa); >> >> +#if IS_ENABLED(CONFIG_CXL_RAS_FEATURES) >> +int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct >> +cxl_region *cxlr); #else static inline int >> +cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region >> +*cxlr) { return 0; } #endif >> + >> #ifdef CONFIG_CXL_SUSPEND >> void cxl_mem_active_inc(void); >> void cxl_mem_active_dec(void); >> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c index >> 2f03a4d5606e..d236b4b8a93c 100644 >> --- a/drivers/cxl/mem.c >> +++ b/drivers/cxl/mem.c >> @@ -116,6 +116,10 @@ static int cxl_mem_probe(struct device *dev) >> if (!cxlds->media_ready) >> return -EBUSY; >> >> + rc = cxl_mem_ras_features_init(cxlmd, NULL); >> + if (rc) >> + dev_warn(&cxlmd->dev, "CXL RAS features init failed\n"); >> + >> /* >> * Someone is trying to reattach this device after it lost its port >> * connection (an endpoint port previously registered by this memdev >> was @@ -259,3 +263,4 @@ >MODULE_ALIAS_CXL(CXL_DEVICE_MEMORY_EXPANDER); >> * endpoint registration. >> */ >> MODULE_SOFTDEP("pre: cxl_port"); >> +MODULE_SOFTDEP("pre: cxl_features"); > >Why? This dependency is no more required. During integration testing, this was added when cxl_features found was not getting initialized when CXL memdev RAS features are getting initialized, which calls features command function, cxl_get_supported_feature_entry, for the RAS features. The reason was different from this and got fixed. Thanks, Shiju
Shiju Jose wrote: > Hi Dan, > > Thanks for the comments. > > Please find reply inline. > > Thanks, > Shiju > >-----Original Message----- > >From: Dan Williams <dan.j.williams@intel.com> > >Sent: 24 January 2025 20:39 > >To: Shiju Jose <shiju.jose@huawei.com>; linux-edac@vger.kernel.org; linux- > >cxl@vger.kernel.org; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux- > >kernel@vger.kernel.org > >Cc: bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org; > >mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan > >Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com; > >alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com; > >david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com; > >Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com; > >Jon.Grimm@amd.com; dave.hansen@linux.intel.com; > >naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com; > >somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com; > >duenwen@google.com; gthelen@google.com; > >wschwartz@amperecomputing.com; dferguson@amperecomputing.com; > >wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei > ><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto > >Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com; > >wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm > ><linuxarm@huawei.com>; Shiju Jose <shiju.jose@huawei.com> > >Subject: Re: [PATCH v18 15/19] cxl/memfeature: Add CXL memory device patrol > >scrub control feature > > > >shiju.jose@ wrote: > >> From: Shiju Jose <shiju.jose@huawei.com> > >> > >> CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub > >> control feature. The device patrol scrub proactively locates and makes > >> corrections to errors in regular cycle. > >> > >> Allow specifying the number of hours within which the patrol scrub > >> must be completed, subject to minimum and maximum limits reported by the > >device. > >> Also allow disabling scrub allowing trade-off error rates against > >> performance. > >> > >> Add support for patrol scrub control on CXL memory devices. > >> Register with the EDAC device driver, which retrieves the scrub > >> attribute descriptors from EDAC scrub and exposes the sysfs scrub > >> control attributes to userspace. For example, scrub control for the > >> CXL memory device "cxl_mem0" is exposed in > >/sys/bus/edac/devices/cxl_mem0/scrubX/. > >> > >> Additionally, add support for region-based CXL memory patrol scrub control. > >> CXL memory regions may be interleaved across one or more CXL memory > >> devices. For example, region-based scrub control for "cxl_region1" is > >> exposed in /sys/bus/edac/devices/cxl_region1/scrubX/. > >> > >> Reviewed-by: Dave Jiang <dave.jiang@intel.com> > >> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > >> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> > >> --- > >> Documentation/edac/scrub.rst | 66 ++++++ > >> drivers/cxl/Kconfig | 17 ++ > >> drivers/cxl/core/Makefile | 1 + > >> drivers/cxl/core/memfeature.c | 392 > >++++++++++++++++++++++++++++++++++ > >> drivers/cxl/core/region.c | 6 + > >> drivers/cxl/cxlmem.h | 7 + > >> drivers/cxl/mem.c | 5 + > >> include/cxl/features.h | 16 ++ > >> 8 files changed, 510 insertions(+) > >> create mode 100644 drivers/cxl/core/memfeature.c diff --git > >> a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst index > >> f86645c7f0af..80e986c57885 100644 > >> --- a/Documentation/edac/scrub.rst > >> +++ b/Documentation/edac/scrub.rst [..] > > > >What is this content-free blob of cat and echo statements? Please write actual > >documentation with theory of operation, clarification of assumptions, rationale > >for defaults, guidance on changing defaults... > > Jonathan already replied. I disagree that any of that is useful to include without rationale, and if the rationale is already somewhere else then delete the multiple lines of showing how 'cat' and 'echo' work with sysfs. [..] > >> + depends on CXL_MEM > > > >Similar comment, and this also goes away if all of this just moves into the new > >cxl_features driver. > > Agree with Jonathan told in reply. These are RAS specific features for CXL memory devices and > thus added in memfeature.c Apoligies for this comment, I had meant to delete it along with some other commentary along this theme after thinking it through. I am now advocating that Dave drop his cxl_features driver altogether and mirror your approach. I.e. EDAC is registered from existing CXL drivers, and FWCTL can be registered against a cxl_memdev just like the fw_upload ABI. There was a concern that CXL needed a separate FWCTL driver in case distributions wanted to have a policy against FWCTL, but given CXL already has CONFIG_CXL_MEM_RAW_COMMANDS at compile-time and a wide array of CXL bus devices, a cxl_features device is an awkward fit. [..] > >> +static int cxl_ps_get_attrs(struct cxl_patrol_scrub_context *cxl_ps_ctx, > >> + struct cxl_memdev_ps_params *params) { > >> + struct cxl_memdev *cxlmd; > >> + u16 min_scrub_cycle = 0; > >> + int i, ret; > >> + > >> + if (cxl_ps_ctx->cxlr) { > >> + struct cxl_region *cxlr = cxl_ps_ctx->cxlr; > >> + struct cxl_region_params *p = &cxlr->params; > >> + > >> + for (i = p->interleave_ways - 1; i >= 0; i--) { > >> + struct cxl_endpoint_decoder *cxled = p->targets[i]; > > > >It looks like this is called directly as a callback from EDAC. Where is the locking > >that keeps cxl_ps_ctx->cxlr valid, or p->targets content stable? > Jonathan already replied. I could not find that comment? I *think* it's ok because when the region is in the probe state changes will not be made to this list, but it would be useful to at least have commentary to that effect. Protect against someone copying this code in isolation and not consider the context. [..] > >> + > >> +int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct > >> +cxl_region *cxlr) > > > >Please separate this into a memdev helper and a region helper. It is silly to have > >two arguments to a function where one is expected to be NULL at all times, and > >then have an if else statement inside that to effectively turn it back into 2 code > >paths. > > > >If there is code to be shared amongst those, make *that* the shared helper. > I added single function cxl_mem_ras_features_init() for both memdev and region based > scrubbing to reduce code size as there were feedbacks try reduce code size. "Succint" and "concise" does not necessarily mean less lines. I would greatly prefer a few more lines if it mines not outsourcing complexity to the calling context. Readable code means I do not need to wonder what: cxl_mem_ras_features_init(NULL, cxlr) ...means. I can just read devm_cxl_region_edac_register(cxlr), and know exactly what is happening without needing to lose my train of thought to go read what semantics cxl_mem_ras_features_init() is implementing. Note that all the other _init() calls in drivers/cxl/ (outside of module_init callbacks), are just purely init work, not object registration. Please keep that local style. > >> +{ > >> + struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES]; > >> + char cxl_dev_name[CXL_DEV_NAME_LEN]; > >> + int num_ras_features = 0; > >> + u8 scrub_inst = 0; > >> + int rc; > >> + > >> + rc = cxl_memdev_scrub_init(cxlmd, cxlr, > >&ras_features[num_ras_features], > >> + scrub_inst); > >> + if (rc < 0) > >> + return rc; > >> + > >> + scrub_inst++; > >> + num_ras_features++; > >> + > >> + if (cxlr) > >> + snprintf(cxl_dev_name, sizeof(cxl_dev_name), > >> + "cxl_region%d", cxlr->id); > > > >Why not pass dev_name(&cxlr->dev) directly? > Jonathan already replied. That was purely with the cxl_mem observation, cxlr can be passed directly. > > > >> + else > >> + snprintf(cxl_dev_name, sizeof(cxl_dev_name), > >> + "%s_%s", "cxl", dev_name(&cxlmd->dev)); > > > >Can a "cxl" directory be created so that the raw name can be used? In fact we already do something similar for CONFIG_HMEM_REPORTING (i.e. an "access%d" device to create a nameed directory of attributes) so it is a question for Boris if he wants to tolerate a parent "cxl" device to parent all CXL objects in EDAC. > > > >> + > >> + return edac_dev_register(&cxlmd->dev, cxl_dev_name, NULL, > >> + num_ras_features, ras_features); > > > >I'm so confused... a few lines down in this patch we have: > > > > rc = cxl_mem_ras_features_init(NULL, cxlr); > > > >...so how can this call to edac_dev_register() unconditionally de-reference > >@cxlmd? > Thanks for spotting this. It is a bug, need to fix. [..] > >> +EXPORT_SYMBOL_NS_GPL(cxl_mem_ras_features_init, "CXL"); > >> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > >> index b98b1ccffd1c..c2be70cd87f8 100644 > >> --- a/drivers/cxl/core/region.c > >> +++ b/drivers/cxl/core/region.c > >> @@ -3449,6 +3449,12 @@ static int cxl_region_probe(struct device *dev) > >> p->res->start, p->res->end, cxlr, > >> is_system_ram) > 0) > >> return 0; > >> + > >> + rc = cxl_mem_ras_features_init(NULL, cxlr); > >> + if (rc) > >> + dev_warn(&cxlr->dev, "CXL RAS features init for > >region_id=%d failed\n", > >> + cxlr->id); > > > >There is more to RAS than EDAC memory scrub so this message is misleading. It > >is also unnecessary because the driver continues to load and the admin, if they > >care, will notice that the EDAC attributes are missing. > This message was added for the debugging purpose in CXL driver. I will change to dev_dbg(). ...but also stop calling this functionality with the blanket term "RAS". It is "EDAC scrub and repair extensions to all the other RAS functionality the CXL subsystem handles directly", name it accordingly.
>-----Original Message----- >From: Dan Williams <dan.j.williams@intel.com> >Sent: 27 January 2025 23:17 >To: Shiju Jose <shiju.jose@huawei.com>; Dan Williams ><dan.j.williams@intel.com>; linux-edac@vger.kernel.org; linux- >cxl@vger.kernel.org; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux- >kernel@vger.kernel.org >Cc: bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org; >mchehab@kernel.org; dave@stgolabs.net; Jonathan Cameron ><jonathan.cameron@huawei.com>; dave.jiang@intel.com; >alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com; >david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com; >Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com; >Jon.Grimm@amd.com; dave.hansen@linux.intel.com; >naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com; >somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com; >duenwen@google.com; gthelen@google.com; >wschwartz@amperecomputing.com; dferguson@amperecomputing.com; >wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei ><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto >Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com; >wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm ><linuxarm@huawei.com> >Subject: RE: [PATCH v18 15/19] cxl/memfeature: Add CXL memory device patrol >scrub control feature > >Shiju Jose wrote: >> Hi Dan, >> >> Thanks for the comments. >> >> Please find reply inline. >> >> Thanks, >> Shiju >> >-----Original Message----- >> >From: Dan Williams <dan.j.williams@intel.com> >> >Sent: 24 January 2025 20:39 >> >To: Shiju Jose <shiju.jose@huawei.com>; linux-edac@vger.kernel.org; >> >linux- cxl@vger.kernel.org; linux-acpi@vger.kernel.org; >> >linux-mm@kvack.org; linux- kernel@vger.kernel.org >> >Cc: bp@alien8.de; tony.luck@intel.com; rafael@kernel.org; >> >lenb@kernel.org; mchehab@kernel.org; dan.j.williams@intel.com; >> >dave@stgolabs.net; Jonathan Cameron <jonathan.cameron@huawei.com>; >> >dave.jiang@intel.com; alison.schofield@intel.com; >> >vishal.l.verma@intel.com; ira.weiny@intel.com; david@redhat.com; >> >Vilas.Sridharan@amd.com; leo.duran@amd.com; >Yazen.Ghannam@amd.com; >> >rientjes@google.com; jiaqiyan@google.com; Jon.Grimm@amd.com; >> >dave.hansen@linux.intel.com; naoya.horiguchi@nec.com; >> >james.morse@arm.com; jthoughton@google.com; >somasundaram.a@hpe.com; >> >erdemaktas@google.com; pgonda@google.com; duenwen@google.com; >> >gthelen@google.com; wschwartz@amperecomputing.com; >> >dferguson@amperecomputing.com; wbs@os.amperecomputing.com; >> >nifan.cxl@gmail.com; tanxiaofei <tanxiaofei@huawei.com>; Zengtao (B) >> ><prime.zeng@hisilicon.com>; Roberto Sassu <roberto.sassu@huawei.com>; >> >kangkang.shen@futurewei.com; wanghuiqiang ><wanghuiqiang@huawei.com>; >> >Linuxarm <linuxarm@huawei.com>; Shiju Jose <shiju.jose@huawei.com> >> >Subject: Re: [PATCH v18 15/19] cxl/memfeature: Add CXL memory device >> >patrol scrub control feature >> > >> >shiju.jose@ wrote: >> >> From: Shiju Jose <shiju.jose@huawei.com> >> >> >> >> CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub >> >> control feature. The device patrol scrub proactively locates and >> >> makes corrections to errors in regular cycle. >> >> >> >> Allow specifying the number of hours within which the patrol scrub >> >> must be completed, subject to minimum and maximum limits reported >> >> by the >> >device. >> >> Also allow disabling scrub allowing trade-off error rates against >> >> performance. >> >> >> >> Add support for patrol scrub control on CXL memory devices. >> >> Register with the EDAC device driver, which retrieves the scrub >> >> attribute descriptors from EDAC scrub and exposes the sysfs scrub >> >> control attributes to userspace. For example, scrub control for the >> >> CXL memory device "cxl_mem0" is exposed in >> >/sys/bus/edac/devices/cxl_mem0/scrubX/. >> >> >> >> Additionally, add support for region-based CXL memory patrol scrub >control. >> >> CXL memory regions may be interleaved across one or more CXL memory >> >> devices. For example, region-based scrub control for "cxl_region1" >> >> is exposed in /sys/bus/edac/devices/cxl_region1/scrubX/. >> >> >> >> Reviewed-by: Dave Jiang <dave.jiang@intel.com> >> >> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >> >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >> >> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> >> >> --- >> >> Documentation/edac/scrub.rst | 66 ++++++ >> >> drivers/cxl/Kconfig | 17 ++ >> >> drivers/cxl/core/Makefile | 1 + >> >> drivers/cxl/core/memfeature.c | 392 >> >++++++++++++++++++++++++++++++++++ >> >> drivers/cxl/core/region.c | 6 + >> >> drivers/cxl/cxlmem.h | 7 + >> >> drivers/cxl/mem.c | 5 + >> >> include/cxl/features.h | 16 ++ >> >> 8 files changed, 510 insertions(+) create mode 100644 >> >> drivers/cxl/core/memfeature.c diff --git >> >> a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst index >> >> f86645c7f0af..80e986c57885 100644 >> >> --- a/Documentation/edac/scrub.rst >> >> +++ b/Documentation/edac/scrub.rst >[..] >> > >> >What is this content-free blob of cat and echo statements? Please >> >write actual documentation with theory of operation, clarification of >> >assumptions, rationale for defaults, guidance on changing defaults... >> >> Jonathan already replied. > >I disagree that any of that is useful to include without rationale, and if the >rationale is already somewhere else then delete the multiple lines of showing >how 'cat' and 'echo' work with sysfs. I will discuss with Jonathan on this how to modify. > >[..] >> >> + depends on CXL_MEM >> > >> >Similar comment, and this also goes away if all of this just moves >> >into the new cxl_features driver. >> >> Agree with Jonathan told in reply. These are RAS specific features >> for CXL memory devices and thus added in memfeature.c > >Apoligies for this comment, I had meant to delete it along with some other >commentary along this theme after thinking it through. > >I am now advocating that Dave drop his cxl_features driver altogether and >mirror your approach. I.e. EDAC is registered from existing CXL drivers, and >FWCTL can be registered against a cxl_memdev just like the fw_upload ABI. > >There was a concern that CXL needed a separate FWCTL driver in case >distributions wanted to have a policy against FWCTL, but given CXL already has >CONFIG_CXL_MEM_RAW_COMMANDS at compile-time and a wide array of CXL >bus devices, a cxl_features device is an awkward fit. Ok. > >[..] >> >> +static int cxl_ps_get_attrs(struct cxl_patrol_scrub_context *cxl_ps_ctx, >> >> + struct cxl_memdev_ps_params *params) { >> >> + struct cxl_memdev *cxlmd; >> >> + u16 min_scrub_cycle = 0; >> >> + int i, ret; >> >> + >> >> + if (cxl_ps_ctx->cxlr) { >> >> + struct cxl_region *cxlr = cxl_ps_ctx->cxlr; >> >> + struct cxl_region_params *p = &cxlr->params; >> >> + >> >> + for (i = p->interleave_ways - 1; i >= 0; i--) { >> >> + struct cxl_endpoint_decoder *cxled = p->targets[i]; >> > >> >It looks like this is called directly as a callback from EDAC. Where >> >is the locking that keeps cxl_ps_ctx->cxlr valid, or p->targets content stable? >> Jonathan already replied. > >I could not find that comment? I *think* it's ok because when the region is in the >probe state changes will not be made to this list, but it would be useful to at >least have commentary to that effect. Protect against someone copying this >code in isolation and not consider the context. Sure. Will do. > >[..] >> >> + >> >> +int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct >> >> +cxl_region *cxlr) >> > >> >Please separate this into a memdev helper and a region helper. It is >> >silly to have two arguments to a function where one is expected to be >> >NULL at all times, and then have an if else statement inside that to >> >effectively turn it back into 2 code paths. >> > >> >If there is code to be shared amongst those, make *that* the shared helper. >> I added single function cxl_mem_ras_features_init() for both memdev >> and region based scrubbing to reduce code size as there were feedbacks try >reduce code size. > >"Succint" and "concise" does not necessarily mean less lines. I would greatly >prefer a few more lines if it mines not outsourcing complexity to the calling >context. Readable code means I do not need to wonder >what: > > cxl_mem_ras_features_init(NULL, cxlr) > >...means. I can just read devm_cxl_region_edac_register(cxlr), and know exactly >what is happening without needing to lose my train of thought to go read what >semantics cxl_mem_ras_features_init() is implementing. > >Note that all the other _init() calls in drivers/cxl/ (outside of module_init >callbacks), are just purely init work, not object registration. Please keep that >local style. Sure. Will add separate functions for region based edac registration. > >> >> +{ >> >> + struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES]; >> >> + char cxl_dev_name[CXL_DEV_NAME_LEN]; >> >> + int num_ras_features = 0; >> >> + u8 scrub_inst = 0; >> >> + int rc; >> >> + >> >> + rc = cxl_memdev_scrub_init(cxlmd, cxlr, >> >&ras_features[num_ras_features], >> >> + scrub_inst); >> >> + if (rc < 0) >> >> + return rc; >> >> + >> >> + scrub_inst++; >> >> + num_ras_features++; >> >> + >> >> + if (cxlr) >> >> + snprintf(cxl_dev_name, sizeof(cxl_dev_name), >> >> + "cxl_region%d", cxlr->id); >> > >> >Why not pass dev_name(&cxlr->dev) directly? >> Jonathan already replied. > >That was purely with the cxl_mem observation, cxlr can be passed directly. Will check. > >> > >> >> + else >> >> + snprintf(cxl_dev_name, sizeof(cxl_dev_name), >> >> + "%s_%s", "cxl", dev_name(&cxlmd->dev)); >> > >> >Can a "cxl" directory be created so that the raw name can be used? > >In fact we already do something similar for CONFIG_HMEM_REPORTING (i.e. >an "access%d" device to create a nameed directory of attributes) so it is a >question for Boris if he wants to tolerate a parent "cxl" device to parent all CXL >objects in EDAC. > >> > >> >> + >> >> + return edac_dev_register(&cxlmd->dev, cxl_dev_name, NULL, >> >> + num_ras_features, ras_features); >> > >> >I'm so confused... a few lines down in this patch we have: >> > >> > rc = cxl_mem_ras_features_init(NULL, cxlr); >> > >> >...so how can this call to edac_dev_register() unconditionally >> >de-reference @cxlmd? >> Thanks for spotting this. It is a bug, need to fix. > > >[..] >> >> +EXPORT_SYMBOL_NS_GPL(cxl_mem_ras_features_init, "CXL"); >> >> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c >> >> index b98b1ccffd1c..c2be70cd87f8 100644 >> >> --- a/drivers/cxl/core/region.c >> >> +++ b/drivers/cxl/core/region.c >> >> @@ -3449,6 +3449,12 @@ static int cxl_region_probe(struct device *dev) >> >> p->res->start, p->res->end, cxlr, >> >> is_system_ram) > 0) >> >> return 0; >> >> + >> >> + rc = cxl_mem_ras_features_init(NULL, cxlr); >> >> + if (rc) >> >> + dev_warn(&cxlr->dev, "CXL RAS features init for >> >region_id=%d failed\n", >> >> + cxlr->id); >> > >> >There is more to RAS than EDAC memory scrub so this message is >> >misleading. It is also unnecessary because the driver continues to >> >load and the admin, if they care, will notice that the EDAC attributes are >missing. >> This message was added for the debugging purpose in CXL driver. I will change >to dev_dbg(). > >...but also stop calling this functionality with the blanket term "RAS". >It is "EDAC scrub and repair extensions to all the other RAS functionality the CXL >subsystem handles directly", name it accordingly. Sure. Will change. Thanks, Shiju
diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst index f86645c7f0af..80e986c57885 100644 --- a/Documentation/edac/scrub.rst +++ b/Documentation/edac/scrub.rst @@ -325,3 +325,69 @@ root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_d 10800 root@localhost:~# echo 0 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background + +2. CXL memory device patrol scrubber + +2.1 Device based scrubbing + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/min_cycle_duration + +3600 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/max_cycle_duration + +918000 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration + +43200 + +root@localhost:~# echo 54000 > /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration + +54000 + +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background + +1 + +root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background + +0 + +2.2. Region based scrubbing + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/min_cycle_duration + +3600 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/max_cycle_duration + +918000 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration + +43200 + +root@localhost:~# echo 54000 > /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration + +54000 + +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_region0/scrub0/enable_background + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/enable_background + +1 + +root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_region0/scrub0/enable_background + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/enable_background + +0 diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 0bc6a2cb8474..6078f02e883b 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -154,4 +154,21 @@ config CXL_FEATURES If unsure say 'y'. +config CXL_RAS_FEATURES + tristate "CXL: Memory RAS features" + depends on CXL_PCI + depends on CXL_MEM + depends on EDAC + help + The CXL memory RAS feature control is optional and allows host to + control the RAS features configurations of CXL Type 3 devices. + + It registers with the EDAC device subsystem to expose control + attributes of CXL memory device's RAS features to the user. + It provides interface functions to support configuring the CXL + memory device's RAS features. + Say 'y/m/n' to enable/disable control of the CXL.mem device's RAS features. + See section 8.2.9.9.11 of CXL 3.1 specification for the detailed + information of CXL memory device features. + endif diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile index 73b6348afd67..54baca513ecb 100644 --- a/drivers/cxl/core/Makefile +++ b/drivers/cxl/core/Makefile @@ -17,3 +17,4 @@ cxl_core-y += cdat.o cxl_core-y += features.o cxl_core-$(CONFIG_TRACING) += trace.o cxl_core-$(CONFIG_CXL_REGION) += region.o +cxl_core-$(CONFIG_CXL_RAS_FEATURES) += memfeature.o diff --git a/drivers/cxl/core/memfeature.c b/drivers/cxl/core/memfeature.c new file mode 100644 index 000000000000..77d1bf6ce45f --- /dev/null +++ b/drivers/cxl/core/memfeature.c @@ -0,0 +1,392 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * CXL memory RAS feature driver. + * + * Copyright (c) 2024 HiSilicon Limited. + * + * - Supports functions to configure RAS features of the + * CXL memory devices. + * - Registers with the EDAC device subsystem driver to expose + * the features sysfs attributes to the user for configuring + * CXL memory RAS feature. + */ + +#include <linux/cleanup.h> +#include <linux/edac.h> +#include <linux/limits.h> +#include <cxl/features.h> +#include <cxl.h> +#include <cxlmem.h> + +#define CXL_DEV_NUM_RAS_FEATURES 1 +#define CXL_DEV_HOUR_IN_SECS 3600 + +#define CXL_DEV_NAME_LEN 128 + +/* CXL memory patrol scrub control functions */ +struct cxl_patrol_scrub_context { + u8 instance; + u16 get_feat_size; + u16 set_feat_size; + u8 get_version; + u8 set_version; + u16 effects; + struct cxl_memdev *cxlmd; + struct cxl_region *cxlr; +}; + +/** + * struct cxl_memdev_ps_params - CXL memory patrol scrub parameter data structure. + * @enable: [IN & OUT] enable(1)/disable(0) patrol scrub. + * @scrub_cycle_changeable: [OUT] scrub cycle attribute of patrol scrub is changeable. + * @scrub_cycle_hrs: [IN] Requested patrol scrub cycle in hours. + * [OUT] Current patrol scrub cycle in hours. + * @min_scrub_cycle_hrs:[OUT] minimum patrol scrub cycle in hours supported. + */ +struct cxl_memdev_ps_params { + bool enable; + bool scrub_cycle_changeable; + u8 scrub_cycle_hrs; + u8 min_scrub_cycle_hrs; +}; + +enum cxl_scrub_param { + CXL_PS_PARAM_ENABLE, + CXL_PS_PARAM_SCRUB_CYCLE, +}; + +#define CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK BIT(0) +#define CXL_MEMDEV_PS_SCRUB_CYCLE_REALTIME_REPORT_CAP_MASK BIT(1) +#define CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK GENMASK(7, 0) +#define CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK GENMASK(15, 8) +#define CXL_MEMDEV_PS_FLAG_ENABLED_MASK BIT(0) + +struct cxl_memdev_ps_rd_attrs { + u8 scrub_cycle_cap; + __le16 scrub_cycle_hrs; + u8 scrub_flags; +} __packed; + +struct cxl_memdev_ps_wr_attrs { + u8 scrub_cycle_hrs; + u8 scrub_flags; +} __packed; + +static int cxl_mem_ps_get_attrs(struct cxl_mailbox *cxl_mbox, + struct cxl_memdev_ps_params *params) +{ + size_t rd_data_size = sizeof(struct cxl_memdev_ps_rd_attrs); + u16 scrub_cycle_hrs; + size_t data_size; + u16 return_code; + struct cxl_memdev_ps_rd_attrs *rd_attrs __free(kfree) = + kmalloc(rd_data_size, GFP_KERNEL); + if (!rd_attrs) + return -ENOMEM; + + data_size = cxl_get_feature(cxl_mbox->features, CXL_FEAT_PATROL_SCRUB_UUID, + CXL_GET_FEAT_SEL_CURRENT_VALUE, + rd_attrs, rd_data_size, 0, &return_code); + if (!data_size || return_code != CXL_MBOX_CMD_RC_SUCCESS) + return -EIO; + + params->scrub_cycle_changeable = FIELD_GET(CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK, + rd_attrs->scrub_cycle_cap); + params->enable = FIELD_GET(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, + rd_attrs->scrub_flags); + scrub_cycle_hrs = le16_to_cpu(rd_attrs->scrub_cycle_hrs); + params->scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, + scrub_cycle_hrs); + params->min_scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK, + scrub_cycle_hrs); + + return 0; +} + +static int cxl_ps_get_attrs(struct cxl_patrol_scrub_context *cxl_ps_ctx, + struct cxl_memdev_ps_params *params) +{ + struct cxl_memdev *cxlmd; + u16 min_scrub_cycle = 0; + int i, ret; + + if (cxl_ps_ctx->cxlr) { + struct cxl_region *cxlr = cxl_ps_ctx->cxlr; + struct cxl_region_params *p = &cxlr->params; + + for (i = p->interleave_ways - 1; i >= 0; i--) { + struct cxl_endpoint_decoder *cxled = p->targets[i]; + + cxlmd = cxled_to_memdev(cxled); + ret = cxl_mem_ps_get_attrs(&cxlmd->cxlds->cxl_mbox, params); + if (ret) + return ret; + + if (params->min_scrub_cycle_hrs > min_scrub_cycle) + min_scrub_cycle = params->min_scrub_cycle_hrs; + } + params->min_scrub_cycle_hrs = min_scrub_cycle; + return 0; + } + cxlmd = cxl_ps_ctx->cxlmd; + + return cxl_mem_ps_get_attrs(&cxlmd->cxlds->cxl_mbox, params); +} + +static int cxl_mem_ps_set_attrs(struct device *dev, + struct cxl_patrol_scrub_context *cxl_ps_ctx, + struct cxl_mailbox *cxl_mbox, + struct cxl_memdev_ps_params *params, + enum cxl_scrub_param param_type) +{ + struct cxl_memdev_ps_wr_attrs wr_attrs; + struct cxl_memdev_ps_params rd_params; + u16 return_code; + int ret; + + ret = cxl_mem_ps_get_attrs(cxl_mbox, &rd_params); + if (ret) { + dev_err(dev, "Get cxlmemdev patrol scrub params failed ret=%d\n", + ret); + return ret; + } + + switch (param_type) { + case CXL_PS_PARAM_ENABLE: + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, + params->enable); + wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, + rd_params.scrub_cycle_hrs); + break; + case CXL_PS_PARAM_SCRUB_CYCLE: + if (params->scrub_cycle_hrs < rd_params.min_scrub_cycle_hrs) { + dev_err(dev, "Invalid CXL patrol scrub cycle(%d) to set\n", + params->scrub_cycle_hrs); + dev_err(dev, "Minimum supported CXL patrol scrub cycle in hour %d\n", + rd_params.min_scrub_cycle_hrs); + return -EINVAL; + } + wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK, + params->scrub_cycle_hrs); + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK, + rd_params.enable); + break; + } + + ret = cxl_set_feature(cxl_mbox->features, CXL_FEAT_PATROL_SCRUB_UUID, + cxl_ps_ctx->set_version, + &wr_attrs, sizeof(wr_attrs), + CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET, + 0, &return_code); + if (ret || return_code != CXL_MBOX_CMD_RC_SUCCESS) { + dev_err(dev, "CXL patrol scrub set feature failed ret=%d return_code=%u\n", + ret, return_code); + return ret; + } + + return 0; +} + +static int cxl_ps_set_attrs(struct device *dev, + struct cxl_patrol_scrub_context *cxl_ps_ctx, + struct cxl_memdev_ps_params *params, + enum cxl_scrub_param param_type) +{ + struct cxl_memdev *cxlmd; + int ret, i; + + if (cxl_ps_ctx->cxlr) { + struct cxl_region *cxlr = cxl_ps_ctx->cxlr; + struct cxl_region_params *p = &cxlr->params; + + for (i = p->interleave_ways - 1; i >= 0; i--) { + struct cxl_endpoint_decoder *cxled = p->targets[i]; + + cxlmd = cxled_to_memdev(cxled); + ret = cxl_mem_ps_set_attrs(dev, cxl_ps_ctx, &cxlmd->cxlds->cxl_mbox, + params, param_type); + if (ret) + return ret; + } + return 0; + } + cxlmd = cxl_ps_ctx->cxlmd; + + return cxl_mem_ps_set_attrs(dev, cxl_ps_ctx, &cxlmd->cxlds->cxl_mbox, + params, param_type); +} + +static int cxl_patrol_scrub_get_enabled_bg(struct device *dev, void *drv_data, bool *enabled) +{ + struct cxl_patrol_scrub_context *ctx = drv_data; + struct cxl_memdev_ps_params params; + int ret; + + ret = cxl_ps_get_attrs(ctx, ¶ms); + if (ret) + return ret; + + *enabled = params.enable; + + return 0; +} + +static int cxl_patrol_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable) +{ + struct cxl_patrol_scrub_context *ctx = drv_data; + struct cxl_memdev_ps_params params = { + .enable = enable, + }; + + return cxl_ps_set_attrs(dev, ctx, ¶ms, CXL_PS_PARAM_ENABLE); +} + +static int cxl_patrol_scrub_read_min_scrub_cycle(struct device *dev, void *drv_data, + u32 *min) +{ + struct cxl_patrol_scrub_context *ctx = drv_data; + struct cxl_memdev_ps_params params; + int ret; + + ret = cxl_ps_get_attrs(ctx, ¶ms); + if (ret) + return ret; + *min = params.min_scrub_cycle_hrs * CXL_DEV_HOUR_IN_SECS; + + return 0; +} + +static int cxl_patrol_scrub_read_max_scrub_cycle(struct device *dev, void *drv_data, + u32 *max) +{ + *max = U8_MAX * CXL_DEV_HOUR_IN_SECS; /* Max set by register size */ + + return 0; +} + +static int cxl_patrol_scrub_read_scrub_cycle(struct device *dev, void *drv_data, + u32 *scrub_cycle_secs) +{ + struct cxl_patrol_scrub_context *ctx = drv_data; + struct cxl_memdev_ps_params params; + int ret; + + ret = cxl_ps_get_attrs(ctx, ¶ms); + if (ret) + return ret; + + *scrub_cycle_secs = params.scrub_cycle_hrs * CXL_DEV_HOUR_IN_SECS; + + return 0; +} + +static int cxl_patrol_scrub_write_scrub_cycle(struct device *dev, void *drv_data, + u32 scrub_cycle_secs) +{ + struct cxl_patrol_scrub_context *ctx = drv_data; + struct cxl_memdev_ps_params params = { + .scrub_cycle_hrs = scrub_cycle_secs / CXL_DEV_HOUR_IN_SECS, + }; + + return cxl_ps_set_attrs(dev, ctx, ¶ms, CXL_PS_PARAM_SCRUB_CYCLE); +} + +static const struct edac_scrub_ops cxl_ps_scrub_ops = { + .get_enabled_bg = cxl_patrol_scrub_get_enabled_bg, + .set_enabled_bg = cxl_patrol_scrub_set_enabled_bg, + .get_min_cycle = cxl_patrol_scrub_read_min_scrub_cycle, + .get_max_cycle = cxl_patrol_scrub_read_max_scrub_cycle, + .get_cycle_duration = cxl_patrol_scrub_read_scrub_cycle, + .set_cycle_duration = cxl_patrol_scrub_write_scrub_cycle, +}; + +static int cxl_memdev_scrub_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr, + struct edac_dev_feature *ras_feature, u8 scrub_inst) +{ + struct cxl_patrol_scrub_context *cxl_ps_ctx; + struct cxl_feat_entry *feat_entry; + struct cxl_mailbox *cxl_mbox; + struct cxl_dev_state *cxlds; + int i; + + if (cxlr) { + struct cxl_region_params *p = &cxlr->params; + + for (i = p->interleave_ways - 1; i >= 0; i--) { + struct cxl_endpoint_decoder *cxled = p->targets[i]; + + cxlmd = cxled_to_memdev(cxled); + cxlds = cxlmd->cxlds; + cxl_mbox = &cxlds->cxl_mbox; + feat_entry = cxl_get_supported_feature_entry(cxl_mbox->features, + &CXL_FEAT_PATROL_SCRUB_UUID); + if (IS_ERR(feat_entry)) + return -EOPNOTSUPP; + + if (!(le32_to_cpu(feat_entry->flags) & CXL_FEAT_ENTRY_FLAG_CHANGABLE)) + return -EOPNOTSUPP; + } + } else { + cxlds = cxlmd->cxlds; + cxl_mbox = &cxlds->cxl_mbox; + feat_entry = cxl_get_supported_feature_entry(cxl_mbox->features, + &CXL_FEAT_PATROL_SCRUB_UUID); + if (IS_ERR(feat_entry)) + return -EOPNOTSUPP; + + if (!(le32_to_cpu(feat_entry->flags) & CXL_FEAT_ENTRY_FLAG_CHANGABLE)) + return -EOPNOTSUPP; + } + + cxl_ps_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ps_ctx), GFP_KERNEL); + if (!cxl_ps_ctx) + return -ENOMEM; + + *cxl_ps_ctx = (struct cxl_patrol_scrub_context) { + .get_feat_size = le16_to_cpu(feat_entry->get_feat_size), + .set_feat_size = le16_to_cpu(feat_entry->set_feat_size), + .get_version = feat_entry->get_feat_ver, + .set_version = feat_entry->set_feat_ver, + .effects = le16_to_cpu(feat_entry->effects), + .instance = scrub_inst, + }; + if (cxlr) + cxl_ps_ctx->cxlr = cxlr; + else + cxl_ps_ctx->cxlmd = cxlmd; + + ras_feature->ft_type = RAS_FEAT_SCRUB; + ras_feature->instance = cxl_ps_ctx->instance; + ras_feature->scrub_ops = &cxl_ps_scrub_ops; + ras_feature->ctx = cxl_ps_ctx; + + return 0; +} + +int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr) +{ + struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES]; + char cxl_dev_name[CXL_DEV_NAME_LEN]; + int num_ras_features = 0; + u8 scrub_inst = 0; + int rc; + + rc = cxl_memdev_scrub_init(cxlmd, cxlr, &ras_features[num_ras_features], + scrub_inst); + if (rc < 0) + return rc; + + scrub_inst++; + num_ras_features++; + + if (cxlr) + snprintf(cxl_dev_name, sizeof(cxl_dev_name), + "cxl_region%d", cxlr->id); + else + snprintf(cxl_dev_name, sizeof(cxl_dev_name), + "%s_%s", "cxl", dev_name(&cxlmd->dev)); + + return edac_dev_register(&cxlmd->dev, cxl_dev_name, NULL, + num_ras_features, ras_features); +} +EXPORT_SYMBOL_NS_GPL(cxl_mem_ras_features_init, "CXL"); diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index b98b1ccffd1c..c2be70cd87f8 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -3449,6 +3449,12 @@ static int cxl_region_probe(struct device *dev) p->res->start, p->res->end, cxlr, is_system_ram) > 0) return 0; + + rc = cxl_mem_ras_features_init(NULL, cxlr); + if (rc) + dev_warn(&cxlr->dev, "CXL RAS features init for region_id=%d failed\n", + cxlr->id); + return devm_cxl_add_dax_region(cxlr); default: dev_dbg(&cxlr->dev, "unsupported region mode: %d\n", diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 55c55685cb39..2b02e47cd7e7 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -800,6 +800,13 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd); int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa); int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa); +#if IS_ENABLED(CONFIG_CXL_RAS_FEATURES) +int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr); +#else +static inline int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr) +{ return 0; } +#endif + #ifdef CONFIG_CXL_SUSPEND void cxl_mem_active_inc(void); void cxl_mem_active_dec(void); diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c index 2f03a4d5606e..d236b4b8a93c 100644 --- a/drivers/cxl/mem.c +++ b/drivers/cxl/mem.c @@ -116,6 +116,10 @@ static int cxl_mem_probe(struct device *dev) if (!cxlds->media_ready) return -EBUSY; + rc = cxl_mem_ras_features_init(cxlmd, NULL); + if (rc) + dev_warn(&cxlmd->dev, "CXL RAS features init failed\n"); + /* * Someone is trying to reattach this device after it lost its port * connection (an endpoint port previously registered by this memdev was @@ -259,3 +263,4 @@ MODULE_ALIAS_CXL(CXL_DEVICE_MEMORY_EXPANDER); * endpoint registration. */ MODULE_SOFTDEP("pre: cxl_port"); +MODULE_SOFTDEP("pre: cxl_features"); diff --git a/include/cxl/features.h b/include/cxl/features.h index adff3496b4be..d1d1c5b7efc1 100644 --- a/include/cxl/features.h +++ b/include/cxl/features.h @@ -60,6 +60,22 @@ struct cxl_mbox_get_sup_feats_in { u8 reserved[2]; } __packed; +/* Supported Feature Entry : Payload out attribute flags */ +#define CXL_FEAT_ENTRY_FLAG_CHANGABLE BIT(0) +#define CXL_FEAT_ENTRY_FLAG_DEEPEST_RESET_PERSISTENCE_MASK GENMASK(3, 1) +#define CXL_FEAT_ENTRY_FLAG_PERSIST_ACROSS_FIRMWARE_UPDATE BIT(4) +#define CXL_FEAT_ENTRY_FLAG_SUPPORT_DEFAULT_SELECTION BIT(5) +#define CXL_FEAT_ENTRY_FLAG_SUPPORT_SAVED_SELECTION BIT(6) + +enum cxl_feat_attr_value_persistence { + CXL_FEAT_ATTR_VALUE_PERSISTENCE_NONE, + CXL_FEAT_ATTR_VALUE_PERSISTENCE_CXL_RESET, + CXL_FEAT_ATTR_VALUE_PERSISTENCE_HOT_RESET, + CXL_FEAT_ATTR_VALUE_PERSISTENCE_WARM_RESET, + CXL_FEAT_ATTR_VALUE_PERSISTENCE_COLD_RESET, + CXL_FEAT_ATTR_VALUE_PERSISTENCE_MAX +}; + struct cxl_feat_entry { uuid_t uuid; __le16 id;