From patchwork Wed Nov 22 01:22:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13463826 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C83E117CE for ; Wed, 22 Nov 2023 01:22:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="RIV1C2O2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700616130; x=1732152130; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hZ0XT3zGZ13N2t/d/jXTzV4fQS2xTLRryKNuiKfor1c=; b=RIV1C2O2D9jckO/GiLK73+bmXCQPPz5DEPlPZbyu6SMXNSye9cf8X838 /HVNftcPqcdHtAcc4UA/Esgn+Pwgb2CdVu8KGbQxYp5EKLYrFp/3K49iV bugC2xD1ZfmcmucMlLV7wDyyd+dUMGItA6gjLyokMcSKD34V6+CB5aQ3k je9OTURU3kMn6uCQmsxfmogH/kxlsYEezk2ktORCjUr/K8gDY9aR1cxWY IpcARIgkpEpjbc1BbHdobbOpVf9SyXSoQ7mXiHAknqNTxMWhiqC7jj/4Q r3BrowHsh7311fwsgMwIi/pp+gCGfmJ/dDSJiF3z7WGb4fFlWOwzmtt/Y A==; X-IronPort-AV: E=McAfee;i="6600,9927,10901"; a="376988162" X-IronPort-AV: E=Sophos;i="6.04,217,1695711600"; d="scan'208";a="376988162" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Nov 2023 17:22:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10901"; a="760270762" X-IronPort-AV: E=Sophos;i="6.04,217,1695711600"; d="scan'208";a="760270762" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.90.75]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Nov 2023 17:22:08 -0800 From: alison.schofield@intel.com To: Vishal Verma Cc: Alison Schofield , nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org Subject: [ndctl PATCH v5 1/5] libcxl: add interfaces for GET_POISON_LIST mailbox commands Date: Tue, 21 Nov 2023 17:22:02 -0800 Message-Id: <22d01bd1af9af5370d1e35094176dbd66ef20dac.1700615159.git.alison.schofield@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Alison Schofield CXL devices maintain a list of locations that are poisoned or result in poison if the addresses are accessed by the host. Per the spec (CXL 3.1 8.2.9.9.4.1), the device returns the Poison List as a set of Media Error Records that include the source of the error, the starting device physical address and length. Trigger the retrieval of the poison list by writing to the memory device sysfs attribute: trigger_poison_list. The CXL driver only offers triggering per memdev, so the trigger by region interface offered here is a convenience API that triggers a poison list retrieval for each memdev contributing to a region. int cxl_memdev_trigger_poison_list(struct cxl_memdev *memdev); int cxl_region_trigger_poison_list(struct cxl_region *region); The resulting poison records are logged as kernel trace events named 'cxl_poison'. Signed-off-by: Alison Schofield --- cxl/lib/libcxl.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++ cxl/lib/libcxl.sym | 6 ++++++ cxl/libcxl.h | 2 ++ 3 files changed, 55 insertions(+) diff --git a/cxl/lib/libcxl.c b/cxl/lib/libcxl.c index af4ca44eae19..cc95c2d7c94a 100644 --- a/cxl/lib/libcxl.c +++ b/cxl/lib/libcxl.c @@ -1647,6 +1647,53 @@ CXL_EXPORT int cxl_memdev_disable_invalidate(struct cxl_memdev *memdev) return 0; } +CXL_EXPORT int cxl_memdev_trigger_poison_list(struct cxl_memdev *memdev) +{ + struct cxl_ctx *ctx = cxl_memdev_get_ctx(memdev); + char *path = memdev->dev_buf; + int len = memdev->buf_len, rc; + + if (snprintf(path, len, "%s/trigger_poison_list", + memdev->dev_path) >= len) { + err(ctx, "%s: buffer too small\n", + cxl_memdev_get_devname(memdev)); + return -ENXIO; + } + rc = sysfs_write_attr(ctx, path, "1\n"); + if (rc < 0) { + fprintf(stderr, + "%s: Failed write sysfs attr trigger_poison_list\n", + cxl_memdev_get_devname(memdev)); + return rc; + } + return 0; +} + +CXL_EXPORT int cxl_region_trigger_poison_list(struct cxl_region *region) +{ + struct cxl_memdev_mapping *mapping; + int rc; + + cxl_mapping_foreach(region, mapping) { + struct cxl_decoder *decoder; + struct cxl_memdev *memdev; + + decoder = cxl_mapping_get_decoder(mapping); + if (!decoder) + continue; + + memdev = cxl_decoder_get_memdev(decoder); + if (!memdev) + continue; + + rc = cxl_memdev_trigger_poison_list(memdev); + if (rc) + return rc; + } + + return 0; +} + CXL_EXPORT int cxl_memdev_enable(struct cxl_memdev *memdev) { struct cxl_ctx *ctx = cxl_memdev_get_ctx(memdev); diff --git a/cxl/lib/libcxl.sym b/cxl/lib/libcxl.sym index 8fa1cca3d0d7..277b7e21d6a6 100644 --- a/cxl/lib/libcxl.sym +++ b/cxl/lib/libcxl.sym @@ -264,3 +264,9 @@ global: cxl_memdev_update_fw; cxl_memdev_cancel_fw_update; } LIBCXL_5; + +LIBCXL_7 { +global: + cxl_memdev_trigger_poison_list; + cxl_region_trigger_poison_list; +} LIBCXL_6; diff --git a/cxl/libcxl.h b/cxl/libcxl.h index 0f4f4b2648fb..ecdffe36df2c 100644 --- a/cxl/libcxl.h +++ b/cxl/libcxl.h @@ -460,6 +460,8 @@ enum cxl_setpartition_mode { int cxl_cmd_partition_set_mode(struct cxl_cmd *cmd, enum cxl_setpartition_mode mode); +int cxl_memdev_trigger_poison_list(struct cxl_memdev *memdev); +int cxl_region_trigger_poison_list(struct cxl_region *region); #ifdef __cplusplus } /* extern "C" */ From patchwork Wed Nov 22 01:22:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13463827 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F1B017CD for ; Wed, 22 Nov 2023 01:22:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="UwCeSnUf" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700616132; x=1732152132; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qw6ErE+N/Zs5mIaQY8NIOuKAGUcYqneo1GoEkWUjLLw=; b=UwCeSnUfsAlUDjx9xG99FedqGhFvqG991vCa0QxavMDkYFHuLG+5QFVS W2nE3zeu/JVeswufsNEwNIiez9xChsR0Nkt8MQ2+Ndt0RkIJN3tmkU+Cm Qjv/gP61gxgl0YoRDOfVFKtDreSoLLCn/9sxI4Ldh8iUMuFeLNW4w+/xH 2jZrtsAXj5JAx4+9cRZD/Yh0O6UWdIqK8W1ZU9GqlBsPk4NnWGq9CY5Qb S0uu/nAQt3uigvtUtSUAQLOT7FSP7ES8nVmlBCjtpGy/7lbtFdCPRuyTP XRwnh15eKK+3RrMh+zGp/tKMWOa/Y7o2mydWPl+owMIbKC2FfCdafCTDx Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10901"; a="376988164" X-IronPort-AV: E=Sophos;i="6.04,217,1695711600"; d="scan'208";a="376988164" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Nov 2023 17:22:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10901"; a="760270765" X-IronPort-AV: E=Sophos;i="6.04,217,1695711600"; d="scan'208";a="760270765" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.90.75]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Nov 2023 17:22:09 -0800 From: alison.schofield@intel.com To: Vishal Verma Cc: Alison Schofield , nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, Jonathan Cameron Subject: [ndctl PATCH v5 2/5] cxl: add an optional pid check to event parsing Date: Tue, 21 Nov 2023 17:22:03 -0800 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Alison Schofield When parsing CXL events, callers may only be interested in events that originate from the current process. Introduce an optional argument to the event trace context: event_pid. When event_pid is present, only include events with a matching pid in the returned JSON list. It is not a failure to see other, non matching results. Simply skip those. The initial use case for this is device poison listings where only the poison error records requested by this process are wanted. Signed-off-by: Alison Schofield Reviewed-by: Jonathan Cameron --- cxl/event_trace.c | 5 +++++ cxl/event_trace.h | 1 + 2 files changed, 6 insertions(+) diff --git a/cxl/event_trace.c b/cxl/event_trace.c index db8cc85f0b6f..269060898118 100644 --- a/cxl/event_trace.c +++ b/cxl/event_trace.c @@ -208,6 +208,11 @@ static int cxl_event_parse(struct tep_event *event, struct tep_record *record, return 0; } + if (event_ctx->event_pid) { + if (event_ctx->event_pid != tep_data_pid(event->tep, record)) + return 0; + } + if (event_ctx->parse_event) return event_ctx->parse_event(event, record, &event_ctx->jlist_head); diff --git a/cxl/event_trace.h b/cxl/event_trace.h index ec6267202c8b..7f7773b2201f 100644 --- a/cxl/event_trace.h +++ b/cxl/event_trace.h @@ -15,6 +15,7 @@ struct event_ctx { const char *system; struct list_head jlist_head; const char *event_name; /* optional */ + int event_pid; /* optional */ int (*parse_event)(struct tep_event *event, struct tep_record *record, struct list_head *jlist_head); /* optional */ }; From patchwork Wed Nov 22 01:22:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13463828 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8AE917CB for ; Wed, 22 Nov 2023 01:22:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jT+31iJy" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700616132; x=1732152132; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1HiFS6FENpqAFJKMqcAIx2DJoKRQ3U0cKHV50rxrN7o=; b=jT+31iJy2dvj6qERVP4MHw03ANFC6qZxCehw3gTScCFkYCq/ISgObl/S KaRn4dgHuM32pMlEBGGuITY9WsuypKox3wvt5WW91xZSBLVFwsvC//1mA oCUFGYxu65z0h6gzUGvZTrYDu8gBS55zNbVAAAWyIQWK9YiPBxAPEvI8K rG3DFWqpSJd4ndVNAdcsFVXsrr6vfnvfm7BXvnnlSNhduIhUR9DRG56XK XbOg6bGxIVay3RBYknj9+NFaVjULhof9Mv2mnKJ2m09UYbphUaEKdS8Hj AOGgPTtVL0uS4a1Npb1Hz6+xzUlB5vzThAS3Q+dy8KYRC8RKhmvnAoyDr w==; X-IronPort-AV: E=McAfee;i="6600,9927,10901"; a="376988170" X-IronPort-AV: E=Sophos;i="6.04,217,1695711600"; d="scan'208";a="376988170" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Nov 2023 17:22:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10901"; a="760270769" X-IronPort-AV: E=Sophos;i="6.04,217,1695711600"; d="scan'208";a="760270769" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.90.75]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Nov 2023 17:22:09 -0800 From: alison.schofield@intel.com To: Vishal Verma Cc: Alison Schofield , nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org Subject: [ndctl PATCH v5 3/5] cxl/list: collect and parse the poison list records Date: Tue, 21 Nov 2023 17:22:04 -0800 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Alison Schofield Poison list records are logged as events in the kernel tracing subsystem. To prepare the poison list for cxl list, enable tracing, trigger the poison list read, and parse the generated cxl_poison events into a json representation. Signed-off-by: Alison Schofield --- cxl/json.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 211 insertions(+) diff --git a/cxl/json.c b/cxl/json.c index 7678d02020b6..6fb17582a1cb 100644 --- a/cxl/json.c +++ b/cxl/json.c @@ -2,15 +2,19 @@ // Copyright (C) 2015-2021 Intel Corporation. All rights reserved. #include #include +#include #include #include #include #include #include +#include +#include #include "filter.h" #include "json.h" #include "../daxctl/json.h" +#include "event_trace.h" #define CXL_FW_VERSION_STR_LEN 16 #define CXL_FW_MAX_SLOTS 4 @@ -571,6 +575,201 @@ err_jobj: return NULL; } +/* CXL Spec 3.1 Table 8-140 Media Error Record */ +#define CXL_POISON_SOURCE_UNKNOWN 0 +#define CXL_POISON_SOURCE_EXTERNAL 1 +#define CXL_POISON_SOURCE_INTERNAL 2 +#define CXL_POISON_SOURCE_INJECTED 3 +#define CXL_POISON_SOURCE_VENDOR 7 + +/* CXL Spec 3.1 Table 8-139 Get Poison List Output Payload */ +#define CXL_POISON_FLAG_MORE BIT(0) +#define CXL_POISON_FLAG_OVERFLOW BIT(1) +#define CXL_POISON_FLAG_SCANNING BIT(2) + +static struct json_object * +util_cxl_poison_events_to_json(struct tracefs_instance *inst, + const char *region_name, unsigned long flags) +{ + struct json_object *jerrors, *jpoison, *jobj = NULL; + struct jlist_node *jnode, *next; + struct event_ctx ectx = { + .event_name = "cxl_poison", + .event_pid = getpid(), + .system = "cxl", + }; + int rc, count = 0; + + list_head_init(&ectx.jlist_head); + rc = cxl_parse_events(inst, &ectx); + if (rc < 0) { + fprintf(stderr, "Failed to parse events: %d\n", rc); + return NULL; + } + /* Add nr_records:0 to json */ + if (list_empty(&ectx.jlist_head)) + goto out; + + jerrors = json_object_new_array(); + if (!jerrors) + return NULL; + + list_for_each_safe(&ectx.jlist_head, jnode, next, list) { + struct json_object *jp, *jval; + int source, pflags = 0; + u64 addr, len; + + jp = json_object_new_object(); + if (!jp) + return NULL; + + /* Skip records not in this region when listing by region */ + if (json_object_object_get_ex(jnode->jobj, "region", &jval)) { + const char *name; + + name = json_object_get_string(jval); + if ((region_name) && (strcmp(region_name, name) != 0)) + continue; + + if (strlen(name)) + json_object_object_add(jp, "region", jval); + } + + /* Memdev name is only needed when listing by region */ + if (region_name) { + if (json_object_object_get_ex(jnode->jobj, "memdev", + &jval)) + json_object_object_add(jp, "memdev", jval); + } + + /* + * When listing by memdev, region names and valid HPAs + * will appear if the poisoned address is part of a region. + * Pick up those valid region names and HPAs and ignore + * any empties and invalids. + */ + + if (json_object_object_get_ex(jnode->jobj, "hpa", &jval)) { + addr = json_object_get_uint64(jval); + if (addr != ULLONG_MAX) { + jobj = util_json_object_hex(addr, flags); + json_object_object_add(jp, "hpa", jobj); + } + } + if (json_object_object_get_ex(jnode->jobj, "dpa", &jval)) { + addr = json_object_get_int64(jval); + jobj = util_json_object_hex(addr, flags); + json_object_object_add(jp, "dpa", jobj); + } + if (json_object_object_get_ex(jnode->jobj, "dpa_length", &jval)) { + len = json_object_get_int64(jval); + jobj = util_json_object_size(len, flags); + json_object_object_add(jp, "dpa_length", jobj); + } + if (json_object_object_get_ex(jnode->jobj, "source", &jval)) { + source = json_object_get_int(jval); + switch (source) { + case CXL_POISON_SOURCE_UNKNOWN: + jobj = json_object_new_string("Unknown"); + break; + case CXL_POISON_SOURCE_EXTERNAL: + jobj = json_object_new_string("External"); + break; + case CXL_POISON_SOURCE_INTERNAL: + jobj = json_object_new_string("Internal"); + break; + case CXL_POISON_SOURCE_INJECTED: + jobj = json_object_new_string("Injected"); + break; + case CXL_POISON_SOURCE_VENDOR: + jobj = json_object_new_string("Vendor"); + break; + default: + jobj = json_object_new_string("Reserved"); + } + json_object_object_add(jp, "source", jobj); + } + if (json_object_object_get_ex(jnode->jobj, "flags", &jval)) + pflags = json_object_get_int(jval); + + if (pflags) { + char flag_str[32] = { '\0' }; + + if (pflags & CXL_POISON_FLAG_MORE) + strcat(flag_str, "More,"); + if (pflags & CXL_POISON_FLAG_OVERFLOW) + strcat(flag_str, "Overflow,"); + if (pflags & CXL_POISON_FLAG_SCANNING) + strcat(flag_str, "Scanning,"); + jobj = json_object_new_string(flag_str); + if (jobj) + json_object_object_add(jp, "flags", jobj); + } + if (json_object_object_get_ex(jnode->jobj, "overflow_t", &jval)) + json_object_object_add(jp, "overflow_time", jval); + + json_object_array_add(jerrors, jp); + count++; + } /* list_for_each_safe */ + +out: + jpoison = json_object_new_object(); + if (!jpoison) + return NULL; + + /* Always include the count. If count is zero, no records follow. */ + jobj = json_object_new_int(count); + if (jobj) + json_object_object_add(jpoison, "nr_records", jobj); + if (count) + json_object_object_add(jpoison, "records", jerrors); + + return jpoison; +} + +static struct json_object * +util_cxl_poison_list_to_json(struct cxl_region *region, + struct cxl_memdev *memdev, + unsigned long flags) +{ + struct json_object *jpoison = NULL; + struct tracefs_instance *inst; + const char *region_name; + int rc; + + inst = tracefs_instance_create("cxl list"); + if (!inst) { + fprintf(stderr, "tracefs_instance_create() failed\n"); + return NULL; + } + + rc = cxl_event_tracing_enable(inst, "cxl", "cxl_poison"); + if (rc < 0) { + fprintf(stderr, "Failed to enable trace: %d\n", rc); + goto err_free; + } + + if (region) + rc = cxl_region_trigger_poison_list(region); + else + rc = cxl_memdev_trigger_poison_list(memdev); + if (rc) + goto err_free; + + rc = cxl_event_tracing_disable(inst); + if (rc < 0) { + fprintf(stderr, "Failed to disable trace: %d\n", rc); + goto err_free; + } + + region_name = region ? cxl_region_get_devname(region) : NULL; + jpoison = util_cxl_poison_events_to_json(inst, region_name, flags); + +err_free: + tracefs_instance_free(inst); + return jpoison; +} + struct json_object *util_cxl_memdev_to_json(struct cxl_memdev *memdev, unsigned long flags) { @@ -649,6 +848,12 @@ struct json_object *util_cxl_memdev_to_json(struct cxl_memdev *memdev, json_object_object_add(jdev, "firmware", jobj); } + if (flags & UTIL_JSON_MEDIA_ERRORS) { + jobj = util_cxl_poison_list_to_json(NULL, memdev, flags); + if (jobj) + json_object_object_add(jdev, "poison", jobj); + } + json_object_set_userdata(jdev, memdev, NULL); return jdev; } @@ -987,6 +1192,12 @@ struct json_object *util_cxl_region_to_json(struct cxl_region *region, json_object_object_add(jregion, "state", jobj); } + if (flags & UTIL_JSON_MEDIA_ERRORS) { + jobj = util_cxl_poison_list_to_json(region, NULL, flags); + if (jobj) + json_object_object_add(jregion, "poison", jobj); + } + util_cxl_mappings_append_json(jregion, region, flags); if (flags & UTIL_JSON_DAX) { From patchwork Wed Nov 22 01:22:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13463829 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F304917D2 for ; Wed, 22 Nov 2023 01:22:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="XFqABM1g" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700616133; x=1732152133; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hCQ70vdYjxDHwuJdyELr8v4z/AkpiRccZ5YJoLFptiU=; b=XFqABM1gmajP/ch8hpkKuTeZW4tEeYmS123Ome3giccy/Lo9ZjQB9Zuc STTZtyqGOBkUuPLa7sY4xvmxiVx4R/HB11KKG/l09cWltovesGhzigpz6 jZf7/vEiGA5wIC8vg7qhBeig0Cj62HQJIrh5tVPxTjCv+cpn9Z5ydck9R QrVg64nuBAPhEp3jqfizvIN5RiyGTV1fUCvDsX4a4GWNGjsHkyuunTWQI HkxKCPNZFJCC9IFWyQTXNZ2Y+mSnUSymKT2drEnWnxmbeAWEkCf6bYCvf Iigh+mANgrA6Uu5xociZfYNZvS5yQ7S3WHD7D3dUHgt+xsV2ycooPkvn6 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10901"; a="376988174" X-IronPort-AV: E=Sophos;i="6.04,217,1695711600"; d="scan'208";a="376988174" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Nov 2023 17:22:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10901"; a="760270772" X-IronPort-AV: E=Sophos;i="6.04,217,1695711600"; d="scan'208";a="760270772" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.90.75]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Nov 2023 17:22:10 -0800 From: alison.schofield@intel.com To: Vishal Verma Cc: Alison Schofield , nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org Subject: [ndctl PATCH v5 4/5] cxl/list: add --poison option to cxl list Date: Tue, 21 Nov 2023 17:22:05 -0800 Message-Id: <216ab396ab0c34fc391d1c3d3797a0d832a8d563.1700615159.git.alison.schofield@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Alison Schofield The --poison option to 'cxl list' retrieves poison lists from memory devices supporting the capability and displays the returned poison records in the cxl list json. This option can apply to memdevs or regions. Example usage in the Documentation/cxl/cxl-list.txt update. Signed-off-by: Alison Schofield --- Documentation/cxl/cxl-list.txt | 58 ++++++++++++++++++++++++++++++++++ cxl/filter.h | 3 ++ cxl/list.c | 2 ++ 3 files changed, 63 insertions(+) diff --git a/Documentation/cxl/cxl-list.txt b/Documentation/cxl/cxl-list.txt index 838de4086678..ee2f1b2d9fae 100644 --- a/Documentation/cxl/cxl-list.txt +++ b/Documentation/cxl/cxl-list.txt @@ -415,6 +415,64 @@ OPTIONS --region:: Specify CXL region device name(s), or device id(s), to filter the listing. +-L:: +--poison:: + Include poison information. The poison list is retrieved from the + device(s) and poison records are added to the listing. Apply this + option to memdevs and regions where devices support the poison + list capability. + +---- +# cxl list -m mem11 --poison +[ + { + "memdev":"mem11", + "pmem_size":268435456, + "ram_size":0, + "serial":0, + "host":"0000:37:00.0", + "poison":{ + "nr_records":1, + "records":[ + { + "dpa":0, + "dpa_length":64, + "source":"Internal", + } + ] + } + } +] +# cxl list -r region5 --poison +[ + { + "region":"region5", + "resource":1035623989248, + "size":2147483648, + "interleave_ways":2, + "interleave_granularity":4096, + "decode_state":"commit", + "poison":{ + "nr_records":2, + "records":[ + { + "memdev":"mem2", + "dpa":0, + "dpa_length":64, + "source":"Internal", + }, + { + "memdev":"mem5", + "dpa":0, + "length":512, + "source":"Vendor", + } + ] + } + } +] +---- + -v:: --verbose:: Increase verbosity of the output. This can be specified diff --git a/cxl/filter.h b/cxl/filter.h index 3f65990f835a..1241f72ccf62 100644 --- a/cxl/filter.h +++ b/cxl/filter.h @@ -30,6 +30,7 @@ struct cxl_filter_params { bool fw; bool alert_config; bool dax; + bool poison; int verbose; struct log_ctx ctx; }; @@ -88,6 +89,8 @@ static inline unsigned long cxl_filter_to_flags(struct cxl_filter_params *param) flags |= UTIL_JSON_ALERT_CONFIG; if (param->dax) flags |= UTIL_JSON_DAX | UTIL_JSON_DAX_DEVS; + if (param->poison) + flags |= UTIL_JSON_MEDIA_ERRORS; return flags; } diff --git a/cxl/list.c b/cxl/list.c index 93ba51ef895c..13fef8569340 100644 --- a/cxl/list.c +++ b/cxl/list.c @@ -57,6 +57,8 @@ static const struct option options[] = { "include memory device firmware information"), OPT_BOOLEAN('A', "alert-config", ¶m.alert_config, "include alert configuration information"), + OPT_BOOLEAN('L', "poison", ¶m.poison, + "include poison information "), OPT_INCR('v', "verbose", ¶m.verbose, "increase output detail"), #ifdef ENABLE_DEBUG OPT_BOOLEAN(0, "debug", &debug, "debug list walk"), From patchwork Wed Nov 22 01:22:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13463830 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E50EB1FB6 for ; Wed, 22 Nov 2023 01:22:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="XL7MIqfe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700616134; x=1732152134; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3+2yS6/E7bkwNUoNoxnhBZrwU9cuGIK24iZRY1XGgZY=; b=XL7MIqfeMf5uNiPRMIlhRMtvFA5ZZXNrIXGOUTBRnd74YM80nxGZeCJl 4C/EmSLi8gYIPRWJGzRCvkvXzklnfzzaViIqI5CGGMzkfSLRyWTms5yyg EdS7kdXTs60OHAocjnQp+JEOeivP/ctUOkwrSRCSfmnsRtQzJquv+imAT D3Gc0x+cYjQPX/S0vhv7uwB86JzCeA/hOB0fkCwsnvC/zs+iy4RtRM8hk adI3CqjNVpdY+HL6j+LNePbml7mgLWpb+Z0Zr4x7DVecLAAT/mF19VUZy 367wia1ZFWWN9iipI+UbLHqUqXb/T04v5MdHAPXly+Y7s/NdFSahHV+SZ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10901"; a="376988177" X-IronPort-AV: E=Sophos;i="6.04,217,1695711600"; d="scan'208";a="376988177" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Nov 2023 17:22:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10901"; a="760270775" X-IronPort-AV: E=Sophos;i="6.04,217,1695711600"; d="scan'208";a="760270775" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.90.75]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Nov 2023 17:22:10 -0800 From: alison.schofield@intel.com To: Vishal Verma Cc: Alison Schofield , nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org Subject: [ndctl PATCH v5 5/5] cxl/test: add cxl-poison.sh unit test Date: Tue, 21 Nov 2023 17:22:06 -0800 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Alison Schofield Exercise cxl list, libcxl, and driver pieces of the get poison list pathway. Inject and clear poison using debugfs and use cxl-cli to read the poison list by memdev and by region. Signed-off-by: Alison Schofield --- test/cxl-poison.sh | 158 +++++++++++++++++++++++++++++++++++++++++++++ test/meson.build | 2 + 2 files changed, 160 insertions(+) create mode 100644 test/cxl-poison.sh diff --git a/test/cxl-poison.sh b/test/cxl-poison.sh new file mode 100644 index 000000000000..8747ffe8cff7 --- /dev/null +++ b/test/cxl-poison.sh @@ -0,0 +1,158 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2023 Intel Corporation. All rights reserved. + +. "$(dirname "$0")"/common + +rc=77 + +set -ex + +trap 'err $LINENO' ERR + +check_prereq "jq" + +modprobe -r cxl_test +modprobe cxl_test + +rc=1 + +# THEORY OF OPERATION: Exercise cxl-cli and cxl driver ability to +# inject, clear, and get the poison list. Do it by memdev and by region. +# Based on current cxl-test topology. + +find_memdev() +{ + readarray -t capable_mems < <("$CXL" list -b "$CXL_TEST_BUS" -M | + jq -r ".[] | select(.pmem_size != null) | + select(.ram_size != null) | .memdev") + + if [ ${#capable_mems[@]} == 0 ]; then + echo "no memdevs found for test" + err "$LINENO" + fi + + memdev=${capable_mems[0]} +} + +create_x2_region() +{ + # Find an x2 decoder + decoder="$($CXL list -b "$CXL_TEST_BUS" -D -d root | jq -r ".[] | + select(.pmem_capable == true) | + select(.nr_targets == 2) | + .decoder")" + + # Find a memdev for each host-bridge interleave position + port_dev0="$($CXL list -T -d "$decoder" | jq -r ".[] | + .targets | .[] | select(.position == 0) | .target")" + port_dev1="$($CXL list -T -d "$decoder" | jq -r ".[] | + .targets | .[] | select(.position == 1) | .target")" + mem0="$($CXL list -M -p "$port_dev0" | jq -r ".[0].memdev")" + mem1="$($CXL list -M -p "$port_dev1" | jq -r ".[0].memdev")" + + region="$($CXL create-region -d "$decoder" -m "$mem0" "$mem1" | + jq -r ".region")" + if [[ ! $region ]]; then + echo "create-region failed for $decoder" + err "$LINENO" + fi + echo "$region" +} + +# When cxl-cli support for inject and clear arrives, replace +# the writes to /sys/kernel/debug with the new cxl commands. + +inject_poison_sysfs() +{ + memdev="$1" + addr="$2" + + echo "$addr" > /sys/kernel/debug/cxl/"$memdev"/inject_poison +} + +clear_poison_sysfs() +{ + memdev="$1" + addr="$2" + + echo "$addr" > /sys/kernel/debug/cxl/"$memdev"/clear_poison +} + +validate_region_poison() +{ + region="$1" + nr_expect="$2" + + poison_list="$($CXL list -r "$region" --poison | jq -r '.[].poison')" + + nr_found="$(jq -r ".nr_records" <<< "$poison_list")" + if [ "$nr_found" -ne "$nr_expect" ]; then + echo "$nr_expect poison records expected, $nr_found found" + err "$LINENO" + fi + + if [[ "$nr_expect" == 0 ]]; then + return + fi + + # Make sure region name format stays sane + region_found="$(jq -r ".records | .[0] | .region" <<< "$poison_list")" + if [[ "$region_found" != "$region" ]]; then + echo "$region expected, $region_found found" + err "$LINENO" + fi +} + +validate_memdev_poison() +{ + memdev="$1" + nr_expect="$2" + + nr_found="$("$CXL" list -m "$memdev" --poison | + jq -r '.[].poison.nr_records')" + if [ "$nr_found" -ne "$nr_expect" ]; then + echo "$nr_expect poison records expected, $nr_found found" + err "$LINENO" + fi +} + +test_poison_by_memdev() +{ + find_memdev + inject_poison_sysfs "$memdev" "0x40000000" + inject_poison_sysfs "$memdev" "0x40001000" + inject_poison_sysfs "$memdev" "0x600" + inject_poison_sysfs "$memdev" "0x0" + validate_memdev_poison "$memdev" 4 + + clear_poison_sysfs "$memdev" "0x40000000" + clear_poison_sysfs "$memdev" "0x40001000" + clear_poison_sysfs "$memdev" "0x600" + clear_poison_sysfs "$memdev" "0x0" + validate_memdev_poison "$memdev" 0 +} + +test_poison_by_region() +{ + create_x2_region + inject_poison_sysfs "$mem0" "0x40000000" + inject_poison_sysfs "$mem1" "0x40000000" + validate_region_poison "$region" 2 + + clear_poison_sysfs "$mem0" "0x40000000" + clear_poison_sysfs "$mem1" "0x40000000" + validate_region_poison "$region" 0 +} + +# Turn tracing on. Note that 'cxl list --poison' does toggle the tracing. +# Turning it on here allows the test user to also view inject and clear +# trace events. +echo 1 > /sys/kernel/tracing/events/cxl/cxl_poison/enable + +test_poison_by_memdev +test_poison_by_region + +check_dmesg "$LINENO" + +modprobe -r cxl-test diff --git a/test/meson.build b/test/meson.build index 224adaf41fcc..2706fa5d633c 100644 --- a/test/meson.build +++ b/test/meson.build @@ -157,6 +157,7 @@ cxl_create_region = find_program('cxl-create-region.sh') cxl_xor_region = find_program('cxl-xor-region.sh') cxl_update_firmware = find_program('cxl-update-firmware.sh') cxl_events = find_program('cxl-events.sh') +cxl_poison = find_program('cxl-poison.sh') tests = [ [ 'libndctl', libndctl, 'ndctl' ], @@ -186,6 +187,7 @@ tests = [ [ 'cxl-create-region.sh', cxl_create_region, 'cxl' ], [ 'cxl-xor-region.sh', cxl_xor_region, 'cxl' ], [ 'cxl-events.sh', cxl_events, 'cxl' ], + [ 'cxl-poison.sh', cxl_poison, 'cxl' ], ] if get_option('destructive').enabled()