From patchwork Fri Nov 11 03:20:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13039541 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D760C433FE for ; Fri, 11 Nov 2022 03:20:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232399AbiKKDUP (ORCPT ); Thu, 10 Nov 2022 22:20:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56250 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231777AbiKKDUO (ORCPT ); Thu, 10 Nov 2022 22:20:14 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E27864B9A2 for ; Thu, 10 Nov 2022 19:20:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668136812; x=1699672812; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2L3oHReSuQCpClXWNtrrvwtAMs5qrMUwLJZ+JgDDIRs=; b=mju0M6v1/M14HHevnyXAAFJlz1JmkIZ5Mng5b7v426tzyUHPQ0NA162L tgK4IQNex5PK+XuxxXvsGOwF0QbgtlAMSvwQyXz7rlvZJmxyzAin7Htba VnGPbU4N5lWevhOs+bUbwX815kAH/iv0o5mK1PnTVDunIuY/nS1Lk3IqA aFXAQD7u6TgLUjf2vF8KkHH1iA5EKNvr1uqrP99O6YMb8UtEfxIEWCpq+ tfVwN1x5f69LBATYXSN5OgmD6Gg7I8VXszHIRKKs+Ix7MPtexydAQFlxh 3gXIhFhisZsByPVl32ExYvbRyNDjPomoxpYRWzssXnzEfnuHSHg9ipXGu A==; X-IronPort-AV: E=McAfee;i="6500,9779,10527"; a="397807330" X-IronPort-AV: E=Sophos;i="5.96,155,1665471600"; d="scan'208";a="397807330" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2022 19:20:12 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10527"; a="743129956" X-IronPort-AV: E=Sophos;i="5.96,155,1665471600"; d="scan'208";a="743129956" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.161.45]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2022 19:20:12 -0800 From: alison.schofield@intel.com To: Dan Williams , Ira Weiny , Vishal Verma , Dave Jiang , Ben Widawsky Cc: Alison Schofield , nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org Subject: [ndctl PATCH 1/5] libcxl: add interfaces for GET_POISON_LIST mailbox commands Date: Thu, 10 Nov 2022 19:20:04 -0800 Message-Id: <73b2edf5ded979cb3164bcf2b76c4f300cdf2250.1668133294.git.alison.schofield@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org From: Alison Schofield CXL devices maintain a list of locations that are poisoned or result in poison if the addresses are accessed by the host. Per the spec (CXL 3.0 8.2.9.8.4.1), the device returns this Poison list as a set of Media Error Records that include the source of the error, the starting device physical address and length. Trigger the retrieval of the poison list by writing to the device sysfs attribute: trigger_poison_list. Retrieval is offered by memdev or by region: int cxl_memdev_trigger_poison_list(struct cxl_memdev *memdev); int cxl_region_trigger_poison_list(struct cxl_region *region); This interface triggers the retrieval of the poison list from the devices and logs the error records as kernel trace events named 'cxl_poison'. Signed-off-by: Alison Schofield Reviewed-by: Jonathan Cameron --- cxl/lib/libcxl.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ cxl/lib/libcxl.sym | 6 ++++++ cxl/libcxl.h | 2 ++ 3 files changed, 52 insertions(+) diff --git a/cxl/lib/libcxl.c b/cxl/lib/libcxl.c index e8c5d4444dd0..1a8a8eb0ffcb 100644 --- a/cxl/lib/libcxl.c +++ b/cxl/lib/libcxl.c @@ -1331,6 +1331,50 @@ CXL_EXPORT int cxl_memdev_disable_invalidate(struct cxl_memdev *memdev) return 0; } +CXL_EXPORT int cxl_memdev_trigger_poison_list(struct cxl_memdev *memdev) +{ + struct cxl_ctx *ctx = cxl_memdev_get_ctx(memdev); + char *path = memdev->dev_buf; + int len = memdev->buf_len, rc; + + if (snprintf(path, len, "%s/trigger_poison_list", memdev->dev_path) >= + len) { + err(ctx, "%s: buffer too small\n", + cxl_memdev_get_devname(memdev)); + return -ENXIO; + } + rc = sysfs_write_attr(ctx, path, "1\n"); + if (rc < 0) { + fprintf(stderr, + "%s: Failed write sysfs attr trigger_poison_list\n", + cxl_memdev_get_devname(memdev)); + return rc; + } + return 0; +} + +CXL_EXPORT int cxl_region_trigger_poison_list(struct cxl_region *region) +{ + struct cxl_ctx *ctx = cxl_region_get_ctx(region); + char *path = region->dev_buf; + int len = region->buf_len, rc; + + if (snprintf(path, len, "%s/trigger_poison_list", region->dev_path) >= + len) { + err(ctx, "%s: buffer too small\n", + cxl_region_get_devname(region)); + return -ENXIO; + } + rc = sysfs_write_attr(ctx, path, "1\n"); + if (rc < 0) { + fprintf(stderr, + "%s: Failed write sysfs attr trigger_poison_list\n", + cxl_region_get_devname(region)); + return rc; + } + return 0; +} + CXL_EXPORT int cxl_memdev_enable(struct cxl_memdev *memdev) { struct cxl_ctx *ctx = cxl_memdev_get_ctx(memdev); diff --git a/cxl/lib/libcxl.sym b/cxl/lib/libcxl.sym index 8bb91e05638b..ecf98e6c7af2 100644 --- a/cxl/lib/libcxl.sym +++ b/cxl/lib/libcxl.sym @@ -217,3 +217,9 @@ global: cxl_decoder_get_max_available_extent; cxl_decoder_get_region; } LIBCXL_2; + +LIBCXL_4 { +global: + cxl_memdev_trigger_poison_list; + cxl_region_trigger_poison_list; +} LIBCXL_3; diff --git a/cxl/libcxl.h b/cxl/libcxl.h index 9fe4e99263dd..5ebdf0879325 100644 --- a/cxl/libcxl.h +++ b/cxl/libcxl.h @@ -375,6 +375,8 @@ enum cxl_setpartition_mode { int cxl_cmd_partition_set_mode(struct cxl_cmd *cmd, enum cxl_setpartition_mode mode); +int cxl_memdev_trigger_poison_list(struct cxl_memdev *memdev); +int cxl_region_trigger_poison_list(struct cxl_region *region); #ifdef __cplusplus } /* extern "C" */ From patchwork Fri Nov 11 03:20:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13039542 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1A4CC43217 for ; Fri, 11 Nov 2022 03:20:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231777AbiKKDUQ (ORCPT ); Thu, 10 Nov 2022 22:20:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56254 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232231AbiKKDUO (ORCPT ); Thu, 10 Nov 2022 22:20:14 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17C5A4D5CB for ; Thu, 10 Nov 2022 19:20:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668136814; x=1699672814; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0lO6qjHWv+IrV/yrUBy/sYYUodJwHEStQS1iikUQicw=; b=dF3XMBRI+Xwwg2EHb9egthH+Uif+4Dl78VkbZ9dlsu498Ctwfh0E1dmR 3FDcKSDE/7QDuzv5A050wDg+GIWqstS6H3kO3whDKNlq8SZ2mABcfn967 TCe+66jEkpHyU+hFyd6BgnVG4i+gITRdfzl6CdFphEf2vb/B1KiqRaaMy DohoI99JxoLAKsDaFKWM2OnpFakNrzbnVutnyRI04DdQNlSNSFG5u98gp fCErjeeVhu7z96il4swdAFIUNsVO1ZP40bEuiW0Wqo/1y3JZTTUMomhyV yvIGBcl/NXYsBbsXrSN5vpFizJOiV5F5/TXJTw/eYgtLYHjWdXQplkulY w==; X-IronPort-AV: E=McAfee;i="6500,9779,10527"; a="397807332" X-IronPort-AV: E=Sophos;i="5.96,155,1665471600"; d="scan'208";a="397807332" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2022 19:20:13 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10527"; a="743129961" X-IronPort-AV: E=Sophos;i="5.96,155,1665471600"; d="scan'208";a="743129961" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.161.45]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2022 19:20:13 -0800 From: alison.schofield@intel.com To: Dan Williams , Ira Weiny , Vishal Verma , Dave Jiang , Ben Widawsky Cc: Alison Schofield , nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org Subject: [ndctl PATCH 2/5] cxl: add an optional pid check to event parsing Date: Thu, 10 Nov 2022 19:20:05 -0800 Message-Id: X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org From: Alison Schofield When parsing CXL events, callers may only be interested in events that originate from the current process. Introduce an optional argument to the event trace context: event_pid. When event_pid is present, only include events with a matching pid in the returned JSON list. It is not a failure to see other, non matching results. Simply skip those. The initial use case for this is the listing of media errors, where only the media errors requested by this process are wanted. Signed-off-by: Alison Schofield Reviewed-by: Jonathan Cameron --- cxl/event_trace.c | 5 +++++ cxl/event_trace.h | 1 + 2 files changed, 6 insertions(+) diff --git a/cxl/event_trace.c b/cxl/event_trace.c index a973a1f62d35..70ab892bbfcb 100644 --- a/cxl/event_trace.c +++ b/cxl/event_trace.c @@ -207,6 +207,11 @@ static int cxl_event_parse(struct tep_event *event, struct tep_record *record, return 0; } + if (event_ctx->event_pid) { + if (event_ctx->event_pid != tep_data_pid(event->tep, record)) + return 0; + } + if (event_ctx->parse_event) return event_ctx->parse_event(event, record, &event_ctx->jlist_head); diff --git a/cxl/event_trace.h b/cxl/event_trace.h index ec6267202c8b..7f7773b2201f 100644 --- a/cxl/event_trace.h +++ b/cxl/event_trace.h @@ -15,6 +15,7 @@ struct event_ctx { const char *system; struct list_head jlist_head; const char *event_name; /* optional */ + int event_pid; /* optional */ int (*parse_event)(struct tep_event *event, struct tep_record *record, struct list_head *jlist_head); /* optional */ }; From patchwork Fri Nov 11 03:20:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13039543 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 337E7C43219 for ; Fri, 11 Nov 2022 03:20:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231791AbiKKDUQ (ORCPT ); Thu, 10 Nov 2022 22:20:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229757AbiKKDUP (ORCPT ); Thu, 10 Nov 2022 22:20:15 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CDE447301 for ; Thu, 10 Nov 2022 19:20:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668136814; x=1699672814; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bRrw9/sMALUjvJ5/z2OV7l7YJ62K7WCyf7LWsqcrRRI=; b=Zd+E09DGOhQl1AAztfTQFw6kY0+KSofXmJZVtAEQ5gVdmOeVPdF04l+O c13Z9szizKnSK6nCfXvBgHkaVyGKMU4mYCorsvZgVKHsu3UmdizOLEe8d QJzsRNz08BxrMQDxNwUkXkQ1TstKCWKg6BDDZoOR0zEFXhoGvjq/VcHls Ad0FUqOKK93qgagZGi/PhE2/EE9OAnk+qYmQGM67otMvTif2uI5cBinIy pILyFBIkrYJyKPGqIbmM8ATcrUstbvPDZa/N+SiIlseRimtKQ/YujBtSS lCHDpyUfolzwZCaZa5Udn0iwivAfoaAAo/SbPtRlFvF9HWHQk+L3syfEB A==; X-IronPort-AV: E=McAfee;i="6500,9779,10527"; a="397807334" X-IronPort-AV: E=Sophos;i="5.96,155,1665471600"; d="scan'208";a="397807334" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2022 19:20:14 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10527"; a="743129966" X-IronPort-AV: E=Sophos;i="5.96,155,1665471600"; d="scan'208";a="743129966" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.161.45]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2022 19:20:14 -0800 From: alison.schofield@intel.com To: Dan Williams , Ira Weiny , Vishal Verma , Dave Jiang , Ben Widawsky Cc: Alison Schofield , nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org Subject: [ndctl PATCH 3/5] cxl/list: collect and parse the poison list records Date: Thu, 10 Nov 2022 19:20:06 -0800 Message-Id: X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org From: Alison Schofield When triggered, poison list error records are logged as events in the kernel tracing subsystem. Trace, trigger, and parse the events when the --media-error option is selected in cxl list. Include the total number of media-errors, even when zero. The media-error records matches the definition in the CXL 3.0 Spec Table 8.107. Signed-off-by: Alison Schofield --- cxl/json.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 185 insertions(+) diff --git a/cxl/json.c b/cxl/json.c index 63c17519aba1..1b3c0bda6bda 100644 --- a/cxl/json.c +++ b/cxl/json.c @@ -2,14 +2,18 @@ // Copyright (C) 2015-2021 Intel Corporation. All rights reserved. #include #include +#include #include #include #include #include #include +#include +#include #include "filter.h" #include "json.h" +#include "event_trace.h" static struct json_object *util_cxl_memdev_health_to_json( struct cxl_memdev *memdev, unsigned long flags) @@ -300,6 +304,167 @@ err_jobj: return NULL; } +/* CXL 8.2.9.5.4.1 Get Poison List: Poison Source */ +#define CXL_POISON_SOURCE_UNKNOWN 0 +#define CXL_POISON_SOURCE_EXTERNAL 1 +#define CXL_POISON_SOURCE_INTERNAL 2 +#define CXL_POISON_SOURCE_INJECTED 3 +#define CXL_POISON_SOURCE_VENDOR 7 + +/* CXL 8.2.9.5.4.1 Get Poison List: Payload out flags */ +#define CXL_POISON_FLAG_MORE BIT(0) +#define CXL_POISON_FLAG_OVERFLOW BIT(1) +#define CXL_POISON_FLAG_SCANNING BIT(2) + +static struct json_object * +util_cxl_poison_events_to_json(struct tracefs_instance *inst, bool is_region, + unsigned long flags) +{ + struct json_object *jerrors, *jmedia, *jobj = NULL; + struct jlist_node *jnode, *next; + struct event_ctx ectx = { + .event_name = "cxl_poison", + .event_pid = getpid(), + .system = "cxl", + }; + int rc, count = 0; + + list_head_init(&ectx.jlist_head); + rc = cxl_parse_events(inst, &ectx); + if (rc < 0) { + fprintf(stderr, "Failed to parse events: %d\n", rc); + return NULL; + } + if (list_empty(&ectx.jlist_head)) + return NULL; + + jerrors = json_object_new_array(); + if (!jerrors) + return NULL; + + list_for_each_safe (&ectx.jlist_head, jnode, next, list) { + struct json_object *jval = NULL; + struct json_object *jp = NULL; + int source, pflags; + u64 addr, len; + + jp = json_object_new_object(); + if (!jp) + return NULL; + + if (is_region) { + /* Per-region JSON includes memdev names */ + if (json_object_object_get_ex(jnode->jobj, "memdev", + &jval)) + json_object_object_add(jp, "memdev", jval); + } + if (json_object_object_get_ex(jnode->jobj, "dpa", &jval)) { + addr = json_object_get_int64(jval); + jobj = util_json_object_hex(addr, flags); + json_object_object_add(jp, "dpa", jobj); + } + if (json_object_object_get_ex(jnode->jobj, "length", &jval)) { + len = json_object_get_int64(jval); + jobj = util_json_object_size(len, flags); + json_object_object_add(jp, "length", jobj); + } + if (json_object_object_get_ex(jnode->jobj, "source", &jval)) { + source = json_object_get_int(jval); + if (source == CXL_POISON_SOURCE_UNKNOWN) + jobj = json_object_new_string("Unknown"); + else if (source == CXL_POISON_SOURCE_EXTERNAL) + jobj = json_object_new_string("External"); + else if (source == CXL_POISON_SOURCE_INTERNAL) + jobj = json_object_new_string("Internal"); + else if (source == CXL_POISON_SOURCE_INJECTED) + jobj = json_object_new_string("Injected"); + else if (source == CXL_POISON_SOURCE_VENDOR) + jobj = json_object_new_string("Vendor"); + else + jobj = json_object_new_string("Reserved"); + json_object_object_add(jp, "source", jobj); + } + if (json_object_object_get_ex(jnode->jobj, "flags", &jval)) { + char flag_str[32] = { '\0' }; + + pflags = json_object_get_int(jval); + if (pflags & CXL_POISON_FLAG_MORE) + strcat(flag_str, "More,"); + if (pflags & CXL_POISON_FLAG_OVERFLOW) + strcat(flag_str, "Overflow,"); + if (pflags & CXL_POISON_FLAG_SCANNING) + strcat(flag_str, "Scanning,"); + jobj = json_object_new_string(flag_str); + if (jobj) + json_object_object_add(jp, "flags", jobj); + } + if (json_object_object_get_ex(jnode->jobj, "overflow_t", &jval)) + json_object_object_add(jp, "overflow_time", jval); + + json_object_array_add(jerrors, jp); + count++; + } /* list_for_each_safe */ + + jmedia = json_object_new_object(); + if (!jmedia) + return NULL; + + /* Always return the count. If count is zero, no records follow. */ + jobj = json_object_new_int(count); + if (jobj) + json_object_object_add(jmedia, "nr_media_errors", jobj); + if (count) + json_object_object_add(jmedia, "media_error_records", jerrors); + + return jmedia; +} + +struct cxl_media_err_ctx { + void *dev; + bool is_region; +}; + +static struct json_object * +util_cxl_media_errors_to_json(struct cxl_media_err_ctx *mectx, + unsigned long flags) +{ + struct json_object *jmedia = NULL; + struct tracefs_instance *inst; + int rc; + + inst = tracefs_instance_create("cxl list"); + if (!inst) { + fprintf(stderr, "tracefs_instance_create() failed\n"); + return NULL; + } + + rc = cxl_event_tracing_enable(inst, "cxl", "cxl_poison"); + if (rc < 0) { + fprintf(stderr, "Failed to enable trace: %d\n", rc); + goto err_free; + } + + if (mectx->is_region) + rc = cxl_region_trigger_poison_list(mectx->dev); + else + rc = cxl_memdev_trigger_poison_list(mectx->dev); + if (rc) { + fprintf(stderr, "Failed write of sysfs attribute: %d\n", rc); + goto err_free; + } + + rc = cxl_event_tracing_disable(inst); + if (rc < 0) { + fprintf(stderr, "Failed to disable trace: %d\n", rc); + goto err_free; + } + + jmedia = util_cxl_poison_events_to_json(inst, mectx->is_region, flags); +err_free: + tracefs_instance_free(inst); + return jmedia; +} + struct json_object *util_cxl_memdev_to_json(struct cxl_memdev *memdev, unsigned long flags) { @@ -359,6 +524,16 @@ struct json_object *util_cxl_memdev_to_json(struct cxl_memdev *memdev, if (jobj) json_object_object_add(jdev, "partition_info", jobj); } + + if (flags & UTIL_JSON_MEDIA_ERRORS) { + struct cxl_media_err_ctx mectx = { + .dev = memdev, + .is_region = false, + }; + jobj = util_cxl_media_errors_to_json(&mectx, flags); + if (jobj) + json_object_object_add(jdev, "media_errors", jobj); + } return jdev; } @@ -678,6 +853,16 @@ struct json_object *util_cxl_region_to_json(struct cxl_region *region, json_object_object_add(jregion, "state", jobj); } + if (flags & UTIL_JSON_MEDIA_ERRORS) { + struct cxl_media_err_ctx mectx = { + .dev = region, + .is_region = true, + }; + jobj = util_cxl_media_errors_to_json(&mectx, flags); + if (jobj) + json_object_object_add(jregion, "media_errors", jobj); + } + util_cxl_mappings_append_json(jregion, region, flags); return jregion; From patchwork Fri Nov 11 03:20:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13039544 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18010C4321E for ; Fri, 11 Nov 2022 03:20:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229757AbiKKDUR (ORCPT ); Thu, 10 Nov 2022 22:20:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56266 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232231AbiKKDUQ (ORCPT ); Thu, 10 Nov 2022 22:20:16 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22E284D5CB for ; Thu, 10 Nov 2022 19:20:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668136816; x=1699672816; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=I+nvXp12J+biWDdGcdaPbaU5FE0X69XrhRSTjEwW9A4=; b=eD1N0O1wlGQy7CR1wFqEsTIzOBtggPqL+dA6KHIQMs6eqbMCUsUJaGZW 14OG6bOqO8iESL24f75mCRpSd+V5iBTb2S9qrtv1ZhJuHkeqTDxWfXy31 HcwVc3vwiCWdAqteJCmV7TF+IsdznsNHqRNOxTalj7cpS0XUB9pTULIGy vmho+IQlTuYJNUD5PbBAb8EurZ7LrS8yxCZytlwwZ1qc9ESHVTTYP4E6O Gfp3pdGywOcMwmteJy0uKRc8OCPTz6I2Nt7Xwo345DE01l3sLG9snbCor USiadS9/jvax3pGgUd+bEjsdTR2jF7uJFGFxkXkg4oAlyxpWDtB4/TxEd A==; X-IronPort-AV: E=McAfee;i="6500,9779,10527"; a="397807338" X-IronPort-AV: E=Sophos;i="5.96,155,1665471600"; d="scan'208";a="397807338" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2022 19:20:15 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10527"; a="743129970" X-IronPort-AV: E=Sophos;i="5.96,155,1665471600"; d="scan'208";a="743129970" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.161.45]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2022 19:20:15 -0800 From: alison.schofield@intel.com To: Dan Williams , Ira Weiny , Vishal Verma , Dave Jiang , Ben Widawsky Cc: Alison Schofield , nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org Subject: [ndctl PATCH 4/5] cxl/list: add --media-errors option to cxl list Date: Thu, 10 Nov 2022 19:20:07 -0800 Message-Id: <762edeab529125d3048cf13721360b1a07260531.1668133294.git.alison.schofield@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org From: Alison Schofield The --media-errors option to 'cxl list' retrieves poison lists from memory devices (supporting the capability) and displays the returned media-error records in the cxl list json. This option can apply to memdevs or regions. Signed-off-by: Alison Schofield --- Documentation/cxl/cxl-list.txt | 64 ++++++++++++++++++++++++++++++++++ cxl/filter.c | 2 ++ cxl/filter.h | 1 + cxl/list.c | 2 ++ 4 files changed, 69 insertions(+) diff --git a/Documentation/cxl/cxl-list.txt b/Documentation/cxl/cxl-list.txt index 14a2b4bb5c2a..24a0cf97cef2 100644 --- a/Documentation/cxl/cxl-list.txt +++ b/Documentation/cxl/cxl-list.txt @@ -344,6 +344,70 @@ OPTIONS --region:: Specify CXL region device name(s), or device id(s), to filter the listing. +-a:: +--media-errors:: + Include media-error information. The poison list is retrieved + from the device(s) and media error records are added to the + listing. This option applies to memdevs and regions where + devices support the poison list capability. + +---- +# cxl list -m mem11 --media-errors +[ + { + "memdev":"mem11", + "pmem_size":268435456, + "ram_size":0, + "serial":0, + "host":"0000:37:00.0", + "media_errors":{ + "nr_media_errors":1, + "media_error_records":[ + { + "dpa":0, + "length":64, + "source":"Internal", + "flags":"", + "overflow_time":0 + } + ] + } + } +] +# cxl list -r region5 --media-errors +[ + { + "region":"region5", + "resource":1035623989248, + "size":2147483648, + "interleave_ways":2, + "interleave_granularity":4096, + "decode_state":"commit", + "media_errors":{ + "nr_media_errors":2, + "media_error_records":[ + { + "memdev":"mem2", + "dpa":0, + "length":64, + "source":"Internal", + "flags":"", + "overflow_time":0 + }, + { + "memdev":"mem5", + "dpa":0, + "length":512, + "source":"Vendor", + "flags":"", + "overflow_time":0 + } + ] + } + } +] +---- + -v:: --verbose:: Increase verbosity of the output. This can be specified diff --git a/cxl/filter.c b/cxl/filter.c index 56c659965891..fe6c29148fb4 100644 --- a/cxl/filter.c +++ b/cxl/filter.c @@ -686,6 +686,8 @@ static unsigned long params_to_flags(struct cxl_filter_params *param) flags |= UTIL_JSON_TARGETS; if (param->partition) flags |= UTIL_JSON_PARTITION; + if (param->media_errors) + flags |= UTIL_JSON_MEDIA_ERRORS; return flags; } diff --git a/cxl/filter.h b/cxl/filter.h index 256df49c3d0c..a92295fe2511 100644 --- a/cxl/filter.h +++ b/cxl/filter.h @@ -26,6 +26,7 @@ struct cxl_filter_params { bool human; bool health; bool partition; + bool media_errors; int verbose; struct log_ctx ctx; }; diff --git a/cxl/list.c b/cxl/list.c index 8c48fbbaaec3..df2ae5a3fec0 100644 --- a/cxl/list.c +++ b/cxl/list.c @@ -52,6 +52,8 @@ static const struct option options[] = { "include memory device health information"), OPT_BOOLEAN('I', "partition", ¶m.partition, "include memory device partition information"), + OPT_BOOLEAN('a', "media-errors", ¶m.media_errors, + "include media error information "), OPT_INCR('v', "verbose", ¶m.verbose, "increase output detail"), #ifdef ENABLE_DEBUG From patchwork Fri Nov 11 03:20:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 13039545 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E298EC433FE for ; Fri, 11 Nov 2022 03:20:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232241AbiKKDUS (ORCPT ); Thu, 10 Nov 2022 22:20:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232231AbiKKDUR (ORCPT ); Thu, 10 Nov 2022 22:20:17 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36CE5528A8 for ; Thu, 10 Nov 2022 19:20:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668136817; x=1699672817; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sQeRo3g4sdBJ0pWvvRj+9tOdBa8K1tS5T7mi8mKysLQ=; b=OgppwqP0/MLMmTNLi/zNDqzy/wIwoUkDcm8bvNxBZVwkjWE60y5OxiPz +o6r/1y16CFldge4PwZnhreMyaPa/8No8WcoV9uzz/Zho0OLdHDtG/E+d gWMNICxSvmfDNC5fPwRJlqUy9Ti1Gl1M8Olqukp8pMjPmM/NI4+rkW8qO fESXqx9euIQee30fp0Swdw92T32yhXmf7N2EHV0ZKxBqIwFi5uTqgQUwe 1Oebwv0aZ+0FDjXsPPnOYbuNrMygMrQ4rKIsqDkpIsLMj14alp2VuTAfH tvg1Sk1XKJI3eOkATS5QZ4KaPKg/RGyOYUf0KEXEPo5srhcjMw5LY+W8G A==; X-IronPort-AV: E=McAfee;i="6500,9779,10527"; a="397807340" X-IronPort-AV: E=Sophos;i="5.96,155,1665471600"; d="scan'208";a="397807340" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2022 19:20:17 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10527"; a="743129974" X-IronPort-AV: E=Sophos;i="5.96,155,1665471600"; d="scan'208";a="743129974" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.209.161.45]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2022 19:20:16 -0800 From: alison.schofield@intel.com To: Dan Williams , Ira Weiny , Vishal Verma , Dave Jiang , Ben Widawsky Cc: Alison Schofield , nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org Subject: [ndctl PATCH 5/5] test: add a cxl-get-poison test Date: Thu, 10 Nov 2022 19:20:08 -0800 Message-Id: X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org From: Alison Schofield Exercise cxl list, libcxl, and driver pieces of the get poison pathway. The poison records themselves are mocked by cxl_test, but the work of triggering the poison read, logging as trace events, and then collecting and parsing is all for real. Signed-off-by: Alison Schofield --- test/cxl-get-poison.sh | 78 ++++++++++++++++++++++++++++++++++++++++++ test/meson.build | 2 ++ 2 files changed, 80 insertions(+) create mode 100644 test/cxl-get-poison.sh diff --git a/test/cxl-get-poison.sh b/test/cxl-get-poison.sh new file mode 100644 index 000000000000..fe93a67a5240 --- /dev/null +++ b/test/cxl-get-poison.sh @@ -0,0 +1,78 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2022 Intel Corporation. All rights reserved. + +. $(dirname $0)/common + +rc=1 + +set -ex + +trap 'err $LINENO' ERR + +check_prereq "jq" + +modprobe -r cxl_test +modprobe cxl_test +udevadm settle + +# The number or errors that cxl_test mocks is subject to change. +NR_ERRS=2 + +# THEORY OF OPERATION: Exercise cxl-cli and cxl driver capabilites wrt +# retrieving poison lists. The poison list is maintained by the device. +# It may be requested per memdev or per region. + +create_region() +{ + region=$($CXL create-region -d $decoder -m $memdevs | jq -r ".region") + + if [[ ! $region ]]; then + echo "create-region failed for $decoder" + err "$LINENO" + fi +} + +setup_x2_region() +{ + # Find an x2 decoder + decoder=$($CXL list -b cxl_test -D -d root | jq -r ".[] | + select(.pmem_capable == true) | + select(.nr_targets == 2) | + .decoder") + + # Find a memdev for each host-bridge interleave position + port_dev0=$($CXL list -T -d $decoder | jq -r ".[] | + .targets | .[] | select(.position == 0) | .target") + port_dev1=$($CXL list -T -d $decoder | jq -r ".[] | + .targets | .[] | select(.position == 1) | .target") + mem0=$($CXL list -M -p $port_dev0 | jq -r ".[0].memdev") + mem1=$($CXL list -M -p $port_dev1 | jq -r ".[0].memdev") + memdevs="$mem0 $mem1" +} + +find_media_errors() +{ + nr=$(echo $json | jq -r ".nr_media_errors") + if [[ $nr -ne $NR_ERRS ]]; then + echo "$mem: $NR_ERRS media errors expected, $nr found" + err "$LINENO" + fi +} + +# Read poison from each available memdev +readarray -t mems < <("$CXL" list -b cxl_test -Mi | jq -r '.[].memdev') +for mem in ${mems[@]}; do + json=$("$CXL" list -m "$mem" --media-errors | jq -r '.[].media_errors') + find_media_errors +done + +# Read poison from one region +setup_x2_region +create_region +json=$("$CXL" list -r "$region" --media-errors | jq -r '.[].media_errors') +find_media_errors +cxl disable-region $region +cxl destroy-region $region + +modprobe -r cxl_test diff --git a/test/meson.build b/test/meson.build index 5953c286d13f..721c69e79f5e 100644 --- a/test/meson.build +++ b/test/meson.build @@ -154,6 +154,7 @@ cxl_topo = find_program('cxl-topology.sh') cxl_sysfs = find_program('cxl-region-sysfs.sh') cxl_labels = find_program('cxl-labels.sh') cxl_create_region = find_program('cxl-create-region.sh') +cxl_get_poison = find_program('cxl-get-poison.sh') tests = [ [ 'libndctl', libndctl, 'ndctl' ], @@ -182,6 +183,7 @@ tests = [ [ 'cxl-region-sysfs.sh', cxl_sysfs, 'cxl' ], [ 'cxl-labels.sh', cxl_labels, 'cxl' ], [ 'cxl-create-region.sh', cxl_create_region, 'cxl' ], + [ 'cxl-get-poison.sh', cxl_get_poison, 'cxl' ], ] if get_option('destructive').enabled()