From patchwork Thu Mar 20 18:04:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 14024288 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8608EC35FFF for ; Thu, 20 Mar 2025 18:05:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CAC19280006; Thu, 20 Mar 2025 14:05:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C34BB280001; Thu, 20 Mar 2025 14:05:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AD680280006; Thu, 20 Mar 2025 14:05:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8D7D6280001 for ; Thu, 20 Mar 2025 14:05:35 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7B644AC345 for ; Thu, 20 Mar 2025 18:05:36 +0000 (UTC) X-FDA: 83242707072.28.836081B Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by imf22.hostedemail.com (Postfix) with ESMTP id B0E84C0020 for ; Thu, 20 Mar 2025 18:05:33 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf22.hostedemail.com: domain of shiju.jose@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=shiju.jose@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742493934; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=jT5zu1112nGTQ6eSC3FzOQRefLFXd85gMx6/NC8X8qM=; b=2J5OQ/jpkTwzue4FnCGdqKqnsGf9gkgv32929qCF47dwM6r7mKDTxnUGtII2dF/KJFvf0l K0PIUQpvwHPM7TTyXLvrnwVTBG232NXlzuF3C870ue1T3Dl1OzsFIqGlHX4XYYo/YnHWQP R2ilQkDGSj+5E07L6FhgGc4CXgSYGSk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742493934; a=rsa-sha256; cv=none; b=gseUDsyk2/S55zpOM6HzMafFiHtrFMR5FpfD4o21pBOgHeMVxTMnYpo3uGQoAPRfilTCSH UY8o66SQ8UBVbdfxhqyNJVBRrXiLlMsuj7fv6i6/5hP6XOqHdqPMOYsv3R7Bl7fGmUEhvy A9BLvXq3kCNAyML41uwF+wbHuiGoFNo= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf22.hostedemail.com: domain of shiju.jose@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=shiju.jose@huawei.com Received: from mail.maildlp.com (unknown [172.18.186.216]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4ZJYM93p9Wz6K9M6; Fri, 21 Mar 2025 02:02:29 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id 103481405A0; Fri, 21 Mar 2025 02:05:28 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.48.156.145) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Thu, 20 Mar 2025 19:05:25 +0100 From: To: , , , , , , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v2 0/8] cxl: support CXL memory RAS features Date: Thu, 20 Mar 2025 18:04:37 +0000 Message-ID: <20250320180450.539-1-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 MIME-Version: 1.0 X-Originating-IP: [10.48.156.145] X-ClientProxiedBy: lhrpeml100003.china.huawei.com (7.191.160.210) To frapeml500007.china.huawei.com (7.182.85.172) X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B0E84C0020 X-Stat-Signature: djzasixga33w9uqbbdaw8rwk6fgmsk74 X-HE-Tag: 1742493933-19158 X-HE-Meta: U2FsdGVkX1+5uJHGvPR0fTgpGC7CEWmms58qd8axMUQ8Fs2/vPNH+n4nyx7rlZOMspvTyPACpd4vrNYMkiHLIAUipe9d6RJyLGcGcwsJpTfK/1lqJCdNO4T0wCweYenKIfMUsW86qOfauedGBpf6JDMvDijrqtAss//jPxg40RlpctpPJsOy/F+epixW95KyZyfirjJ8ltuvSLd1IhPvN6AMoM3ShlutobJtcP6e8Vm3dtTWvBO74a6R4Nq7RIgIslJeNyLR/mP3K62+2Tzle1gRycoaEjYL9EimvXgU7g6qV1lx309AmNJ8Wy7M92HzZvwcfe4VqVLR0UL9RBYy0lJ5W4pY03m8x2FtYdg/GipzuKMju7EChqiKJFVwqDiKkKf0PvPQ5ALATqcOd8qzrW+U+NvC04+NCPyy5OTJhGKe8DzFG2Mm6FBC9yLaOChshHHwAlStxLICxt/4EwWTe0UOxp1GROVdm7IAuyvZZXCRR4u+FfiorbSNb5FQl/S002S38tHsoJij1cPR7BzVaeeHLAmJezviz6qvdvjow2IhrGCGq86VnbUYRgEqlP/pHIZRjAfMFSYqsMYeL/GD6M0LpAJ07Lkzl1Nrm0FYAjpfuJVsljVIRdpDPM3HfO0N4QZIYPqLxj/aW/6FzkZ5ze2yjnBgrlbAltgDeIyica941U1l5DYKyfHhYcE9Oj7tid/h22dAxks3ksRzB/eCZn+SzZgdWBEpDD3LAIb5+voGIL6716g1TOU3VQJKVBLWhIgwaq60xmg2/Gv9NJM7BhSYevD0LE5AOsj037EzZdyVI08YqQeYh1qmet3AG5N01LVfLP5CiQSlge1kErr1EmpVgwQbdZLHnpaNJR68VskoWy3LKKQuBaO8l+5u+RLTDyOdKuxtMwLoIJbCAX6LbNw94asEJFmigPbWwHBPCKsB3TFqglQ/9FuMPjHSRHf6ucqEo4tJnaJC5cVaSQL PQui5i4D njY151kZY//5VlVCGyEqA0OWNTtsTsB9bpzaGltVnJlu9gVSa79lAbDKvUd9WruFbVLDzBpE6WZVApya6Slaol/ImZPjrfH5TyyT6lZ/72lFfpx0kMMj9NEbTwj3cjmYw1ciIHYCRI8qApzAoovxvs81/bhJ2kFTs0qdDRaAuvluNKcVeR+2HznoOBm9B4l96X19/eCHBuyKNMQHRM7SrPgcHrOczWa8GpspAlmEJXVk6IjM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Shiju Jose Support for CXL memory RAS features: patrol scrub, ECS, soft-PPR and memory sparing. This CXL series was part of the EDAC series [1]. The code is based on cxl.git next branch [2] merged with ras.git edac-cxl branch [3]. 1. https://lore.kernel.org/linux-cxl/20250212143654.1893-1-shiju.jose@huawei.com/ 2. https://web.git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=next 3. https://web.git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git/log/?h=edac-cxl Userspace code for CXL memory repair features [4] and sample boot-script for CXL memory repair [5]. [4]: https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.com/ [5]: https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.com/ Changes ======= v1 -> v2: 1. Feedbacks from Dan Williams on v1, https://lore.kernel.org/linux-mm/20250307091137.00006a0a@huawei.com/T/ - Fixed lock issues in region scrubbing, added local cxl_acquire() and cxl_unlock. - Replaced CXL examples using cat and echo from EDAC .rst docs with short description and ref to ABI docs. Also corrections in existing descriptions as suggested by Dan. - Add policy description for the scrub control feature. However this may require inputs from CXL experts. - Replaced CONFIG_CXL_RAS_FEATURES with CONFIG_CXL_EDAC_MEM_FEATURES. - Few changes to depends part of CONFIG_CXL_EDAC_MEM_FEATURES. - Rename drivers/cxl/core/memfeatures.c as drivers/cxl/core/edac.c - snprintf() -> kasprintf() in few places. 2. Feedbacks from Alison on v1, - In cxl_get_feature_entry()(patch 1), return NULL on failures and reintroduced checks in cxl_get_feature_entry(). - Changed logic in for loop in region based scrubbing code. - Replace cxl_are_decoders_committed() to cxl_is_memdev_memory_online() and add as a local function to drivers/cxl/core/edac.c - Changed few multiline comments to single line comments. - Removed unnecessary comments from the code. - Reduced line length of few macros in ECS and memory repair code. - In new files, changed "GPL-2.0-or-later" -> "GPL-2.0-only". - Ran clang-format for new files and updated. 3. Changes for feedbacks from Jonathan on v1. - Changed few multiline comments to single line comments. Shiju Jose (8): cxl: Add helper function to retrieve a feature entry EDAC: Update documentation for the CXL memory patrol scrub control feature cxl/edac: Add CXL memory device patrol scrub control feature cxl/edac: Add CXL memory device ECS control feature cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command cxl: Support for finding memory operation attributes from the current boot cxl/memfeature: Add CXL memory device soft PPR control feature cxl/memfeature: Add CXL memory device memory sparing control feature Documentation/edac/memory_repair.rst | 31 + Documentation/edac/scrub.rst | 47 + drivers/cxl/Kconfig | 27 + drivers/cxl/core/Makefile | 1 + drivers/cxl/core/core.h | 2 + drivers/cxl/core/edac.c | 1730 ++++++++++++++++++++++++++ drivers/cxl/core/features.c | 23 + drivers/cxl/core/mbox.c | 45 +- drivers/cxl/core/memdev.c | 9 + drivers/cxl/core/ras.c | 145 +++ drivers/cxl/core/region.c | 5 + drivers/cxl/cxlmem.h | 73 ++ drivers/cxl/mem.c | 4 + drivers/cxl/pci.c | 3 + drivers/edac/mem_repair.c | 9 + include/linux/edac.h | 7 + 16 files changed, 2159 insertions(+), 2 deletions(-) create mode 100644 drivers/cxl/core/edac.c