From patchwork Fri Nov 1 09:17:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 13858971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C150CE674BF for ; Fri, 1 Nov 2024 09:18:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D9286B00AA; Fri, 1 Nov 2024 05:18:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 461C36B00AB; Fri, 1 Nov 2024 05:18:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 267896B00AC; Fri, 1 Nov 2024 05:18:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id ECEA76B00AA for ; Fri, 1 Nov 2024 05:18:40 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id A78F6121A7F for ; Fri, 1 Nov 2024 09:18:40 +0000 (UTC) X-FDA: 82736975328.15.CE73104 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by imf29.hostedemail.com (Postfix) with ESMTP id C5183120011 for ; Fri, 1 Nov 2024 09:18:00 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf29.hostedemail.com: domain of shiju.jose@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=shiju.jose@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730452586; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Bzv1TaBfGDDsgV9L8iuGiHuvpEV/usEBTSh9KMBojak=; b=TR9Z1wD+Dr8O2SDaHiMBAuq8lG4jbM/eXq3OMTYrtwjPGpgAijFZKicBn/r9mBp1zjbxfT 6gro73K4U+RtwjQM85pDjNad6Ylr3SzSjG4leU3fQBkXz2dWAssMSXoRxxMO8kF1mv1nRS cAXbSkj+5A/LwWIRUC6hxVjN4mRb07Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730452586; a=rsa-sha256; cv=none; b=ZkrVbGvPQNLbXTBGX52q9gkH60PZsVU4ATValzvRLsehMYEkvPqtVkTGKfupU/5ieZ5CaM tcQ8VzRJt5BLmg8I6/10MtZaietRIRSWaFYa5mQA/0ki0gRzeaXWHYw8xnWzOnyTh6tCVD B/LJ81GUTbyJL9BofLo1klk37bgPBrg= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf29.hostedemail.com: domain of shiju.jose@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=shiju.jose@huawei.com Received: from mail.maildlp.com (unknown [172.18.186.231]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4XfwFx6jqsz6K7MW; Fri, 1 Nov 2024 17:16:05 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id D669114058E; Fri, 1 Nov 2024 17:18:36 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.126.171.129) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Fri, 1 Nov 2024 10:18:34 +0100 From: To: , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v15 15/15] EDAC: Add documentation for RAS feature control Date: Fri, 1 Nov 2024 09:17:33 +0000 Message-ID: <20241101091735.1465-16-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20241101091735.1465-1-shiju.jose@huawei.com> References: <20241101091735.1465-1-shiju.jose@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.126.171.129] X-ClientProxiedBy: lhrpeml500004.china.huawei.com (7.191.163.9) To frapeml500007.china.huawei.com (7.182.85.172) X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C5183120011 X-Stat-Signature: jtyq1e9z3rjm39ted1ujj79nxmizxkmn X-HE-Tag: 1730452680-856690 X-HE-Meta: U2FsdGVkX1/YqBBQ/SuYjifvJu3+J22dMyUZbbqMwnv6S5BzRP5gIS1rSG5e4lIWcwNWo9npvPblvRFnHxQ7uz26SEKRdILc3fP8bX+4b696r/Usu8oaz/0Pfz4OzbKvKIZqni8RIoSu0uPntLE7DJNB0fCD2ZpU9fe4f08Nt3CBv0/XAmb1wh6MIxSGsXuhuWRlAbGXZfLiyrMwb2BKd8y6Drv4TyXmkq6F5UxGrsQIXmi0QvJKHEypAWBhEbCQp3ewte3Bi/ptWQycuA3KGSMo+smxW5g1ro/e8FL5jE7UtfWykG1emQX83WFvm61NUgjA7PPeWfFOo4FV5kmFusGO1Qt86dHPyNzoAitPPcwgXTEkTfqfJ/ure6Dz3TnBD057MKsgdd+nz09uuOgc1Q4L8ulO4xLHoAcUehPTGSI5VF1Ng0Acii3XOKUdOKM/tMFcdkIReOcvHTbrWkc62n5hfDZxVZHKIypnShEwB6PJ7/+z3qW2RaRsErQDnSpRmCSd00TZalFTn00k1p9dGqs27nHozgh3k+5uq/knbXHzxDSiGOdzIRoyAlgCR+CocuLd9574MD7OAA644GydW/zraIypwjzTmVE1jloQ813ALtYxKkjGtwU7u9xBLEunfgKkDr4TYSPfx5yCLTCwi9R6NrhLHATv3w4pNedhuiEGJoer8kw1bYY+hVKVQS1Pt6dDwj7QNRDkyAvuEPQeIbRqdzm+QAgmaS5r32qQrE+WJjngJPsDUsUvcwNtBSTVSvJrn3yR+KQX7vkn5Id2UjFYKsYnefOUUhWduYoGzb21O/2Sj/BZJLRtyBU5Kcf5brxuQL67V7LHDt4FTfHNaSPBf/mgfiK/lPJTIjzJWBmdIPHAgc0qqhnSAuq5RdUbSVqi8LvoTdvUaJllSp4pfJTd9SH74cfDt7Pf5UGiw8dBaqExiG0TX3sPZOruNh7WL0oVMjTMs9f2YDTW3h2 3fUl/XYH DvmhfyXrULDZxBWLYpeaFVi/AmVpocwhLtp44GY4MfBABXB7Eulqy37ZzDNrgadcu+2sKZWuKn3x7zmDWun/ONcCMe45HXBmWxAEShEDSKMy60j1EFKDlIfzC2RlnioTyMi5gbzVRhjA6PEJ/ZZm5uPcUvOVEZi0IntT4IXXmjfpQHdF6M6uJmDco6jYnY2u0x1TNdP2hLbkzfVYU5HJzSU4Wag== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Shiju Jose Add Documentation for expansion of EDAC for controlling RAS features. Signed-off-by: Shiju Jose --- Documentation/edac/features.rst | 102 +++++++ Documentation/edac/index.rst | 12 + Documentation/edac/memory_repair.rst | 230 ++++++++++++++++ Documentation/edac/scrub.rst | 393 +++++++++++++++++++++++++++ 4 files changed, 737 insertions(+) create mode 100644 Documentation/edac/features.rst create mode 100644 Documentation/edac/index.rst create mode 100644 Documentation/edac/memory_repair.rst create mode 100644 Documentation/edac/scrub.rst diff --git a/Documentation/edac/features.rst b/Documentation/edac/features.rst new file mode 100644 index 000000000000..5e855952136b --- /dev/null +++ b/Documentation/edac/features.rst @@ -0,0 +1,102 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================================ +Augmenting EDAC for controlling RAS features +============================================ + +Copyright (c) 2024 HiSilicon Limited. + +:Author: Shiju Jose +:License: The GNU Free Documentation License, Version 1.2 + (dual licensed under the GPL v2) +:Original Reviewers: + +- Written for: 6.13 + +Introduction +------------ +The expansion of EDAC for controlling RAS features and exposing features +control attributes to userspace via sysfs. Some Examples: + +* Scrub control + +* Error Check Scrub (ECS) control + +* ACPI RAS2 features + +* Post Package Repair (PPR) control + +* Memory Sparing Repair control etc. + +High level design is illustrated in the following diagram:: + + _______________________________________________ + | Userspace - Rasdaemon | + | _____________ | + | | RAS CXL mem | _______________ | + | |error handler|---->| | | + | |_____________| | RAS dynamic | | + | _____________ | scrub, memory | | + | | RAS memory |---->| repair control| | + | |error handler| |_______________| | + | |_____________| | | + |__________________________|____________________| + | + | + _______________________________|______________________________ + | Kernel EDAC extension for | controlling RAS Features | + | ______________________________|____________________________ | + || EDAC Core Sysfs EDAC| Bus | | + || __________________________|_________ _____________ | | + || |/sys/bus/edac/devices//scrubX/ | | EDAC device || | + || |/sys/bus/edac/devices//ecsX/ |<->| EDAC MC || | + || |/sys/bus/edac/devices//repairX | | EDAC sysfs || | + || |____________________________________| |_____________|| | + || EDAC|Bus | | + || | | | + || __________ Get feature | Get feature | | + || | |desc _________|______ desc __________ | | + || |EDAC scrub|<-----| EDAC device | | | | | + || |__________| | driver- RAS |---->| EDAC mem | | | + || __________ | feature control| | repair | | | + || | |<-----|________________| |__________| | | + || |EDAC ECS | Register RAS|features | | + || |__________| | | | + || ______________________|_____________ | | + ||_________|_______________|__________________|______________| | + | _______|____ _______|_______ ____|__________ | + | | | | CXL mem driver| | Client driver | | + | | ACPI RAS2 | | scrub, ECS, | | memory repair | | + | | driver | | sparing, PPR | | features | | + | |____________| |_______________| |_______________| | + | | | | | + |________|_________________|____________________|______________| + | | | + ________|_________________|____________________|______________ + | ___|_________________|____________________|_______ | + | | | | + | | Platform HW and Firmware | | + | |__________________________________________________| | + |______________________________________________________________| + + +1. EDAC Features components - Create feature specific descriptors. +For example, EDAC scrub, EDAC ECS, EDAC memory repair in the above +diagram. + +2. EDAC device driver for controlling RAS Features - Get feature's attribute +descriptors from EDAC RAS feature component and registers device's RAS +features with EDAC bus and exposes the features control attributes via +the sysfs EDAC bus. For example, /sys/bus/edac/devices//X/ + +3. RAS dynamic feature controller - Userspace sample modules in rasdaemon for +dynamic scrub/repair control to issue scrubbing/repair when excess number +of corrected memory errors are reported in a short span of time. + +RAS features +------------ +1. Memory Scrub +Memory scrub features are documented in `Documentation/edac/scrub.rst`. + +2. Memory Repair +Memory repair features are documented in `Documentation/edac/memory_repair.rst`. diff --git a/Documentation/edac/index.rst b/Documentation/edac/index.rst new file mode 100644 index 000000000000..d6778f4562dd --- /dev/null +++ b/Documentation/edac/index.rst @@ -0,0 +1,12 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============== +EDAC Subsystem +============== + +.. toctree:: + :maxdepth: 1 + + features + memory_repair + scrub diff --git a/Documentation/edac/memory_repair.rst b/Documentation/edac/memory_repair.rst new file mode 100644 index 000000000000..ad7f869e0b15 --- /dev/null +++ b/Documentation/edac/memory_repair.rst @@ -0,0 +1,230 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================== +EDAC Memory Repair Control +========================== + +Copyright (c) 2024 HiSilicon Limited. + +:Author: Shiju Jose +:License: The GNU Free Documentation License, Version 1.2 + (dual licensed under the GPL v2) +:Original Reviewers: + +- Written for: 6.13 + +Introduction +------------ +Memory devices may support memory repair and maintenance operations to +perform repairs on faulty memory media. Various types of memory repair +features are available, such as Post Package Repair (PPR) and memory +sparing. + +Post Package Repair(PPR) +~~~~~~~~~~~~~~~~~~~~~~~~ +PPR maintenance operation requests the memory device to perform a repair +operation on its media if supported. A memory device may support two types +of PPR: Hard PPR (hPPR), for a permanent row repair and Soft PPR (sPPR), +for a temporary row repair. sPPR is much faster than hPPR, but the repair +is lost with a power cycle. During the execution of a PPR maintenance +operation, a memory device, may or may not retain data and may or may not +be able to process memory requests correctly. sPPR maintenance operation +may be executed at runtime, if data is retained and memory requests are +correctly processed. hPPR maintenance operation may be executed only at +boot because data would not be retained. In CXL devices, sPPR and hPPR +repair operations may be supported (CXL spec rev 3.1 sections 8.2.9.7.1.2 +and 8.2.9.7.1.3). + +Memory Sparing +~~~~~~~~~~~~~~ +Memory sparing is defined as a repair function that replaces a portion of +memory with a portion of functional memory at that same DPA. User space +tool, e.g. rasdaemon, may request the sparing operation for a given +address for which the uncorrectable error is reported. In CXL, +(CXL spec 3.1 section 8.2.9.7.1.4) subclasses for sparing operation vary +in terms of the scope of the sparing being performed. Cacheline sparing +subclass refers to a sparing action that can replace a full cacheline. +Row sparing is provided as an alternative to PPR sparing functions and its +scope is that of a single DDR row. Bank sparing allows an entire bank to +be replaced. Rank sparing is defined as an operation in which an entire +DDR rank is replaced. + +Use cases of generic memory repair features control +--------------------------------------------------- + +1. The Soft PPR (sPPR), Hard PPR (hPPR), and memory-sparing features share +similar control interfaces. Therefore, there is a need for a standardized, +generic sysfs repair control that is exposed to userspace and used by +administrators, scripts, and tools. + +2. When a CXL device detects a failure in a memory component, it may inform +the host of the need for a repair maintenance operation by using an event +record where the "maintenance needed" flag is set. The event record +specifies the DPA that requires repair. The kernel reports the corresponding +CXL general media or DRAM trace event to userspace, and userspace tools +(e.g., rasdaemon) initiate a repair maintenance operation in response to +the device request using the sysfs repair control. + +3. Userspace tools, such as rasdaemon, may request a PPR/sparing on a memory +region when an uncorrected memory error or an excess of corrected memory +errors is reported on that memory. + +4. Multiple PPR/sparing instances may be present per memory device. + +The File System +--------------- + +The control attributes of a registered scrubber instance could be +accessed in the + +/sys/bus/edac/devices//mem_repairX/ + +sysfs +----- + +Sysfs files are documented in + +`Documentation/ABI/testing/sysfs-edac-memory-repair`. + +Example +------- + +The usage takes the form shown in this example: + +1. CXL memory device sPPR + +# read capabilities + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/dpa_support + +1 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/nibble_mask + +0x0 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/persist_mode_avail + +0 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/persist_mode + +0 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/repair_type + +0 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/repair_safe_when_in_use + +1 + +# set and readback attributes + +root@localhost:~# echo 0x8a2d > /sys/bus/edac/devices/cxl_mem0/mem_repair0/nibble_mask + +root@localhost:~# echo 0x300000 > /sys/bus/edac/devices/cxl_mem0/mem_repair0/dpa + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/dpa + +0x300000 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/nibble_mask + +0x8a2d + +# issue repair operations + +# query and reapir return error if unsupported/failed. + +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/mem_repair0/query + +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/mem_repair0/repair + +1.2. CXL memory sparing + +# read capabilities + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/repair_type + +2 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/dpa_support + +1 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/persist_mode_avail + +0,1 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/persist_mode + +0 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/repair_safe_when_in_use + +1 + +#set and readback attributes + +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/bank_group + +root@localhost:~# echo 3 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/bank + +root@localhost:~# echo 2 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/channel + +root@localhost:~# echo 7 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/rank + +root@localhost:~# echo 0x4fb9 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/row + +root@localhost:~# echo 5 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/sub_channel + +root@localhost:~# echo 11 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/column + +root@localhost:~# echo 0x85c2 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/nibble_mask + +root@localhost:~# echo 0x700000 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/dpa + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/bank_group + +1 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/bank + +3 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/channel + +2 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/rank + +7 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/row + +0x4fb9 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/sub_channel + +5 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/column + +11 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/nibble_mask + +0x85c2 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/dpa + +0x700000 + +# issue repair operations + +# query and repair return error if unsupported/failed. + +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/query + +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/repair diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst new file mode 100644 index 000000000000..d316f98604ad --- /dev/null +++ b/Documentation/edac/scrub.rst @@ -0,0 +1,393 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=================== +EDAC Scrub Control +=================== + +Copyright (c) 2024 HiSilicon Limited. + +:Author: Shiju Jose +:License: The GNU Free Documentation License, Version 1.2 + (dual licensed under the GPL v2) +:Original Reviewers: + +- Written for: 6.13 + +Introduction +------------ +Increasing DRAM size and cost have made memory subsystem reliability an +important concern. These modules are used where potentially corrupted data +could cause expensive or fatal issues. Memory errors are among the top +hardware failures that cause server and workload crashes. + +Memory scrubbing is a feature where an ECC (Error-Correcting Code) engine +reads data from each memory media location, corrects with an ECC if +necessary and writes the corrected data back to the same memory media +location. + +The memory DIMMs can be scrubbed at a configurable rate to detect +uncorrected memory errors and attempt recovery from detected errors, +providing the following benefits. + +* Proactively scrubbing memory DIMMs reduces the chance of a correctable error becoming uncorrectable. + +* When detected, uncorrected errors caught in unallocated memory pages are isolated and prevented from being allocated to an application or the OS. + +* This reduces the likelihood of software or hardware products encountering memory errors. + +There are 2 types of memory scrubbing: + +1. Background (patrol) scrubbing of the RAM while the RAM is otherwise +idle. + +2. On-demand scrubbing for a specific address range or region of memory. + +Several types of interfaces to hardware memory scrubbers have been +identified, such as CXL memory device patrol scrub, CXL DDR5 ECS, ACPI +RAS2 memory scrubbing, and ACPI NVDIMM ARS (Address Range Scrub). + +The scrub control varies between different memory scrubbers. To allow +for standard userspace tooling there is a need to present these controls +with a standard ABI. + +The control mechanisms vary across different memory scrubbers. To enable +standardized userspace tooling, there is a need to present these controls +through a standardized ABI. + +Introduce a generic memory EDAC scrub control that allows users to manage +underlying scrubbers in the system through a standardized sysfs scrub +control interface. This common sysfs scrub control interface abstracts the +management of various scrubbing functionalities into a unified set of +functions. + +Use cases of common scrub control feature +----------------------------------------- +1. Several types of interfaces for hardware (HW) memory scrubbers have +been identified, including the CXL memory device patrol scrub, CXL DDR5 +ECS, ACPI RAS2 memory scrubbing features, ACPI NVDIMM ARS (Address Range +Scrub), and software-based memory scrubbers. Some of these scrubbers +support control over patrol (background) scrubbing (e.g., ACPI RAS2, CXL) +and/or on-demand scrubbing (e.g., ACPI RAS2, ACPI ARS). However, the scrub +control interfaces vary between memory scrubbers, highlighting the need for +a standardized, generic sysfs scrub control interface that is accessible to +userspace for administration and use by scripts/tools. + +2. User-space scrub controls allow users to disable scrubbing if necessary, +for example, to disable background patrol scrubbing or adjust the scrub +rate for performance-aware operations where background activities need to +be minimized or disabled. + +3. User-space tools enable on-demand scrubbing for specific address ranges, +provided that the scrubber supports this functionality. + +4. User-space tools can also control memory DIMM scrubbing at a configurable +scrub rate via sysfs scrub controls. This approach offers several benefits: + +* Detects uncorrectable memory errors early, before user access to affected memory, helping facilitate recovery. + +* Reduces the likelihood of correctable errors developing into uncorrectable errors. + +5. Policy control for hotplugged memory is necessary because there may not +be a system-wide BIOS or similar control to manage scrub settings for a CXL +device added after boot. Determining these settings is a policy decision, +balancing reliability against performance, so userspace should control it. +Therefore, a unified interface is recommended for handling this function in +a way that aligns with other similar interfaces, rather than creating a +separate one. + +Scrubbing features +------------------ +Comparison of various scrubbing features:: + + ................................................................ + . . ACPI . CXL patrol. CXL ECS . ARS . + . Name . RAS2 . scrub . . . + ................................................................ + . . . . . . + . On-demand . Supported . No . No . Supported . + . Scrubbing . . . . . + . . . . . . + ................................................................ + . . . . . . + . Background . Supported . Supported . Supported . No . + . scrubbing . . . . . + . . . . . . + ................................................................ + . . . . . . + . Mode of . Scrub ctrl. per device. per memory. Unknown . + . scrubbing . per NUMA . . media . . + . . domain. . . . . + ................................................................ + . . . . . . + . Query scrub . Supported . Supported . Supported . Supported . + . capabilities . . . . . + . . . . . . + ................................................................ + . . . . . . + . Setting . Supported . No . No . Supported . + . address range. . . . . + . . . . . . + ................................................................ + . . . . . . + . Setting . Supported . Supported . No . No . + . scrub rate . . . . . + . . . . . . + ................................................................ + . . . . . . + . Unit for . Not . in hours . No . No . + . scrub rate . Defined . . . . + . . . . . . + ................................................................ + . . Supported . . . . + . Scrub . on-demand . No . No . Supported . + . status/ . scrubbing . . . . + . Completion . only . . . . + ................................................................ + . UC error . .CXL general.CXL general. ACPI UCE . + . reporting . Exception .media/DRAM .media/DRAM . notify and. + . . .event/media.event/media. query . + . . .scan? .scan? . ARS status. + ................................................................ + . . . . . . + . Support for . Supported . Supported . Supported . No . + . EDAC control . . . . . + . . . . . . + ................................................................ + +CXL Memory Scrubbing features +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +CXL spec r3.1 section 8.2.9.9.11.1 describes the memory device patrol scrub +control feature. The device patrol scrub proactively locates and makes +corrections to errors in regular cycle. The patrol scrub control allows the +request to configure patrol scrubber's input configurations. + +The patrol scrub control allows the requester to specify the number of +hours in which the patrol scrub cycles must be completed, provided that +the requested number is not less than the minimum number of hours for the +patrol scrub cycle that the device is capable of. In addition, the patrol +scrub controls allow the host to disable and enable the feature in case +disabling of the feature is needed for other purposes such as +performance-aware operations which require the background operations to be +turned off. + +Error Check Scrub (ECS) +~~~~~~~~~~~~~~~~~~~~~~~ +CXL spec r3.1 section 8.2.9.9.11.2 describes the Error Check Scrub (ECS) +is a feature defined in JEDEC DDR5 SDRAM Specification (JESD79-5) and +allows the DRAM to internally read, correct single-bit errors, and write +back corrected data bits to the DRAM array while providing transparency +to error counts. + +The DDR5 device contains number of memory media FRUs per device. The +DDR5 ECS feature and thus the ECS control driver supports configuring +the ECS parameters per FRU. + +ACPI RAS2 Hardware-based Memory Scrubbing +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +ACPI spec 6.5 section 5.2.21 ACPI RAS2 describes ACPI RAS2 table +provides interfaces for platform RAS features and supports independent +RAS controls and capabilities for a given RAS feature for multiple +instances of the same component in a given system. +Memory RAS features apply to RAS capabilities, controls and operations +that are specific to memory. RAS2 PCC sub-spaces for memory-specific RAS +features have a Feature Type of 0x00 (Memory). + +The platform can use the hardware-based memory scrubbing feature to expose +controls and capabilities associated with hardware-based memory scrub +engines. The RAS2 memory scrubbing feature supports following as per spec, + +* Independent memory scrubbing controls for each NUMA domain, identified using its proximity domain. + +* Provision for background (patrol) scrubbing of the entire memory system, as well as on-demand scrubbing for a specific region of memory. + +ACPI Address Range Scrubbing(ARS) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +ACPI spec 6.5 section 9.19.7.2 describes Address Range Scrubbing(ARS). +ARS allows the platform to communicate memory errors to system software. +This capability allows system software to prevent accesses to addresses +with uncorrectable errors in memory. ARS functions manage all NVDIMMs +present in the system. Only one scrub can be in progress system wide +at any given time. +Following functions are supported as per the specification. + +1. Query ARS Capabilities for a given address range, indicates platform +supports the ACPI NVDIMM Root Device Unconsumed Error Notification. + +2. Start ARS triggers an Address Range Scrub for the given memory range. +Address scrubbing can be done for volatile memory, persistent memory, or both. + +3. Query ARS Status command allows software to get the status of ARS, +including the progress of ARS and ARS error record. + +4. Clear Uncorrectable Error. + +5. Translate SPA + +6. ARS Error Inject etc. + +The kernel supports an existing control for ARS and ARS is currently not +supported in EDAC. + +The File System +--------------- + +The control attributes of a registered scrubber instance could be +accessed in the + +/sys/bus/edac/devices//scrubX/ + +sysfs +----- + +Sysfs files are documented in + +`Documentation/ABI/testing/sysfs-edac-scrub`. + +`Documentation/ABI/testing/sysfs-edac-ecs`. + +Example +------- + +The usage takes the form shown in this example: + +1. CXL memory device patrol scrubber + +1.1 device based + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/min_cycle_duration + +3600 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/max_cycle_duration + +918000 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration + +43200 + +root@localhost:~# echo 54000 > /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration + +54000 + +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background + +1 + +root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background + +0 + +1.2. region based + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/min_cycle_duration + +3600 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/max_cycle_duration + +918000 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration + +43200 + +root@localhost:~# echo 54000 > /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration + +54000 + +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_region0/scrub0/enable_background + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/enable_background + +1 + +root@localhost:~# echo 0 > /sys/bus/edac/devices/cxl_region0/scrub0/enable_background + +root@localhost:~# cat /sys/bus/edac/devices/cxl_region0/scrub0/enable_background + +0 + +2. RAS2 + +2.1 On demand scrubbing for a specific memory region. + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/min_cycle_duration + +3600 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/max_cycle_duration + +86400 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +36000 + +# Readback 'addr', non-zero - demand scrub is in progress, zero - scrub is finished. + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr + +0 + +root@localhost:~# echo 54000 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +root@localhost:~# echo 0x150000 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/size + +# Write 'addr' starts demand scrubbing, please make sure other attributes are set prior to that. + +root@localhost:~# echo 0x120000 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +54000 + +# Readback 'addr', non-zero - demand scrub is in progress, zero - scrub is finished. + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr + +0x120000 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr + +0 + +2.2 Background scrubbing the entire memory + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/min_cycle_duration + +3600 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/max_cycle_duration + +86400 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +36000 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background + +0 + +root@localhost:~# echo 10800 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +root@localhost:~# echo 1 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background + +1 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +10800 + +root@localhost:~# echo 0 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background