From patchwork Fri Jan 20 03:46:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 13109101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24ED5C46467 for ; Fri, 20 Jan 2023 03:46:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AE6DB6B0073; Thu, 19 Jan 2023 22:46:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A963D6B0078; Thu, 19 Jan 2023 22:46:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 986B66B007B; Thu, 19 Jan 2023 22:46:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7E0CE6B0073 for ; Thu, 19 Jan 2023 22:46:32 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4A9251605EF for ; Fri, 20 Jan 2023 03:46:32 +0000 (UTC) X-FDA: 80373790224.23.943E707 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf28.hostedemail.com (Postfix) with ESMTP id 880BAC0005 for ; Fri, 20 Jan 2023 03:46:29 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=n9vjHRZG; spf=pass (imf28.hostedemail.com: domain of 3lA7KYwgKCP0onfvn3fslttlqj.htrqnsz2-rrp0fhp.twl@flex--jiaqiyan.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3lA7KYwgKCP0onfvn3fslttlqj.htrqnsz2-rrp0fhp.twl@flex--jiaqiyan.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674186389; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+XoYfvivMm4eK9U0FZ9oxFt/e5wfF2TATZ3CeGQNbi4=; b=sLJFT9A9FuCHMkPW3TGmVpqjm4zfLmCd5vX7YXHdoPOsIJtRfJ/OYlkSO5I4oaZh1yVfbn Z4dineoKqx1mkrR02ToZBF94Q9T5/y90yfo4GMzS53ax3kCy+PwlScvnjjvKYU+0kAJsS5 HTOZuFBTygxlqNXpU7NQHR1pL8N3My4= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=n9vjHRZG; spf=pass (imf28.hostedemail.com: domain of 3lA7KYwgKCP0onfvn3fslttlqj.htrqnsz2-rrp0fhp.twl@flex--jiaqiyan.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3lA7KYwgKCP0onfvn3fslttlqj.htrqnsz2-rrp0fhp.twl@flex--jiaqiyan.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674186389; a=rsa-sha256; cv=none; b=snq2+otCt+ieVjmSdoWtr5j1JKXN9r0QDOSRfpIvaIbCTzOJgtvZzFBitWom3Kjhs9pfnH oh1OtEEdqUcncExcl5586SHRD+G006BidW7ATXyXaE3FUjogpS5dAiXwzMKTovCl1f2r8E V18L2oaTMfz7ouL+fpFNOmABONLZJ5M= Received: by mail-pj1-f74.google.com with SMTP id h6-20020a17090aa88600b00223fccff2efso4365662pjq.6 for ; Thu, 19 Jan 2023 19:46:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+XoYfvivMm4eK9U0FZ9oxFt/e5wfF2TATZ3CeGQNbi4=; b=n9vjHRZGnsz/IMMNj5Wx2W8jqZ6ZwkZ1+QsS4bYe6GGSs21nvGHNE437QEtVkHY5qY lapCC1WEk/RbNA62cnhio/RJWqRmDYlD6G2tkhg1Ondc61+wmfNXA6mF5MwKY22LuStG 9/I6+KX4Vu0e/mfstLju8YIEjkjub7UgWHqx18vhtD8HTU+N75EJ+aOJzGo4g8sfi9yT LwF1gNbpmB0CLcpHtCWz3VsAPn63/LxDRxpmJoCVR9LEg+AmZrOqX93bgsrl14p4tfNq vK9unWMDwvlQbrlifyJER7AtqOEhTDSs4vAK0nvtrB8Yhfp2YCNXKCJkW8ijyRZZQ+TF l+iA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+XoYfvivMm4eK9U0FZ9oxFt/e5wfF2TATZ3CeGQNbi4=; b=UWrSSH6gwh4jjQlzHt3P+wlh8cCs83Fvx0a4rWS0MYaohmvPiwvIfo5KNDolKIGb7G 7RVZJrXQymdKWpL1XlveecQPiQbUS81XDPwSmToJ0IIEK9rfbvgiBr6c2isN8EFhIkeX JKvaDtPGulQUudZMkCRhMKMymIMxFsfuWGFK66bmOXhNefz8HwA36TUcIXsdgGX3ojAa s3S4oZk7/Chyo/5rhgdlJrghePVmpB41cI77/JCYUhWB1GzIif59sVqDoV7lD/PCTrw4 uUcSuUi+YFmbwJytakvoCQusqWzkk9COuFB+pGAPt00F4vvuskHBJbQm5X/gjOJURDXs 9FmQ== X-Gm-Message-State: AFqh2koDiVAy0zkQoLe4Xo9t5BedLGseStkPk1+90ZfhhnSJTegAMbPQ SvN99/SOdvaEnAUK3r1VpKbs15j846qopg== X-Google-Smtp-Source: AMrXdXtTUrIqVjvBaNXV3f8Rx6f4oKdykpfGHDnW449gp+XvYvD73y8BnRbDQd746fWpJLK29zIc0xbBHtaNlg== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a17:902:d894:b0:193:405:1704 with SMTP id b20-20020a170902d89400b0019304051704mr1324884plz.31.1674186388275; Thu, 19 Jan 2023 19:46:28 -0800 (PST) Date: Fri, 20 Jan 2023 03:46:20 +0000 In-Reply-To: <20230120034622.2698268-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230120034622.2698268-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.39.0.246.g2a6d74b583-goog Message-ID: <20230120034622.2698268-2-jiaqiyan@google.com> Subject: [PATCH v2 1/3] mm: memory-failure: Add memory failure stats to sysfs From: Jiaqi Yan To: tony.luck@intel.com, naoya.horiguchi@nec.com Cc: jiaqiyan@google.com, duenwen@google.com, rientjes@google.com, linux-mm@kvack.org, shy828301@gmail.com, akpm@linux-foundation.org, wangkefeng.wang@huawei.com X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 880BAC0005 X-Rspam-User: X-Stat-Signature: 3reohmrqpuzo33ogqew9hnfq8npef3n6 X-HE-Tag: 1674186389-599683 X-HE-Meta: U2FsdGVkX18FfhFIe3vJwaGJNGTkFqzCkShSmyu8Pi7jhKxURHczzxcm7V+9+sEKGMWOEOp/e4tyBB6PKD3gP34jnhloHDWbmxidniAWY6plAKDWWz3FTWUR1wmagDBiDwW029ga85/XUMPLa73FoUN+bzbqbEW90/DkQLYFrsaHgg/xxq95c/9rnBn1XiWv5yZRk1zw7Whi3Xgk/G+JKyO6VG5S2A1F5hvJ4lCP1CE9q+ZuO8I9tGi3lw3W5fsYEhCihN3jiRps7TlLarK0Cr0zvaRmJ7a34uzitbTMHtL/dWWFGr2JSVp3fRnQnY//TI/JW1XqUeBRk7tSK77gC5j3AAvqoPIVeHzRG08BTcUQhHhqAiz4dsAwX7YkSKF1ENxxJ3QzOe/CucKxtp1Fb3oQLSgji5j7XVsUdc6PjRLAZb5uSmocfxmIxUgImv5Bi7aT9U1yPyHoKPWHALnjnLTIxBRR6Zi7m+MIL5oN+Q0BNbSN9x4E+pGkkySCezWKt/pcFfB7sqxKGwQucbtM7rdLbpUqOm98/YXyf+1nzF9PRwCmJyPDa44yF5q2ach6OcRoHxK4zjKjZHQsIOEhE0zhBhpnVHpRQDOjZd+pOVX3rjnbkGS9Y+qO8eZRxh07gvMOHcDdKinW3JRSnwSUXWDnxX9CcYuCWqvsTw44dtqtzJDDsRGPlt2egGTXeQCuw3RcftoAaHTeqGjXtRtxJHhTdegmGcmSMJljRRultEjrerb6i7y8+IPtf5nz1Kw35au0Ep8FGdToKUSP7FYXr4KhoguK4TLWQj72/D9fy0OWDRxlhsc1ZLKJtNY6OoqjNvlCRjd8w2X3yguWJs4EtXul2DAKJxSXaBqgHvXgtD0/1AgSdb6aZn9JkA+RDVTmVfRtwxFBBVfngw06inuGGJjNuiG9MgEKrTYXtFXOWAqlKs7E8BHlw871lqUsjl2ptcMBXo2WkCUn81oHadM 7A5Kal9k 0XM/j1pBIccLR0T6MD4R23IRsbnqvQQ65R/JzrJLyOWBm2s6OHl3BBfpcCehhZOVBain/K1WAdTSUB1FQKZnrIVgLqP1qYRQaovhlVJssgTxzsDkMkl4w424O1el9eMsUJ9thVQRp3oN6mNveKillm8ckgF1sFtdNh8YC2rqp/fzxUcrKHiNivZ0g/26QjS8RHduff5rqLQmDsrmDudVDqOUfAnVZj8grhuVCgw0xbW3QozP5HWgDtPSGOo3Tm93K5T0i1X/AeHvkUMJ78m5sOCbo9YYXauXGpoSK2A7aoXlRxUOGXS4s4DHlnOg0II+12tWiX1efHlwZoiW5+Z0Ej7waY/JGOUSANiWmFVmliEQkSj0/KyLIGJRrC5vS4lO1M611mZfVLwwQ8aTeUAEDSJg8Rd8ZZiXkomh2ZkEl3I5t3ZvoCLv5aI1mItO2nTDwBZX4h4amXQaFdE8se56z0AcM/ARFGD1EQyLrd6oDpLdhwLStZ1+lh165ngBfPNd9HLB2JJLgCoBZjjcnCDTc/NsRjMmD2//c6g7Z6VstdVvjdSwClpW8qYSy4ucZEg7LgWWwzHoUJ/aXAtKANwPp1J0bC5Ati+RO6T/HY2tXV7JnplLo40RAPtb42GGPDZw2yqlHG9cSyLukPdU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Today kernel provides following memory error info to userspace, but each has its own disadvantage * HardwareCorrupted in /proc/meminfo: number of bytes poisoned in total, not per NUMA node stats though * ras:memory_failure_event: only available after explicitly enabled * /dev/mcelog provides many useful info about the MCEs, but doesn't capture how memory_failure recovered memory MCEs * kernel logs: userspace needs to process log text Exposes per NUMA node memory error stats as sysfs entries: /sys/devices/system/node/node${X}/memory_failure/total /sys/devices/system/node/node${X}/memory_failure/recovered /sys/devices/system/node/node${X}/memory_failure/ignored /sys/devices/system/node/node${X}/memory_failure/failed /sys/devices/system/node/node${X}/memory_failure/delayed These counters describe how many raw pages are poisoned and after the attempted recoveries by the kernel, their resolutions: how many are recovered, ignored, failed, or delayed respectively. The following math holds for the statistics: * total = recovered + ignored + failed + delayed Acked-by: David Rientjes Signed-off-by: Jiaqi Yan Acked-by: Naoya Horiguchi --- drivers/base/node.c | 3 +++ include/linux/mm.h | 5 +++++ include/linux/mmzone.h | 28 ++++++++++++++++++++++++++++ mm/memory-failure.c | 35 +++++++++++++++++++++++++++++++++++ 4 files changed, 71 insertions(+) diff --git a/drivers/base/node.c b/drivers/base/node.c index faf3597a96da..b46db17124f3 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -586,6 +586,9 @@ static const struct attribute_group *node_dev_groups[] = { &node_dev_group, #ifdef CONFIG_HAVE_ARCH_NODE_DEV_GROUP &arch_node_dev_group, +#endif +#ifdef CONFIG_MEMORY_FAILURE + &memory_failure_attr_group, #endif NULL }; diff --git a/include/linux/mm.h b/include/linux/mm.h index f3f196e4d66d..888576884eb9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3521,6 +3521,11 @@ enum mf_action_page_type { MF_MSG_UNKNOWN, }; +/* + * Sysfs entries for memory failure handling statistics. + */ +extern const struct attribute_group memory_failure_attr_group; + #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) extern void clear_huge_page(struct page *page, unsigned long addr_hint, diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index cd28a100d9e4..2c537b31fa7b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1110,6 +1110,31 @@ struct deferred_split { }; #endif +#ifdef CONFIG_MEMORY_FAILURE +/* + * Per NUMA node memory failure handling statistics. + */ +struct memory_failure_stats { + /* + * Number of raw pages poisoned. + * Cases not accounted: memory outside kernel control, offline page, + * arch-specific memory_failure (SGX), hwpoison_filter() filtered + * error events, and unpoison actions from hwpoison_unpoison. + */ + unsigned long total; + /* + * Recovery results of poisoned raw pages handled by memory_failure, + * in sync with mf_result. + * total = ignored + failed + delayed + recovered. + * total * PAGE_SIZE * #nodes = /proc/meminfo/HardwareCorrupted. + */ + unsigned long ignored; + unsigned long failed; + unsigned long delayed; + unsigned long recovered; +}; +#endif + /* * On NUMA machines, each NUMA node would have a pg_data_t to describe * it's memory layout. On UMA machines there is a single pglist_data which @@ -1253,6 +1278,9 @@ typedef struct pglist_data { #ifdef CONFIG_NUMA struct memory_tier __rcu *memtier; #endif +#ifdef CONFIG_MEMORY_FAILURE + struct memory_failure_stats mf_stats; +#endif } pg_data_t; #define node_present_pages(nid) (NODE_DATA(nid)->node_present_pages) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index c77a9e37e27e..c628f1db3a4d 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -87,6 +87,41 @@ inline void num_poisoned_pages_sub(unsigned long pfn, long i) memblk_nr_poison_sub(pfn, i); } +/** + * MF_ATTR_RO - Create sysfs entry for each memory failure statistics. + * @_name: name of the file in the per NUMA sysfs directory. + */ +#define MF_ATTR_RO(_name) \ +static ssize_t _name##_show(struct device *dev, \ + struct device_attribute *attr, \ + char *buf) \ +{ \ + struct memory_failure_stats *mf_stats = \ + &NODE_DATA(dev->id)->mf_stats; \ + return sprintf(buf, "%lu\n", mf_stats->_name); \ +} \ +static DEVICE_ATTR_RO(_name) + +MF_ATTR_RO(total); +MF_ATTR_RO(ignored); +MF_ATTR_RO(failed); +MF_ATTR_RO(delayed); +MF_ATTR_RO(recovered); + +static struct attribute *memory_failure_attr[] = { + &dev_attr_total.attr, + &dev_attr_ignored.attr, + &dev_attr_failed.attr, + &dev_attr_delayed.attr, + &dev_attr_recovered.attr, + NULL, +}; + +const struct attribute_group memory_failure_attr_group = { + .name = "memory_failure", + .attrs = memory_failure_attr, +}; + /* * Return values: * 1: the page is dissolved (if needed) and taken off from buddy,