From patchwork Fri Sep 23 14:12:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12986614 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C46DECAAD8 for ; Fri, 23 Sep 2022 14:12:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 11FEA80008; Fri, 23 Sep 2022 10:12:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D05380007; Fri, 23 Sep 2022 10:12:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EDA2380008; Fri, 23 Sep 2022 10:12:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DF63280007 for ; Fri, 23 Sep 2022 10:12:16 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 95DCA1C692F for ; Fri, 23 Sep 2022 14:12:16 +0000 (UTC) X-FDA: 79943539872.29.DBD6395 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf12.hostedemail.com (Postfix) with ESMTP id A8A4340005 for ; Fri, 23 Sep 2022 14:12:15 +0000 (UTC) Date: Fri, 23 Sep 2022 23:12:04 +0900 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1663942333; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=dvPwM1cF3LDELbAASe38zl1PgrpfZck8Uq5npiVmskg=; b=lJLV9kQwtjfS4/R2xzfuvYrcfp9xxRrEo1UwiBOqn80UuWnBusebLX7UEmsk98DmZLqRgZ v7QoqT7KwnFA3qA/1gkqqhHjp+V/dfd1FDsM1dHJb+qOaDa3kYPu+SWqWz+FxTNi3fUIE7 3l5pPGwU2qDMqS7a04JWLgYEiWm6PdY= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , Miaohe Lin , David Hildenbrand , Mike Kravetz , Yang Shi , Oscar Salvador , Muchun Song , Jane Chu , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v5 4/4] mm/hwpoison: introduce per-memory_block hwpoison counter counter Message-ID: <20220923141204.GA1484969@ik1-406-35019.vs.sakura.ne.jp> References: <20220921091359.25889-1-naoya.horiguchi@linux.dev> <20220921091359.25889-5-naoya.horiguchi@linux.dev> <20220923082613.GB1357512@ik1-406-35019.vs.sakura.ne.jp> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20220923082613.GB1357512@ik1-406-35019.vs.sakura.ne.jp> X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663942336; a=rsa-sha256; cv=none; b=Dy2F8TxesKZFJ/zgBL+ZSw6kzNyHcgnTY8cTfzvj35W12k5Kl5bM85j9+2BOa9CyHQcMdK iBc+fCPa3wMszd6+ZdlSrg2tj16fBittJXcDaKDE4KpitLe4QuhtIYe9WbqoIx9iStSzWK vat37u+SO6jm6jXF4FGa4gWWEva0ooA= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=lJLV9kQw; spf=pass (imf12.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663942336; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dvPwM1cF3LDELbAASe38zl1PgrpfZck8Uq5npiVmskg=; b=6vptCJrCCW5GFGYvxhhQv/H3CHsRpN8TVlkJV6zq6ZEVDt9d0NUEbOfrAI8xnAWgRh+ZaC XvI4/NpLFx1BUyTtevZ29U7reIcBFWNk8ZfBtSRgSLI8gvIxM7RBP28aUObjnVJuZvxOEq DL14tjY0GTSRhEx7VzeQgwAsHaxVd9E= X-Stat-Signature: 43i1aces9to4bxjpebqgx57o77gkqhsq X-Rspamd-Queue-Id: A8A4340005 X-Rspam-User: X-Rspamd-Server: rspam08 Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=lJLV9kQw; spf=pass (imf12.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-HE-Tag: 1663942335-451825 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There seems another build error in aarch64 with MEMORY_HOTPLUG disabled. https://lore.kernel.org/lkml/20220923110144.GA1413812@ik1-406-35019.vs.sakura.ne.jp/ , so let me revise this patch again to handle it. - Naoya Horiguchi Reviewed-by: Miaohe Lin --- From: Naoya Horiguchi Date: Fri, 23 Sep 2022 22:51:20 +0900 Subject: [PATCH v5 4/4] mm/hwpoison: introduce per-memory_block hwpoison counter Currently PageHWPoison flag does not behave well when experiencing memory hotremove/hotplug. Any data field in struct page is unreliable when the associated memory is offlined, and the current mechanism can't tell whether a memory section is onlined because a new memory devices is installed or because previous failed offline operations are undone. Especially if there's a hwpoisoned memory, it's unclear what the best option is. So introduce a new mechanism to make struct memory_block remember that a memory block has hwpoisoned memory inside it. And make any online event fail if the onlined memory block contains hwpoison. struct memory_block is freed and reallocated over ACPI-based hotremove/hotplug, but not over sysfs-based hotremove/hotplug. So it's desirable to implement hwpoison counter on this struct. Note that clear_hwpoisoned_pages() is relocated to be called earlier than now, just before unregistering struct memory_block. Otherwise, the per-memory_block hwpoison counter is freed and we fail to adjust global hwpoison counter properly. Signed-off-by: Naoya Horiguchi Reported-by: kernel test robot --- ChangeLog v4 -> v5: - add Reported-by of lkp bot, - check both CONFIG_MEMORY_FAILURE and CONFIG_MEMORY_HOTPLUG in introduced #ifdefs, intending to fix "undefined reference" errors in aarch64. ChangeLog v3 -> v4: - fix build error (https://lore.kernel.org/linux-mm/202209231134.tnhKHRfg-lkp@intel.com/) by using memblk_nr_poison() to access to the member ->nr_hwpoison --- drivers/base/memory.c | 34 ++++++++++++++++++++++++++++++++++ include/linux/memory.h | 3 +++ include/linux/mm.h | 24 ++++++++++++++++++++++++ mm/internal.h | 8 -------- mm/memory-failure.c | 31 ++++++++++--------------------- mm/sparse.c | 2 -- 6 files changed, 71 insertions(+), 31 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 9aa0da991cfb..99e0e789616c 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -183,6 +183,9 @@ static int memory_block_online(struct memory_block *mem) struct zone *zone; int ret; + if (memblk_nr_poison(start_pfn)) + return -EHWPOISON; + zone = zone_for_pfn_range(mem->online_type, mem->nid, mem->group, start_pfn, nr_pages); @@ -864,6 +867,7 @@ void remove_memory_block_devices(unsigned long start, unsigned long size) mem = find_memory_block_by_id(block_id); if (WARN_ON_ONCE(!mem)) continue; + clear_hwpoisoned_pages(memblk_nr_poison(start)); unregister_memory_block_under_nodes(mem); remove_memory_block(mem); } @@ -1164,3 +1168,33 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func, } return ret; } + +#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_MEMORY_HOTPLUG) +void memblk_nr_poison_inc(unsigned long pfn) +{ + const unsigned long block_id = pfn_to_block_id(pfn); + struct memory_block *mem = find_memory_block_by_id(block_id); + + if (mem) + atomic_long_inc(&mem->nr_hwpoison); +} + +void memblk_nr_poison_sub(unsigned long pfn, long i) +{ + const unsigned long block_id = pfn_to_block_id(pfn); + struct memory_block *mem = find_memory_block_by_id(block_id); + + if (mem) + atomic_long_sub(i, &mem->nr_hwpoison); +} + +unsigned long memblk_nr_poison(unsigned long pfn) +{ + const unsigned long block_id = pfn_to_block_id(pfn); + struct memory_block *mem = find_memory_block_by_id(block_id); + + if (mem) + return atomic_long_read(&mem->nr_hwpoison); + return 0; +} +#endif diff --git a/include/linux/memory.h b/include/linux/memory.h index aa619464a1df..ad8cd9bb3239 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -85,6 +85,9 @@ struct memory_block { unsigned long nr_vmemmap_pages; struct memory_group *group; /* group (if any) for this block */ struct list_head group_next; /* next block inside memory group */ +#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_MEMORY_HOTPLUG) + atomic_long_t nr_hwpoison; +#endif }; int arch_get_memory_phys_device(unsigned long start_pfn); diff --git a/include/linux/mm.h b/include/linux/mm.h index 2bb5d1596041..936864d6f8be 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3280,6 +3280,7 @@ extern int soft_offline_page(unsigned long pfn, int flags); #ifdef CONFIG_MEMORY_FAILURE extern int __get_huge_page_for_hwpoison(unsigned long pfn, int flags); extern void num_poisoned_pages_inc(unsigned long pfn); +extern void clear_hwpoisoned_pages(long nr_poison); #else static inline int __get_huge_page_for_hwpoison(unsigned long pfn, int flags) { @@ -3289,6 +3290,29 @@ static inline int __get_huge_page_for_hwpoison(unsigned long pfn, int flags) static inline void num_poisoned_pages_inc(unsigned long pfn) { } + +static inline void clear_hwpoisoned_pages(long nr_poison) +{ +} +#endif + +#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_MEMORY_HOTPLUG) +extern void memblk_nr_poison_inc(unsigned long pfn); +extern void memblk_nr_poison_sub(unsigned long pfn, long i); +extern unsigned long memblk_nr_poison(unsigned long pfn); +#else +static inline void memblk_nr_poison_inc(unsigned long pfn) +{ +} + +static inline void memblk_nr_poison_sub(unsigned long pfn, long i) +{ +} + +static inline unsigned long memblk_nr_poison(unsigned long pfn) +{ + return 0; +} #endif #ifndef arch_memory_failure diff --git a/mm/internal.h b/mm/internal.h index b3002e03c28f..42ba8b96cab5 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -708,14 +708,6 @@ extern u64 hwpoison_filter_flags_value; extern u64 hwpoison_filter_memcg; extern u32 hwpoison_filter_enable; -#ifdef CONFIG_MEMORY_FAILURE -void clear_hwpoisoned_pages(struct page *memmap, int nr_pages); -#else -static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) -{ -} -#endif - extern unsigned long __must_check vm_mmap_pgoff(struct file *, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index a069d43bc87f..03479895086d 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -74,14 +74,17 @@ atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0); static bool hw_memory_failure __read_mostly = false; -static inline void num_poisoned_pages_inc(unsigned long pfn) +void num_poisoned_pages_inc(unsigned long pfn) { atomic_long_inc(&num_poisoned_pages); + memblk_nr_poison_inc(pfn); } static inline void num_poisoned_pages_sub(unsigned long pfn, long i) { atomic_long_sub(i, &num_poisoned_pages); + if (pfn != -1UL) + memblk_nr_poison_sub(pfn, i); } /* @@ -2414,6 +2417,10 @@ int unpoison_memory(unsigned long pfn) unlock_mutex: mutex_unlock(&mf_mutex); if (!ret || freeit) { + /* + * TODO: per-memory_block counter might break when the page + * size to be unpoisoned is larger than a memory_block. + */ num_poisoned_pages_sub(pfn, count); unpoison_pr_info("Unpoison: Software-unpoisoned page %#lx\n", page_to_pfn(p), &unpoison_rs); @@ -2618,25 +2625,7 @@ int soft_offline_page(unsigned long pfn, int flags) return ret; } -void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) +void clear_hwpoisoned_pages(long nr_poison) { - int i, total = 0; - - /* - * A further optimization is to have per section refcounted - * num_poisoned_pages. But that would need more space per memmap, so - * for now just do a quick global check to speed up this routine in the - * absence of bad pages. - */ - if (atomic_long_read(&num_poisoned_pages) == 0) - return; - - for (i = 0; i < nr_pages; i++) { - if (PageHWPoison(&memmap[i])) { - total++; - ClearPageHWPoison(&memmap[i]); - } - } - if (total) - num_poisoned_pages_sub(total); + num_poisoned_pages_sub(-1UL, nr_poison); } diff --git a/mm/sparse.c b/mm/sparse.c index e5a8a3a0edd7..2779b419ef2a 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -926,8 +926,6 @@ void sparse_remove_section(struct mem_section *ms, unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap) { - clear_hwpoisoned_pages(pfn_to_page(pfn) + map_offset, - nr_pages - map_offset); section_deactivate(pfn, nr_pages, altmap); } #endif /* CONFIG_MEMORY_HOTPLUG */