From patchwork Sat Oct 26 13:51:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13852242 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3D40D10BF9 for ; Sat, 26 Oct 2024 13:52:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AE7526B0082; Sat, 26 Oct 2024 09:52:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A968C6B0083; Sat, 26 Oct 2024 09:52:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9851C6B0085; Sat, 26 Oct 2024 09:52:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 799496B0082 for ; Sat, 26 Oct 2024 09:52:10 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 30DF38168B for ; Sat, 26 Oct 2024 13:51:52 +0000 (UTC) X-FDA: 82715892210.14.8CA5F91 Received: from out30-97.freemail.mail.aliyun.com (out30-97.freemail.mail.aliyun.com [115.124.30.97]) by imf14.hostedemail.com (Postfix) with ESMTP id EEA2F10001E for ; Sat, 26 Oct 2024 13:51:43 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=XTrxnorL; spf=pass (imf14.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729950572; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=Yn7IMviIZnvK7YYzvaA/aGBlTXmKShv9ycQ4bY+Uv/g=; b=JvaOTdyvo8zPoju0fSx/A12o0vm+0TFsNsCBUvXNUvNvDBkDzdTtAoCkmcr/pzULMadUkr xna7gOODhnLNM/sat6aIV2EiO5TzGXjaKOGx98fs4Zh2fMrBVBaXxajz1qvy5cWx0pbTTM kvBr+A7dS/LXxwX+e0KpuNEpom5fL4M= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729950572; a=rsa-sha256; cv=none; b=vblBHOpz4N4UuEp0LmjerUhVkBxV9iirUHdtEQZ5fBk4i56EzMhRZ2BeO12GQOHyPTjPD9 9DHOS/TAgkANEbhr9JVlQ+tZBdzqAKE83Q2gwYzO2Rr34nr/cVfhcoBhq/pUmEgj8K1gSi nLCneUlVqY8OSgKmTZ55Qv1C3E6yHjI= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=XTrxnorL; spf=pass (imf14.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1729950723; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=Yn7IMviIZnvK7YYzvaA/aGBlTXmKShv9ycQ4bY+Uv/g=; b=XTrxnorLFgKk4UpWn8RhckpzKT/0sGYlj6lVyQTRVfNN6KTYKURLgMGWfcLyGW/3Ymri9/y2ixRlhgG2Jk4SpeF3QyRsGgGOUwQlwuGSQV7HhxpoZ6bQMas89qQtW2Wr1/8xdFl4m3UKvSz9UpJmv520M9DAXswkqqo82BlR46Y= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WHvS07n_1729950721 cluster:ay36) by smtp.aliyun-inc.com; Sat, 26 Oct 2024 21:52:02 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, shy828301@gmail.com, dhowells@redhat.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2] mm: shmem: fallback to page size splice if large folio has poisoned pages Date: Sat, 26 Oct 2024 21:51:52 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 X-Stat-Signature: n74fqu4tqzpojup3mo4r7qzxbd8s6fsp X-Rspamd-Queue-Id: EEA2F10001E X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729950703-133294 X-HE-Meta: U2FsdGVkX18mhKXCPHh9ytfsXtDQpmAhrpMaos+iPLnL9Q4ldyWskhCjk51RWKEorV/xNHIChktAUFE8s1TerrsvTvPNjgO1qsJryhz3FT8/QQu3epfVnL+CCTAvxYRAlwKuaw6pht/jVHn7XyjA8uB9gA/twq7qc7QKndzVrPGbbI+/wL2aSaxZFJ05TZNWJr6FJDDyQcYlLvW6JbEK0vAb9c3U3Ahp8zKtrawj6c9EhTsFglH/HHYSgOzs6ITrxXP3a7jHFo6gJRcAb9YU9qWS0WhPqABI6AobIkDe9CMMcDXSjZgZNGDXgXIfiKsqp/t1fRXzBTq9Al8XZzgw9wlfzeLgXfS88uxEca/WgFg+ULJzRn0leBIyEeN3vUXuxiEtYLbrAqOFXXu8kdOwG319pELjmSjY3QWA24nEsaPTuFnsac4EhSeYXi95H9WNlxoVkKDJ2s5lr4JnMbwlYPvKsFJP02fw2sJaDDm4w4/Si2MLd+tmUmo6hCZd1CBcRXtV42aMZO0pV4YFLXGivjWVzl7Lhq3ckD9V/AlzYX9CC/BqXMh9SpoCIX3nvhpAJFAUw6dB3VA648YlOZXC1mDpefCa962jYqdXmGnzhzfDODIVy3UEtI5D4bVQuqoQlN4cTLQK3Fk9Vb7Zca8UekOzZCbTzUmzVqoP3OoCxUY2b0NzZep5dCxfzC9CtELgXhHiCURXQun15QmGmQa9uSCLG4fBVk730+gijVtDO63MxWSZdz9qlsgiT5/zdd7F3uIrOB6DaF+Gs9aCKR+X/amXH++J6PZI0OImgDbCDz1V8ClV0Z3FMtjoZfbNv5ke8ZjCAP1oWf6j0gA/4VGvaUAwzX2y6BFJdgXb2sdc+hc1F+T6KpSevnDOWxWE0A3l3dHGsfbjU5HIcOWIQa8oiPzmTgjAERwF44xv63YE17iUkHKvqJ/M4lANSYdu9yACl0BESyOeMH0LioyFbXP RlF/BbL3 6MJZKaH8r4ht5g8aM+o2uSPPbnXu2pYfYeOJVoDhZjwf/H9M3p4oCZlCpqjX5TYFBa5bD/4QhavT5y4Lr4cqUvgqoI1JVfOSrSHVo1le12no7trJl3WtSrcI5qpj85PFmZdVK1Hz/HUktgImwNjTzDY3iR/R4+kVrNHfLnnjV3qs783N28gdnCmSSbrBlzB4LcTJm7z/W4Wg86WKawUt2N3HVJE+wsXLHhk5o/ZIoRocBLOivDd/fIz2XM9N1yFvb5O2gC7mneffM9zZyM1kGqXf7fXWBqDT92eziaEqPepuSZGhQkdYk+3071Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The tmpfs has already supported the PMD-sized large folios, and splice() can not read any pages if the large folio has a poisoned page, which is not good as Matthew pointed out in a previous email[1]: " so if we have hwpoison set on one page in a folio, we now can't read bytes from any page in the folio? That seems like we've made a bad situation worse. " Thus adding a fallback to the PAGE_SIZE splice() still allows reading normal pages if the large folio has hwpoisoned pages. [1] https://lore.kernel.org/all/Zw_d0EVAJkpNJEbA@casper.infradead.org/ Signed-off-by: Baolin Wang --- Changes from v1: - Use 'pages' instead of 'subpages' in the commit message, per Matthew. - Include the relevant information from previous discussion, per Andrew. --- mm/shmem.c | 39 +++++++++++++++++++++++++++++++-------- 1 file changed, 31 insertions(+), 8 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 1bef6e32a1fa..44282a296c33 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3291,11 +3291,16 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, len = min_t(size_t, len, npages * PAGE_SIZE); do { + bool fallback_page_splice = false; + struct page *page = NULL; + pgoff_t index; + size_t size; + if (*ppos >= i_size_read(inode)) break; - error = shmem_get_folio(inode, *ppos / PAGE_SIZE, 0, &folio, - SGP_READ); + index = *ppos >> PAGE_SHIFT; + error = shmem_get_folio(inode, index, 0, &folio, SGP_READ); if (error) { if (error == -EINVAL) error = 0; @@ -3304,12 +3309,15 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, if (folio) { folio_unlock(folio); - if (folio_test_hwpoison(folio) || - (folio_test_large(folio) && - folio_test_has_hwpoisoned(folio))) { + page = folio_file_page(folio, index); + if (PageHWPoison(page)) { error = -EIO; break; } + + if (folio_test_large(folio) && + folio_test_has_hwpoisoned(folio)) + fallback_page_splice = true; } /* @@ -3323,7 +3331,18 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, isize = i_size_read(inode); if (unlikely(*ppos >= isize)) break; - part = min_t(loff_t, isize - *ppos, len); + /* + * Fallback to PAGE_SIZE splice if the large folio has hwpoisoned + * pages. + */ + if (likely(!fallback_page_splice)) { + size = len; + } else { + size_t offset = *ppos & ~PAGE_MASK; + + size = min_t(loff_t, PAGE_SIZE - offset, len); + } + part = min_t(loff_t, isize - *ppos, size); if (folio) { /* @@ -3331,8 +3350,12 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, * virtual addresses, take care about potential aliasing * before reading the page on the kernel side. */ - if (mapping_writably_mapped(mapping)) - flush_dcache_folio(folio); + if (mapping_writably_mapped(mapping)) { + if (likely(!fallback_page_splice)) + flush_dcache_folio(folio); + else + flush_dcache_page(page); + } folio_mark_accessed(folio); /* * Ok, we have the page, and it's up-to-date, so we can