From patchwork Fri Oct 18 03:00:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13841044 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D20ED3C544 for ; Fri, 18 Oct 2024 03:00:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3740D6B0093; Thu, 17 Oct 2024 23:00:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2AC5E6B0092; Thu, 17 Oct 2024 23:00:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DDEC6B008A; Thu, 17 Oct 2024 23:00:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D3A7B6B0085 for ; Thu, 17 Oct 2024 23:00:42 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0758B120521 for ; Fri, 18 Oct 2024 03:00:32 +0000 (UTC) X-FDA: 82685219694.20.7868612 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by imf10.hostedemail.com (Postfix) with ESMTP id D1263C0012 for ; Fri, 18 Oct 2024 03:00:35 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=e3T2xx1h; spf=pass (imf10.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.130 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729220294; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZL+DuJ+RP/aBC8VbvHjJ2i2qhBXaPyZwPwmHC2i5Ozk=; b=XjenpQu3nCm0stBY6cbUoWpX99y08fiBs7Jgx1oUsoHUZjk++o5oy4nLVYraSvF+4QlzAF vQVWb0RMB0wRDrx5hMGclVoAK/J85fr5qj/MsPRZhJLllpkCZ/saZdo8pARRpjeY5Zvhwp 7EDWqIQUyJPAFYu3wo1sSlhJUIPYCQ8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729220294; a=rsa-sha256; cv=none; b=mHsXtYQpmNoeKTwqjzj6SO5BlLYdxZdOI0HalzYj1wstY/9zm/3x26IFTvr7rXqgcdhVnp AILhGwz9rc0O6oGJUHGfgixnOi/zrTpQ28JvM+qBq9N6I7JMarLVfgFj95mpOiHfknY0iG DHAAB2t6U/GiR3XWZPx8LOgofzBFyzg= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=e3T2xx1h; spf=pass (imf10.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.130 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1729220437; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=ZL+DuJ+RP/aBC8VbvHjJ2i2qhBXaPyZwPwmHC2i5Ozk=; b=e3T2xx1hFmunyeuiIHLvsNZipxS3GpW7M+65XqNmcSNxcNmN4ib6zKrDOP/zVC/cM7vOctKe2N0eu/GHozWprOPyScw3necBXGl3rhbrqLe6FVndSymYpWJMprRyrs7jIE3K5WPPlDIgnB4I6QtnTKyFbT9KrPWboNIBdfdRG9M= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WHMdEgo_1729220436 cluster:ay36) by smtp.aliyun-inc.com; Fri, 18 Oct 2024 11:00:36 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, shy828301@gmail.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 2/2] mm: shmem: improve the tmpfs large folio read performance Date: Fri, 18 Oct 2024 11:00:28 +0800 Message-Id: <2129a21a5b9f77d3bb7ddec152c009ce7c5653c4.1729218573.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D1263C0012 X-Stat-Signature: ynis1nkummhnytteburunhczs3pjup8z X-HE-Tag: 1729220435-356148 X-HE-Meta: U2FsdGVkX1+b8olvfdsT0Db6hlz8Uke35HivaBko93OuJYdktk++M6t8lmwEBWxYQOXdsajtlXnQrMpGGP7z3xOe2/8VtReZ2+N6sPkwpFHgZg1S1lRHGmNLTkE+0q75/nTKGYdexZT/I/qQHJwOF/JItU3B5QfdnzWfvbyG3+p35WYEkcHm9K6qes0WCpT+RElIb8NPcqXCzcGj5LkXSwO3LaP/2RYnNHQq7dXa3J5HxNulFmXrdPXletQj+Ek0DEKwBN+SPECyTB3pM1Xsusq2vg/tIRtHMz22dNr43ag03MttFtGWbSwWeGfN2Bc5fGdhm9XvPhkQ+vN8+V5Z+2cy7Fhkqqkeqokg1MZUaJ3KIYbBN+q0Wa5WPi8WPwEZezbL9h/koFXJ+Vxceo5Cl5AAhH2hjCNTU0TlTVrvU3dkEsniS1CPvWfHQyGb4Z1FrtpVSoA1KQGvo9LT9lJ+FwVMzVtnZnLvfZ5MRX1fe9TxYjHRXRb1MvkYX83mCKsxVsBIvlvFFlQ2Ef1DNPcOANDH+UB2DTCbqCdctIB0Pvk3j3aROLEJ1BGToV1X6iYTrGK+ZnOPCns3qSM99LowXh7PDnFHxoU0TF3jYjNM+l/BS2UdKM+Lc9WVaBSKQqOFTs6WKFYuuNYACYNI0F51NY1YR9CdeU8pjzEVaCrkRc22QiccZdsBnS3KPOECU66x25n+26kggSjZC1uNKp5EnDYsWQw+9vjKFcgVD45vc0aLdKfowdGlbYsZB/aMwmIXweZGBInqlzrd3pdlE2J7IcFdm0OFpaSZti+pHgT/9lkyoE8ou5q3B4M8vsxRm43RpRhLFGm92XJvBs6+8ZDn0/gVYtYxIQdpI2xn0+86wcNEk4p4k/RsMWxHVe2a2RzbGtszsj81y3C/NGfQd+PdviG/tQzLndZaMiw8FkMdbrzUIXUcp35MW4Ny3RzNrq50GAonw/4xa/EgJRV9gsj 8O6KsKs5 35qsanVfqn4xYFNUcXKqn8LIF+2j3xZl+KWgfR00/GaRe/sOCTdyyOjN81AGm9d5sSDWuiXlvWlAl7DESy36E6mqo7MMiTezySnCM3j4Mxd0kfQkU9LF3z3+LHA1Rv/rQtGOUQdBCutlfUNnddx1qRt0kmrePZI5EDbtqHA4HAAkjcDQTr0kEslx84zXMQFtOXiR86XxAWESlMrORUaULsVEKpQMVnt60/ulJ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The tmpfs has already supported the PMD-sized large folios, but the tmpfs read operation still performs copying at the PAGE SIZE granularity, which is unreasonable. This patch changes to copy data at the folio granularity, which can improve the read performance, as well as changing to use folio related functions. Moreoever, if a large folio has a subpage that is hwpoisoned, it will still fallback to page granularity copying. Use 'fio bs=64k' to read a 1G tmpfs file populated with 2M THPs, and I can see about 20% performance improvement, and no regression with bs=4k. Before the patch: READ: bw=10.0GiB/s After the patch: READ: bw=12.0GiB/s Signed-off-by: Baolin Wang --- mm/shmem.c | 34 ++++++++++++++++++++++++---------- 1 file changed, 24 insertions(+), 10 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 93642aa8d1aa..cbefd9801f6b 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3107,13 +3107,13 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) int error = 0; ssize_t retval = 0; - offset = iocb->ki_pos & ~PAGE_MASK; - for (;;) { struct folio *folio = NULL; struct page *page = NULL; unsigned long nr, ret; loff_t end_offset, i_size = i_size_read(inode); + bool fallback_page_copy = false; + size_t fsize; if (unlikely(iocb->ki_pos >= i_size)) break; @@ -3134,6 +3134,10 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) error = -EIO; break; } + + if (folio_test_large(folio) && + folio_test_has_hwpoisoned(folio)) + fallback_page_copy = true; } /* @@ -3147,7 +3151,12 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) break; } end_offset = min_t(loff_t, i_size, iocb->ki_pos + to->count); - nr = min_t(loff_t, end_offset - iocb->ki_pos, PAGE_SIZE - offset); + if (folio && likely(!fallback_page_copy)) + fsize = folio_size(folio); + else + fsize = PAGE_SIZE; + offset = iocb->ki_pos & (fsize - 1); + nr = min_t(loff_t, end_offset - iocb->ki_pos, fsize - offset); if (folio) { /* @@ -3155,10 +3164,15 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) * virtual addresses, take care about potential aliasing * before reading the page on the kernel side. */ - if (mapping_writably_mapped(mapping)) - flush_dcache_page(page); + if (mapping_writably_mapped(mapping)) { + if (likely(!fallback_page_copy)) + flush_dcache_folio(folio); + else + flush_dcache_page(page); + } + /* - * Mark the page accessed if we read the beginning. + * Mark the folio accessed if we read the beginning. */ if (!offset) folio_mark_accessed(folio); @@ -3166,9 +3180,11 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) * Ok, we have the page, and it's up-to-date, so * now we can copy it to user space... */ - ret = copy_page_to_iter(page, offset, nr, to); + if (likely(!fallback_page_copy)) + ret = copy_folio_to_iter(folio, offset, nr, to); + else + ret = copy_page_to_iter(page, offset, nr, to); folio_put(folio); - } else if (user_backed_iter(to)) { /* * Copy to user tends to be so well optimized, but @@ -3186,8 +3202,6 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) } retval += ret; - offset += ret; - offset &= ~PAGE_MASK; iocb->ki_pos += ret; if (!iov_iter_count(to))