From patchwork Fri Oct 18 03:00:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13841043 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CD95D3C53F for ; Fri, 18 Oct 2024 03:00:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DF92D6B0083; Thu, 17 Oct 2024 23:00:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D8D646B008A; Thu, 17 Oct 2024 23:00:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BFBE16B0088; Thu, 17 Oct 2024 23:00:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A1B876B0083 for ; Thu, 17 Oct 2024 23:00:42 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 79C7C40513 for ; Fri, 18 Oct 2024 03:00:35 +0000 (UTC) X-FDA: 82685219862.03.143BE1A Received: from out30-113.freemail.mail.aliyun.com (out30-113.freemail.mail.aliyun.com [115.124.30.113]) by imf07.hostedemail.com (Postfix) with ESMTP id 8CE2A40006 for ; Fri, 18 Oct 2024 03:00:23 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=TQXpYFNj; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf07.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.113 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729220321; a=rsa-sha256; cv=none; b=5A/a9jt7AP9y6GqSmeVgBOqI3mK3UDNLgeDY01keOZw3r+0rQdgwKrV31xyJZyi41LTU9V 8qTjM8HBvvM9HF73DW4HwZJhnnZWtw5pT37ujs069ni8w7jeZRBblxInb7vDXEo9qvtU7g YnR+YYq9fPJhnVnbeFaZJIeKZRnfDhc= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=TQXpYFNj; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf07.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.113 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729220321; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=b7EskSvgvdOAp4v5R718ehHA8SmBGTMmQBGnhhWZuCo=; b=rgnVmlm+rQBE5+ONkwCd+Uz+GS9dfA9icPEu7w5DkNUtKlrjoEfsK6wMxJBRFqIxbBJSpa CJUIK5P2RpUWHovCByJR9q6qORAzf15c77ubQ+To1YluRpvFN5H8ESRtK3g+F6P23Pw+m2 r+QB3swvTpg5j3O1P3GOafCEYSwFlkE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1729220435; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=b7EskSvgvdOAp4v5R718ehHA8SmBGTMmQBGnhhWZuCo=; b=TQXpYFNj/XcAV8e2VuI1cLhh/wXHtJsoA0cmwlwO5qcr9I28Z7HHOmBSlGTTNkAVcMWOVWFeK9nEfHftN7MxzE2Cxvhijp27b/n63vKCMGoxK9YfXER2Nzdefl/Cr9disy5519qA9KzHGqZulFt+9u16kC3J7IoIjcbeL2B53r0= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WHMZWVB_1729220434 cluster:ay36) by smtp.aliyun-inc.com; Fri, 18 Oct 2024 11:00:35 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, shy828301@gmail.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 0/2] Improve the tmpfs large folio read performance Date: Fri, 18 Oct 2024 11:00:26 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 8CE2A40006 X-Stat-Signature: noga1kxrafyniwd5ptyqq998ojeoa8j3 X-Rspam-User: X-HE-Tag: 1729220423-773128 X-HE-Meta: U2FsdGVkX1+E73Qy0Rdfblm+Y2t7rYZJIxsuykIbi1K9vdYOa/Hr16TvT9cdTcVO5Iklcgo8R5oBMp8ER3ehV2K9YHRyqmqOBzvJQKaw5y3nYCk4UiRvCDdVDjnBFAF6Yvaw9MB9kEjTcb/wvkajU6at9j6iAwXMAmY+nD24qk3ti8uhlJPWEu2cK75lkxvbheFFzhdL+nKzogvxlVwsGQPMuE9H1+rW9C4BZuMHSisOxx0ABOkf592MNfp4hEL2UmzIuul3SQjXuddkf3Wtf7lkaO3e8VdU+PQ8jewiRAUP4Z0onaECSYQRWZIRRhc3r19RvNgV5GDsCe/AkhRkBIZiBfdTJzH7qr4Z7r5bQtwsYTblMGhsysSJZ7sVuO4JCy0+WpPq9qMBWoSbKWQ4J8DY71RK+evRF+mh3mX5ETMvkyK3DZ5yYXHS46EDl0gPutMcsW7WPwIlmmF4EZLHa45ky/T03FaxQZIf3MrP7uLDt6/24VJQI4fMpZy1SKULWSKYv0MiSLRZEGZNhBjRle+/G1qetw8sbdRDcOVafalxGpk64iz7tXV9qSOQD3ZbQnzW56QrhjwbG1OJReoFfa7vvcJpc/ZDEKd+e0qrxnWB+EJFruGMIQwgrhTXXIwBkaJeswganRaCsqn0S0r/IyyB7ms5nWRE+wBimZKOAWnl3ZPamvOl8GzQqVV87MS92oSXNLWerGmpywHpIdxI2aLPGVsokx5NfMz20znXlerc4lPnWbfJtfWXxUsAAKHg18Gcc/UML+rRp7FVq/pZi/lhVXnaZoVGJf/N/5JKPHIXVcLjOhfQlAmz+mHIxFbvwQoghm20036Vmg3Ngf2Xebbw8HHrOwCDbh0s8v2cZ7PRFgnYuonW8l/TL0LmpToEmAI/mU1x7DBxb6/nKXrwAFQROPMjQ+C8K9lcHkLV6mDZss/NXof6yBUAaYUws2LNo0VO6xxcWVD3iNBd/Z/ 3/wZMMRz 10TipzcCZ5Z0/5uOuX7d4+uG8xqfhjurt8BjIblqqFkjZBMmExOVZ7IpfWFrcqh7ZuKmGBtAzIa9N9oS7ZkA/q0pYz2eNZ9Y+vMrY2gaWKOmN36x/2BE82IoAJzzAvnrDtZa1sO//dAn1UZj7PXolGHi2ddnXI4VKMNrBg/wVAzKQDNkGeu/kQdHWFiizaAaBeXgX35tokB6x4ZI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The tmpfs has already supported the PMD-sized large folios, but the tmpfs read operation still performs copying at the PAGE SIZE granularity, which is not perfect. This patch changes to copy data at the folio granularity, which can improve the read performance. Use 'fio bs=64k' to read a 1G tmpfs file populated with 2M THPs, and I can see about 20% performance improvement, and no regression with bs=4k. I also did some functional test with the xfstests suite, and I did not find any regressions with the following xfstests config. FSTYP=tmpfs export TEST_DIR=/mnt/tempfs_mnt export TEST_DEV=/mnt/tempfs_mnt export SCRATCH_MNT=/mnt/scratchdir export SCRATCH_DEV=/mnt/scratchdir Changes from v1: - Move index calculation to the appropriate place, per Kefeng. - Fallback to page copy if large folio has poisoned subpages, suggested by Matthew and Yang. Baolin Wang (2): mm: shmem: update iocb->ki_pos directly to simplify tmpfs read logic mm: shmem: improve the tmpfs large folio read performance mm/shmem.c | 65 +++++++++++++++++++++++++++--------------------------- 1 file changed, 33 insertions(+), 32 deletions(-)