From patchwork Wed Oct 16 10:09:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13838103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCA8BD1AD3E for ; Wed, 16 Oct 2024 10:09:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 576396B007B; Wed, 16 Oct 2024 06:09:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 525286B0088; Wed, 16 Oct 2024 06:09:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C7FD6B008A; Wed, 16 Oct 2024 06:09:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1CAE66B007B for ; Wed, 16 Oct 2024 06:09:51 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 38F0A1A1B13 for ; Wed, 16 Oct 2024 10:09:33 +0000 (UTC) X-FDA: 82679043636.17.CCBB09E Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) by imf18.hostedemail.com (Postfix) with ESMTP id 3C54B1C000F for ; Wed, 16 Oct 2024 10:09:43 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=Xf3ZOrvY; spf=pass (imf18.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.111 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729073243; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=wthfNTj+zL6EOKjwNvCjqXLCtOPgmJseg9bBU6MTnpg=; b=y6854sNVFEmFFkRIcoAJePu4a97ljDKG8BFWR08GV/xxjAY9gN+7EmoJv46dHCfutG+GHt NBx+dzViDM6n+2Iron4Wo0zLBAmb53zdZTPd1KstcOjs5gXk1pdqZ0d8SYfLWg/LeSI3dQ 2TraMYwj2dxoGj5D4N2Gv19wwXPqnkk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729073243; a=rsa-sha256; cv=none; b=MhiP9e+6VMdnhFHabmYRhcLUSoQEGreSRoAzDPVDNigYGM3e9mHv8HkdcGK9I/uUoxd2GS Uf9sakumRpEjPRqXjl3iTGkHiPtdflRLEcBmScojupO2xN8M3RkVvRFLKkrc169Zc/80+G 4kFzf8tCcgmXlCpLldOStagc/MuvyQk= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=Xf3ZOrvY; spf=pass (imf18.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.111 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1729073379; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=wthfNTj+zL6EOKjwNvCjqXLCtOPgmJseg9bBU6MTnpg=; b=Xf3ZOrvYR2mdUatLjLQ1JCszg0NsORhON5vUOICbhox2GqiqIGnBAW9san2jEWPVuSkMRCWJc/55GGalmLyV0suoqfQnxg8n3m5EvnBe+yF4wSwI07/juZ+Wo1lb/nNNUe6JK8NMRN/k4MEEGXU9IJ4TR/nir2KLFh8gXRxy7Og= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WHH-UCr_1729073377 cluster:ay36) by smtp.aliyun-inc.com; Wed, 16 Oct 2024 18:09:38 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 0/2] Improve the tmpfs large folio read performance Date: Wed, 16 Oct 2024 18:09:28 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 X-Stat-Signature: di75pitau9pfx1ea4pbf554quwsceky1 X-Rspamd-Queue-Id: 3C54B1C000F X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729073383-423647 X-HE-Meta: U2FsdGVkX1/32PQWs+X5kApbnrnSzeNs4GgebR3IRnVomtu+EN8hpx6VqZn425d41ENZp521YAqtj1Yrql3v2pMqTMZ5oP/9QkOrDuVlr+THaANX6rlAdu22cZ41BaoVRXRB6A7J5Q6u3HqG1yvV1SVX/bOUGmto64ipdgJ6mubKo/haDwLhRJ1R4ma49i71709V8ZxUN+/7UDAiNUC3LZwyDN8ToQMbLOfrWjYkxRJpT8AQ/6R1JtrTdtcpB1LxfKsWtNnxeECOjn3HCptwcY1AYgB75/vUoRJ05pNY32P8dA5499FMzBM/2ajWluw7Uo/zkQE58cw207WBW8GrPDBc0JQRGFfDXKKo4JIrtMsPOoHUchrVqRj8DNsnps3U6ykIVv7qZjkqPFMvT1COLEKt49FvFDrv5a6uvfUgc9bPILlyyOP64RSCRNFbdZ/0aH7BJbDx23F+quZcqYI2JHxOqcd1SGaMzwq4bssqUTFGS9Gxler0BGIq7R1L17KP1IUNBJ+ZhKKd5I1J66oZmD0HSBXePXs/2kzH8BC7MImNQ0Rxb4R87m9cZPBfwcsrtvyJGEX6UYiq8je/33le/+HZEG3McEtkxdphLaBzCysL2sTLRr6dh9wSBjOe6R9YES+y1041aGwIKQygPt1rlKDfzP/5Xgfrg6JmfVsJ7TeEMCQaf3W9ZM9T4Jdg1L66ouKuYV2LLVf4vWaF6LhSnl9w4GPH6titVRc0GkFHhr/euok0un78mUgsQd64d0gYoQNPXJBl/91ns9F6JGGps+iO5f99eKOEmoPNzXlkrP/+H0mSNf7j2HqQTuQeBLDQIDG8WL6y20wqTtUlISxSnLOp00kMuFUeM6TgF28nWa7DXFGaubj+EyJTnFV/iPnZoJoYH0MLkXQ+oOQ5kj/x0uSsLma9Acx/um4ZkaNP+XjytBBbr87Fj9rCh1CPCvcsgpC+bUpnZ2u7IzJm684 S9885FfW SwlcTw6xkUz0g8+s/CLz2L8DvrYXUoDytuW6z29o+q25u7v+fct34uX6behjsQ661Dr0NawtYaLT6/E13IjDdqgz+PQhVl8+Ov31Bz2BnHyEFujegM4+TUoSfw7jM42HWwzv6jFmHNZMV8hfCkb3AJQx7H6M2QY/KwGRDoleP21AzSuL1/K4zJRqXxTLTUQsmp87b X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The tmpfs has already supported the PMD-sized large folios, but the tmpfs read operation still performs copying at the PAGE SIZE granularity, which is not perfect. This patch changes to copy data at the folio granularity, which can improve the read performance. Use 'fio bs=64k' to read a 1G tmpfs file populated with 2M THPs, and I can see about 20% performance improvement, and no regression with bs=4k. I also did some functional test with the xfstests suite, and I did not find any regressions with the following xfstests config. FSTYP=tmpfs export TEST_DIR=/mnt/tempfs_mnt export TEST_DEV=/mnt/tempfs_mnt export SCRATCH_MNT=/mnt/scratchdir export SCRATCH_DEV=/mnt/scratchdir Baolin Wang (2): mm: shmem: update iocb->ki_pos directly to simplify tmpfs read logic mm: shmem: improve the tmpfs large folio read performance mm/shmem.c | 54 ++++++++++++++++++++++-------------------------------- 1 file changed, 22 insertions(+), 32 deletions(-)