From patchwork Mon Feb 6 14:06:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13129994 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95A45C61DA4 for ; Mon, 6 Feb 2023 14:08:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D0F5E6B0072; Mon, 6 Feb 2023 09:08:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CBF246B0073; Mon, 6 Feb 2023 09:08:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B87106B0074; Mon, 6 Feb 2023 09:08:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A908C6B0072 for ; Mon, 6 Feb 2023 09:08:36 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 632A614022D for ; Mon, 6 Feb 2023 14:08:36 +0000 (UTC) X-FDA: 80437047432.13.E4A8489 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf30.hostedemail.com (Postfix) with ESMTP id 18F3480023 for ; Mon, 6 Feb 2023 14:08:32 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="IHWeP/3L"; spf=pass (imf30.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675692514; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=Txe4TLSUKldvZeaJlYSokgAhf05YUVKE3OkO8Pq08IY=; b=u2HPYUPkdEPYYYoqLFxSOfMW2wZS+t5EKdTagpP2k3wZKtt7D44MmIA0hYUmhL4y06crAf RC8m2bHiU9of1LzZxGkZe8PrHaHY3U+NXWYlmpX/dXOVt1FTZOEBW1NPWP1roVI4WsBUF/ TiUmNjZwNh5E2ZHA1K2PDZotCr3L2NE= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="IHWeP/3L"; spf=pass (imf30.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675692514; a=rsa-sha256; cv=none; b=HnNtXsbjhWOkpSvV1iX0qHexAnTAzW4XsWa8lCMHqGqWVnlZCklxfc4yJsSRJ6kF93Zt+m 4BhPdMe4G5WX52snXWWF6PFnYg5Ezq9PvW07DmIwn4ed1g1Pyf9WrQ7kNgqHQJxtLWTJqc HEO/kCuo/sqqec+mVxOjuYmIsFSElJs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675692513; x=1707228513; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=oeorHuYKTrngoJgPGut9Fd60Ta+zY+abHY56LyoKGaA=; b=IHWeP/3LDDmJBaTnor14eAwPwgckD9DT18IOI3IrxOD1ri1e2yqLnF+P n/EnOSE2X5kBYyEeXjoGZa1vOhSc/qt+G1ZEnu0Z6Q2n+y0SbD+csCkVE ExTnIjJxdRXPkR+HWr3AAxHT8XEJgGHDqogh7rHEb09iLbH3rdCq5H3Yx 43ueWC4mb+n4Ecpj9EH6c7VCFugYh+cswwMSslEsXckwBpcHvWadl/pYs EOQ/smyl6k0K8M8OIdZpUoEXpJLUvTCk/H8rTSVyWqf3CWQ4d4iqnsKX8 j7RQpMuD/H58Ruzo+Tkb8bOBKO+q/DgZL/KErpGS+USk2vLth94XyjMcz A==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="312864195" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="312864195" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 06:04:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="911937863" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="911937863" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga006.fm.intel.com with ESMTP; 06 Feb 2023 06:04:43 -0800 From: Yin Fengwei To: willy@infradead.org, david@redhat.com, linux-mm@kvack.org Cc: dave.hansen@intel.com, tim.c.chen@intel.com, ying.huang@intel.com, fengwei.yin@intel.com Subject: [RFC PATCH v4 0/4] folio based filemap_map_pages() Date: Mon, 6 Feb 2023 22:06:35 +0800 Message-Id: <20230206140639.538867-1-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 88ta6uyp3fbf6z4se48x4czmjuf7ck1u X-Rspamd-Queue-Id: 18F3480023 X-HE-Tag: 1675692512-366940 X-HE-Meta: U2FsdGVkX19vfLb3iMTupDKE5guDbfA0wJTWoGoGNOfTgoJbhYfTI3P+ilYdY9OYS4NcVmiR7G8K1hTSEfuQsF7cCzqicN+W8rau5TI/ZYC6jtgOaeLbB2dWjnk2VMwBHjo8Khzw4asnI6XTaEZcI3vaUanZZb+elexuJZaB1LDBxvx69he2S0OFsWCWrwGw5ltaAcYXBQoOK0b9BbaSrFprnXYhIwZFAVzshPr0W+8GlhH1MaO45hg/SYmhuWis+HT2i3A+ChwQME3oTB3UdHOzChFCchMJ1RH+lICDQ9Q9P3ICbvPC7OcyFR5TOzzmStAai/ccln7SicJBtiveVUbIoquF2w8c7Bbe6r0PKxzE+HRVsPnaIcYy10H2cBFvrk93RDYCi+OPC8L401nsqGZsIQ2ZAU+mNAtgNK3Ul1q3W98297ZbRATpxUDf39p4dpX7Rm0ltAS+FLFz+RrUkwWdjLS+2yLcPqmrvVhfUaFM6NUtdzLn74N5pb5/01wv1XrccSB6QeywNg/FLQyUY2eIhOHj2aKJEQjFhDQXn4/ZmulNPr+JyhfimyymEH7a3PaWKh4BJqfbl/BlsHRBZO0m1BIUKUkHYXZTVAuKaYmPIuibimmf6FXhmimSIIDIQ0i6LDYD08mRVoJOTVZAKCQeQpwdFo0hGVKzNGkcbHGjbvh9IJ0r7O531Wc4hUu+mPT0OZOesWbWwP49VXKmr9gPKlPQrb3eCSJQtvPA0TfD+5+AxluHOmXR5KqLF5D/8U8pKPIjXmhJM3X512hrrdDHw4wnM2XkXM8mlMamnSmkhoT5+PTS5iCf0/kP+cpRfXIFXkTFGsQQ/jNAwKui3t4QOGiUTGTeeQnItyFekIMgkbwl3VosOmZpKnlgjUbwxNv8Y7ShNxCBUxu6L3zhh5JdG/f8ium+JHMGa2QbZ+RqbD3KPoomVxt2PeAZHQ+UTLfGtaygZUw7dGohym9 75G5XeOU mHeQDUCQh+pzTnOolNYhaMhCqHIeB+lF4UM30U7iRek8W8jNMsjK4N0WuT37BP16J3fFW9e10OzHs2AZjA10U7T3/eYP31TVAulShxw6MHCyJsAAoFOPyaYQftfZtZNUkG9UV82ZzH8D+ELIkhh5rHyoCTRr3B9cL5i52JX85LXeXeZJztBEvPLqdAmdNWRlEcp4Gaoe3r/tWYEs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Current filemap_map_pages() uses page granularity even when underneath folio is large folio. Making it use folio based granularity allows batched refcount, rmap and mm counter update. Which brings performance gain. This series tries to bring batched refcount, rmap and mm counter for filemap_map_pages(). Testing with a micro benchmark like will-it-scale.pagefault on a 48C/96T IceLake box showed: - batched rmap brings around 15% performance gain - batched refcount brings around 2% performance gain Patch 1 update filemap_map_pages() to do map based on folio granularity and batched refcount update Patch 2,3,4 enable batched rmap and mm counter Change from v3: Patch 1: - delete folio_more_pages() as no onc call it as suggested by Matthew - Make the xas.xa_index point to end_pgoff when loop end as Kirill pointed out - Style fixing as pointed out by Matthew and Kirill Patch 2: - Typo fix and better comments as Matthew and Kirill suggested Patch 3: - Warn on cow case in do_set_pte_range() and add comment about cow can't handle large folio yet Change from v2: - Drop patch 1 because it misses ->page_mkwrite() as Kirill pointed out. Patch 2: - Change page_add_file_rmap_range() to folio_add_file_rmap_range() as Matthew suggested - Remove "sub" as Matthew suggested Patch 3: - Only handle !cow case in do_set_pte_range() as David pointed out - Add a parameter of pte and avoid change vmf->pte in do_set_pte_range() - Drop do_set_pte_entry() Patch 4: - adopt the suggestion from Matthew to avoid subtracted/add vmf->pte. filemap_map_folio_range() doesn't update vmf->pte now. Make it easy to fit the filemap_map_pages() and no possible point to wrong page table. Change from v1: - Update the struct page * parameter of *_range() to index in the folio as Matthew suggested - Fix indentations problem as Matthew pointed out - Add back the function comment as Matthew pointed out - Use nr_pages than len as Matthew pointed out - Add do_set_pte_range() as Matthew suggested - Add function comment as Ying suggested - Add performance test result to patch 1/2/5 commit message Patch 1: - Adapt commit message as Matthew suggested - Add Reviweed-by from Matthew Patch 3: - Restore general logic of page_add_file_rmap_range() to make patch review easier as Matthew suggested Patch 5: - Add perf data collected to understand the reason of performance gain Yin Fengwei (4): filemap: add function filemap_map_folio_range() rmap: add folio_add_file_rmap_range() mm: add do_set_pte_range() filemap: batched update mm counter,rmap when map file folio include/linux/mm.h | 3 ++ include/linux/rmap.h | 2 + mm/filemap.c | 112 ++++++++++++++++++++++++++----------------- mm/memory.c | 66 +++++++++++++++++-------- mm/rmap.c | 66 +++++++++++++++++++------ 5 files changed, 171 insertions(+), 78 deletions(-)