From patchwork Sun Jun 2 00:45:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13682574 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0996EC25B76 for ; Sun, 2 Jun 2024 00:45:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2D69C6B0098; Sat, 1 Jun 2024 20:45:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 286A06B009A; Sat, 1 Jun 2024 20:45:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 128086B009E; Sat, 1 Jun 2024 20:45:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E4DB86B0098 for ; Sat, 1 Jun 2024 20:45:24 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 50743A0E4C for ; Sun, 2 Jun 2024 00:45:24 +0000 (UTC) X-FDA: 82184104968.17.05FDA1F Received: from mail-ot1-f46.google.com (mail-ot1-f46.google.com [209.85.210.46]) by imf22.hostedemail.com (Postfix) with ESMTP id 9B674C0002 for ; Sun, 2 Jun 2024 00:45:22 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ndzifKs9; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf22.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.46 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717289122; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=ny3ufmHx1+bPqTe/RqTA9t4s0xhOCgE7MQam6lSXckQ=; b=qjTXLaLxpwwZCPNbf//NNiz+9gHlpsMLLtIFEzan3zRDkeLmEWkrtAujuunyBMC3IvQQWS N5sKMVzGWdjcbF2/fwYefzdC2H1RGPHKR/PT+B3pUUUws+CRlUbhna2oCXYTQmxC1ktzBz gIOZI41XJXZXWxuLoFHnZUYngTyMLeA= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ndzifKs9; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf22.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.46 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717289122; a=rsa-sha256; cv=none; b=aYH4OMTXvdALiiyG7Zw4hYDviuVoXTu8uFDqu4KEpPZvzArq5zu73s85BJHr+hZHn9w0Pr xcqa0AH/sbkKr9arIckXI+pqpmZLc4Ya7isZ93Qvp0AcH+PTnKu4QTcK5W66QmOjiM0Y92 tJA2vNIbSbqKf+CMwHF5uKPAu2r78O4= Received: by mail-ot1-f46.google.com with SMTP id 46e09a7af769-6f264d5dadaso2015484a34.0 for ; Sat, 01 Jun 2024 17:45:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717289121; x=1717893921; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=ny3ufmHx1+bPqTe/RqTA9t4s0xhOCgE7MQam6lSXckQ=; b=ndzifKs98M6SbtVQLgzuAfctFsmFFai3GLtEDpoeDJrZlNEugoHk90mwM5aKozfmqk A5O4KUjMksfR1H1j45B/G8QMG9Nq/ncBlaHz9qm2VHoP8eA4hyXhBb549pGvh64qu0KS KRIqhIMAe75hM+Hkc989dV6ZlPnyrAovmK7KMxns5s8sCW97dzCnh17JhXsxrL/xVgPg 3zyMACY+AOWZw5hzflLzzKx4cr0IFIyPh1iwTBT/C5uI/kIJ8Ob3nJWD6jfH8QCAzDZm PM8ku2Cx99TKUvYAfjctzlkKXDgsSCfOTrCQeXrDjISR+fJ1vF6Mfkkd/DIiWNXWTs20 LrkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717289121; x=1717893921; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ny3ufmHx1+bPqTe/RqTA9t4s0xhOCgE7MQam6lSXckQ=; b=e+nhN1BpLtRn2jFa1V+wS1Z1+MvGsmuMcR48KnjyMkfZubXOLcu9Ug/YLLxYfeC1w3 /8m8dpe6XPJulhZtq+42mcR1umpDdSRYP1txhHC57VM7KXix1e2MICoFHBqWRk1ITgvK Rui4qE3v9HM8plnQJ4fThq0CeT0ZWFQNRAmrzXxpcTHElEVtU0o5CkbqAGPTFCP+MQvM 9uFY/4u5L1fXYaWeu2zNuWtR0Y5nLN3/syB6fSptc8RqaPKnymi1U4jZkrNuv4fsvZmo chmG2MpZHx5Q6t4SCS5fxE5IGjSg8dlax5UMSmMZ7bJaY2KIFo3Tf6ACCcZaV5LoEcVe LziA== X-Forwarded-Encrypted: i=1; AJvYcCUl3/LElK0WxuihMjiZ0gU//bbMsNRLuEZZhPM9pgs3JKlBrKrHVON0Oy9geUfrbR9MXRKPMSbipiK68/Eu5+PF2m4= X-Gm-Message-State: AOJu0YwCtimJCTIqo/xOnJDpT91VEpAFylc/zFCHn5YD+Yeqah+xo2SH dFYSzLi6qo0Nrt6U8ni6c1b/Ta01d7yyXTRboweV4bBf/V/aGzJm X-Google-Smtp-Source: AGHT+IHh1cIQ7orF2jvAI+3PDDIW+Q+O5ga+7R2QrzrMB6FWHK2QCyt/0bb6GGjNhBejybR7NX1Nkg== X-Received: by 2002:a05:6358:44e:b0:199:2a86:126b with SMTP id e5c5f4694b2df-19b48d7cf3cmr578625155d.8.1717289121340; Sat, 01 Jun 2024 17:45:21 -0700 (PDT) Received: from barry-desktop.hub ([2407:7000:8942:5500:aaa1:59ff:fe57:eb97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2c1a9ae60b8sm5308820a91.30.2024.06.01.17.45.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Jun 2024 17:45:20 -0700 (PDT) From: Barry Song <21cnbao@gmail.com> To: david@redhat.com, akpm@linux-foundation.org, linux-mm@kvack.org Cc: chrisl@kernel.org, kasong@tencent.com, linux-kernel@vger.kernel.org, minchan@kernel.org, ryan.roberts@arm.com, surenb@google.com, v-songbaohua@oppo.com, willy@infradead.org Subject: [PATCH v2] mm: swap: reuse exclusive folio directly instead of wp page faults Date: Sun, 2 Jun 2024 12:45:02 +1200 Message-Id: <20240602004502.26895-1-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Rspamd-Queue-Id: 9B674C0002 X-Stat-Signature: m3qga6cb8rryxpmgdfde9x6enaq6u3pj X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1717289122-778304 X-HE-Meta: U2FsdGVkX192lAyXcc+T3vU9oP+m/e/fHpC6KDZAeU4imQUbBdX8MrH7WvkcQODPJppAI4NOnfln44Ih60hIXbT21kDdY/4kKiXjoeQi+wSIy34WUbsrrWqTqtWi+y/PDy6Ataaot4V6KD+nJai6i1C0xSpDXkpnbPhn9hpljshDRbkHcYAM/e6W+vAf8Ccy/aezMDkn3QX2ZiSxQ+bZiF31yX3wVt2gDThWOaf11B0AlGsWI7i9dAczl83Uyqvo+4cowJA767oZFSicEe6PpMvsFoi0icf263QJMdUamvYBcJTSGePVXh2oR9YpmGFaLyQKvqYJGknXgu/6eFdcmVG19sEH2VhKuIyK1r99Lyvmhw0YVJbq+ZnnB40Vbfv1OAfxhSZTJo9/NoNA/kjVMaouPM40Esn/jI1cYfWblCloW7ovdfRRslrT6NN2iRKj7dE0WqRBSqSX7sOY7I7SFvrL/iUUGlNcw9kf6uLvvkFCDYMxlGUy+853kKEwTpnU1GENf4EQXjNGhweGn1g29aWcV5RJ+YrksGyXJzW0HBJ3JwCUfUDsGZ7rgGYwXNsLnjKpC5SiqNOZRV3KLjIDqHzp2ylmlo63aIqMaBMgi8GzDp6suOJgpqYLtH+NqHJsb25CMTWU1E5ALnvsLF4UeiWCIbt3h72KYVTrNqL44N+AvxcUs1xAnAX2i6tzgPeiyftSgJxT+F3/Ddqrs+IZJCuEIbtJOYq95gilhscOBaVFIKeEP3KZaq0xKnG7KAWFsbWgVjkgctFXAjshBj3Stvms0oswEZ/rVUZSXZc/uHeSNqdhQXLdOiWKTe8lUacJS6m/Y3PRQJblxFfr5wWWiosLlX3LT2gt2pcg0qx3a2VyvIzUw0Yy8AGdK6WRh/UCYn+90mHuEZet1dPGROFOiNh7uzyxAp9QJa/71xds5q9lArzkEJ0CWeB/c4OANcXpTmhcVrYGlmk8+qoZjUz c/6lTKD6 qNrPzIA5a01k5JuUU1QIzbPYFXE6e3VLYG01AQEJUNmaQ1JxdxDUuSgBbMr9UJGXInE3j3ThGtGqLPX6IUKqFHsP/M8XbUZFTv6vX1D95fAJa5S5wcpENF9qSywEsRNLIxK4POGIPX90k5xgBdyuKXU8TmWRvrbvaXcNwiTvO0QLLJMiGElotdlKhRGZEnglZahmt2iKnQ9QANR5qDfx54uLdcLA4fAPlPSr9UWrOX6XAeiZt+Pkh0FXmjY7LM9azIevjkdyO/fNKe/rKYQ3UUtIDLgssRmcn0VcPHaaB6wmMn0yByQ2epXNc+NMLShOpY9FXE4KxiYJSH4DwKY02v48TivC20peHWlk+d/Rax948KKqfkr9WCkR7AA1VBN7VCaHqLxPyYWzLEtNK+knffE7V8h9P2Mpl5PIsNod98MKUfpE+KmAYDd6L+163mFsGO9XC73fRlhgZKNEip+/fN0KvDI8BsR7GxmU8kPHd80ty8q2HlR3PG9qu1mY+e23MG0sbXakCykRe+pynS5go6EpZr+DP5CNlKdQs8HwaZPByXIL+pP0jifTfx/3+qm0n3ckE X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song After swapping out, we perform a swap-in operation. If we first read and then write, we encounter a major fault in do_swap_page for reading, along with additional minor faults in do_wp_page for writing. However, the latter appears to be unnecessary and inefficient. Instead, we can directly reuse in do_swap_page and completely eliminate the need for do_wp_page. This patch achieves that optimization specifically for exclusive folios. The following microbenchmark demonstrates the significant reduction in minor faults. #define DATA_SIZE (2UL * 1024 * 1024) #define PAGE_SIZE (4UL * 1024) static void *read_write_data(char *addr) { char tmp; for (int i = 0; i < DATA_SIZE; i += PAGE_SIZE) { tmp = *(volatile char *)(addr + i); *(volatile char *)(addr + i) = tmp; } } int main(int argc, char **argv) { struct rusage ru; char *addr = mmap(NULL, DATA_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); memset(addr, 0x11, DATA_SIZE); do { long old_ru_minflt, old_ru_majflt; long new_ru_minflt, new_ru_majflt; madvise(addr, DATA_SIZE, MADV_PAGEOUT); getrusage(RUSAGE_SELF, &ru); old_ru_minflt = ru.ru_minflt; old_ru_majflt = ru.ru_majflt; read_write_data(addr); getrusage(RUSAGE_SELF, &ru); new_ru_minflt = ru.ru_minflt; new_ru_majflt = ru.ru_majflt; printf("minor faults:%ld major faults:%ld\n", new_ru_minflt - old_ru_minflt, new_ru_majflt - old_ru_majflt); } while(0); return 0; } w/o patch, / # ~/a.out minor faults:512 major faults:512 w/ patch, / # ~/a.out minor faults:0 major faults:512 Minor faults decrease to 0! Signed-off-by: Barry Song Acked-by: David Hildenbrand --- -v2: * don't set the dirty flag for read fault, per David; * make write-protect of uffd_wp clear and remove confusion( it used to be "wrprotected->writable->wrprotected"), per David; Thank you for reviewing, David. -v1: https://lore.kernel.org/linux-mm/20240531104819.140218-1-21cnbao@gmail.com/ mm/memory.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index eef4e482c0c2..9696c7397b85 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4316,6 +4316,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages); add_mm_counter(vma->vm_mm, MM_SWAPENTS, -nr_pages); pte = mk_pte(page, vma->vm_page_prot); + if (pte_swp_soft_dirty(vmf->orig_pte)) + pte = pte_mksoft_dirty(pte); + if (pte_swp_uffd_wp(vmf->orig_pte)) + pte = pte_mkuffd_wp(pte); /* * Same logic as in do_wp_page(); however, optimize for pages that are @@ -4325,18 +4329,18 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) */ if (!folio_test_ksm(folio) && (exclusive || folio_ref_count(folio) == 1)) { - if (vmf->flags & FAULT_FLAG_WRITE) { - pte = maybe_mkwrite(pte_mkdirty(pte), vma); - vmf->flags &= ~FAULT_FLAG_WRITE; + if ((vma->vm_flags & VM_WRITE) && !userfaultfd_pte_wp(vma, pte) && + !vma_soft_dirty_enabled(vma)) { + pte = pte_mkwrite(pte, vma); + if (vmf->flags & FAULT_FLAG_WRITE) { + pte = pte_mkdirty(pte); + vmf->flags &= ~FAULT_FLAG_WRITE; + } } rmap_flags |= RMAP_EXCLUSIVE; } folio_ref_add(folio, nr_pages - 1); flush_icache_pages(vma, page, nr_pages); - if (pte_swp_soft_dirty(vmf->orig_pte)) - pte = pte_mksoft_dirty(pte); - if (pte_swp_uffd_wp(vmf->orig_pte)) - pte = pte_mkuffd_wp(pte); vmf->orig_pte = pte_advance_pfn(pte, page_idx); /* ksm created a completely new copy */