From patchwork Mon Jan 22 19:41:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13526132 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02743C47DAF for ; Mon, 22 Jan 2024 19:42:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 157D06B0093; Mon, 22 Jan 2024 14:42:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 106E76B0096; Mon, 22 Jan 2024 14:42:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EE9BA6B0098; Mon, 22 Jan 2024 14:42:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D984F6B0093 for ; Mon, 22 Jan 2024 14:42:19 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9CE3D160499 for ; Mon, 22 Jan 2024 19:42:19 +0000 (UTC) X-FDA: 81707968398.12.B6A6299 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf19.hostedemail.com (Postfix) with ESMTP id BCDD81A0022 for ; Mon, 22 Jan 2024 19:42:17 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cepXlsbM; spf=pass (imf19.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705952537; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=qd9zQc8SwlMxJsbzyQie1UqPu3p8Wk+b2U2cDU4I5Gw=; b=w+naFfxNpe9G/WbViFDPJmOl3iIvnc23eVq1RytcE20hI7HsZg1cpUn2lwWQaBVFNVHhsL edLhxcN+q4ZwUqWjrI+guMMJq4M3VzzDiPuXRYShNJ33zWuxEbC8niGYMlYwMKKaxJ/5n9 rqTCDgNipSQGkLpIX3ND71y8buC+YGI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705952537; a=rsa-sha256; cv=none; b=JlO9K3Xc5LzBfsuZ8N32L4iAppsezKdLnYRsOCSPEKltJg4PQlLe/BmfADvFDqsuqcS9Ba QD6pgnGm5AXTa7RgS1WRBJyMoQ9Ps504s1g1PcawW6a/YZKzS6Y5a3noW1HNJO5dJa8nXt YRfKWL0RL2w7cdRXsQo7+lE8KL18GKk= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cepXlsbM; spf=pass (imf19.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705952537; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=qd9zQc8SwlMxJsbzyQie1UqPu3p8Wk+b2U2cDU4I5Gw=; b=cepXlsbMKAZ43+6pkZhg5NIBxSRybevAY4LuZtUMePcfNhtJR0hPcL7HRFqnFQheuEeKGu zqenkG3oBFyyTdJVihgkvLBPV0Ps4+uH6tMysVXkc9FQTBoWfvU69N0IisXfichETeBL4u iZ7OyQq7UOPiVRTLDnJq7UqwCq2nO+0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-351-x69pe3DCPsqpnwDc2t8HtA-1; Mon, 22 Jan 2024 14:42:11 -0500 X-MC-Unique: x69pe3DCPsqpnwDc2t8HtA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id AEEA788D016; Mon, 22 Jan 2024 19:42:09 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.195.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 267043C2E; Mon, 22 Jan 2024 19:42:02 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Matthew Wilcox , Ryan Roberts , Russell King , Catalin Marinas , Will Deacon , Dinh Nguyen , Michael Ellerman , Nicholas Piggin , Christophe Leroy , "Aneesh Kumar K.V" , "Naveen N. Rao" , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Gordeev , Gerald Schaefer , Heiko Carstens , Vasily Gorbik , Christian Borntraeger , Sven Schnelle , "David S. Miller" , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, sparclinux@vger.kernel.org Subject: [PATCH v1 00/11] mm/memory: optimize fork() with PTE-mapped THP Date: Mon, 22 Jan 2024 20:41:49 +0100 Message-ID: <20240122194200.381241-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 X-Stat-Signature: 1kj84xpyz4kxgpkc6sh6t7s5oz37r34r X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: BCDD81A0022 X-Rspam-User: X-HE-Tag: 1705952537-679147 X-HE-Meta: U2FsdGVkX18F/jePJy6KqXoDj6p74VdmmgQ1e/0bK5yyFpBwbnBfX8OiBguuwUzFGU8fpQl0b4ssRn8uLiWiIwiW+YjTs4/DqOCalywTLMQ7xeMkdSQKUJQljDi+GKI9WHgnVnjFo9WZIlxeCSZsjGmW4gk5zM3xm8QzR0Cq/0E3xUQ+S7okUFuN6uFrRJBSVAdnUSH6KA+u2smUtHkQZu1lAzlFix6gvo3Bk9xMG4+IB8Fj+6TlufB0ARQPWyy5F5P1pP1UeaH4UdzEG0djDq/i6wfL5c7EgqhuOzRpErSYk0H+G8KZCF+GQYDtyky5rCEG0HXXZQfY0hWWeVjVXoFPPzpY7HI7+6Oq+8E9NKgad5cf8mBfVlMCAid9z88whDfmDdC6HQblmUjouPbmqyNLvkUyIj/fOOjkYzbdrcHe9BXIIGkg+1kF/Fw+FqU3AA+6zi/SFMpaKucpkFmNTQxnfpUs3/hDju9cDEFU8E/mzH3XR7KoUChWx9HzKAnSUrQ69BF/h6xReDHJjXmZJ0Jvuuw+pup2FbKxav4siWvN1NE4aCFk/D7R5QrgGSLr6MlwuLWh/aeVEwThXqw1F0MOXkthmOWkjH05A7l4G54PJD8ARU7oC0qp/EeXK7ql+JeE0PdZ/XDdqhRV1G4U1UI5KcTy3zQ7veHmIpv0TVW+vaHUytFFWp2l+8qhsq2ryW1TeDrE0XkNF7swaHCTozEQcaSxay3NeUahBM9dUBOPsBug+sGOtrU8b/qNL6Gid6dLGqdxgwJ3TQWaZDCvHDVtUbYyI26nX1vZuvLr7wyuejrQVApmrTes9O2o3fy0uHo7aApJyiqSsS1GTgtNEHBZECQGW7dRu4w26zW/MO9V8+Unau5ncKvXocfAxE5ZylhiglHvUFqqjtruWnBNDL2CFpXNDYgOTjdPMBPL2bQvzywkUmVN1ad9pk1ZUe8J69kobbQgGcp7/Ycvsru Uda41wVZ DJzoXNZApMiWlTYhnYUqEgyuSLi7FMqt/JQt+W/bKC5TRqjS0Y+YTAhytm7b+62rApf9RVJWrQKNd9E3fKi2VIEFweXHo2qVTqePrxbNPHiwRNELuUX8Xd//PL+g+9SQxk9a33bKkb7w69t2iL2mI7SsUYVJhJdonrVJe8/HWYaDOclnxv2zkZPZgw05Of26hS4Vt6rYAUnnIbYIRjotPN+A/+lW0Ds3pZbkFNgJpE+UFFudS954fbT7WIC1hpN5+mffLQ4nc0DSWKbbHU3+5l3tAw+RfZCbfVzf+EXyODw8H2hLtvKiquFRL+VkfksSrEpRoz1vaVrh254htsxgzO077bkrjjZAvf4Q8Isk4GBLmUwJJULFZ1mWgKRR/yy6TJ5834Zx3xm/8GN9aG989Y/SXkvEI8EElY+YXorL64uaO+ZnZSiOhaL/9iCsu1qv5nHvHQvoOsMazF4iCvBO3LMSQoxWvL21lYU3QfE8mmyBpr2LPq2Dg5nkyDYxo0ehVdSz3H5AShUC0RbgPLO2oEjAXF2WJeqpdC6TXcrZKfABLCiGK/Abu9zi18XlvNXv8Xquk959txxF1kBqfMsQfQZdruKfikoKKXmHjSdHZxbuDxixiDtiInGwDj5dszbP/jl8oQun7N+zKRvUQB830xX1AX2x8HCXQG2q8mkCuLgC/sH1vKJV6fryxUxQJTnBRsfCaDR2UsLTmevjmyzV0HZVZ/SDbrm8OTN6/MUWoO9UXIrvQAjgyBTVbfDrn7HmtOlkk4puinp3Ekx0zpYrOJhlz/ky8+FzCyU3DIH+IogkV5lXTCCBdmK+KcQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that the rmap overhaul[1] is upstream that provides a clean interface for rmap batching, let's implement PTE batching during fork when processing PTE-mapped THPs. This series is partially based on Ryan's previous work[2] to implement cont-pte support on arm64, but its a complete rewrite based on [1] to optimize all architectures independent of any such PTE bits, and to use the new rmap batching functions that simplify the code and prepare for further rmap accounting changes. We collect consecutive PTEs that map consecutive pages of the same large folio, making sure that the other PTE bits are compatible, and (a) adjust the refcount only once per batch, (b) call rmap handling functions only once per batch and (c) perform batch PTE setting/updates. While this series should be beneficial for adding cont-pte support on ARM64[2], it's one of the requirements for maintaining a total mapcount[3] for large folios with minimal added overhead and further changes[4] that build up on top of the total mapcount. Independent of all that, this series results in a speedup during fork with PTE-mapped THP, which is the default with THPs that are smaller than a PMD (for example, 16KiB to 1024KiB mTHPs for anonymous memory[5]). On an Intel Xeon Silver 4210R CPU, fork'ing with 1GiB of PTE-mapped folios of the same size (stddev < 1%) results in the following runtimes for fork() (shorter is better): Folio Size | v6.8-rc1 | New | Change ------------------------------------------ 4KiB | 0.014328 | 0.014265 | 0% 16KiB | 0.014263 | 0.013293 | - 7% 32KiB | 0.014334 | 0.012355 | -14% 64KiB | 0.014046 | 0.011837 | -16% 128KiB | 0.014011 | 0.011536 | -18% 256KiB | 0.013993 | 0.01134 | -19% 512KiB | 0.013983 | 0.011311 | -19% 1024KiB | 0.013986 | 0.011282 | -19% 2048KiB | 0.014305 | 0.011496 | -20% Next up is PTE batching when unmapping, that I'll probably send out based on this series this/next week. Only tested on x86-64. Compile-tested on most other architectures. Will do more testing and double-check the arch changes while this is getting some review. [1] https://lkml.kernel.org/r/20231220224504.646757-1-david@redhat.com [2] https://lkml.kernel.org/r/20231218105100.172635-1-ryan.roberts@arm.com [3] https://lkml.kernel.org/r/20230809083256.699513-1-david@redhat.com [4] https://lkml.kernel.org/r/20231124132626.235350-1-david@redhat.com [5] https://lkml.kernel.org/r/20231207161211.2374093-1-ryan.roberts@arm.com Cc: Andrew Morton Cc: Matthew Wilcox (Oracle) Cc: Ryan Roberts Cc: Russell King Cc: Catalin Marinas Cc: Will Deacon Cc: Dinh Nguyen Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: "Aneesh Kumar K.V" Cc: "Naveen N. Rao" Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Cc: Alexander Gordeev Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Cc: Sven Schnelle Cc: "David S. Miller" Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-riscv@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: sparclinux@vger.kernel.org David Hildenbrand (11): arm/pgtable: define PFN_PTE_SHIFT on arm and arm64 nios2/pgtable: define PFN_PTE_SHIFT powerpc/pgtable: define PFN_PTE_SHIFT risc: pgtable: define PFN_PTE_SHIFT s390/pgtable: define PFN_PTE_SHIFT sparc/pgtable: define PFN_PTE_SHIFT mm/memory: factor out copying the actual PTE in copy_present_pte() mm/memory: pass PTE to copy_present_pte() mm/memory: optimize fork() with PTE-mapped THP mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch() mm/memory: ignore writable bit in folio_pte_batch() arch/arm/include/asm/pgtable.h | 2 + arch/arm64/include/asm/pgtable.h | 2 + arch/nios2/include/asm/pgtable.h | 2 + arch/powerpc/include/asm/pgtable.h | 2 + arch/riscv/include/asm/pgtable.h | 2 + arch/s390/include/asm/pgtable.h | 2 + arch/sparc/include/asm/pgtable_64.h | 2 + include/linux/pgtable.h | 17 ++- mm/memory.c | 188 +++++++++++++++++++++------- 9 files changed, 173 insertions(+), 46 deletions(-) base-commit: 6613476e225e090cc9aad49be7fa504e290dd33d