From patchwork Tue Oct 16 13:13:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 10643537 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5729E13AD for ; Tue, 16 Oct 2018 13:14:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4511228C31 for ; Tue, 16 Oct 2018 13:14:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 393D329654; Tue, 16 Oct 2018 13:14:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A8F2A28C31 for ; Tue, 16 Oct 2018 13:14:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 659016B000C; Tue, 16 Oct 2018 09:14:13 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5DB356B000D; Tue, 16 Oct 2018 09:14:13 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4CC0A6B000E; Tue, 16 Oct 2018 09:14:13 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 243906B000C for ; Tue, 16 Oct 2018 09:14:13 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id j9-v6so14358943plt.3 for ; Tue, 16 Oct 2018 06:14:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=b8n7ekTz4I5WPtItNykHetIxRP9150gP8kPb0BQtO90=; b=YMqwycpxj2hKkNk3AqW7/cmKYmmDH0UoH4Sm7EcZLrmwRa24l+brZxAf7Kh9RrV6nR aOZMCZg4swgDvFfEiwgIpBMWpjcBVHSekRB7+fBZodTK6di0KnSP1C4vZUAb4Z1F6Nmb fpjod4Rf1Ifuot8NDTlGtl+1uZPNzOSTLmdIxUUUWdY4BASUyxrjyM3DI5F56Ks3ADXq c+0wqoSuEzx0XUyAXdt5QAxM+Uw2uHCfXKyWvHRUcpZ3//jC/+Fnwyt9MnwG2VABdVxY 6GRwoEq5jNITP/hONgpSNc2UVUAfcRqz2i2dzgVgg+frOziG/9fFejklZUoiSrT63acv hvlw== X-Gm-Message-State: ABuFfoiiNXnkgrN93DTQo63Pjja3TrSYo4HaIyUVqKDG/cokkZEQsiwW WMJJc1On0Ma/GzyEkQoOhTtI2NrsIb8bf2WIbMuge2WVpXJ2Uqq6d9GDxDmSkeqXEDn8wkuHXv4 hr5gYOVZClyluJxLm29GEjtH7RQ/cX9iJkqWi3pzhwW5hVV/SN/B6NFvnNBXhembvduEOh7FwJU E16AFfclo80ePxY1EqUKDug1cIOQLF0rDlH0Sv+8fXBIabAQHmmTWeKuyDhO9vklkwx8nNXQbGl CNusxVchySp36Kqve6Pm/LG0cx1GcICp5U59tL57ts2IFVKk6Onkw7lrdwxTbWvqRTP9100I/zU ce/HjDow2KQdm7O5FtedU1L2zzVl9xva8Za3bWHak9id8Ul2bLSNuWszK6vTQ2CCEmdGqoIm3k9 C X-Received: by 2002:a63:f711:: with SMTP id x17-v6mr20075982pgh.322.1539695652785; Tue, 16 Oct 2018 06:14:12 -0700 (PDT) X-Received: by 2002:a63:f711:: with SMTP id x17-v6mr20075928pgh.322.1539695651955; Tue, 16 Oct 2018 06:14:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539695651; cv=none; d=google.com; s=arc-20160816; b=RVlGdw0JobaRHdaTt7U3Q4MO37QyJmc8gaGfFBmDRH5vq7jILy4EWUNLxoVcVGylIz 0H8sU80mZ3YFQ+3lVaoaa24ERe1nUwngHgV37bYCJB8hYCo1rhIxtgi3iGzayU3x2zYy WKFZH63nTRY//GPxqTt6FSWmvGniOuf+l0UNtyroFDKPzVlKnyLmcRBWUNFLO6ECLXnT yUQHF5kB6CvZKMPKwuZpkqkiag5nMqejEOOcUnLL3Mfa8rMj/iXXFPp8x+cmhMDzqaAN +PrYiU38fotSMb3PofQrnQQnFrx0X0KIqKJJRQP1W5jqUWxoKbw3upB+6AdsSAnR+JeC bZxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=b8n7ekTz4I5WPtItNykHetIxRP9150gP8kPb0BQtO90=; b=JYbqdI1dD/KqH2vvJc7SxMgLuk2VPhxPzuZA4nPA82lNM/eLHa9q+Xqb0qCCJDjytl 0lHz9XZ9gpM4JFI7wfoLwg8qB29E192XhNgcdBywqj33EF1rLkWVfqVwQ6VydG31/hy3 9OizZ2Hqm36e2+tCALeLI9tcYmVR2FY4q6nrdoAUdfdEps0wDKQR9LtoqJnuL4uXmPNT 9Zt9mgPY0YXCEBl6d9oxYar+L4e+k5rof34CnIyBp/imVAzlEddMECjH0ofniIf6Eo8l fPLkx8yXZme0pLvWwYQ6EH5IdEOqkEwKlZuMEiHkemMy/E7DRTuUBe5NKZD+yITsvFQi Ry6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="WYUM7y/k"; spf=pass (google.com: domain of npiggin@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=npiggin@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id s23-v6sor4222653plr.15.2018.10.16.06.14.11 for (Google Transport Security); Tue, 16 Oct 2018 06:14:11 -0700 (PDT) Received-SPF: pass (google.com: domain of npiggin@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="WYUM7y/k"; spf=pass (google.com: domain of npiggin@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=npiggin@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=b8n7ekTz4I5WPtItNykHetIxRP9150gP8kPb0BQtO90=; b=WYUM7y/kL8/+heRnnym3+/pjQXjKvyJ+4vExqnWSUKAt5OCcPWLDKAyjYXXX+3tTNR T6b0g4t9dmXMbmx3FIvcAqX7O329x5zhDruzeal6OmBiza0FOgjbU1RMeljPcpkbKp0/ TFo8sX1VGUzkmoHiR1PFKoUwS+KpHNocpFVRrcfhk+rx6fduiPzvcVAXqnIL0cBl6THr hbVSz2bbtVDrUnEsJdSAy25sFuItTMKF0xRIJvWJnmBDgjGpCixgOlx4aG9rDmg6RsAx /uuAKEyJO8ldmQhR9iNJHXbr4spuqd6ROpfB/V469ofmS6CbDMXdpP6hxlUKVxCe5x+R 3iTQ== X-Google-Smtp-Source: ACcGV63ooqapdfib9rvEhAj6eX85EavZn0yGkIHSWlSmi1u/6UNdDgdZJZkJZsL5NL3zvWoE/2y3vA== X-Received: by 2002:a17:902:4681:: with SMTP id p1-v6mr13592768pld.97.1539695651671; Tue, 16 Oct 2018 06:14:11 -0700 (PDT) Received: from roar.local0.net ([60.240.252.156]) by smtp.gmail.com with ESMTPSA id j62-v6sm16043423pgd.40.2018.10.16.06.14.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 Oct 2018 06:14:11 -0700 (PDT) From: Nicholas Piggin To: Andrew Morton Cc: Nicholas Piggin , Linus Torvalds , linux-mm , linux-arch , Linux Kernel Mailing List , ppc-dev , Ley Foon Tan Subject: [PATCH v2 3/5] mm/cow: optimise pte accessed bit handling in fork Date: Tue, 16 Oct 2018 23:13:41 +1000 Message-Id: <20181016131343.20556-4-npiggin@gmail.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20181016131343.20556-1-npiggin@gmail.com> References: <20181016131343.20556-1-npiggin@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP fork clears dirty/accessed bits from new ptes in the child. This logic has existed since mapped page reclaim was done by scanning ptes when it may have been quite important. Today with physical based pte scanning, there is less reason to clear these bits, so this patch avoids clearing the accessed bit in the child. Any accessed bit is treated similarly to many, with the difference today with > 1 referenced bit causing the page to be activated, while 1 bit causes it to be kept. This patch causes pages shared by fork(2) to be more readily activated, but this heuristic is very fuzzy anyway -- a page can be accessed by multiple threads via a single pte and be just as important as one that is accessed via multiple ptes, for example. In the end I don't believe fork(2) is a significant driver of page reclaim behaviour that this should matter too much. This and the following change eliminate a major source of faults that powerpc/radix requires to set dirty/accessed bits in ptes, speeding up a fork/exit microbenchmark by about 5% on POWER9 (16600 -> 17500 fork/execs per second). Skylake appears to have a micro-fault overhead too -- a test which allocates 4GB anonymous memory, reads each page, then forks, and times the child reading a byte from each page. The first pass over the pages takes about 1000 cycles per page, the second pass takes about 27 cycles (TLB miss). With no additional minor faults measured due to either child pass, and the page array well exceeding TLB capacity, the large cost must be caused by micro faults caused by setting accessed bit. Signed-off-by: Nicholas Piggin --- mm/huge_memory.c | 2 -- mm/memory.c | 1 - mm/vmscan.c | 8 ++++++++ 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0fb0e3025f98..1f43265204d4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -977,7 +977,6 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmdp_set_wrprotect(src_mm, addr, src_pmd); pmd = pmd_wrprotect(pmd); } - pmd = pmd_mkold(pmd); set_pmd_at(dst_mm, addr, dst_pmd, pmd); ret = 0; @@ -1071,7 +1070,6 @@ int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, pudp_set_wrprotect(src_mm, addr, src_pud); pud = pud_wrprotect(pud); } - pud = pud_mkold(pud); set_pud_at(dst_mm, addr, dst_pud, pud); ret = 0; diff --git a/mm/memory.c b/mm/memory.c index c467102a5cbc..0387ee1e3582 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1033,7 +1033,6 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, */ if (vm_flags & VM_SHARED) pte = pte_mkclean(pte); - pte = pte_mkold(pte); page = vm_normal_page(vma, addr, pte); if (page) { diff --git a/mm/vmscan.c b/mm/vmscan.c index c5ef7240cbcb..e72d5b3336a0 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1031,6 +1031,14 @@ static enum page_references page_check_references(struct page *page, * to look twice if a mapped file page is used more * than once. * + * fork() will set referenced bits in child ptes despite + * not having been accessed, to avoid micro-faults of + * setting accessed bits. This heuristic is not perfectly + * accurate in other ways -- multiple map/unmap in the + * same time window would be treated as multiple references + * despite same number of actual memory accesses made by + * the program. + * * Mark it and spare it for another trip around the * inactive list. Another page table reference will * lead to its activation.