From patchwork Sun Dec 20 04:55:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 11983783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58941C4361B for ; Sun, 20 Dec 2020 04:55:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 03E5D23A5E for ; Sun, 20 Dec 2020 04:55:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 03E5D23A5E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 993F66B006C; Sat, 19 Dec 2020 23:55:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 943D36B006E; Sat, 19 Dec 2020 23:55:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 833BD6B0070; Sat, 19 Dec 2020 23:55:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0104.hostedemail.com [216.40.44.104]) by kanga.kvack.org (Postfix) with ESMTP id 64BBF6B006C for ; Sat, 19 Dec 2020 23:55:52 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2CEEF180AD82F for ; Sun, 20 Dec 2020 04:55:52 +0000 (UTC) X-FDA: 77612448144.24.boats70_1d108af2744c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id 0D0211A4A0 for ; Sun, 20 Dec 2020 04:55:52 +0000 (UTC) X-HE-Tag: boats70_1d108af2744c X-Filterd-Recvd-Size: 9494 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Sun, 20 Dec 2020 04:55:51 +0000 (UTC) Received: by mail-pl1-f171.google.com with SMTP id j1so3850935pld.3 for ; Sat, 19 Dec 2020 20:55:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9bX3LR+n1i0jGEGH3dWRU+mPM2rK0/8EWTnEof4Fv40=; b=nY2uBwPdkxF4qOrBcBFqRSED5+iUmsyuIg32bAnbFKN6twHpXeiHZWDaHJ/Ro0hhvf STNGlB7UYx+15oWTWvhDuwZ1aPKbeqGSquegDv4Ne5yPM3HCakGQ5f2BNzZnTuzLW5Y2 XxJrnNlyd3EI4vYVceVKfCEBZ7p77KGNVele9BMP21ynuHZke2K2oRxbzzHoCItP7qfa qNI4/uVioCVc45TOE6iWVhJF2awFVbcqsHhwvaImQxxTHe3jColIJ5iiHYOLlUgaVmVP xhxk5jCA+cnjxVe+lcUwFedYY1NOSMpGuaEDDE+9zR3R+4vLaI3SG0umCyEGiSoSHWIe ov5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9bX3LR+n1i0jGEGH3dWRU+mPM2rK0/8EWTnEof4Fv40=; b=eSj2L0taetPHWTCnocb2uK2fmgCoweqiLz8j17TNYGSMuGdNNlCnS/JfWu0DzwDPBz S4bNk7i/reQAiSsetcPNveU0csGH5x9abNlHqE3viF2Vs5csQNdKqbA1BsHSj5w5UGTJ YuMABb6bW8eY/tmOku4iqQhU2jk4OREpA1R0hTRe2aikqvbn2TF5nvh6zJzOCCTp3EAH WIkFEE5jnFta5uKZA61PKI7qiJBqEz3xvQ/HUVh9KzlCf8SoTUcnEM3UoEwn7SGb4irE UK57ksS1PHLmFK41tQKDNCYNkOYCBUFKNq67I0lTcHZVSayOsypvKsluZ+3Z3luQ8RF/ dVvw== X-Gm-Message-State: AOAM53270NOfdDFYO14yzHRgkTf3XP8DlDOgp1nbIFiaYF1ZIbqDfTke hsrY4pq841+g74CwOftszuu25QMSI4Y= X-Google-Smtp-Source: ABdhPJzwFr+CaiXFxtP03ohueNGOOIdu541URfXMkFXGiKdCmQZQa5qLEKYYWrZ7tTJNpeoa/OLQtw== X-Received: by 2002:a17:90a:9f44:: with SMTP id q4mr11791060pjv.226.1608440150554; Sat, 19 Dec 2020 20:55:50 -0800 (PST) Received: from bobo.ibm.com (193-116-97-30.tpgi.com.au. [193.116.97.30]) by smtp.gmail.com with ESMTPSA id l13sm13632529pgq.51.2020.12.19.20.55.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Dec 2020 20:55:50 -0800 (PST) From: Nicholas Piggin To: linux-mm@kvack.org Cc: Nicholas Piggin , Linus Torvalds , Bibo Mao Subject: [PATCH v3 3/3] mm: optimise pte dirty/accessed bit setting by demand based pte insertion Date: Sun, 20 Dec 2020 14:55:35 +1000 Message-Id: <20201220045535.848591-4-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20201220045535.848591-1-npiggin@gmail.com> References: <20201220045535.848591-1-npiggin@gmail.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Similarly to the previous patch, this tries to optimise dirty/accessed bits in ptes to avoid access costs of hardware setting them. This tidies up a few last cases where dirty/accessed faults can be seen, and subsumes the pte_sw_mkyoung helper -- it's not just architectures with explicit software dirty/accessed bits that take expensive faults to modify ptes. The vast majority of the remaining dirty/accessed faults on kbuild workloads after this patch are from NUMA migration, due to remove_migration_pte inserting old/clean ptes. Signed-off-by: Nicholas Piggin --- arch/mips/include/asm/pgtable.h | 2 -- include/linux/pgtable.h | 16 ---------------- mm/huge_memory.c | 4 ++-- mm/memory.c | 14 +++++++------- mm/migrate.c | 1 + mm/shmem.c | 1 + mm/userfaultfd.c | 2 +- 7 files changed, 12 insertions(+), 28 deletions(-) diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h index 4f9c37616d42..3275495adccb 100644 --- a/arch/mips/include/asm/pgtable.h +++ b/arch/mips/include/asm/pgtable.h @@ -406,8 +406,6 @@ static inline pte_t pte_mkyoung(pte_t pte) return pte; } -#define pte_sw_mkyoung pte_mkyoung - #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT static inline int pte_huge(pte_t pte) { return pte_val(pte) & _PAGE_HUGE; } diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 8fcdfa52eb4b..70d04931dff4 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -424,22 +424,6 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres } #endif -/* - * On some architectures hardware does not set page access bit when accessing - * memory page, it is responsibilty of software setting this bit. It brings - * out extra page fault penalty to track page access bit. For optimization page - * access bit can be set during all page fault flow on these arches. - * To be differentiate with macro pte_mkyoung, this macro is used on platforms - * where software maintains page access bit. - */ -#ifndef pte_sw_mkyoung -static inline pte_t pte_sw_mkyoung(pte_t pte) -{ - return pte; -} -#define pte_sw_mkyoung pte_sw_mkyoung -#endif - #ifndef pte_savedwrite #define pte_savedwrite pte_write #endif diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f2ca0326b5af..f6719312dc27 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2151,8 +2151,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = maybe_mkwrite(entry, vma); if (!write) entry = pte_wrprotect(entry); - if (!young) - entry = pte_mkold(entry); + if (young) + entry = pte_mkyoung(entry); if (soft_dirty) entry = pte_mksoft_dirty(entry); if (uffd_wp) diff --git a/mm/memory.c b/mm/memory.c index dd1f364d8ca3..4cebba596660 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1639,7 +1639,7 @@ static int insert_page_into_pte_locked(struct mm_struct *mm, pte_t *pte, get_page(page); inc_mm_counter_fast(mm, mm_counter_file(page)); page_add_file_rmap(page, false); - set_pte_at(mm, addr, pte, mk_pte(page, prot)); + set_pte_at(mm, addr, pte, pte_mkyoung(mk_pte(page, prot))); return 0; } @@ -1954,10 +1954,9 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr, else entry = pte_mkspecial(pfn_t_pte(pfn, prot)); - if (mkwrite) { - entry = pte_mkyoung(entry); + entry = pte_mkyoung(entry); + if (mkwrite) entry = maybe_mkwrite(pte_mkdirty(entry), vma); - } set_pte_at(mm, addr, pte, entry); update_mmu_cache(vma, addr, pte); /* XXX: why not for insert_page? */ @@ -2889,7 +2888,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) } flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte)); entry = mk_pte(new_page, vma->vm_page_prot); - entry = pte_sw_mkyoung(entry); + entry = pte_mkyoung(entry); entry = maybe_mkwrite(pte_mkdirty(entry), vma); /* * Clear the pte entry and flush it first, before updating the @@ -3402,6 +3401,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS); pte = mk_pte(page, vma->vm_page_prot); + pte = pte_mkyoung(pte); if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &= ~FAULT_FLAG_WRITE; @@ -3545,7 +3545,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) __SetPageUptodate(page); entry = mk_pte(page, vma->vm_page_prot); - entry = pte_sw_mkyoung(entry); + entry = pte_mkyoung(entry); if (vma->vm_flags & VM_WRITE) entry = pte_mkwrite(pte_mkdirty(entry)); @@ -3821,7 +3821,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page) flush_icache_page(vma, page); entry = mk_pte(page, vma->vm_page_prot); - entry = pte_sw_mkyoung(entry); + entry = pte_mkyoung(entry); if (write) entry = maybe_mkwrite(pte_mkdirty(entry), vma); /* copy-on-write page */ diff --git a/mm/migrate.c b/mm/migrate.c index ee5e612b4cd8..d33b2bfc846b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2963,6 +2963,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, } } else { entry = mk_pte(page, vma->vm_page_prot); + entry = pte_mkyoung(entry); if (vma->vm_flags & VM_WRITE) entry = pte_mkwrite(pte_mkdirty(entry)); } diff --git a/mm/shmem.c b/mm/shmem.c index 7c6b6d8f6c39..4f23b16d6baf 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2420,6 +2420,7 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, goto out_release; _dst_pte = mk_pte(page, dst_vma->vm_page_prot); + _dst_pte = pte_mkyoung(_dst_pte); if (dst_vma->vm_flags & VM_WRITE) _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte)); else { diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 9a3d451402d7..56c44aa06a7e 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -99,7 +99,7 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, if (mem_cgroup_charge(page, dst_mm, GFP_KERNEL)) goto out_release; - _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot)); + _dst_pte = pte_mkdirty(pte_mkyoung(mk_pte(page, dst_vma->vm_page_prot))); if (dst_vma->vm_flags & VM_WRITE) { if (wp_copy) _dst_pte = pte_mkuffd_wp(_dst_pte);