From patchwork Tue Jan 4 20:22:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 12703831 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5647C4332F for ; Tue, 4 Jan 2022 20:23:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 551E16B0073; Tue, 4 Jan 2022 15:23:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FF056B0074; Tue, 4 Jan 2022 15:23:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 304BA6B0075; Tue, 4 Jan 2022 15:23:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0021.hostedemail.com [216.40.44.21]) by kanga.kvack.org (Postfix) with ESMTP id 188326B0073 for ; Tue, 4 Jan 2022 15:23:24 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id C7536181CB164 for ; Tue, 4 Jan 2022 20:23:23 +0000 (UTC) X-FDA: 78993729486.06.0EEED34 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf22.hostedemail.com (Postfix) with ESMTP id 1CF9AC000E for ; Tue, 4 Jan 2022 20:23:21 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id s84-20020a254557000000b0060ac37f4bb1so55724107yba.5 for ; Tue, 04 Jan 2022 12:23:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=2fzYgKHjTtrxGw0GB60HtJq3pkJVmwXmT5LDUgkS65M=; b=YI7VLpCNyyiUbnArPXH4Vog2iUj3Py1GveGy5aNLD018G6UTJ5D220cf3S8iAqFZM+ ChAgVvl7odJAso8iiC8Gk0jW6zt3nXSLXLFwhahNv16mu72Qr/tPjaoNxJsh0sf99Tog zzLbdc/omE1wmDBCYoYXcsX9KI5r+qZclmfYOKgcEEVYtRo4fTCUb/AqeQpCTi24srgE WXXwVoof997pmI299Q4VRR809/TbKAPHfWcDdJ+pKT66zHCqIfwmnvDPJtv8NppW1v23 kezUSb2GfdvOVZ/hNDqh3QznuHfUtxOBxEaueZjlEIM0iuTtXQeA0DXfSB7YiBOdpJQ5 BHIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=2fzYgKHjTtrxGw0GB60HtJq3pkJVmwXmT5LDUgkS65M=; b=0q/SP9qKnsRUUKOLzP6pJTISfGh6J2oG/mm6btckFE5PoFisj0aCgRcFCwRXLtBJeC m2EnliW2+ZKl0tAqzvqCwfLq3FxEw6uX5p1j5Kd/PhAS9WxSLYLfzdMk70AstnIiIqyj l7x31pIhmygkCSFNfgRA7GcPkRpJ9hW0XA7tTF5wVcss+k8YRBlEHpS4n0EAW3VMduvW L97RoywLLq8mi6H3kMkGALiqwh6zZ/s198QPiSGTK+mT+YjtzNFjZL3EGrFwWfWjoMV8 J8QQ85QU03xJaQP3w114gu95ZipIq1eCJdv9IsnKT1GSEBiaDorixh8OHGWNNutWuBrj NnAw== X-Gm-Message-State: AOAM5336FCD0UkVBKI6WeEfJKLppY4ZcgsT7aXwWCOCW62GsF8jvm75D ZcYVPlGZs+7vKPxuupcw5Rv0i8TKkAI= X-Google-Smtp-Source: ABdhPJzJjG/RD50GN61XoI5Bcnp6HfOxLQQyP8uuoJlHDydVLPIBL6v9CM3RuZykJkxMwwXnrdc0GXLYZr4= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:6c8c:5506:7ca2:9dfd]) (user=yuzhao job=sendgmr) by 2002:a25:84c2:: with SMTP id x2mr17042809ybm.51.1641327802572; Tue, 04 Jan 2022 12:23:22 -0800 (PST) Date: Tue, 4 Jan 2022 13:22:20 -0700 In-Reply-To: <20220104202227.2903605-1-yuzhao@google.com> Message-Id: <20220104202227.2903605-2-yuzhao@google.com> Mime-Version: 1.0 References: <20220104202227.2903605-1-yuzhao@google.com> X-Mailer: git-send-email 2.34.1.448.ga2b2bfdf31-goog Subject: [PATCH v6 1/9] mm: x86, arm64: add arch_has_hw_pte_young() From: Yu Zhao To: Andrew Morton , Linus Torvalds Cc: Andi Kleen , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Jesse Barnes , Johannes Weiner , Jonathan Corbet , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Rik van Riel , Vlastimil Babka , Will Deacon , Ying Huang , linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, page-reclaim@google.com, x86@kernel.org, Yu Zhao , Konstantin Kharlamov X-Stat-Signature: d3er17saznctks4xoashxxdiraq7rqkx X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 1CF9AC000E Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=YI7VLpCN; spf=pass (imf22.hostedemail.com: domain of 3uqzUYQYKCHsxtygZnfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--yuzhao.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3uqzUYQYKCHsxtygZnfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1641327801-493736 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Some architectures automatically set the accessed bit in PTEs, e.g., x86 and arm64 v8.2. On architectures that don't have this capability, clearing the accessed bit in a PTE usually triggers a page fault following the TLB miss of this PTE. Being aware of this capability can help make better decisions, e.g., whether to spread the work out over a period of time to avoid bursty page faults when trying to clear the accessed bit in a large number of PTEs. Signed-off-by: Yu Zhao Tested-by: Konstantin Kharlamov --- arch/arm64/include/asm/cpufeature.h | 5 +++++ arch/arm64/include/asm/pgtable.h | 13 ++++++++----- arch/arm64/kernel/cpufeature.c | 19 +++++++++++++++++++ arch/arm64/tools/cpucaps | 1 + arch/x86/include/asm/pgtable.h | 6 +++--- include/linux/pgtable.h | 13 +++++++++++++ mm/memory.c | 14 +------------- 7 files changed, 50 insertions(+), 21 deletions(-) diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index ef6be92b1921..99518b4b2a9e 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -779,6 +779,11 @@ static inline bool system_supports_tlb_range(void) cpus_have_const_cap(ARM64_HAS_TLB_RANGE); } +static inline bool system_has_hw_af(void) +{ + return IS_ENABLED(CONFIG_ARM64_HW_AFDBM) && cpus_have_const_cap(ARM64_HW_AF); +} + extern int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt); static inline u32 id_aa64mmfr0_parange_to_phys_shift(int parange) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index c4ba047a82d2..e736f47436c7 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -999,13 +999,16 @@ static inline void update_mmu_cache(struct vm_area_struct *vma, * page after fork() + CoW for pfn mappings. We don't always have a * hardware-managed access flag on arm64. */ -static inline bool arch_faults_on_old_pte(void) +static inline bool arch_has_hw_pte_young(bool local) { - WARN_ON(preemptible()); + if (local) { + WARN_ON(preemptible()); + return cpu_has_hw_af(); + } - return !cpu_has_hw_af(); + return system_has_hw_af(); } -#define arch_faults_on_old_pte arch_faults_on_old_pte +#define arch_has_hw_pte_young arch_has_hw_pte_young /* * Experimentally, it's cheap to set the access flag in hardware and we @@ -1013,7 +1016,7 @@ static inline bool arch_faults_on_old_pte(void) */ static inline bool arch_wants_old_prefaulted_pte(void) { - return !arch_faults_on_old_pte(); + return arch_has_hw_pte_young(true); } #define arch_wants_old_prefaulted_pte arch_wants_old_prefaulted_pte diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 6f3e677d88f1..5bb553ee2c0e 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2171,6 +2171,25 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_hw_dbm, .cpu_enable = cpu_enable_hw_dbm, }, + { + /* + * __cpu_setup always enables this capability. But if the boot + * CPU has it and a late CPU doesn't, the absent + * ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU will prevent this late CPU + * from going online. There is neither known hardware does that + * nor obvious reasons to design hardware works that way, hence + * no point leaving the door open here. If the need arises, a + * new weak system feature flag should do the trick. + */ + .desc = "Hardware update of the Access flag", + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .capability = ARM64_HW_AF, + .sys_reg = SYS_ID_AA64MMFR1_EL1, + .sign = FTR_UNSIGNED, + .field_pos = ID_AA64MMFR1_HADBS_SHIFT, + .min_field_value = 1, + .matches = has_cpuid_feature, + }, #endif { .desc = "CRC32 instructions", diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index 870c39537dd0..56e4ef5d95fa 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -36,6 +36,7 @@ HAS_STAGE2_FWB HAS_SYSREG_GIC_CPUIF HAS_TLB_RANGE HAS_VIRT_HOST_EXTN +HW_AF HW_DBM KVM_PROTECTED_MODE MISMATCHED_CACHE_TYPE diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 448cd01eb3ec..c60b16f8b741 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1397,10 +1397,10 @@ static inline bool arch_has_pfn_modify_check(void) return boot_cpu_has_bug(X86_BUG_L1TF); } -#define arch_faults_on_old_pte arch_faults_on_old_pte -static inline bool arch_faults_on_old_pte(void) +#define arch_has_hw_pte_young arch_has_hw_pte_young +static inline bool arch_has_hw_pte_young(bool local) { - return false; + return true; } #endif /* __ASSEMBLY__ */ diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index e24d2c992b11..53bd6a26918f 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -258,6 +258,19 @@ static inline int pmdp_clear_flush_young(struct vm_area_struct *vma, #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif +#ifndef arch_has_hw_pte_young +/* + * Return whether the accessed bit is supported by the local CPU or system-wide. + * + * This stub assumes accessing thru an old PTE triggers a page fault. + * Architectures that automatically set the access bit should overwrite it. + */ +static inline bool arch_has_hw_pte_young(bool local) +{ + return false; +} +#endif + #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index 8f1de811a1dc..ead6c7d4b9a1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -121,18 +121,6 @@ int randomize_va_space __read_mostly = 2; #endif -#ifndef arch_faults_on_old_pte -static inline bool arch_faults_on_old_pte(void) -{ - /* - * Those arches which don't have hw access flag feature need to - * implement their own helper. By default, "true" means pagefault - * will be hit on old pte. - */ - return true; -} -#endif - #ifndef arch_wants_old_prefaulted_pte static inline bool arch_wants_old_prefaulted_pte(void) { @@ -2755,7 +2743,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src, * On architectures with software "accessed" bits, we would * take a double page fault, so mark it accessed here. */ - if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) { + if (!arch_has_hw_pte_young(true) && !pte_young(vmf->orig_pte)) { pte_t entry; vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);