From patchwork Mon Oct 14 10:58:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13834806 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CF5EAD16243 for ; Mon, 14 Oct 2024 11:30:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=qPt13oBtsQSK+UGOOEWOc/EwkFI9q79bukNoe+ZJFuA=; b=FszNW5xcqvVlb6y4siwlYimYoD K08yfPgLGss9nz3zPSgLXWU44l8YoqQpCgguYA6Nro3WbC2umHqrUqpzTtsZgrPomMq/wWzPcrA+I vJ+WUJLtWqQdGukzJ6724qGYEjTVINCeIysPl0Boj/g9zvW07lelay/C3q5Fzo4YwtsGvUBif0kwg hWGIzr7VIW2KwInCAOtVGvHGHyA7zL96UVzqOVSXlxfn5r89PZnD5OumytJ5bqawOYQKEdgF9mqmG Xff7t1F8ijB49lG4ONZymtWMzRhcdVIN8KqzthG/aJopd9x8fNhn8sUJRcVPfkh+eqeA9FS8eysoj sopzihfg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t0JHR-00000004vsY-48IM; Mon, 14 Oct 2024 11:30:29 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t0Ine-00000004nRv-2Z8f for linux-arm-kernel@lists.infradead.org; Mon, 14 Oct 2024 10:59:44 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E3E091684; Mon, 14 Oct 2024 04:00:10 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E47583F51B; Mon, 14 Oct 2024 03:59:38 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Anshuman Khandual , Ard Biesheuvel , Catalin Marinas , David Hildenbrand , Greg Marsden , Ivan Ivanov , Kalesh Singh , Marc Zyngier , Mark Rutland , Matthias Brugger , Miroslav Benes , Will Deacon Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 05/57] mm: Avoid split pmd ptl if pmd level is run-time folded Date: Mon, 14 Oct 2024 11:58:12 +0100 Message-ID: <20241014105912.3207374-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241014105912.3207374-1-ryan.roberts@arm.com> References: <20241014105514.3206191-1-ryan.roberts@arm.com> <20241014105912.3207374-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241014_035942_857655_CD690FAB X-CRM114-Status: GOOD ( 19.78 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org If there are only 2 levels of translation, the first level (pgd) may not be an entire page and so does not have a ptdesc backing it (this may be true on arm64 depending on the VA size and page size). Even if it is an entire page and does therefore have an entire ptdesc, pagetable_pmd_ctor() won't be called for the ptdesc (since it's a pgd not pmd table) and so the per-ptdec ptl fields won't be initialised. To date this has been fine; the arch knows at compile time if it needs to fold the pmd level and in this case does not select CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK. However, if the number of levels are not known at compile time (as is the case for boot-time page size selection), we want to be able to choose at boot whether to use split pmd ptls in the pmd's ptdesc or simply fall back to the lock in the mm_struct. So let's make that change; when CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK is selected, determine if it should be used at run-time based on mm_pmd_folded(). This sets us up for arm64 to support boot-time page size selection. Signed-off-by: Ryan Roberts --- ***NOTE*** Any confused maintainers may want to read the cover note here for context: https://lore.kernel.org/all/20241014105514.3206191-1-ryan.roberts@arm.com/ include/linux/mm.h | 15 ++++++++++++++- include/linux/mm_types.h | 2 +- kernel/fork.c | 4 ++-- 3 files changed, 17 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 1470736017168..09a840517c23a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3037,6 +3037,8 @@ static inline struct ptdesc *pmd_ptdesc(pmd_t *pmd) static inline spinlock_t *pmd_lockptr(struct mm_struct *mm, pmd_t *pmd) { + if (mm_pmd_folded(mm)) + return &mm->page_table_lock; return ptlock_ptr(pmd_ptdesc(pmd)); } @@ -3056,7 +3058,18 @@ static inline void pmd_ptlock_free(struct ptdesc *ptdesc) ptlock_free(ptdesc); } -#define pmd_huge_pte(mm, pmd) (pmd_ptdesc(pmd)->pmd_huge_pte) +static inline pgtable_t *__pmd_huge_pte(struct mm_struct *mm, pmd_t *pmd) +{ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + if (mm_pmd_folded(mm)) + return &mm->pmd_huge_pte; + return &pmd_ptdesc(pmd)->pmd_huge_pte; +#else + return NULL; +#endif +} + +#define pmd_huge_pte(mm, pmd) (*__pmd_huge_pte(mm, pmd)) #else diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 0844ed7cfaa53..87dc6de7b7baf 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -946,7 +946,7 @@ struct mm_struct { #ifdef CONFIG_MMU_NOTIFIER struct mmu_notifier_subscriptions *notifier_subscriptions; #endif -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) pgtable_t pmd_huge_pte; /* protected by page_table_lock */ #endif #ifdef CONFIG_NUMA_BALANCING diff --git a/kernel/fork.c b/kernel/fork.c index cc760491f2012..ea472566d4fcc 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -832,7 +832,7 @@ static void check_mm(struct mm_struct *mm) pr_alert("BUG: non-zero pgtables_bytes on freeing mm: %ld\n", mm_pgtables_bytes(mm)); -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) VM_BUG_ON_MM(mm->pmd_huge_pte, mm); #endif } @@ -1276,7 +1276,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, RCU_INIT_POINTER(mm->exe_file, NULL); mmu_notifier_subscriptions_init(mm); init_tlb_flush_pending(mm); -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) mm->pmd_huge_pte = NULL; #endif mm_init_uprobes_state(mm);