From patchwork Fri Mar 28 09:13:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Gordeev X-Patchwork-Id: 14031779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3E0BBC28B20 for ; Fri, 28 Mar 2025 09:14:06 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.930368.1333055 (Exim 4.92) (envelope-from ) id 1ty5mo-0002qt-QJ; Fri, 28 Mar 2025 09:13:58 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 930368.1333055; Fri, 28 Mar 2025 09:13:58 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1ty5mo-0002pz-K7; Fri, 28 Mar 2025 09:13:58 +0000 Received: by outflank-mailman (input) for mailman id 930368; Fri, 28 Mar 2025 09:13:57 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1ty5mn-0001o1-55 for xen-devel@lists.xenproject.org; Fri, 28 Mar 2025 09:13:57 +0000 Received: from li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com (unknown [2a02:8070:a484:e780::f9cd]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id f6f2a26d-0bb4-11f0-9ffa-bf95429c2676; Fri, 28 Mar 2025 10:13:52 +0100 (CET) Received: from agordeev by li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com with local (Exim 4.98.1) (envelope-from ) id 1ty5mY-00000001t9f-1ee8; Fri, 28 Mar 2025 10:13:42 +0100 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list X-Inumbo-ID: f6f2a26d-0bb4-11f0-9ffa-bf95429c2676 From: Alexander Gordeev To: Andrey Ryabinin , Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, kasan-dev@googlegroups.com, sparclinux@vger.kernel.org, xen-devel@lists.xenproject.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, Hugh Dickins , Nicholas Piggin , Guenter Roeck , Juergen Gross , Jeremy Fitzhardinge Subject: [PATCH 1/4] kasan: Avoid sleepable page allocation from atomic context Date: Fri, 28 Mar 2025 10:13:39 +0100 Message-ID: X-Mailer: git-send-email 2.43.5 In-Reply-To: References: MIME-Version: 1.0 Sender: Alexander Gordeev apply_to_page_range() enters lazy MMU mode and then invokes kasan_populate_vmalloc_pte() callback on each page table walk iteration. The lazy MMU mode may only be entered only under protection of the page table lock. However, the callback can go into sleep when trying to allocate a single page. Change __get_free_page() allocation mode from GFP_KERNEL to GFP_ATOMIC to avoid scheduling out while in atomic context. Signed-off-by: Alexander Gordeev --- mm/kasan/shadow.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c index 88d1c9dcb507..edfa77959474 100644 --- a/mm/kasan/shadow.c +++ b/mm/kasan/shadow.c @@ -301,7 +301,7 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr, if (likely(!pte_none(ptep_get(ptep)))) return 0; - page = __get_free_page(GFP_KERNEL); + page = __get_free_page(GFP_ATOMIC); if (!page) return -ENOMEM; From patchwork Fri Mar 28 09:13:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Gordeev X-Patchwork-Id: 14031780 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6C774C36010 for ; Fri, 28 Mar 2025 09:14:07 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.930367.1333036 (Exim 4.92) (envelope-from ) id 1ty5mn-0002HY-K1; Fri, 28 Mar 2025 09:13:57 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 930367.1333036; Fri, 28 Mar 2025 09:13:57 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1ty5mn-0002EH-9X; Fri, 28 Mar 2025 09:13:57 +0000 Received: by outflank-mailman (input) for mailman id 930367; Fri, 28 Mar 2025 09:13:56 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1ty5mm-0001o1-54 for xen-devel@lists.xenproject.org; Fri, 28 Mar 2025 09:13:56 +0000 Received: from li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com (unknown [2a02:8070:a484:e780::f9cd]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id f6e79da4-0bb4-11f0-9ffa-bf95429c2676; Fri, 28 Mar 2025 10:13:52 +0100 (CET) Received: from agordeev by li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com with local (Exim 4.98.1) (envelope-from ) id 1ty5mY-00000001t9k-1nrE; Fri, 28 Mar 2025 10:13:42 +0100 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list X-Inumbo-ID: f6e79da4-0bb4-11f0-9ffa-bf95429c2676 From: Alexander Gordeev To: Andrey Ryabinin , Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, kasan-dev@googlegroups.com, sparclinux@vger.kernel.org, xen-devel@lists.xenproject.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, Hugh Dickins , Nicholas Piggin , Guenter Roeck , Juergen Gross , Jeremy Fitzhardinge Subject: [PATCH 2/4] mm: Allow detection of wrong arch_enter_lazy_mmu_mode() context Date: Fri, 28 Mar 2025 10:13:40 +0100 Message-ID: <07d6b41231658e66a570b21f3ed6fc8cf57ee171.1743079720.git.agordeev@linux.ibm.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: References: MIME-Version: 1.0 Sender: Alexander Gordeev The lazy MMU batching may be only be entered and left under the protection of the page table locks for all page tables which may be modified. Yet, there were cases arch_enter_lazy_mmu_mode() was called without the locks taken, e.g. commit b9ef323ea168 ("powerpc/64s: Disable preemption in hash lazy mmu mode"). Make default arch_enter|leave|flush_lazy_mmu_mode() callbacks complain at least in case the preemption is enabled to detect wrong contexts. Most platforms do not implement the callbacks, so to aovid a performance impact allow the complaint when CONFIG_DEBUG_VM option is enabled only. Signed-off-by: Alexander Gordeev --- include/linux/pgtable.h | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 94d267d02372..6669f977e368 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -228,9 +228,18 @@ static inline int pmd_dirty(pmd_t pmd) * it must synchronize the delayed page table writes properly on other CPUs. */ #ifndef __HAVE_ARCH_ENTER_LAZY_MMU_MODE -#define arch_enter_lazy_mmu_mode() do {} while (0) -#define arch_leave_lazy_mmu_mode() do {} while (0) -#define arch_flush_lazy_mmu_mode() do {} while (0) +static inline void arch_enter_lazy_mmu_mode(void) +{ + VM_WARN_ON(preemptible()); +} +static inline void arch_leave_lazy_mmu_mode(void) +{ + VM_WARN_ON(preemptible()); +} +static inline void arch_flush_lazy_mmu_mode(void) +{ + VM_WARN_ON(preemptible()); +} #endif #ifndef pte_batch_hint From patchwork Fri Mar 28 09:13:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Gordeev X-Patchwork-Id: 14031783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 62F76C28B20 for ; Fri, 28 Mar 2025 09:14:09 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.930366.1333031 (Exim 4.92) (envelope-from ) id 1ty5mn-0002Cp-9A; Fri, 28 Mar 2025 09:13:57 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 930366.1333031; Fri, 28 Mar 2025 09:13:57 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1ty5mn-0002CA-0n; Fri, 28 Mar 2025 09:13:57 +0000 Received: by outflank-mailman (input) for mailman id 930366; Fri, 28 Mar 2025 09:13:55 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1ty5ml-0001o1-54 for xen-devel@lists.xenproject.org; Fri, 28 Mar 2025 09:13:55 +0000 Received: from li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com (unknown [2a02:8070:a484:e780::f9cd]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id f6e79d94-0bb4-11f0-9ffa-bf95429c2676; Fri, 28 Mar 2025 10:13:52 +0100 (CET) Received: from agordeev by li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com with local (Exim 4.98.1) (envelope-from ) id 1ty5mY-00000001t9o-1wfb; Fri, 28 Mar 2025 10:13:42 +0100 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list X-Inumbo-ID: f6e79d94-0bb4-11f0-9ffa-bf95429c2676 From: Alexander Gordeev To: Andrey Ryabinin , Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, kasan-dev@googlegroups.com, sparclinux@vger.kernel.org, xen-devel@lists.xenproject.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, Hugh Dickins , Nicholas Piggin , Guenter Roeck , Juergen Gross , Jeremy Fitzhardinge Subject: [PATCH 3/4] mm: Cleanup apply_to_pte_range() routine Date: Fri, 28 Mar 2025 10:13:41 +0100 Message-ID: <6a4580bd289b4c62b09f8ef656d242c027c0c866.1743079720.git.agordeev@linux.ibm.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: References: MIME-Version: 1.0 Sender: Alexander Gordeev Reverse 'create' vs 'mm == &init_mm' conditions and move page table mask modification out of the atomic context. Signed-off-by: Alexander Gordeev --- mm/memory.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index fb7b8dc75167..00f253404db5 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2884,24 +2884,28 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd, pte_fn_t fn, void *data, bool create, pgtbl_mod_mask *mask) { + int err = create ? -ENOMEM : -EINVAL; pte_t *pte, *mapped_pte; - int err = 0; spinlock_t *ptl; - if (create) { - mapped_pte = pte = (mm == &init_mm) ? - pte_alloc_kernel_track(pmd, addr, mask) : - pte_alloc_map_lock(mm, pmd, addr, &ptl); + if (mm == &init_mm) { + if (create) + pte = pte_alloc_kernel_track(pmd, addr, mask); + else + pte = pte_offset_kernel(pmd, addr); if (!pte) - return -ENOMEM; + return err; } else { - mapped_pte = pte = (mm == &init_mm) ? - pte_offset_kernel(pmd, addr) : - pte_offset_map_lock(mm, pmd, addr, &ptl); + if (create) + pte = pte_alloc_map_lock(mm, pmd, addr, &ptl); + else + pte = pte_offset_map_lock(mm, pmd, addr, &ptl); if (!pte) - return -EINVAL; + return err; + mapped_pte = pte; } + err = 0; arch_enter_lazy_mmu_mode(); if (fn) { @@ -2913,12 +2917,14 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd, } } while (addr += PAGE_SIZE, addr != end); } - *mask |= PGTBL_PTE_MODIFIED; arch_leave_lazy_mmu_mode(); if (mm != &init_mm) pte_unmap_unlock(mapped_pte, ptl); + + *mask |= PGTBL_PTE_MODIFIED; + return err; } From patchwork Fri Mar 28 09:13:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Gordeev X-Patchwork-Id: 14031782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9C1C9C36013 for ; Fri, 28 Mar 2025 09:14:07 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.930363.1333006 (Exim 4.92) (envelope-from ) id 1ty5ml-0001oJ-5X; Fri, 28 Mar 2025 09:13:55 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 930363.1333006; Fri, 28 Mar 2025 09:13:55 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1ty5ml-0001oC-33; Fri, 28 Mar 2025 09:13:55 +0000 Received: by outflank-mailman (input) for mailman id 930363; Fri, 28 Mar 2025 09:13:54 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1ty5mj-0001nv-VA for xen-devel@lists.xenproject.org; Fri, 28 Mar 2025 09:13:53 +0000 Received: from li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com (unknown [2a02:8070:a484:e780::f9cd]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id f6e5b158-0bb4-11f0-9ea3-5ba50f476ded; Fri, 28 Mar 2025 10:13:52 +0100 (CET) Received: from agordeev by li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com with local (Exim 4.98.1) (envelope-from ) id 1ty5mY-00000001t9q-24st; Fri, 28 Mar 2025 10:13:42 +0100 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list X-Inumbo-ID: f6e5b158-0bb4-11f0-9ea3-5ba50f476ded From: Alexander Gordeev To: Andrey Ryabinin , Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, kasan-dev@googlegroups.com, sparclinux@vger.kernel.org, xen-devel@lists.xenproject.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, Hugh Dickins , Nicholas Piggin , Guenter Roeck , Juergen Gross , Jeremy Fitzhardinge Subject: [PATCH 4/4] mm: Protect kernel pgtables in apply_to_pte_range() Date: Fri, 28 Mar 2025 10:13:42 +0100 Message-ID: X-Mailer: git-send-email 2.43.5 In-Reply-To: References: MIME-Version: 1.0 Sender: Alexander Gordeev The lazy MMU mode can only be entered and left under the protection of the page table locks for all page tables which may be modified. Yet, when it comes to kernel mappings apply_to_pte_range() does not take any locks. That does not conform arch_enter|leave_lazy_mmu_mode() semantics and could potentially lead to re-schedulling a process while in lazy MMU mode or racing on a kernel page table updates. Signed-off-by: Alexander Gordeev --- mm/kasan/shadow.c | 7 ++----- mm/memory.c | 5 ++++- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c index edfa77959474..6531a7aa8562 100644 --- a/mm/kasan/shadow.c +++ b/mm/kasan/shadow.c @@ -308,14 +308,14 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr, __memset((void *)page, KASAN_VMALLOC_INVALID, PAGE_SIZE); pte = pfn_pte(PFN_DOWN(__pa(page)), PAGE_KERNEL); - spin_lock(&init_mm.page_table_lock); if (likely(pte_none(ptep_get(ptep)))) { set_pte_at(&init_mm, addr, ptep, pte); page = 0; } - spin_unlock(&init_mm.page_table_lock); + if (page) free_page(page); + return 0; } @@ -401,13 +401,10 @@ static int kasan_depopulate_vmalloc_pte(pte_t *ptep, unsigned long addr, page = (unsigned long)__va(pte_pfn(ptep_get(ptep)) << PAGE_SHIFT); - spin_lock(&init_mm.page_table_lock); - if (likely(!pte_none(ptep_get(ptep)))) { pte_clear(&init_mm, addr, ptep); free_page(page); } - spin_unlock(&init_mm.page_table_lock); return 0; } diff --git a/mm/memory.c b/mm/memory.c index 00f253404db5..c000377cad0c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2895,6 +2895,7 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd, pte = pte_offset_kernel(pmd, addr); if (!pte) return err; + spin_lock(&init_mm.page_table_lock); } else { if (create) pte = pte_alloc_map_lock(mm, pmd, addr, &ptl); @@ -2920,7 +2921,9 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd, arch_leave_lazy_mmu_mode(); - if (mm != &init_mm) + if (mm == &init_mm) + spin_unlock(&init_mm.page_table_lock); + else pte_unmap_unlock(mapped_pte, ptl); *mask |= PGTBL_PTE_MODIFIED;