From patchwork Mon Jun 26 17:14:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13293244 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B15EEEB64D9 for ; Mon, 26 Jun 2023 17:15:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=zv+ZE+8ewIxmiH4h9iPluNMXldW4kdeukWqbOlHRXqY=; b=CoJY0TG5+yKYx6 6SELG8By7OduLdJj3kAwy0hX3u2dyo/iC5zJ4skm1g31ZUNdIAKbzV7av6jUmIpkHhV7ev4M81dkA L5QoPZmczUADdrUhA6M28sj0FQXiE0/YSZPQl9IOWvIYqKYVoQA2Ijtvb7KFruQpoPoJKunkNVWSr bqjfTc+nnMMvHW1Jf3Md0r30Dr653A0y1MM6L6psqy40/2jvPM6VebCctXPTlhfvCplMaMPNwyvNI NX4tPPytZ9avAxBp9V9TyYPsK0i8gJ2asKU+Us91Z/8S9v0ysB/jBLfw981spS4AZFfvLKyrseILw x3oSR8VHEI2iEsyUXC3g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnd-00AiVy-1M; Mon, 26 Jun 2023 17:14:49 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnZ-00AiUF-2k for linux-arm-kernel@lists.infradead.org; Mon, 26 Jun 2023 17:14:47 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 31D9613D5; Mon, 26 Jun 2023 10:15:28 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 484F93F663; Mon, 26 Jun 2023 10:14:41 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 01/10] mm: Expose clear_huge_page() unconditionally Date: Mon, 26 Jun 2023 18:14:21 +0100 Message-Id: <20230626171430.3167004-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230626_101445_950495_AC5501CE X-CRM114-Status: GOOD ( 12.03 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org In preparation for extending vma_alloc_zeroed_movable_folio() to allocate a arbitrary order folio, expose clear_huge_page() unconditionally, so that it can be used to zero the allocated folio in the generic implementation of vma_alloc_zeroed_movable_folio(). Signed-off-by: Ryan Roberts --- include/linux/mm.h | 3 ++- mm/memory.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 7f1741bd870a..7e3bf45e6491 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3684,10 +3684,11 @@ enum mf_action_page_type { */ extern const struct attribute_group memory_failure_attr_group; -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) extern void clear_huge_page(struct page *page, unsigned long addr_hint, unsigned int pages_per_huge_page); + +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) int copy_user_large_folio(struct folio *dst, struct folio *src, unsigned long addr_hint, struct vm_area_struct *vma); diff --git a/mm/memory.c b/mm/memory.c index fb30f7523550..3d4ea668c4d1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5741,7 +5741,6 @@ void __might_fault(const char *file, int line) EXPORT_SYMBOL(__might_fault); #endif -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) /* * Process all subpages of the specified huge page with the specified * operation. The target subpage will be processed last to keep its @@ -5839,6 +5838,7 @@ void clear_huge_page(struct page *page, process_huge_page(addr_hint, pages_per_huge_page, clear_subpage, page); } +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) static int copy_user_gigantic_page(struct folio *dst, struct folio *src, unsigned long addr, struct vm_area_struct *vma, From patchwork Mon Jun 26 17:14:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13293248 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2D069EB64DA for ; Mon, 26 Jun 2023 17:15:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=T/KjLXB3nGtp2jZOXtkxFhDCL6+8aSza+SjX+qzkE/o=; b=XV5gTtQVWy5ID9 BJ/6kbpfEqmzd+4YlrFtdO/ma5fCZSnbBzuKXQuCoYz4qoX633WY3hql3gIB4dpXv35RpF92V0iFh YHO0OSXwhjnqm+vbCF1PswUAC/tvYD+8/bqXkIBIZvVc+shFoKakgua3vwc6lj0+L9ncZaFJxWM/N XdumovlHmQ4vc2+ifwEXqXH0HMXu6nvdSfItSetRYpMF7Cj7traCHaUZO9dqq5lXxfVqRN6AOk8I4 fRuujjyFuw5iwQHAlQfNaWY1DsNf5taW9gfA/uPLPHCeoGO2IBo7B2J2+3dB5IbV/yjTVqRgialMY fCbXUIwlaZT9Cn+WBGpA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpns-00Aieh-28; Mon, 26 Jun 2023 17:15:04 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnd-00AiVw-23 for linux-arm-kernel@lists.infradead.org; Mon, 26 Jun 2023 17:14:51 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5BD8D143D; Mon, 26 Jun 2023 10:15:31 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 577523F663; Mon, 26 Jun 2023 10:14:44 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 02/10] mm: pass gfp flags and order to vma_alloc_zeroed_movable_folio() Date: Mon, 26 Jun 2023 18:14:22 +0100 Message-Id: <20230626171430.3167004-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230626_101449_781645_3E81E350 X-CRM114-Status: GOOD ( 20.86 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Allow allocation of large folios with vma_alloc_zeroed_movable_folio(). This prepares the ground for large anonymous folios. The generic implementation of vma_alloc_zeroed_movable_folio() now uses clear_huge_page() to zero the allocated folio since it may now be a non-0 order. Currently the function is always called with order 0 and no extra gfp flags, so no functional change intended. But a subsequent commit will take advantage of the new parameters to allocate large folios. The extra gfp flags will be used to control the reclaim policy. Signed-off-by: Ryan Roberts --- arch/alpha/include/asm/page.h | 5 +++-- arch/arm64/include/asm/page.h | 3 ++- arch/arm64/mm/fault.c | 7 ++++--- arch/ia64/include/asm/page.h | 5 +++-- arch/m68k/include/asm/page_no.h | 7 ++++--- arch/s390/include/asm/page.h | 5 +++-- arch/x86/include/asm/page.h | 5 +++-- include/linux/highmem.h | 23 +++++++++++++---------- mm/memory.c | 5 +++-- 9 files changed, 38 insertions(+), 27 deletions(-) diff --git a/arch/alpha/include/asm/page.h b/arch/alpha/include/asm/page.h index 4db1ebc0ed99..6fc7fe91b6cb 100644 --- a/arch/alpha/include/asm/page.h +++ b/arch/alpha/include/asm/page.h @@ -17,8 +17,9 @@ extern void clear_page(void *page); #define clear_user_page(page, vaddr, pg) clear_page(page) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) extern void copy_page(void * _to, void * _from); #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 2312e6ee595f..47710852f872 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -30,7 +30,8 @@ void copy_highpage(struct page *to, struct page *from); #define __HAVE_ARCH_COPY_HIGHPAGE struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, - unsigned long vaddr); + unsigned long vaddr, + gfp_t gfp, int order); #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio void tag_clear_highpage(struct page *to); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 6045a5117ac1..0a43c3b3f190 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -961,9 +961,10 @@ NOKPROBE_SYMBOL(do_debug_exception); * Used during anonymous page fault handling. */ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, - unsigned long vaddr) + unsigned long vaddr, + gfp_t gfp, int order) { - gfp_t flags = GFP_HIGHUSER_MOVABLE | __GFP_ZERO; + gfp_t flags = GFP_HIGHUSER_MOVABLE | __GFP_ZERO | gfp; /* * If the page is mapped with PROT_MTE, initialise the tags at the @@ -973,7 +974,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, if (vma->vm_flags & VM_MTE) flags |= __GFP_ZEROTAGS; - return vma_alloc_folio(flags, 0, vma, vaddr, false); + return vma_alloc_folio(flags, order, vma, vaddr, false); } void tag_clear_highpage(struct page *page) diff --git a/arch/ia64/include/asm/page.h b/arch/ia64/include/asm/page.h index 310b09c3342d..ebdf04274023 100644 --- a/arch/ia64/include/asm/page.h +++ b/arch/ia64/include/asm/page.h @@ -82,10 +82,11 @@ do { \ } while (0) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ ({ \ struct folio *folio = vma_alloc_folio( \ - GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false); \ + GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false); \ if (folio) \ flush_dcache_folio(folio); \ folio; \ diff --git a/arch/m68k/include/asm/page_no.h b/arch/m68k/include/asm/page_no.h index 060e4c0e7605..4a2fe57fef5e 100644 --- a/arch/m68k/include/asm/page_no.h +++ b/arch/m68k/include/asm/page_no.h @@ -3,7 +3,7 @@ #define _M68K_PAGE_NO_H #ifndef __ASSEMBLY__ - + extern unsigned long memory_start; extern unsigned long memory_end; @@ -13,8 +13,9 @@ extern unsigned long memory_end; #define clear_user_page(page, vaddr, pg) clear_page(page) #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) #define __pa(vaddr) ((unsigned long)(vaddr)) #define __va(paddr) ((void *)((unsigned long)(paddr))) diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h index 8a2a3b5d1e29..b749564140f1 100644 --- a/arch/s390/include/asm/page.h +++ b/arch/s390/include/asm/page.h @@ -73,8 +73,9 @@ static inline void copy_page(void *to, void *from) #define clear_user_page(page, vaddr, pg) clear_page(page) #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) /* * These are used to make use of C type-checking.. diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h index d18e5c332cb9..34deab1a8dae 100644 --- a/arch/x86/include/asm/page.h +++ b/arch/x86/include/asm/page.h @@ -34,8 +34,9 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr, copy_page(to, from); } -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) #ifndef __pa #define __pa(x) __phys_addr((unsigned long)(x)) diff --git a/include/linux/highmem.h b/include/linux/highmem.h index 4de1dbcd3ef6..b9a9b0340557 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -209,26 +209,29 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr) #ifndef vma_alloc_zeroed_movable_folio /** - * vma_alloc_zeroed_movable_folio - Allocate a zeroed page for a VMA. - * @vma: The VMA the page is to be allocated for. - * @vaddr: The virtual address the page will be inserted into. - * - * This function will allocate a page suitable for inserting into this - * VMA at this virtual address. It may be allocated from highmem or + * vma_alloc_zeroed_movable_folio - Allocate a zeroed folio for a VMA. + * @vma: The start VMA the folio is to be allocated for. + * @vaddr: The virtual address the folio will be inserted into. + * @gfp: Additional gfp falgs to mix in or 0. + * @order: The order of the folio (2^order pages). + * + * This function will allocate a folio suitable for inserting into this + * VMA starting at this virtual address. It may be allocated from highmem or * the movable zone. An architecture may provide its own implementation. * - * Return: A folio containing one allocated and zeroed page or NULL if + * Return: A folio containing 2^order allocated and zeroed pages or NULL if * we are out of memory. */ static inline struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, - unsigned long vaddr) + unsigned long vaddr, gfp_t gfp, int order) { struct folio *folio; - folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, vaddr, false); + folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE | gfp, + order, vma, vaddr, false); if (folio) - clear_user_highpage(&folio->page, vaddr); + clear_huge_page(&folio->page, vaddr, 1U << order); return folio; } diff --git a/mm/memory.c b/mm/memory.c index 3d4ea668c4d1..367bbbb29d91 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3073,7 +3073,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) goto oom; if (is_zero_pfn(pte_pfn(vmf->orig_pte))) { - new_folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); + new_folio = vma_alloc_zeroed_movable_folio(vma, vmf->address, + 0, 0); if (!new_folio) goto oom; } else { @@ -4087,7 +4088,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) /* Allocate our own private page. */ if (unlikely(anon_vma_prepare(vma))) goto oom; - folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); + folio = vma_alloc_zeroed_movable_folio(vma, vmf->address, 0, 0); if (!folio) goto oom; From patchwork Mon Jun 26 17:14:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13293245 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 988A3EB64DD for ; Mon, 26 Jun 2023 17:15:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=V7LpwvxwjKPThGxOxoLfJNZ48Tn1lGki0GyTT/NqMRE=; b=l2xxybfibC9vCW b5XY+Dl3ty2xReEFOfPsOD5xebPQR/WtLke2fRxoLElIuA2kr88Gk01Cel2rWj/xDhef4Cnpn6fnr XJ8Tu5apLJdnuW90rN5OlZD3BfthB/xNojSj1PtPgbg3zOO5r4R34hM6iFQa4ygokCOajtcm0p+AP y7KFCa64qnky5jU21tqCkF5GOLypJslQsuLBlIm/CzBIGM+1QWx6hWPQgx/074xm0wi85Ew4RtxjL hVdUUnH7rQ2J9KiU2Pdwx78/mSvxfhSmrI2sOkIK+MQlPnFLEdIfHZZPQofR0Z/VQlHhYUsDci+qx SqDkqlBXYsj9MCyEHL/w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnt-00AifA-13; Mon, 26 Jun 2023 17:15:05 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnf-00AiXR-11 for linux-arm-kernel@lists.infradead.org; Mon, 26 Jun 2023 17:14:53 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6A1EA1474; Mon, 26 Jun 2023 10:15:34 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 80DFE3F663; Mon, 26 Jun 2023 10:14:47 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 03/10] mm: Introduce try_vma_alloc_movable_folio() Date: Mon, 26 Jun 2023 18:14:23 +0100 Message-Id: <20230626171430.3167004-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230626_101451_397399_DD21CA2F X-CRM114-Status: GOOD ( 13.84 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Opportunistically attempt to allocate high-order folios in highmem, optionally zeroed. Retry with lower orders all the way to order-0, until success. Although, of note, order-1 allocations are skipped since a large folio must be at least order-2 to work with the THP machinery. The user must check what they got with folio_order(). This will be used to oportunistically allocate large folios for anonymous memory with a sensible fallback under memory pressure. For attempts to allocate non-0 orders, we set __GFP_NORETRY to prevent high latency due to reclaim, instead preferring to just try for a lower order. The same approach is used by the readahead code when allocating large folios. Signed-off-by: Ryan Roberts --- mm/memory.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 367bbbb29d91..53896d46e686 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3001,6 +3001,39 @@ static vm_fault_t fault_dirty_shared_page(struct vm_fault *vmf) return 0; } +static inline struct folio *vma_alloc_movable_folio(struct vm_area_struct *vma, + unsigned long vaddr, int order, bool zeroed) +{ + gfp_t gfp = order > 0 ? __GFP_NORETRY | __GFP_NOWARN : 0; + + if (zeroed) + return vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order); + else + return vma_alloc_folio(GFP_HIGHUSER_MOVABLE | gfp, order, vma, + vaddr, false); +} + +/* + * Opportunistically attempt to allocate high-order folios, retrying with lower + * orders all the way to order-0, until success. order-1 allocations are skipped + * since a folio must be at least order-2 to work with the THP machinery. The + * user must check what they got with folio_order(). vaddr can be any virtual + * address that will be mapped by the allocated folio. + */ +static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, + unsigned long vaddr, int order, bool zeroed) +{ + struct folio *folio; + + for (; order > 1; order--) { + folio = vma_alloc_movable_folio(vma, vaddr, order, zeroed); + if (folio) + return folio; + } + + return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); +} + /* * Handle write page faults for pages that can be reused in the current vma * From patchwork Mon Jun 26 17:14:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13293246 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7D887EB64D9 for ; Mon, 26 Jun 2023 17:15:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=YsMllMF2aJ/Sdg0GoZbnyvzWNEhr+EUuI3QwWpNcCC8=; b=1MzUVrjbqQ6gve KqW+Hd9DdNmnXsDIPRDC48/rQ1K4a7JKUKpp+itxOrEY0MRctJjzbfQbS88xo8Riv8n0L1cOB7KMa obZfoxQjwvNl+9aSlYGLLyri9gh9hnChtsz/rq3mLxiCcLnnzfsljnlFoWqNZOJQb8z/TA9AAPy5P 28Tx2Bejf2pgf44lq1xYj1cJIjIG6Z1yZQtobts1a+2rjHiVg75WKEzw1EP+b0E8tHlgTRfP8ZSAm 6U5jN6a9I0UcSo5BFDQPsUPxyQAWipuvEVwZ60WtbwIoMe8kXmiHI6BCyDb5Ka/pp7YAjtbASfaiR 1bRWRy/+t38lf3qB4Xvg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnt-00Aifj-2s; Mon, 26 Jun 2023 17:15:05 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpni-00AiYl-1m for linux-arm-kernel@lists.infradead.org; Mon, 26 Jun 2023 17:14:55 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7835F1480; Mon, 26 Jun 2023 10:15:37 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8F06B3F663; Mon, 26 Jun 2023 10:14:50 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 04/10] mm: Implement folio_add_new_anon_rmap_range() Date: Mon, 26 Jun 2023 18:14:24 +0100 Message-Id: <20230626171430.3167004-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230626_101454_679612_E676C9E2 X-CRM114-Status: GOOD ( 15.77 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Like folio_add_new_anon_rmap() but batch-rmaps a range of pages belonging to a folio, for effciency savings. All pages are accounted as small pages. Signed-off-by: Ryan Roberts --- include/linux/rmap.h | 2 ++ mm/rmap.c | 43 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index a3825ce81102..15433a3d0cbf 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -196,6 +196,8 @@ void page_add_new_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address); void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); +void folio_add_new_anon_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma, unsigned long address); void page_add_file_rmap(struct page *, struct vm_area_struct *, bool compound); void folio_add_file_rmap_range(struct folio *, struct page *, unsigned int nr, diff --git a/mm/rmap.c b/mm/rmap.c index 1d8369549424..4050bcea7ae7 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1305,6 +1305,49 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, __page_set_anon_rmap(folio, &folio->page, vma, address, 1); } +/** + * folio_add_new_anon_rmap_range - Add mapping to a set of pages within a new + * anonymous potentially large folio. + * @folio: The folio containing the pages to be mapped + * @page: First page in the folio to be mapped + * @nr: Number of pages to be mapped + * @vma: the vm area in which the mapping is added + * @address: the user virtual address of the first page to be mapped + * + * Like folio_add_new_anon_rmap() but batch-maps a range of pages within a folio + * using non-THP accounting. Like folio_add_new_anon_rmap(), the inc-and-test is + * bypassed and the folio does not have to be locked. All pages in the folio are + * individually accounted. + * + * As the folio is new, it's assumed to be mapped exclusively by a single + * process. + */ +void folio_add_new_anon_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma, unsigned long address) +{ + int i; + + VM_BUG_ON_VMA(address < vma->vm_start || + address + (nr << PAGE_SHIFT) > vma->vm_end, vma); + __folio_set_swapbacked(folio); + + if (folio_test_large(folio)) { + /* increment count (starts at 0) */ + atomic_set(&folio->_nr_pages_mapped, nr); + } + + for (i = 0; i < nr; i++) { + /* increment count (starts at -1) */ + atomic_set(&page->_mapcount, 0); + __page_set_anon_rmap(folio, page, vma, address, 1); + page++; + address += PAGE_SIZE; + } + + __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); + +} + /** * folio_add_file_rmap_range - add pte mapping to page range of a folio * @folio: The folio to add the mapping to From patchwork Mon Jun 26 17:14:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13293247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 34468EB64DC for ; Mon, 26 Jun 2023 17:15:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=7CetpUNqA4OCFrukwBf4TIeOrrxRzLNCG44D2+kuZwE=; b=oZ6uxPAm7xI2Qv 6bhWmRtqEKwqOiRevDiufx2aAZm3M/UHdY4NswNzBV4Jbt3mpvc2MJlyxSrnkJnxuHTyzBEeLybev zIPLMDLOxVujxhKAl9W24HzcTQy1YtHPXtcs5PoHLBcyZ2UkKDbjDUXbNk0TeWR0Mbi0K7WyobGNt o/Fz0JcIjGhcjIOd69Y41SDN2dT2bTe0qmbBU+Kvqae1W06e1JKYk36B0cqXHht+KaLoQTBez4UgU sIf7+dzT9JFCaCPO0m7fFx8IRq1HxLtT7TNNiJbtcKEB6/f5A8qI4FSXx23Zd5/lY9eQMI1YMQ/y4 urXUfiQp1rU07w+Qc5hA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnu-00AigF-27; Mon, 26 Jun 2023 17:15:06 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnm-00Aiah-2C for linux-arm-kernel@lists.infradead.org; Mon, 26 Jun 2023 17:15:00 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 864C1150C; Mon, 26 Jun 2023 10:15:40 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9D3913F663; Mon, 26 Jun 2023 10:14:53 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 05/10] mm: Implement folio_remove_rmap_range() Date: Mon, 26 Jun 2023 18:14:25 +0100 Message-Id: <20230626171430.3167004-6-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230626_101458_826922_AB29BA90 X-CRM114-Status: GOOD ( 17.26 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Like page_remove_rmap() but batch-removes the rmap for a range of pages belonging to a folio, for effciency savings. All pages are accounted as small pages. Signed-off-by: Ryan Roberts --- include/linux/rmap.h | 2 ++ mm/rmap.c | 62 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 15433a3d0cbf..50f50e4cb0f8 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -204,6 +204,8 @@ void folio_add_file_rmap_range(struct folio *, struct page *, unsigned int nr, struct vm_area_struct *, bool compound); void page_remove_rmap(struct page *, struct vm_area_struct *, bool compound); +void folio_remove_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma); void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); diff --git a/mm/rmap.c b/mm/rmap.c index 4050bcea7ae7..ac1d93d43f2b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1434,6 +1434,68 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, folio_add_file_rmap_range(folio, page, nr_pages, vma, compound); } +/* + * folio_remove_rmap_range - take down pte mappings from a range of pages + * belonging to a folio. All pages are accounted as small pages. + * @folio: folio that all pages belong to + * @page: first page in range to remove mapping from + * @nr: number of pages in range to remove mapping from + * @vma: the vm area from which the mapping is removed + * + * The caller needs to hold the pte lock. + */ +void folio_remove_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma) +{ + atomic_t *mapped = &folio->_nr_pages_mapped; + int nr_unmapped = 0; + int nr_mapped; + bool last; + enum node_stat_item idx; + + VM_BUG_ON_FOLIO(folio_test_hugetlb(folio), folio); + + if (!folio_test_large(folio)) { + /* Is this the page's last map to be removed? */ + last = atomic_add_negative(-1, &page->_mapcount); + nr_unmapped = last; + } else { + for (; nr != 0; nr--, page++) { + /* Is this the page's last map to be removed? */ + last = atomic_add_negative(-1, &page->_mapcount); + if (last) { + /* Page still mapped if folio mapped entirely */ + nr_mapped = atomic_dec_return_relaxed(mapped); + if (nr_mapped < COMPOUND_MAPPED) + nr_unmapped++; + } + } + } + + if (nr_unmapped) { + idx = folio_test_anon(folio) ? NR_ANON_MAPPED : NR_FILE_MAPPED; + __lruvec_stat_mod_folio(folio, idx, -nr_unmapped); + + /* + * Queue anon THP for deferred split if we have just unmapped at + * least 1 page, while at least 1 page remains mapped. + */ + if (folio_test_large(folio) && folio_test_anon(folio)) + if (nr_mapped) + deferred_split_folio(folio); + } + + /* + * It would be tidy to reset folio_test_anon mapping when fully + * unmapped, but that might overwrite a racing page_add_anon_rmap + * which increments mapcount after us but sets mapping before us: + * so leave the reset to free_pages_prepare, and remember that + * it's only reliable while mapped. + */ + + munlock_vma_folio(folio, vma, false); +} + /** * page_remove_rmap - take down pte mapping from a page * @page: page to remove mapping from From patchwork Mon Jun 26 17:14:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13293249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1BA8DEB64D7 for ; Mon, 26 Jun 2023 17:15:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=McWm/dM1SiAAoiKXLDl/quoD+zEDyjYmAQpkVO3vR34=; b=qArhTote9WUXRj REYU39Hv2t5IRHfwljpsPitX+YYvgj7UbZgr+13nBilLt3nVlWT3JtKdRhItV2yMayUmFYI2FVTde Bh8aONahNt7wPmxMxm1v8HXEVaqSJuFyveHx/99yIJ/OLwSdUH4WpCiJ4KFF4aHphLi20SvNcVsE8 xNz3wiP/DQARoYW9oSlYWoAL/9XgbaP4bZ7P2IaKBv/Le8pNPg6MsRVCXb0eTWsUAxheG/f3RP5BB tqU5kcMG6wk3Ze8MN43dGtPqpnU+i7TEY8T/JQle+iyUEeN8LTFARjrNbjQlhPYtVWs1D4jldYAZt qJAJXFMMEdWGWyvGqgiw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnw-00AihU-0K; Mon, 26 Jun 2023 17:15:08 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpno-00Aibj-2O for linux-arm-kernel@lists.infradead.org; Mon, 26 Jun 2023 17:15:02 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 94AD9152B; Mon, 26 Jun 2023 10:15:43 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AB7DB3F663; Mon, 26 Jun 2023 10:14:56 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 06/10] mm: Allow deferred splitting of arbitrary large anon folios Date: Mon, 26 Jun 2023 18:14:26 +0100 Message-Id: <20230626171430.3167004-7-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230626_101500_824071_58BD2ADC X-CRM114-Status: GOOD ( 13.63 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org With the introduction of large folios for anonymous memory, we would like to be able to split them when they have unmapped subpages, in order to free those unused pages under memory pressure. So remove the artificial requirement that the large folio needed to be at least PMD-sized. Signed-off-by: Ryan Roberts Reviewed-by: Yu Zhao Reviewed-by: Yin Fengwei --- mm/rmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/rmap.c b/mm/rmap.c index ac1d93d43f2b..3d11c5fb6090 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1567,7 +1567,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, * page of the folio is unmapped and at least one page * is still mapped. */ - if (folio_test_pmd_mappable(folio) && folio_test_anon(folio)) + if (folio_test_large(folio) && folio_test_anon(folio)) if (!compound || nr < nr_pmdmapped) deferred_split_folio(folio); } From patchwork Mon Jun 26 17:14:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13293253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A3573EB64D7 for ; Mon, 26 Jun 2023 17:16:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=FZqtRXqiPbyKAhDe462Boobvq8Exh4+V+oUGDNz19fw=; b=c8ie2wI/BYoot0 ki4bAZWww8nn7s2K8hY3Ov8YnIJLwD0vCqEG9nz70zOsfw1cRSkYtD/VaVq5qwDpt+1VkNkioRPLG E5kYmisYOYgPrKygZd12ooCRnIaB5egXL78bW3JFdIW0zB42HSXDTsUUFsGD4Iu2CEWPLSGTcifl9 tUZDemOx93+NxgJvCvFx3qDZtMhZUIq4kJJG5g0arSA/CGZvtkdrCDlgIrF+T19XPCJWESA2p/8La pbi0ZG/sw01ESwg6Iy3XNCYbwj5Rvo/ScPiGp5Hjj67BucNNfeUMh/qqjcH2CqpjSEAniePvWiy1Z CH1mBLz4Z+FWdnizS6rw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpoT-00AiyO-1b; Mon, 26 Jun 2023 17:15:41 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnr-00AidN-2V for linux-arm-kernel@lists.infradead.org; Mon, 26 Jun 2023 17:15:06 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A3C721595; Mon, 26 Jun 2023 10:15:46 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BA1553F663; Mon, 26 Jun 2023 10:14:59 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 07/10] mm: Batch-zap large anonymous folio PTE mappings Date: Mon, 26 Jun 2023 18:14:27 +0100 Message-Id: <20230626171430.3167004-8-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230626_101503_926584_B431E78A X-CRM114-Status: GOOD ( 19.01 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org This allows batching the rmap removal with folio_remove_rmap_range(), which means we avoid spuriously adding a partially unmapped folio to the deferrred split queue in the common case, which reduces split queue lock contention. Previously each page was removed from the rmap individually with page_remove_rmap(). If the first page belonged to a large folio, this would cause page_remove_rmap() to conclude that the folio was now partially mapped and add the folio to the deferred split queue. But subsequent calls would cause the folio to become fully unmapped, meaning there is no value to adding it to the split queue. Signed-off-by: Ryan Roberts --- mm/memory.c | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 119 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 53896d46e686..9165ed1b9fc2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -914,6 +914,57 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma return 0; } +static inline unsigned long page_addr(struct page *page, + struct page *anchor, unsigned long anchor_addr) +{ + unsigned long offset; + unsigned long addr; + + offset = (page_to_pfn(page) - page_to_pfn(anchor)) << PAGE_SHIFT; + addr = anchor_addr + offset; + + if (anchor > page) { + if (addr > anchor_addr) + return 0; + } else { + if (addr < anchor_addr) + return ULONG_MAX; + } + + return addr; +} + +static int calc_anon_folio_map_pgcount(struct folio *folio, + struct page *page, pte_t *pte, + unsigned long addr, unsigned long end) +{ + pte_t ptent; + int floops; + int i; + unsigned long pfn; + + end = min(page_addr(&folio->page + folio_nr_pages(folio), page, addr), + end); + floops = (end - addr) >> PAGE_SHIFT; + pfn = page_to_pfn(page); + pfn++; + pte++; + + for (i = 1; i < floops; i++) { + ptent = ptep_get(pte); + + if (!pte_present(ptent) || + pte_pfn(ptent) != pfn) { + return i; + } + + pfn++; + pte++; + } + + return floops; +} + /* * Copy one pte. Returns 0 if succeeded, or -EAGAIN if one preallocated page * is required to copy this pte. @@ -1379,6 +1430,44 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); } +static unsigned long zap_anon_pte_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, + struct page *page, pte_t *pte, + unsigned long addr, unsigned long end, + bool *full_out) +{ + struct folio *folio = page_folio(page); + struct mm_struct *mm = tlb->mm; + pte_t ptent; + int pgcount; + int i; + bool full; + + pgcount = calc_anon_folio_map_pgcount(folio, page, pte, addr, end); + + for (i = 0; i < pgcount;) { + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + tlb_remove_tlb_entry(tlb, pte, addr); + full = __tlb_remove_page(tlb, page, 0); + + if (unlikely(page_mapcount(page) < 1)) + print_bad_pte(vma, addr, ptent, page); + + i++; + page++; + pte++; + addr += PAGE_SIZE; + + if (unlikely(full)) + break; + } + + folio_remove_rmap_range(folio, page - i, i, vma); + + *full_out = full; + return i; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1415,6 +1504,36 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, page = vm_normal_page(vma, addr, ptent); if (unlikely(!should_zap_page(details, page))) continue; + + /* + * Batch zap large anonymous folio mappings. This allows + * batching the rmap removal, which means we avoid + * spuriously adding a partially unmapped folio to the + * deferrred split queue in the common case, which + * reduces split queue lock contention. Require the VMA + * to be anonymous to ensure that none of the PTEs in + * the range require zap_install_uffd_wp_if_needed(). + */ + if (page && PageAnon(page) && vma_is_anonymous(vma)) { + bool full; + int pgcount; + + pgcount = zap_anon_pte_range(tlb, vma, + page, pte, addr, end, &full); + + rss[mm_counter(page)] -= pgcount; + pgcount--; + pte += pgcount; + addr += pgcount << PAGE_SHIFT; + + if (unlikely(full)) { + force_flush = 1; + addr += PAGE_SIZE; + break; + } + continue; + } + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); tlb_remove_tlb_entry(tlb, pte, addr); From patchwork Mon Jun 26 17:14:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13293250 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AF6DBEB64DC for ; Mon, 26 Jun 2023 17:16:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=LnhRx02S2yibEXte2V7Ef+txd9/zsVUciMpVC4wsHaQ=; b=EADo7VjkRsG+gH hoFhGmcBJ7qreWlqRod5SlR/hZIvyon1qZ2Q+YzAuh62yMeIfKyu4ILLsneWfl9/tY/zGuv8v3RmC 2psv6yE0OpFBKawFRrxnp0Du2qOiMwHKDcAIWpnycEPmhV3o09C9qGttrxXXSHSv1RNms+oYaGa62 c6NHLBNw71/OFhy8znMzNKri5LV1HS2JlJO2OZ4fQr2BoY7uzR5+OZy54zacIgyFdxqg8L+Le4dFx xg+4atVLoKnmFvEbIhbCXKpXY0kwVAhmM4F49gWAXiDZGDbHxQwHiOvFYpRIqU5Mdv8iThtRp9jmQ pFabEp47mJxdUStJA6aw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpoU-00Aiyl-0Q; Mon, 26 Jun 2023 17:15:42 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnu-00Aifl-1i for linux-arm-kernel@lists.infradead.org; Mon, 26 Jun 2023 17:15:08 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B2AB21596; Mon, 26 Jun 2023 10:15:49 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C8DE63F663; Mon, 26 Jun 2023 10:15:02 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 08/10] mm: Kconfig hooks to determine max anon folio allocation order Date: Mon, 26 Jun 2023 18:14:28 +0100 Message-Id: <20230626171430.3167004-9-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230626_101506_685053_FC4D0E46 X-CRM114-Status: GOOD ( 15.34 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org For variable-order anonymous folios, we need to determine the order that we will allocate. From a SW perspective, the higher the order we allocate, the less overhead we will have; fewer faults, fewer folios in lists, etc. But of course there will also be more memory wastage as the order increases. From a HW perspective, there are memory block sizes that can be beneficial to reducing TLB pressure. arm64, for example, has the ability to map "contpte" sized chunks (64K for a 4K base page, 2M for 16K and 64K base pages) such that one of these chunks only uses a single TLB entry. So we let the architecture specify the order of the maximally beneficial mapping unit when PTE-mapped. Furthermore, because in some cases, this order may be quite big (and therefore potentially wasteful of memory), allow the arch to specify 2 values; One is the max order for a mapping that _would not_ use THP if all size and alignment constraints were met, and the other is the max order for a mapping that _would_ use THP if all those constraints were met. Implement this with Kconfig by introducing some new options to allow the architecture to declare that it supports large anonymous folios along with these 2 preferred max order values. Then introduce a user-facing option, LARGE_ANON_FOLIO, which defaults to disabled and can only be enabled if the architecture has declared its support. When disabled, it forces the max order values, LARGE_ANON_FOLIO_NOTHP_ORDER_MAX and LARGE_ANON_FOLIO_THP_ORDER_MAX to 0, meaning only a single page is ever allocated. Signed-off-by: Ryan Roberts --- mm/Kconfig | 39 +++++++++++++++++++++++++++++++++++++++ mm/memory.c | 8 ++++++++ 2 files changed, 47 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index 7672a22647b4..f4ba48c37b75 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1208,4 +1208,43 @@ config PER_VMA_LOCK source "mm/damon/Kconfig" +config ARCH_SUPPORTS_LARGE_ANON_FOLIO + def_bool n + help + An arch should select this symbol if wants to allow LARGE_ANON_FOLIO + to be enabled. It must also set the following integer values: + - ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + - ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX + +config ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + int + help + The maximum size of folio to allocate for an anonymous VMA PTE-mapping + that does not have the MADV_HUGEPAGE hint set. + +config ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX + int + help + The maximum size of folio to allocate for an anonymous VMA PTE-mapping + that has the MADV_HUGEPAGE hint set. + +config LARGE_ANON_FOLIO + bool "Allocate large folios for anonymous memory" + depends on ARCH_SUPPORTS_LARGE_ANON_FOLIO + default n + help + Use large (bigger than order-0) folios to back anonymous memory where + possible. This reduces the number of page faults, as well as other + per-page overheads to improve performance for many workloads. + +config LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + int + default 0 if !LARGE_ANON_FOLIO + default ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + +config LARGE_ANON_FOLIO_THP_ORDER_MAX + int + default 0 if !LARGE_ANON_FOLIO + default ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX + endmenu diff --git a/mm/memory.c b/mm/memory.c index 9165ed1b9fc2..a8f7e2b28d7a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3153,6 +3153,14 @@ static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); } +static inline int max_anon_folio_order(struct vm_area_struct *vma) +{ + if (hugepage_vma_check(vma, vma->vm_flags, false, true, true)) + return CONFIG_LARGE_ANON_FOLIO_THP_ORDER_MAX; + else + return CONFIG_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX; +} + /* * Handle write page faults for pages that can be reused in the current vma * From patchwork Mon Jun 26 17:14:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13293252 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 79B9AEB64DA for ; Mon, 26 Jun 2023 17:16:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=9LRX4xnTOKVqw3K3F5AfTcUAJPHPeM9Mwo2buC+lhQo=; b=dL1oxNEv3sIgOa wYi2giQQYO0vagNIzRclbFMA6Cp72sDdP5oPPfHou5z7hI4m2CzQowfV9IyA2/dlMrm7wofw94Iye 900UCwMHCj4woepwAvYHHKLAewdjI9KXqCZq8+Q2BKqinZRPHkqhzYK0+j1KaUfssXya3TdYHs4AA CYJhKeGml3FEWnLIAWAaajSQvqZ0OIMu3qpGGf/xIrRy5gB0xuMbv1eol9XsBFfD49TIsCCaS978N uKuILgVok+siRTXK3bOA0aaz6IEee3PSLs4TAIMvvQh19cCWQLdGO6N9RrEKI3396J5Pj86jxzGL8 GqKn/ER2rZ84sbKSrwxw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpoV-00AizR-0B; Mon, 26 Jun 2023 17:15:43 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpnx-00AiiX-1j for linux-arm-kernel@lists.infradead.org; Mon, 26 Jun 2023 17:15:11 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C111215A1; Mon, 26 Jun 2023 10:15:52 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D7EBC3F663; Mon, 26 Jun 2023 10:15:05 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 09/10] arm64: mm: Declare support for large anonymous folios Date: Mon, 26 Jun 2023 18:14:29 +0100 Message-Id: <20230626171430.3167004-10-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230626_101509_620811_7320F4A5 X-CRM114-Status: GOOD ( 10.34 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org For the unhinted case, when THP is not permitted for the vma, don't allow anything bigger than 64K. This means we don't waste too much memory. Additionally, for 4K pages this is the contpte size, and for 16K, this is (usually) the HPA size when the uarch feature is implemented. For the hinted case, when THP is permitted for the vma, allow the contpte size for all page size configurations; 64K for 4K, 2M for 16K and 2M for 64K. Signed-off-by: Ryan Roberts --- arch/arm64/Kconfig | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 343e1e1cae10..0e91b5bc8cd9 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -243,6 +243,7 @@ config ARM64 select TRACE_IRQFLAGS_SUPPORT select TRACE_IRQFLAGS_NMI_SUPPORT select HAVE_SOFTIRQ_ON_OWN_STACK + select ARCH_SUPPORTS_LARGE_ANON_FOLIO help ARM 64-bit (AArch64) Linux support. @@ -281,6 +282,18 @@ config ARM64_CONT_PMD_SHIFT default 5 if ARM64_16K_PAGES default 4 +config ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + int + default 0 if ARM64_64K_PAGES # 64K (1 page) + default 2 if ARM64_16K_PAGES # 64K (4 pages; benefits from HPA where HW supports it) + default 4 if ARM64_4K_PAGES # 64K (16 pages; eligible for contpte-mapping) + +config ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX + int + default 5 if ARM64_64K_PAGES # 2M (32 page; eligible for contpte-mapping) + default 7 if ARM64_16K_PAGES # 2M (128 pages; eligible for contpte-mapping) + default 4 if ARM64_4K_PAGES # 64K (16 pages; eligible for contpte-mapping) + config ARCH_MMAP_RND_BITS_MIN default 14 if ARM64_64K_PAGES default 16 if ARM64_16K_PAGES From patchwork Mon Jun 26 17:14:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13293251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E9EF6EB64D7 for ; Mon, 26 Jun 2023 17:16:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=pNGRYWB7N0Brs4u7keEIexlCNYsTp6FuOh1uou8AmRk=; b=rn7kIQPr3MPrll irh5sG161OwVG5ezKOWBa3tOdcM/WeDPtNUq5zdON3QmaNvcppwnPjI8vC47gF9a/FHlGFOT5h7Ax dKFKvdmNPQbECgzywHTdBsfuPcBmC5qSLFbM8qR9nVg5dR+9O2g5rifHfxDMgGxobpZE+yeAU5exu 8KzrmemBIRargyR+0+PciNRyipJksKmAPQd5WOIoijcrBkOQB8THw3vK/bra88D4V/StVITehxCto lYIxXA8eC2l4vX/5C0mP1OFcwzw79G3hR/GmTYxcMMDSbha7zBH0BO+1qCIgcePU5qlQt9QK+icnh s2HIWKsupw27bVaoHqaw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpoV-00Aizz-2m; Mon, 26 Jun 2023 17:15:43 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qDpo0-00Aiks-28 for linux-arm-kernel@lists.infradead.org; Mon, 26 Jun 2023 17:15:15 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D0EE615DB; Mon, 26 Jun 2023 10:15:55 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E66B63F663; Mon, 26 Jun 2023 10:15:08 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 10/10] mm: Allocate large folios for anonymous memory Date: Mon, 26 Jun 2023 18:14:30 +0100 Message-Id: <20230626171430.3167004-11-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230626_101512_811400_4CE32E98 X-CRM114-Status: GOOD ( 28.02 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org With all of the enabler patches in place, modify the anonymous memory write allocation path so that it opportunistically attempts to allocate a large folio up to `max_anon_folio_order()` size (This value is ultimately configured by the architecture). This reduces the number of page faults, reduces the size of (e.g. LRU) lists, and generally improves performance by batching what were per-page operations into per-(large)-folio operations. If CONFIG_LARGE_ANON_FOLIO is not enabled (the default) then `max_anon_folio_order()` always returns 0, meaning we get the existing allocation behaviour. Signed-off-by: Ryan Roberts --- mm/memory.c | 159 +++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 144 insertions(+), 15 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index a8f7e2b28d7a..d23c44cc5092 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3161,6 +3161,90 @@ static inline int max_anon_folio_order(struct vm_area_struct *vma) return CONFIG_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX; } +/* + * Returns index of first pte that is not none, or nr if all are none. + */ +static inline int check_ptes_none(pte_t *pte, int nr) +{ + int i; + + for (i = 0; i < nr; i++) { + if (!pte_none(ptep_get(pte++))) + return i; + } + + return nr; +} + +static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) +{ + /* + * The aim here is to determine what size of folio we should allocate + * for this fault. Factors include: + * - Order must not be higher than `order` upon entry + * - Folio must be naturally aligned within VA space + * - Folio must not breach boundaries of vma + * - Folio must be fully contained inside one pmd entry + * - Folio must not overlap any non-none ptes + * + * Additionally, we do not allow order-1 since this breaks assumptions + * elsewhere in the mm; THP pages must be at least order-2 (since they + * store state up to the 3rd struct page subpage), and these pages must + * be THP in order to correctly use pre-existing THP infrastructure such + * as folio_split(). + * + * As a consequence of relying on the THP infrastructure, if the system + * does not support THP, we always fallback to order-0. + * + * Note that the caller may or may not choose to lock the pte. If + * unlocked, the calculation should be considered an estimate that will + * need to be validated under the lock. + */ + + struct vm_area_struct *vma = vmf->vma; + int nr; + unsigned long addr; + pte_t *pte; + pte_t *first_set = NULL; + int ret; + + if (has_transparent_hugepage()) { + order = min(order, PMD_SHIFT - PAGE_SHIFT); + + for (; order > 1; order--) { + nr = 1 << order; + addr = ALIGN_DOWN(vmf->address, nr << PAGE_SHIFT); + pte = vmf->pte - ((vmf->address - addr) >> PAGE_SHIFT); + + /* Check vma bounds. */ + if (addr < vma->vm_start || + addr + (nr << PAGE_SHIFT) > vma->vm_end) + continue; + + /* Ptes covered by order already known to be none. */ + if (pte + nr <= first_set) + break; + + /* Already found set pte in range covered by order. */ + if (pte <= first_set) + continue; + + /* Need to check if all the ptes are none. */ + ret = check_ptes_none(pte, nr); + if (ret == nr) + break; + + first_set = pte + ret; + } + + if (order == 1) + order = 0; + } else + order = 0; + + return order; +} + /* * Handle write page faults for pages that can be reused in the current vma * @@ -4201,6 +4285,9 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) struct folio *folio; vm_fault_t ret = 0; pte_t entry; + unsigned long addr; + int order = uffd_wp ? 0 : max_anon_folio_order(vma); + int pgcount = BIT(order); /* File mapping without ->vm_ops ? */ if (vma->vm_flags & VM_SHARED) @@ -4242,24 +4329,44 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) pte_unmap_unlock(vmf->pte, vmf->ptl); return handle_userfault(vmf, VM_UFFD_MISSING); } - goto setpte; + if (uffd_wp) + entry = pte_mkuffd_wp(entry); + set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); + + /* No need to invalidate - it was non-present before */ + update_mmu_cache(vma, vmf->address, vmf->pte); + goto unlock; } - /* Allocate our own private page. */ +retry: + /* + * Estimate the folio order to allocate. We are not under the ptl here + * so this estiamte needs to be re-checked later once we have the lock. + */ + vmf->pte = pte_offset_map(vmf->pmd, vmf->address); + order = calc_anon_folio_order_alloc(vmf, order); + pte_unmap(vmf->pte); + + /* Allocate our own private folio. */ if (unlikely(anon_vma_prepare(vma))) goto oom; - folio = vma_alloc_zeroed_movable_folio(vma, vmf->address, 0, 0); + folio = try_vma_alloc_movable_folio(vma, vmf->address, order, true); if (!folio) goto oom; + /* We may have been granted less than we asked for. */ + order = folio_order(folio); + pgcount = BIT(order); + addr = ALIGN_DOWN(vmf->address, pgcount << PAGE_SHIFT); + if (mem_cgroup_charge(folio, vma->vm_mm, GFP_KERNEL)) goto oom_free_page; folio_throttle_swaprate(folio, GFP_KERNEL); /* * The memory barrier inside __folio_mark_uptodate makes sure that - * preceding stores to the page contents become visible before - * the set_pte_at() write. + * preceding stores to the folio contents become visible before + * the set_ptes() write. */ __folio_mark_uptodate(folio); @@ -4268,11 +4375,31 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (vma->vm_flags & VM_WRITE) entry = pte_mkwrite(pte_mkdirty(entry)); - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, - &vmf->ptl); - if (vmf_pte_changed(vmf)) { - update_mmu_tlb(vma, vmf->address, vmf->pte); - goto release; + vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); + + /* + * Ensure our estimate above is still correct; we could have raced with + * another thread to service a fault in the region. + */ + if (order == 0) { + if (vmf_pte_changed(vmf)) { + update_mmu_tlb(vma, vmf->address, vmf->pte); + goto release; + } + } else if (check_ptes_none(vmf->pte, pgcount) != pgcount) { + pte_t *pte = vmf->pte + ((vmf->address - addr) >> PAGE_SHIFT); + + /* If faulting pte was allocated by another, exit early. */ + if (!pte_none(ptep_get(pte))) { + update_mmu_tlb(vma, vmf->address, pte); + goto release; + } + + /* Else try again, with a lower order. */ + pte_unmap_unlock(vmf->pte, vmf->ptl); + folio_put(folio); + order--; + goto retry; } ret = check_stable_address_space(vma->vm_mm); @@ -4286,16 +4413,18 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) return handle_userfault(vmf, VM_UFFD_MISSING); } - inc_mm_counter(vma->vm_mm, MM_ANONPAGES); - folio_add_new_anon_rmap(folio, vma, vmf->address); + folio_ref_add(folio, pgcount - 1); + + add_mm_counter(vma->vm_mm, MM_ANONPAGES, pgcount); + folio_add_new_anon_rmap_range(folio, &folio->page, pgcount, vma, addr); folio_add_lru_vma(folio, vma); -setpte: + if (uffd_wp) entry = pte_mkuffd_wp(entry); - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); + set_ptes(vma->vm_mm, addr, vmf->pte, entry, pgcount); /* No need to invalidate - it was non-present before */ - update_mmu_cache(vma, vmf->address, vmf->pte); + update_mmu_cache_range(vma, addr, vmf->pte, pgcount); unlock: pte_unmap_unlock(vmf->pte, vmf->ptl); return ret;