From patchwork Wed Aug 16 15:11:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13355388 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA84AC04FDF for ; Wed, 16 Aug 2023 15:13:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 32A9E280024; Wed, 16 Aug 2023 11:13:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DBCC280021; Wed, 16 Aug 2023 11:13:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17C7F280024; Wed, 16 Aug 2023 11:13:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 03C83280021 for ; Wed, 16 Aug 2023 11:13:07 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 49B12C0339 for ; Wed, 16 Aug 2023 15:13:06 +0000 (UTC) X-FDA: 81130310772.06.C59E32F Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf25.hostedemail.com (Postfix) with ESMTP id D2A2DA02EB for ; Wed, 16 Aug 2023 15:12:15 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=BTcAFsHY; dmarc=none; spf=none (imf25.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692198736; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+84Xamq1BwF9kuhmcyg5RE+NbnD1rvqVFR4weQ2kreI=; b=ZWKEGpyzJd0ffB0HMCHXiZ/32BjHYKcFZ24PyLCKjbWtmDaTOdgcCeM5k4mtDHq6ESL4FR qpInfaX/q32kecdmhQpZWOmRgpqYZQbn+IQ+d5Wt1bxpA1oVbsl1IiDo46vDWwMY48BA9e W1cjvVG+N0LhHgcCGPGxArj726TVXGs= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=BTcAFsHY; dmarc=none; spf=none (imf25.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692198736; a=rsa-sha256; cv=none; b=S8PkBea6dU/Af77HlrTwKYW5DHrLNT7iXIkYMDZE502nZxKifnPfjHEcGEuv8dfXM6Yaka hrpMrqN/5bi4QeUOb73mdeiikchIfJd+SMbgQOYGcZdfHTQmubzo+bN24eJpqU3Aevap2x Y2OjLu0BkeTjUzLg/dtDXkwCPUZRclU= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:Content-Type: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-ID:Content-Description; bh=+84Xamq1BwF9kuhmcyg5RE+NbnD1rvqVFR4weQ2kreI=; b=BTcAFsHYbrWS4w+olPzG8d2rj1 esk+oQT/jKnc4xYCle83gKpL2PZYGmmskmImDMhedEdo8xcacN2//QQOLoZYzhSRs6saXjBwUVPfj cYTrYr5VOMyfpMBYU+LI74CMoa7WB6Y3gU0iFIwda5lmRcSIBcJer0oGbyLc//apMO+pZuA78wUut u4FNzM3f3EjZhT3SMm4PlSjvVwm46qK1TrWY7bU2ap880yhTfOGHCozDUD20x1Tku70qmBcFmUJlX 44QVflIRIjMANgBG62dcxLNhb3Az2XPttXa7ax3qv5ccO0y2P/VYlb6FjYKDVmztY5dirr/SAzg8t 1RdAzfhQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qWIBu-00FL8T-3q; Wed, 16 Aug 2023 15:12:10 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , Jens Axboe , io-uring@vger.kernel.org, linux-mm@kvack.org, Yanteng Si Subject: [PATCH v2 03/13] mm: Convert free_huge_page() to free_huge_folio() Date: Wed, 16 Aug 2023 16:11:51 +0100 Message-Id: <20230816151201.3655946-4-willy@infradead.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230816151201.3655946-1-willy@infradead.org> References: <20230816151201.3655946-1-willy@infradead.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: D2A2DA02EB X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: abz6gkyf9ow4u7fzyu67g5895cb9u4rw X-HE-Tag: 1692198735-399310 X-HE-Meta: U2FsdGVkX1+xUk2DDi93gId/94ie61Me06tAUSoYIOLU0TnY/w3pqSaw/wIhR4eYMKIF/8CSn+5NFoFdxmKDXhjm7bBG7xkogaKgAcWEOHGH7AZEuJpNGmW87TJH0zxZEgbIJcODShS76W/3Mh8WrxO3tbBRhfZ38fO/UMJ2v1fNeq0Op8mDzg80EdAtiILBY0MR3wz4quK1DO82/P5RtMEQAUjuYsnOgzARBZVQRtXIHoia/u05cCb4xC3aGgsa6AKqRlSZDqZ2bZXn9dRNlCEzxjAe7ihFQFFycNOvK5fbYCRBqJK/NscIXkhqo2gg9T3AVsuxEpPkC2ndx8gn9VNksPJUBdF/EksZPSlUaswwXuZTENaXxFmEb8gW/aBIe5p/m0u9yAYYRyTTw4Q/bvrcaTsL39pDY1UKeB+fAAWbwe93f66KTXm7U/ArCizsDFASAl3Z2plnN9/ndNw+qRXnScLQB1FYvFWL9jtz5tPlpSl4SCaDco3nzrcf/UWFT9gi6xpt+wmyCb2GJxduqRmwIR0sqxI2Tp5x21MBmnXkyYiHwi9Ag4BHCXgYGRO3byvB7K/VSDocIt07olxTwWSgFYWBUBE414uOXmP+aiqWOHHjo1Bs4R9S2H2g8GBR5vVPzoJ6w/z+2NxvnXcGrGteMf+XafhEYQXTD11pDWAzxTSFtp4FPcAKcma79Dnufcbw/x4VVzsHmXVen4k4Yh27xIQj76RQqiP0RqK7j3EQ/SFWBiZdPbWRz1Qr1BP2w3wNrr2T8btaF+278DFHiYn+/I0aGKy74mR3jkkJjGXu4lAt6kl0uXyy3nT+LWnlj9k54a38Hbw3Zg8NvKuCarIL+EQWZLe0z0Clg844N/F9kB0cXCbDlDLrfx1mP3JBZbYvFA+A71JzN+IsandStYAMMbvEFAPes/8ijbeTitHu17hiHA8ym8cd6h+Zw/37mvIP+ey9nMLEsOrOz5G nZEjnKAz xAnoFgE52ImTIPokp8l3mVctFO/7VKBK74CJauLUgjIQRUJvR17LBt3I73tnGTOZU6xAL+sR3CDvBlcvYbACib6xdsMSX0DAvkFm8++BYagguOKLIXxCfEPwwikUdf3swMJFck1wIvGyLEXqchvJ9xkHt8EJnD/GrQ9PoGYHd7tlJ4nw3Y7AzIiNEduQhJ8OA2+0aDa+YziLqpvEoqQAZ9dvx47jl+1Nci9PFy096CR3qC3M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Pass a folio instead of the head page to save a few instructions. Update the documentation, at least in English. Signed-off-by: Matthew Wilcox (Oracle) Cc: Yanteng Si Reviewed-by: Sidhartha Kumar --- Documentation/mm/hugetlbfs_reserv.rst | 14 +++--- .../zh_CN/mm/hugetlbfs_reserv.rst | 4 +- include/linux/hugetlb.h | 2 +- mm/hugetlb.c | 48 +++++++++---------- mm/page_alloc.c | 2 +- 5 files changed, 34 insertions(+), 36 deletions(-) diff --git a/Documentation/mm/hugetlbfs_reserv.rst b/Documentation/mm/hugetlbfs_reserv.rst index d9c2b0f01dcd..4914fbf07966 100644 --- a/Documentation/mm/hugetlbfs_reserv.rst +++ b/Documentation/mm/hugetlbfs_reserv.rst @@ -271,12 +271,12 @@ to the global reservation count (resv_huge_pages). Freeing Huge Pages ================== -Huge page freeing is performed by the routine free_huge_page(). This routine -is the destructor for hugetlbfs compound pages. As a result, it is only -passed a pointer to the page struct. When a huge page is freed, reservation -accounting may need to be performed. This would be the case if the page was -associated with a subpool that contained reserves, or the page is being freed -on an error path where a global reserve count must be restored. +Huge pages are freed by free_huge_folio(). It is only passed a pointer +to the folio as it is called from the generic MM code. When a huge page +is freed, reservation accounting may need to be performed. This would +be the case if the page was associated with a subpool that contained +reserves, or the page is being freed on an error path where a global +reserve count must be restored. The page->private field points to any subpool associated with the page. If the PagePrivate flag is set, it indicates the global reserve count should @@ -525,7 +525,7 @@ However, there are several instances where errors are encountered after a huge page is allocated but before it is instantiated. In this case, the page allocation has consumed the reservation and made the appropriate subpool, reservation map and global count adjustments. If the page is freed at this -time (before instantiation and clearing of PagePrivate), then free_huge_page +time (before instantiation and clearing of PagePrivate), then free_huge_folio will increment the global reservation count. However, the reservation map indicates the reservation was consumed. This resulting inconsistent state will cause the 'leak' of a reserved huge page. The global reserve count will diff --git a/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst b/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst index b7a0544224ad..0f7e7fb5ca8c 100644 --- a/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst +++ b/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst @@ -219,7 +219,7 @@ vma_commit_reservation()之间,预留映射有可能被改变。如果hugetlb_ 释放巨页 ======== -巨页释放是由函数free_huge_page()执行的。这个函数是hugetlbfs复合页的析构器。因此,它只传 +巨页释放是由函数free_huge_folio()执行的。这个函数是hugetlbfs复合页的析构器。因此,它只传 递一个指向页面结构体的指针。当一个巨页被释放时,可能需要进行预留计算。如果该页与包含保 留的子池相关联,或者该页在错误路径上被释放,必须恢复全局预留计数,就会出现这种情况。 @@ -387,7 +387,7 @@ region_count()在解除私有巨页映射时被调用。在私有映射中,预 然而,有几种情况是,在一个巨页被分配后,但在它被实例化之前,就遇到了错误。在这种情况下, 页面分配已经消耗了预留,并进行了适当的子池、预留映射和全局计数调整。如果页面在这个时候被释放 -(在实例化和清除PagePrivate之前),那么free_huge_page将增加全局预留计数。然而,预留映射 +(在实例化和清除PagePrivate之前),那么free_huge_folio将增加全局预留计数。然而,预留映射 显示报留被消耗了。这种不一致的状态将导致预留的巨页的 “泄漏” 。全局预留计数将比它原本的要高, 并阻止分配一个预先分配的页面。 diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 5a1dfaffbd80..5b2626063f4f 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -26,7 +26,7 @@ typedef struct { unsigned long pd; } hugepd_t; #define __hugepd(x) ((hugepd_t) { (x) }) #endif -void free_huge_page(struct page *page); +void free_huge_folio(struct folio *folio); #ifdef CONFIG_HUGETLB_PAGE diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e327a5a7602c..086eb51bf845 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1706,10 +1706,10 @@ static void add_hugetlb_folio(struct hstate *h, struct folio *folio, zeroed = folio_put_testzero(folio); if (unlikely(!zeroed)) /* - * It is VERY unlikely soneone else has taken a ref on - * the page. In this case, we simply return as the - * hugetlb destructor (free_huge_page) will be called - * when this other ref is dropped. + * It is VERY unlikely soneone else has taken a ref + * on the folio. In this case, we simply return as + * free_huge_folio() will be called when this other ref + * is dropped. */ return; @@ -1875,13 +1875,12 @@ struct hstate *size_to_hstate(unsigned long size) return NULL; } -void free_huge_page(struct page *page) +void free_huge_folio(struct folio *folio) { /* * Can't pass hstate in here because it is called from the * compound page destructor. */ - struct folio *folio = page_folio(page); struct hstate *h = folio_hstate(folio); int nid = folio_nid(folio); struct hugepage_subpool *spool = hugetlb_folio_subpool(folio); @@ -1936,7 +1935,7 @@ void free_huge_page(struct page *page) spin_unlock_irqrestore(&hugetlb_lock, flags); update_and_free_hugetlb_folio(h, folio, true); } else { - arch_clear_hugepage_flags(page); + arch_clear_hugepage_flags(&folio->page); enqueue_hugetlb_folio(h, folio); spin_unlock_irqrestore(&hugetlb_lock, flags); } @@ -2246,7 +2245,7 @@ static int alloc_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed, folio = alloc_fresh_hugetlb_folio(h, gfp_mask, node, nodes_allowed, node_alloc_noretry); if (folio) { - free_huge_page(&folio->page); /* free it into the hugepage allocator */ + free_huge_folio(folio); /* free it into the hugepage allocator */ return 1; } } @@ -2429,13 +2428,13 @@ static struct folio *alloc_surplus_hugetlb_folio(struct hstate *h, * We could have raced with the pool size change. * Double check that and simply deallocate the new page * if we would end up overcommiting the surpluses. Abuse - * temporary page to workaround the nasty free_huge_page + * temporary page to workaround the nasty free_huge_folio * codeflow */ if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) { folio_set_hugetlb_temporary(folio); spin_unlock_irq(&hugetlb_lock); - free_huge_page(&folio->page); + free_huge_folio(folio); return NULL; } @@ -2547,8 +2546,7 @@ static int gather_surplus_pages(struct hstate *h, long delta) __must_hold(&hugetlb_lock) { LIST_HEAD(surplus_list); - struct folio *folio; - struct page *page, *tmp; + struct folio *folio, *tmp; int ret; long i; long needed, allocated; @@ -2608,21 +2606,21 @@ static int gather_surplus_pages(struct hstate *h, long delta) ret = 0; /* Free the needed pages to the hugetlb pool */ - list_for_each_entry_safe(page, tmp, &surplus_list, lru) { + list_for_each_entry_safe(folio, tmp, &surplus_list, lru) { if ((--needed) < 0) break; /* Add the page to the hugetlb allocator */ - enqueue_hugetlb_folio(h, page_folio(page)); + enqueue_hugetlb_folio(h, folio); } free: spin_unlock_irq(&hugetlb_lock); /* * Free unnecessary surplus pages to the buddy allocator. - * Pages have no ref count, call free_huge_page directly. + * Pages have no ref count, call free_huge_folio directly. */ - list_for_each_entry_safe(page, tmp, &surplus_list, lru) - free_huge_page(page); + list_for_each_entry_safe(folio, tmp, &surplus_list, lru) + free_huge_folio(folio); spin_lock_irq(&hugetlb_lock); return ret; @@ -2836,11 +2834,11 @@ static long vma_del_reservation(struct hstate *h, * 2) No reservation was in place for the page, so hugetlb_restore_reserve is * not set. However, alloc_hugetlb_folio always updates the reserve map. * - * In case 1, free_huge_page later in the error path will increment the - * global reserve count. But, free_huge_page does not have enough context + * In case 1, free_huge_folio later in the error path will increment the + * global reserve count. But, free_huge_folio does not have enough context * to adjust the reservation map. This case deals primarily with private * mappings. Adjust the reserve map here to be consistent with global - * reserve count adjustments to be made by free_huge_page. Make sure the + * reserve count adjustments to be made by free_huge_folio. Make sure the * reserve map indicates there is a reservation present. * * In case 2, simply undo reserve map modifications done by alloc_hugetlb_folio. @@ -2856,7 +2854,7 @@ void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma, * Rare out of memory condition in reserve map * manipulation. Clear hugetlb_restore_reserve so * that global reserve count will not be incremented - * by free_huge_page. This will make it appear + * by free_huge_folio. This will make it appear * as though the reservation for this folio was * consumed. This may prevent the task from * faulting in the folio at a later time. This @@ -3232,7 +3230,7 @@ static void __init gather_bootmem_prealloc(void) if (prep_compound_gigantic_folio(folio, huge_page_order(h))) { WARN_ON(folio_test_reserved(folio)); prep_new_hugetlb_folio(h, folio, folio_nid(folio)); - free_huge_page(page); /* add to the hugepage allocator */ + free_huge_folio(folio); /* add to the hugepage allocator */ } else { /* VERY unlikely inflated ref count on a tail page */ free_gigantic_folio(folio, huge_page_order(h)); @@ -3264,7 +3262,7 @@ static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid) &node_states[N_MEMORY], NULL); if (!folio) break; - free_huge_page(&folio->page); /* free it into the hugepage allocator */ + free_huge_folio(folio); /* free it into the hugepage allocator */ } cond_resched(); } @@ -3542,7 +3540,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, while (count > persistent_huge_pages(h)) { /* * If this allocation races such that we no longer need the - * page, free_huge_page will handle it by freeing the page + * page, free_huge_folio will handle it by freeing the page * and reducing the surplus. */ spin_unlock_irq(&hugetlb_lock); @@ -3658,7 +3656,7 @@ static int demote_free_hugetlb_folio(struct hstate *h, struct folio *folio) prep_compound_page(subpage, target_hstate->order); folio_change_private(inner_folio, NULL); prep_new_hugetlb_folio(target_hstate, inner_folio, nid); - free_huge_page(subpage); + free_huge_folio(inner_folio); } mutex_unlock(&target_hstate->resize_lock); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 548c8016190b..b569fd5562aa 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -620,7 +620,7 @@ void destroy_large_folio(struct folio *folio) enum compound_dtor_id dtor = folio->_folio_dtor; if (folio_test_hugetlb(folio)) { - free_huge_page(&folio->page); + free_huge_folio(folio); return; }