From patchwork Tue Sep 15 12:59:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11776495 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 534B9746 for ; Tue, 15 Sep 2020 13:01:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 022A920829 for ; Tue, 15 Sep 2020 13:01:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="K0+TtRoQ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 022A920829 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9937F90003B; Tue, 15 Sep 2020 09:01:28 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 91D4D90003A; Tue, 15 Sep 2020 09:01:28 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7BD4290003B; Tue, 15 Sep 2020 09:01:28 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0025.hostedemail.com [216.40.44.25]) by kanga.kvack.org (Postfix) with ESMTP id 61CB190003A for ; Tue, 15 Sep 2020 09:01:28 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 028D6485C for ; Tue, 15 Sep 2020 13:01:28 +0000 (UTC) X-FDA: 77265307056.30.ball45_33122b227111 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id B24E0180B45BE for ; Tue, 15 Sep 2020 13:01:21 +0000 (UTC) X-Spam-Summary: 1,0,0,b18224e9777ce23d,d41d8cd98f00b204,songmuchun@bytedance.com,,RULES_HIT:2:41:355:379:541:800:960:965:966:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1535:1606:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2731:3138:3139:3140:3141:3142:3354:3865:3866:3867:3871:4119:4385:4390:4395:4605:5007:6119:6120:6261:6653:6737:6738:7875:7903:9036:10004:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12683:12895:13161:13229:13894:14096:14110:21080:21444:21451:21627:21990:30054,0,RBL:209.85.215.195:@bytedance.com:.lbl8.mailshell.net-66.100.201.201 62.2.0.100;04y8bunzjiz1hhcbdgcqj48yzj1keyp3nb7tfxi448w8h47y3k6cjh4wb8t5unh.cf1n75p7d5fp1brh9dgarhkf81y6jmpt795bbxggen3i3md1uuohsz8h17zwnak.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:243,LUA_SUMMARY:none X-HE-Tag: ball45_33122b227111 X-Filterd-Recvd-Size: 8930 Received: from mail-pg1-f195.google.com (mail-pg1-f195.google.com [209.85.215.195]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Tue, 15 Sep 2020 13:01:09 +0000 (UTC) Received: by mail-pg1-f195.google.com with SMTP id y1so1944063pgk.8 for ; Tue, 15 Sep 2020 06:01:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=d7qZ2BwXX/ZVANEcHk8/fyjZQtNSlbjNG3JjBN1i9+E=; b=K0+TtRoQ0iBLsidzNfDucv/964bgNOsr/noRAqmDAfTjJMtaNxLaRm83+mJUhcr54j KNh3l/4HZdMPba82hDWu7cxPi3ZrtwGlBNyZkuZzQlmeC2t6B8VtRsi+raONTYolNT73 7r1ueK3UAuXFx0RS7WTMb2XdbueEX7Tw/ZU/o9NUO6+BuP9cVlweh2dt23epZyI0f9RA qQjz5EuUOz4OYzz7lCZ4QYZzig1RqkoG0CzEJ06v/ZltzeXQqA8JkDNhKt7KQEHAvdf6 F0RxAEaLFcwVHtW165J5QbIrtdb0RuRKnEuvQWLVc9AUWNBmGjZss5M7+PHZF2WAQPM0 pLfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=d7qZ2BwXX/ZVANEcHk8/fyjZQtNSlbjNG3JjBN1i9+E=; b=CI4mzfDrI9fNTxN/JSwH8DhZhBBnK4aJeoaeXSIMxAbdv+DKRSC58gRUcFAzbGEirt PkCxMJPUNGf9NCGRgxCp+U5tj4tRD/ZFLGnl9tABYu79JilhMNgePyNQUxrsxuvlcUqN FfG1Tsm2xofqyZdjs/WTX9iBX4az3vjYbCyjY6+fTRRmCznLLSAxhDDeiZuVX8krrMfB 7Rqr50ad4iiZkm5aioomDY2AifnPHLAzCf9nZKnffncmYtqPaBacGErbx9Ir90sVlYbD Sei+g7ExvXDyuc0Go3PBzQwJ2fyDD0VMu/C9uRTXsvgDUZ31f8kHonEf9ae9I/pHF0bo 5+Fg== X-Gm-Message-State: AOAM532ThQyu39+5SastTecCve06wUH6ztUBTeGJ04IHURIt93tNWlBX CiiT0sfrH5R6yRjp4HEUwofSfA== X-Google-Smtp-Source: ABdhPJzwahGiSKr1NRGUl09d6iuN/pmblKyvvg/Fup+XIG5qhKQz4c4TIZOJ3KaUzNx9x8K8iv5qkA== X-Received: by 2002:a63:cb0a:: with SMTP id p10mr15017791pgg.314.1600174867618; Tue, 15 Sep 2020 06:01:07 -0700 (PDT) Received: from localhost.bytedance.net ([103.136.220.66]) by smtp.gmail.com with ESMTPSA id w185sm14269855pfc.36.2020.09.15.06.00.57 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 15 Sep 2020 06:01:07 -0700 (PDT) From: Muchun Song To: corbet@lwn.net, mike.kravetz@oracle.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, rdunlap@infradead.org, oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, almasrymina@google.com, rientjes@google.com Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Muchun Song Subject: [RFC PATCH 06/24] mm/hugetlb: Introduce pgtable allocation/freeing helpers Date: Tue, 15 Sep 2020 20:59:29 +0800 Message-Id: <20200915125947.26204-7-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20200915125947.26204-1-songmuchun@bytedance.com> References: <20200915125947.26204-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: B24E0180B45BE X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On some architectures, the vmemmap areas use huge page mapping. If we want to free the unused vmemmap pages, we have to split the huge pmd firstly. So we should pre-allocate pgtable to split huge pmd. Signed-off-by: Muchun Song --- include/linux/hugetlb.h | 17 ++++++ mm/hugetlb.c | 117 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 134 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index eed3dd3bd626..ace304a6196c 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -593,6 +593,23 @@ static inline unsigned int blocks_per_huge_page(struct hstate *h) #include +#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP +#ifndef arch_vmemmap_support_huge_mapping +static inline bool arch_vmemmap_support_huge_mapping(void) +{ + return false; +} +#endif + +#ifndef VMEMMAP_HPAGE_SHIFT +#define VMEMMAP_HPAGE_SHIFT PMD_SHIFT +#endif +#define VMEMMAP_HPAGE_ORDER (VMEMMAP_HPAGE_SHIFT - PAGE_SHIFT) +#define VMEMMAP_HPAGE_NR (1 << VMEMMAP_HPAGE_ORDER) +#define VMEMMAP_HPAGE_SIZE ((1UL) << VMEMMAP_HPAGE_SHIFT) +#define VMEMMAP_HPAGE_MASK (~(VMEMMAP_HPAGE_SIZE - 1)) +#endif /* CONFIG_HUGETLB_PAGE_FREE_VMEMMAP */ + #ifndef is_hugepage_only_range static inline int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, unsigned long len) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index f1b2b733b49b..d6ae9b6876be 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1295,11 +1295,108 @@ static inline void destroy_compound_gigantic_page(struct page *page, #ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP #define RESERVE_VMEMMAP_NR 2U +#define page_huge_pte(page) ((page)->pmd_huge_pte) + static inline unsigned int nr_free_vmemmap(struct hstate *h) { return h->nr_free_vmemmap_pages; } +static inline unsigned int nr_vmemmap(struct hstate *h) +{ + return nr_free_vmemmap(h) + RESERVE_VMEMMAP_NR; +} + +static inline unsigned long nr_vmemmap_size(struct hstate *h) +{ + return (unsigned long)nr_vmemmap(h) << PAGE_SHIFT; +} + +static inline unsigned int nr_pgtable(struct hstate *h) +{ + unsigned long vmemmap_size = nr_vmemmap_size(h); + + if (!arch_vmemmap_support_huge_mapping()) + return 0; + + /* + * No need pre-allocate page tabels when there is no vmemmap pages + * to free. + */ + if (!nr_free_vmemmap(h)) + return 0; + + return ALIGN(vmemmap_size, VMEMMAP_HPAGE_SIZE) >> VMEMMAP_HPAGE_SHIFT; +} + +static inline void vmemmap_pgtable_init(struct page *page) +{ + page_huge_pte(page) = NULL; +} + +static void vmemmap_pgtable_deposit(struct page *page, pte_t *pte_p) +{ + pgtable_t pgtable = virt_to_page(pte_p); + + /* FIFO */ + if (!page_huge_pte(page)) + INIT_LIST_HEAD(&pgtable->lru); + else + list_add(&pgtable->lru, &page_huge_pte(page)->lru); + page_huge_pte(page) = pgtable; +} + +static pte_t *vmemmap_pgtable_withdraw(struct page *page) +{ + pgtable_t pgtable; + + /* FIFO */ + pgtable = page_huge_pte(page); + if (unlikely(!pgtable)) + return NULL; + page_huge_pte(page) = list_first_entry_or_null(&pgtable->lru, + struct page, lru); + if (page_huge_pte(page)) + list_del(&pgtable->lru); + return page_to_virt(pgtable); +} + +static int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page) +{ + int i; + pte_t *pte_p; + unsigned int nr = nr_pgtable(h); + + if (!nr) + return 0; + + vmemmap_pgtable_init(page); + + for (i = 0; i < nr; i++) { + pte_p = pte_alloc_one_kernel(&init_mm); + if (!pte_p) + goto out; + vmemmap_pgtable_deposit(page, pte_p); + } + + return 0; +out: + while (i-- && (pte_p = vmemmap_pgtable_withdraw(page))) + pte_free_kernel(&init_mm, pte_p); + return -ENOMEM; +} + +static inline void vmemmap_pgtable_free(struct hstate *h, struct page *page) +{ + pte_t *pte_p; + + if (!nr_pgtable(h)) + return; + + while ((pte_p = vmemmap_pgtable_withdraw(page))) + pte_free_kernel(&init_mm, pte_p); +} + static void __init hugetlb_vmemmap_init(struct hstate *h) { unsigned int order = huge_page_order(h); @@ -1323,6 +1420,15 @@ static void __init hugetlb_vmemmap_init(struct hstate *h) static inline void hugetlb_vmemmap_init(struct hstate *h) { } + +static inline int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page) +{ + return 0; +} + +static inline void vmemmap_pgtable_free(struct hstate *h, struct page *page) +{ +} #endif static void update_and_free_page(struct hstate *h, struct page *page) @@ -1531,6 +1637,9 @@ void free_huge_page(struct page *page) static void prep_new_huge_page(struct hstate *h, struct page *page, int nid) { + /* Must be called before the initialization of @page->lru */ + vmemmap_pgtable_free(h, page); + INIT_LIST_HEAD(&page->lru); set_compound_page_dtor(page, HUGETLB_PAGE_DTOR); set_hugetlb_cgroup(page, NULL); @@ -1783,6 +1892,14 @@ static struct page *alloc_fresh_huge_page(struct hstate *h, if (!page) return NULL; + if (vmemmap_pgtable_prealloc(h, page)) { + if (hstate_is_gigantic(h)) + free_gigantic_page(page, huge_page_order(h)); + else + put_page(page); + return NULL; + } + if (hstate_is_gigantic(h)) prep_compound_gigantic_page(page, huge_page_order(h)); prep_new_huge_page(h, page, page_to_nid(page));