From patchwork Mon Sep 28 17:54:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804445 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D44656CA for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 898DB208D5 for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="ZgLtXxPq"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="MSjto/Lp" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 898DB208D5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3E7B66B005D; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 38292900006; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 15DE1900003; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0247.hostedemail.com [216.40.44.247]) by kanga.kvack.org (Postfix) with ESMTP id D7FA66B005D for ; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9637A1E02 for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) X-FDA: 77313222126.20.map83_4a0a30427183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin20.hostedemail.com (Postfix) with ESMTP id 826A0180C07AF for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) X-Spam-Summary: 10,1,0,15404996f4b8a444,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:2:41:69:355:379:541:800:960:966:973:982:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1605:1606:1730:1747:1777:1792:1981:2194:2196:2199:2200:2393:2559:2562:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:4120:4250:4321:4385:4605:5007:6117:6119:6120:6261:6653:6742:7576:7901:7903:7927:9010:9012:10007:11026:11232:11473:11658:11914:12043:12216:12296:12438:12555:12679:12895:12986:13161:13229:13255:13894:14096:21080:21451:21627:21990:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04y8mrk4erncnrmzjk8x3gf16bj88yp6r9oin8kmabp87popgomhj9z1brapiwy.wsiu8wtb4c7grgsdd7ckkz59o1h6ua7i35xk4tx6wfk4czwmfasaj1rcptnx3zb.4-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:1:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: map83_4a0a30427183 X-Filterd-Recvd-Size: 9761 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:22 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 649DFE18; Mon, 28 Sep 2020 13:55:20 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=QhvUEUtDR8jg5 JOpmkKTRwuFRtglZQ/vY1xqC3rzo+Y=; b=ZgLtXxPqBSqSPdR91hgiV4VdcPIak W87Aw2siDPBTbHNiXO18yOYzE1RbU1uc8OYHYPxIjWxD5WChdgM25aFx8WiHwpS4 qzjlvUS4RuUSly9HXsy4pslpdCnFbTrtER5iKZIY/AdfNA+ri6kR0pgg3DQIEjZg lMx0wJFwQ0tZF579l2ruxjSpPnS3RyUqKquReI5LAXTqsKwatxC1ZSDGJJaj2TJm Cht3OMROuJy1fgS8G5/qtIjztysadd8YHQ7upTmk/BkAZPBBgKCA3piyapqaOdGq HJByAdj0u8QyCB4d3gbReby/vDvMVd5wCfjlqwT7rOCjp9aZix914+qFA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=QhvUEUtDR8jg5JOpmkKTRwuFRtglZQ/vY1xqC3rzo+Y=; b=MSjto/Lp Ks6w7QJ9GuVIUQYhEw0TMaGr/QnEqDbAo3Ag3ZwXgo5UXMVAoH8aqEnFLSv+hHx6 8c15DQqdjRr+zFgWszHu3hhd6BHbsSVar+1zxiOGo97DK8YnAryMCxhMGJ6KSGO2 rKK9Xx1ISsw0oHBXiJzr4yoIcjeBPlPkEtJx7X+KSpeuJbp7DhLa9pIhxugi08gB SiTxoBC1pNAQNKHsnJt9qKiTwTlpld9Ha7EXWJg0p5hu5MC72dMuOyyFPGdlz+5a sVq0SQdIiuV8Q4+Vu0A5ZRqZiSnigkNI68Aj1wHpo9SeBPk+M7FG/f4F7llmZGps z/qJ7q+9neDCOg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id EFBE63064683; Mon, 28 Sep 2020 13:55:18 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 03/30] mm: thp: use single linked list for THP page table page deposit. Date: Mon, 28 Sep 2020 13:54:01 -0400 Message-Id: <20200928175428.4110504-4-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan The old design uses the double linked list page->lru to chain all deposited page table pages when creating a THP and page->pmd_huge_pte to point to the first page of the list. As the second pointer in page->lru overlaps with page->pmd_huge_pte, the design prevents multi-level page table page deposit, which is useful for PUD and higher level THPs. The new design uses single linked list, where deposit_head points to a single linked list of deposited pages and deposit_node can be used to deposit the page itself to another list. For example, this allows us to have one PUD page points to a list of PMD pages, each of which points a list of PTE pages to support PUD level THP. Signed-off-by: Zi Yan --- include/linux/mm.h | 9 +++++---- include/linux/mm_types.h | 8 +++++--- kernel/fork.c | 4 ++-- mm/pgtable-generic.c | 15 +++++---------- 4 files changed, 17 insertions(+), 19 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 17e712207d74..01b62da34794 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -2249,7 +2250,7 @@ static inline spinlock_t *pmd_lockptr(struct mm_struct *mm, pmd_t *pmd) static inline bool pmd_ptlock_init(struct page *page) { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - page->pmd_huge_pte = NULL; + init_llist_head(&page->deposit_head); #endif return ptlock_init(page); } @@ -2257,12 +2258,12 @@ static inline bool pmd_ptlock_init(struct page *page) static inline void pmd_ptlock_free(struct page *page) { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - VM_BUG_ON_PAGE(page->pmd_huge_pte, page); + VM_BUG_ON_PAGE(!llist_empty(&page->deposit_head), page); #endif ptlock_free(page); } -#define pmd_huge_pte(mm, pmd) (pmd_to_page(pmd)->pmd_huge_pte) +#define huge_pmd_deposit_head(mm, pmd) (pmd_to_page(pmd)->deposit_head) #else @@ -2274,7 +2275,7 @@ static inline spinlock_t *pmd_lockptr(struct mm_struct *mm, pmd_t *pmd) static inline bool pmd_ptlock_init(struct page *page) { return true; } static inline void pmd_ptlock_free(struct page *page) {} -#define pmd_huge_pte(mm, pmd) ((mm)->pmd_huge_pte) +#define huge_pmd_deposit_head(mm, pmd) ((mm)->deposit_head_pmd) #endif diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 496c3ff97cce..be842926577a 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -143,8 +144,8 @@ struct page { struct list_head deferred_list; }; struct { /* Page table pages */ - unsigned long _pt_pad_1; /* compound_head */ - pgtable_t pmd_huge_pte; /* protected by page->ptl */ + struct llist_head deposit_head; /* pgtable deposit list head */ + struct llist_node deposit_node; /* pgtable deposit list node */ unsigned long _pt_pad_2; /* mapping */ union { struct mm_struct *pt_mm; /* x86 pgds only */ @@ -511,7 +512,8 @@ struct mm_struct { struct mmu_notifier_subscriptions *notifier_subscriptions; #endif #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS - pgtable_t pmd_huge_pte; /* protected by page_table_lock */ + /* pgtable deposit list head, protected by page_table_lock */ + struct llist_head deposit_head_pmd; #endif #ifdef CONFIG_NUMA_BALANCING /* diff --git a/kernel/fork.c b/kernel/fork.c index 138cd6ca50da..9c8e880538de 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -661,7 +661,7 @@ static void check_mm(struct mm_struct *mm) mm_pgtables_bytes(mm)); #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS - VM_BUG_ON_MM(mm->pmd_huge_pte, mm); + VM_BUG_ON_MM(!llist_empty(&mm->deposit_head_pmd), mm); #endif } @@ -1022,7 +1022,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, mmu_notifier_subscriptions_init(mm); init_tlb_flush_pending(mm); #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS - mm->pmd_huge_pte = NULL; + init_llist_head(&mm->deposit_head_pmd); #endif mm_init_uprobes_state(mm); diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index 9578db83e312..dbb0154165f1 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -164,11 +164,7 @@ void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp, assert_spin_locked(pmd_lockptr(mm, pmdp)); /* FIFO */ - if (!pmd_huge_pte(mm, pmdp)) - INIT_LIST_HEAD(&pgtable->lru); - else - list_add(&pgtable->lru, &pmd_huge_pte(mm, pmdp)->lru); - pmd_huge_pte(mm, pmdp) = pgtable; + llist_add(&pgtable->deposit_node, &huge_pmd_deposit_head(mm, pmdp)); } #endif @@ -180,12 +176,11 @@ pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp) assert_spin_locked(pmd_lockptr(mm, pmdp)); + /* only withdraw from a non empty list */ + VM_BUG_ON(llist_empty(&huge_pmd_deposit_head(mm, pmdp))); /* FIFO */ - pgtable = pmd_huge_pte(mm, pmdp); - pmd_huge_pte(mm, pmdp) = list_first_entry_or_null(&pgtable->lru, - struct page, lru); - if (pmd_huge_pte(mm, pmdp)) - list_del(&pgtable->lru); + pgtable = llist_entry(llist_del_first(&huge_pmd_deposit_head(mm, pmdp)), + struct page, deposit_node); return pgtable; } #endif