From patchwork Wed Sep 2 18:06:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11751497 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C935B109A for ; Wed, 2 Sep 2020 18:07:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 80743208C7 for ; Wed, 2 Sep 2020 18:07:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="K2KV2DVW"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="YPWKC0BX" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 80743208C7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B458F900022; Wed, 2 Sep 2020 14:06:36 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 90434900025; Wed, 2 Sep 2020 14:06:36 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 13D36900021; Wed, 2 Sep 2020 14:06:35 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0102.hostedemail.com [216.40.44.102]) by kanga.kvack.org (Postfix) with ESMTP id 7FFA090001D for ; Wed, 2 Sep 2020 14:06:35 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 446C3824805A for ; Wed, 2 Sep 2020 18:06:35 +0000 (UTC) X-FDA: 77218901550.26.sink23_0d12587270a2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id 15C971804B654 for ; Wed, 2 Sep 2020 18:06:35 +0000 (UTC) X-Spam-Summary: 1,0,0,64f30d7e1618764b,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:2:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1605:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3865:3867:3868:3870:4049:4120:4250:4321:4605:5007:6119:6120:6261:6630:6653:7576:7901:8957:9036:9592:10004:10226:11026:11232:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:13894:14110:21080:21627:21990:30003:30054:30064,0,RBL:66.111.4.25:@sent.com:.lbl8.mailshell.net-62.18.0.100 66.100.201.100;04yfgqq86mmz13bdybo7wnugzfahiycfyzm9o7ourdiqm4hedisud4yzcr97zm6.m3xc9xpwfg58e66mygtpu6r4g3fmnb55we4ppmozh5e5z8kd5zaz6yk5yem3fnk.c-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: sink23_0d12587270a2 X-Filterd-Recvd-Size: 9967 Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Wed, 2 Sep 2020 18:06:34 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 277195C0144; Wed, 2 Sep 2020 14:06:34 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Wed, 02 Sep 2020 14:06:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=9wI5eSYIGtV+C JbXh5SLtYPlAilzT8nVnNMnj6skQGo=; b=K2KV2DVWtRcdX3QiWB3+l4EPkOlif 7hjYtzHbMVjwAb6fUCN65GC1CLfMKL2VCq5F88xbuX4BVVQd4qsWZ/aGyjp6jpPf EKk/7dk+++Htiu9n1t7TUXcntIGgQEUIqTj3qD1ZQqKgsDJVORxcGKCuwnx0b3Ir Y0K7kJjqoOwC9XlMZccZVFqQkWVay+9k9xGaMfIOAp2m3hONQtSHGifldo+NETEO If3C4rsQeW4V9GdTp4OVhRgXLKBlKR4PnMxA/4lG40KuWx5x1TAWtOgct9dwr46v 8havT3dmN9Nd567Yo9Ex2dUOLyRft4Wji4GWs0PU8JWTz5Rby8DpfM5fg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=9wI5eSYIGtV+CJbXh5SLtYPlAilzT8nVnNMnj6skQGo=; b=YPWKC0BX l4CP15MFklKrY6Uf2JRlHyi/6ltWioVeafteVm7XVZu43P+jd222Cns+lSDQgV7L Uq7Vd+G/DzmS2lZi4+5UIjbCnuS6FyP7HmcdRTkObHxjlMxcPU2R4X80vn10WUbM MmBQXVnDdTbK3CCOEwD+vFfYoYkF9n34cH9Psm1xIq0loXqBoTuysISFfh09URN4 /hA/U8RR7M0sil9DVccoSq86cF0sDKJdbkdMKIvG3rXgFEIS9b5ZOJLag6Se338N gd5ZDzNbFGP5Xg6MS692bceOgFvlRpUuVAuVuofVXfVXp63Y2BVh6XGBJ8sACqmq oNdImj6sgGn/ew== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduiedrudefledguddvudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcu jggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhf ffveektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphep uddvrdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh epmhgrihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 5C90030600B7; Wed, 2 Sep 2020 14:06:33 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org, Roman Gushchin Cc: Rik van Riel , "Kirill A . Shutemov" , Matthew Wilcox , Shakeel Butt , Yang Shi , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH 11/16] mm: thp: 1GB THP follow_p*d_page() support. Date: Wed, 2 Sep 2020 14:06:23 -0400 Message-Id: <20200902180628.4052244-12-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200902180628.4052244-1-zi.yan@sent.com> References: <20200902180628.4052244-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Rspamd-Queue-Id: 15C971804B654 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan Add follow_page support for 1GB THPs. Signed-off-by: Zi Yan --- include/linux/huge_mm.h | 11 +++++++ mm/gup.c | 60 ++++++++++++++++++++++++++++++++- mm/huge_memory.c | 73 ++++++++++++++++++++++++++++++++++++++++- 3 files changed, 142 insertions(+), 2 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 589e5af5a1c2..c7bc40c4a5e2 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -20,6 +20,10 @@ extern int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, extern void huge_pud_set_accessed(struct vm_fault *vmf, pud_t orig_pud); extern int do_huge_pud_anonymous_page(struct vm_fault *vmf); extern vm_fault_t do_huge_pud_wp_page(struct vm_fault *vmf, pud_t orig_pud); +extern struct page *follow_trans_huge_pud(struct vm_area_struct *vma, + unsigned long addr, + pud_t *pud, + unsigned int flags); #else static inline void huge_pud_set_accessed(struct vm_fault *vmf, pud_t orig_pud) { @@ -32,6 +36,13 @@ extern vm_fault_t do_huge_pud_wp_page(struct vm_fault *vmf, pud_t orig_pud) { return VM_FAULT_FALLBACK; } +struct page *follow_trans_huge_pud(struct vm_area_struct *vma, + unsigned long addr, + pud_t *pud, + unsigned int flags) +{ + return NULL; +} #endif extern vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd); diff --git a/mm/gup.c b/mm/gup.c index bd883a112724..4b32ae3c5fa2 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -698,10 +698,68 @@ static struct page *follow_pud_mask(struct vm_area_struct *vma, if (page) return page; } + +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + if (likely(!pud_trans_huge(*pud))) { + if (unlikely(pud_bad(*pud))) + return no_page_table(vma, flags); + return follow_pmd_mask(vma, address, pud, flags, ctx); + } + + ptl = pud_lock(mm, pud); + + if (unlikely(!pud_trans_huge(*pud))) { + spin_unlock(ptl); + if (unlikely(pud_bad(*pud))) + return no_page_table(vma, flags); + return follow_pmd_mask(vma, address, pud, flags, ctx); + } + + if (flags & FOLL_SPLIT) { + int ret; + pmd_t *pmd = NULL; + + page = pud_page(*pud); + if (is_huge_zero_page(page)) { + + spin_unlock(ptl); + ret = 0; + split_huge_pud(vma, pud, address); + pmd = pmd_offset(pud, address); + split_huge_pmd(vma, pmd, address); + if (pmd_trans_unstable(pmd)) + ret = -EBUSY; + } else { + get_page(page); + spin_unlock(ptl); + lock_page(page); + ret = split_huge_pud_page(page); + if (!ret) + ret = split_huge_page(page); + else { + unlock_page(page); + put_page(page); + goto out; + } + unlock_page(page); + put_page(page); + if (pud_none(*pud)) + return no_page_table(vma, flags); + pmd = pmd_offset(pud, address); + } +out: + return ret ? ERR_PTR(ret) : + follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); + } + page = follow_trans_huge_pud(vma, address, pud, flags); + spin_unlock(ptl); + ctx->page_mask = HPAGE_PUD_NR - 1; + return page; +#else if (unlikely(pud_bad(*pud))) return no_page_table(vma, flags); - return follow_pmd_mask(vma, address, pud, flags, ctx); +#endif } static struct page *follow_p4d_mask(struct vm_area_struct *vma, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 398f1b52f789..e209c2dfc5b7 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1259,6 +1259,77 @@ struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr, return page; } +/* + * FOLL_FORCE can write to even unwritable pmd's, but only + * after we've gone through a COW cycle and they are dirty. + */ +static inline bool can_follow_write_pud(pud_t pud, unsigned int flags) +{ + return pud_write(pud) || + ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pud_dirty(pud)); +} + +struct page *follow_trans_huge_pud(struct vm_area_struct *vma, + unsigned long addr, + pud_t *pud, + unsigned int flags) +{ + struct mm_struct *mm = vma->vm_mm; + struct page *page = NULL; + + assert_spin_locked(pud_lockptr(mm, pud)); + + if (flags & FOLL_WRITE && !can_follow_write_pud(*pud, flags)) + goto out; + + /* Avoid dumping huge zero page */ + if ((flags & FOLL_DUMP) && is_huge_zero_pud(*pud)) + return ERR_PTR(-EFAULT); + + /* Full NUMA hinting faults to serialise migration in fault paths */ + /*&& pud_protnone(*pmd)*/ + if ((flags & FOLL_NUMA)) + goto out; + + page = pud_page(*pud); + VM_BUG_ON_PAGE(!PageHead(page) && !is_zone_device_page(page), page); + if (flags & FOLL_TOUCH) + touch_pud(vma, addr, pud, flags); + if ((flags & FOLL_MLOCK) && (vma->vm_flags & VM_LOCKED)) { + /* + * We don't mlock() pte-mapped THPs. This way we can avoid + * leaking mlocked pages into non-VM_LOCKED VMAs. + * + * For anon THP: + * + * We do the same thing as PMD-level THP. + * + * For file THP: + * + * No support yet. + * + */ + + if (PageAnon(page) && compound_mapcount(page) != 1) + goto skip_mlock; + if (PagePUDDoubleMap(page) || !page->mapping) + goto skip_mlock; + if (!trylock_page(page)) + goto skip_mlock; + lru_add_drain(); + if (page->mapping && !PagePUDDoubleMap(page)) + mlock_vma_page(page); + unlock_page(page); + } +skip_mlock: + page += (addr & ~HPAGE_PUD_MASK) >> PAGE_SHIFT; + VM_BUG_ON_PAGE(!PageCompound(page) && !is_zone_device_page(page), page); + if (flags & FOLL_GET) + get_page(page); + +out: + return page; +} int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, pud_t *dst_pud, pud_t *src_pud, unsigned long addr, struct vm_area_struct *vma) @@ -1501,7 +1572,7 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, goto out; page = pmd_page(*pmd); - VM_BUG_ON_PAGE(!PageHead(page) && !is_zone_device_page(page), page); + VM_BUG_ON_PAGE(!PageHead(page) && !is_zone_device_page(page) && !PMDPageInPUD(page), page); if (!try_grab_page(page, flags)) return ERR_PTR(-ENOMEM);