From patchwork Thu Nov 5 19:15:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 11884983 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 888F916C1 for ; Thu, 5 Nov 2020 19:15:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1C4BC2151B for ; Thu, 5 Nov 2020 19:15:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1C4BC2151B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C8B476B017F; Thu, 5 Nov 2020 14:15:23 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C3C6D6B0180; Thu, 5 Nov 2020 14:15:23 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B044B6B0181; Thu, 5 Nov 2020 14:15:23 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0157.hostedemail.com [216.40.44.157]) by kanga.kvack.org (Postfix) with ESMTP id 85B3B6B017F for ; Thu, 5 Nov 2020 14:15:23 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 2388D1EF1 for ; Thu, 5 Nov 2020 19:15:23 +0000 (UTC) X-FDA: 77451318126.21.fork80_32098ad272cc Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id 07369180442C4 for ; Thu, 5 Nov 2020 19:15:23 +0000 (UTC) X-Spam-Summary: 1,0,0,4fc9fbf3818fb559,d41d8cd98f00b204,riel@shelob.surriel.com,,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1543:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3874:4250:4385:4605:5007:6117:6119:6120:6261:6742:7901:7903:10004:11026:11232:11233:11473:11658:11914:12043:12050:12296:12297:12438:12517:12519:12555:12895:13161:13229:13894:14096:14181:14394:14721:14819:21080:21451:21611:21627:21740:21990:30012:30054:30075:30091,0,RBL:96.67.55.147:@shelob.surriel.com:.lbl8.mailshell.net-62.14.0.100 64.201.201.201;04ygodpz9z8gfehckjudw58xcw5csocyce4gbtzbdj8sxrztqmsrztjskwyby8s.c1znk3hsge3qo8p8c6tuiuqof18ddhrsqhuypctnbu46wcp83an5g54qbganmyt.n-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:68,LUA_SUMMARY:none X-HE-Tag: fork80_32098ad272cc X-Filterd-Recvd-Size: 5095 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf20.hostedemail.com (Postfix) with ESMTP for ; Thu, 5 Nov 2020 19:15:22 +0000 (UTC) Received: from imladris.surriel.com ([96.67.55.152]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1kakj1-00048T-Di; Thu, 05 Nov 2020 14:15:11 -0500 From: Rik van Riel To: hughd@google.com Cc: xuyu@linux.alibaba.com, akpm@linux-foundation.org, mgorman@suse.de, aarcange@redhat.com, willy@infradead.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, linux-mm@kvack.org, vbabka@suse.cz, mhocko@suse.com, Rik van Riel Subject: [PATCH 1/2] mm,thp,shmem: limit shmem THP alloc gfp_mask Date: Thu, 5 Nov 2020 14:15:07 -0500 Message-Id: <20201105191508.1961686-2-riel@surriel.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20201105191508.1961686-1-riel@surriel.com> References: <20201105191508.1961686-1-riel@surriel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The allocation flags of anonymous transparent huge pages can be controlled through the files in /sys/kernel/mm/transparent_hugepage/defrag, which can help the system from getting bogged down in the page reclaim and compaction code when many THPs are getting allocated simultaneously. However, the gfp_mask for shmem THP allocations were not limited by those configuration settings, and some workloads ended up with all CPUs stuck on the LRU lock in the page reclaim code, trying to allocate dozens of THPs simultaneously. This patch applies the same configurated limitation of THPs to shmem hugepage allocations, to prevent that from happening. This way a THP defrag setting of "never" or "defer+madvise" will result in quick allocation failures without direct reclaim when no 2MB free pages are available. With this patch applied, THP allocations for tmpfs will be a little more aggressive than today for files mmapped with MADV_HUGEPAGE, and a little less aggressive for files that are not mmapped or mapped without that flag. Signed-off-by: Rik van Riel --- include/linux/gfp.h | 2 ++ mm/huge_memory.c | 6 +++--- mm/shmem.c | 8 +++++--- 3 files changed, 10 insertions(+), 6 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index c603237e006c..c7615c9ba03c 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -614,6 +614,8 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask); extern void pm_restrict_gfp_mask(void); extern void pm_restore_gfp_mask(void); +extern gfp_t vma_thp_gfp_mask(struct vm_area_struct *vma); + #ifdef CONFIG_PM_SLEEP extern bool pm_suspended_storage(void); #else diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9474dbc150ed..c5d03b2f2f2f 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -649,9 +649,9 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, * available * never: never stall for any thp allocation */ -static inline gfp_t alloc_hugepage_direct_gfpmask(struct vm_area_struct *vma) +gfp_t vma_thp_gfp_mask(struct vm_area_struct *vma) { - const bool vma_madvised = !!(vma->vm_flags & VM_HUGEPAGE); + const bool vma_madvised = vma && (vma->vm_flags & VM_HUGEPAGE); /* Always do synchronous compaction */ if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags)) @@ -744,7 +744,7 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf) pte_free(vma->vm_mm, pgtable); return ret; } - gfp = alloc_hugepage_direct_gfpmask(vma); + gfp = vma_thp_gfp_mask(vma); page = alloc_hugepage_vma(gfp, vma, haddr, HPAGE_PMD_ORDER); if (unlikely(!page)) { count_vm_event(THP_FAULT_FALLBACK); diff --git a/mm/shmem.c b/mm/shmem.c index 537c137698f8..6c3cb192a88d 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1545,8 +1545,8 @@ static struct page *shmem_alloc_hugepage(gfp_t gfp, return NULL; shmem_pseudo_vma_init(&pvma, info, hindex); - page = alloc_pages_vma(gfp | __GFP_COMP | __GFP_NORETRY | __GFP_NOWARN, - HPAGE_PMD_ORDER, &pvma, 0, numa_node_id(), true); + page = alloc_pages_vma(gfp, HPAGE_PMD_ORDER, &pvma, 0, numa_node_id(), + true); shmem_pseudo_vma_destroy(&pvma); if (page) prep_transhuge_page(page); @@ -1802,6 +1802,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, struct page *page; enum sgp_type sgp_huge = sgp; pgoff_t hindex = index; + gfp_t huge_gfp; int error; int once = 0; int alloced = 0; @@ -1887,7 +1888,8 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, } alloc_huge: - page = shmem_alloc_and_acct_page(gfp, inode, index, true); + huge_gfp = vma_thp_gfp_mask(vma); + page = shmem_alloc_and_acct_page(huge_gfp, inode, index, true); if (IS_ERR(page)) { alloc_nohuge: page = shmem_alloc_and_acct_page(gfp, inode, From patchwork Thu Nov 5 19:15:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 11884985 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4466A15E6 for ; Thu, 5 Nov 2020 19:15:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F33E621D46 for ; Thu, 5 Nov 2020 19:15:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F33E621D46 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 761EA6B0183; Thu, 5 Nov 2020 14:15:28 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 733B76B0186; Thu, 5 Nov 2020 14:15:28 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6722A6B0187; Thu, 5 Nov 2020 14:15:28 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0087.hostedemail.com [216.40.44.87]) by kanga.kvack.org (Postfix) with ESMTP id 38FDC6B0183 for ; Thu, 5 Nov 2020 14:15:28 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D1675824999B for ; Thu, 5 Nov 2020 19:15:27 +0000 (UTC) X-FDA: 77451318294.18.tray25_2606681272cc Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id 84740100ED0E4 for ; Thu, 5 Nov 2020 19:15:27 +0000 (UTC) X-Spam-Summary: 1,0,0,651940eebba44021,d41d8cd98f00b204,riel@shelob.surriel.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1311:1314:1345:1359:1437:1515:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:2693:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3871:3874:4250:4321:4423:5007:6119:6120:6261:6742:7901:7903:10004:11026:11658:11914:12043:12291:12296:12297:12438:12517:12519:12555:12895:13069:13311:13357:13894:14181:14384:14394:14721:14819:21080:21451:21627:21990:30054:30075,0,RBL:96.67.55.147:@shelob.surriel.com:.lbl8.mailshell.net-62.14.0.100 64.201.201.201;04y8bfq9szxktoh54xpo7mnpeprj7ocfjhkeigoqs1gyec6ynkydniaec5ct71u.61pfgka7dpyg9ch58ti5jffgun5ctw69qetkbbg8a8kqpazfrctbycxq1ncy15s.w-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:69,LUA_SUMMARY:none X-HE-Tag: tray25_2606681272cc X-Filterd-Recvd-Size: 2905 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Thu, 5 Nov 2020 19:15:26 +0000 (UTC) Received: from imladris.surriel.com ([96.67.55.152]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1kakj1-00048T-FN; Thu, 05 Nov 2020 14:15:11 -0500 From: Rik van Riel To: hughd@google.com Cc: xuyu@linux.alibaba.com, akpm@linux-foundation.org, mgorman@suse.de, aarcange@redhat.com, willy@infradead.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, linux-mm@kvack.org, vbabka@suse.cz, mhocko@suse.com, Rik van Riel Subject: [PATCH 2/2] mm,thp,shm: limit gfp mask to no more than specified Date: Thu, 5 Nov 2020 14:15:08 -0500 Message-Id: <20201105191508.1961686-3-riel@surriel.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20201105191508.1961686-1-riel@surriel.com> References: <20201105191508.1961686-1-riel@surriel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Matthew Wilcox pointed out that the i915 driver opportunistically allocates tmpfs memory, but will happily reclaim some of its pool if no memory is available. Make sure the gfp mask used to opportunistically allocate a THP is always at least as restrictive as the original gfp mask. Signed-off-by: Rik van Riel Suggested-by: Matthew Wilcox --- mm/shmem.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/mm/shmem.c b/mm/shmem.c index 6c3cb192a88d..ee3cea10c2a4 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1531,6 +1531,26 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp, return page; } +/* + * Make sure huge_gfp is always more limited than limit_gfp. + * Some of the flags set permissions, while others set limitations. + */ +static gfp_t limit_gfp_mask(gfp_t huge_gfp, gfp_t limit_gfp) +{ + gfp_t allowflags = __GFP_IO | __GFP_FS | __GFP_RECLAIM; + gfp_t denyflags = __GFP_NOWARN | __GFP_NORETRY; + gfp_t result = huge_gfp & ~allowflags; + + /* + * Minimize the result gfp by taking the union with the deny flags, + * and the intersection of the allow flags. + */ + result |= (limit_gfp & denyflags); + result |= (huge_gfp & limit_gfp) & allowflags; + + return result; +} + static struct page *shmem_alloc_hugepage(gfp_t gfp, struct shmem_inode_info *info, pgoff_t index) { @@ -1889,6 +1909,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, alloc_huge: huge_gfp = vma_thp_gfp_mask(vma); + huge_gfp = limit_gfp_mask(huge_gfp, gfp); page = shmem_alloc_and_acct_page(huge_gfp, inode, index, true); if (IS_ERR(page)) { alloc_nohuge: