From patchwork Wed Nov 6 05:16:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229295 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B7C7815AB for ; Wed, 6 Nov 2019 05:16:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6B00E217F4 for ; Wed, 6 Nov 2019 05:16:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="upSax/QB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6B00E217F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C45056B0269; Wed, 6 Nov 2019 00:16:23 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BF6EE6B026A; Wed, 6 Nov 2019 00:16:23 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0B5E6B026B; Wed, 6 Nov 2019 00:16:23 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0084.hostedemail.com [216.40.44.84]) by kanga.kvack.org (Postfix) with ESMTP id 99ECD6B0269 for ; Wed, 6 Nov 2019 00:16:23 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 3E62C2C9D for ; Wed, 6 Nov 2019 05:16:23 +0000 (UTC) X-FDA: 76124691846.05.coil69_37b269a0d9208 X-Spam-Summary: 2,0,0,1414ba49181a8992,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:guro@fb.com:hannes@cmpxchg.org::mhocko@suse.com:mm-commits@vger.kernel.org:shakeelb@google.com:stable@vger.kernel.org:torvalds@linux-foundation.org:vdavydov.dev@gmail.com,RULES_HIT:41:355:379:800:960:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1535:1544:1605:1711:1730:1747:1777:1792:2196:2199:2393:2525:2559:2563:2682:2685:2859:2897:2899:2901:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4117:4321:4385:5007:6119:6120:6238:6261:6653:7514:7576:7809:7903:8599:8660:9025:9163:9545:10004:10913:11026:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12740:12783:12895:12986:13148:13221:13229:13230:13846:13870:14093:14096:14181:14721:14849:21080:21451:21622:21795:21939:30012:30029:30051:30054:30056:30064,0,RBL:error,CacheIP:none,Bayesian:0.5,0. 5,0.5,Ne X-HE-Tag: coil69_37b269a0d9208 X-Filterd-Recvd-Size: 6302 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:22 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 551BF206A3; Wed, 6 Nov 2019 05:16:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017381; bh=aH/jj0H4fMOvSzV5fvt6swlyJYSWxEBSqHHIoXLKLFM=; h=Date:From:To:Subject:From; b=upSax/QBfbsLXteQ4sH5d5fNopRj7PpMHPYSQdUnaDVhAquYUbMoTfYzrg4qvA+sX Co472BfVNqF+BGfFTQdGVII7xUq5il9up9IUriWDlEA9RLbPottL9FNY2j4wzBvTGF 4nxRkBEQam8R9opCKbgPn8QMPnSisbo5JspcjX5w= Date: Tue, 05 Nov 2019 21:16:21 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, shakeelb@google.com, stable@vger.kernel.org, torvalds@linux-foundation.org, vdavydov.dev@gmail.com Subject: [patch 01/17] mm: memcontrol: fix NULL-ptr deref in percpu stats flush Message-ID: <20191106051621.ptBmJsVW2%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shakeel Butt Subject: mm: memcontrol: fix NULL-ptr deref in percpu stats flush __mem_cgroup_free() can be called on the failure path in mem_cgroup_alloc(). However memcg_flush_percpu_vmstats() and memcg_flush_percpu_vmevents() which are called from __mem_cgroup_free() access the fields of memcg which can potentially be null if called from failure path from mem_cgroup_alloc(). Indeed syzbot has reported the following crash: R13: 00000000004bf27d R14: 00000000004db028 R15: 0000000000000003 kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 30393 Comm: syz-executor.1 Not tainted 5.4.0-rc2+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:memcg_flush_percpu_vmstats+0x4ae/0x930 mm/memcontrol.c:3436 Code: 05 41 89 c0 41 0f b6 04 24 41 38 c7 7c 08 84 c0 0f 85 5d 03 00 00 44 3b 05 33 d5 12 08 0f 83 e2 00 00 00 4c 89 f0 48 c1 e8 03 <42> 80 3c 28 00 0f 85 91 03 00 00 48 8b 85 10 fe ff ff 48 8b b0 90 RSP: 0018:ffff888095c27980 EFLAGS: 00010206 RAX: 0000000000000012 RBX: ffff888095c27b28 RCX: ffffc90008192000 RDX: 0000000000040000 RSI: ffffffff8340fae7 RDI: 0000000000000007 RBP: ffff888095c27be0 R08: 0000000000000000 R09: ffffed1013f0da33 R10: ffffed1013f0da32 R11: ffff88809f86d197 R12: fffffbfff138b760 R13: dffffc0000000000 R14: 0000000000000090 R15: 0000000000000007 FS: 00007f5027170700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000710158 CR3: 00000000a7b18000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __mem_cgroup_free+0x1a/0x190 mm/memcontrol.c:5021 mem_cgroup_free mm/memcontrol.c:5033 [inline] mem_cgroup_css_alloc+0x3a1/0x1ae0 mm/memcontrol.c:5160 css_create kernel/cgroup/cgroup.c:5156 [inline] cgroup_apply_control_enable+0x44d/0xc40 kernel/cgroup/cgroup.c:3119 cgroup_mkdir+0x899/0x11b0 kernel/cgroup/cgroup.c:5401 kernfs_iop_mkdir+0x14d/0x1d0 fs/kernfs/dir.c:1124 vfs_mkdir+0x42e/0x670 fs/namei.c:3807 do_mkdirat+0x234/0x2a0 fs/namei.c:3830 __do_sys_mkdir fs/namei.c:3846 [inline] __se_sys_mkdir fs/namei.c:3844 [inline] __x64_sys_mkdir+0x5c/0x80 fs/namei.c:3844 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x459a59 Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f502716fc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000053 RAX: ffffffffffffffda RBX: 00007f502716fc90 RCX: 0000000000459a59 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000180 RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f50271706d4 R13: 00000000004bf27d R14: 00000000004db028 R15: 0000000000000003 Fixing this by moving the flush to mem_cgroup_free as there is no need to flush anything if we see failure in mem_cgroup_alloc(). Link: http://lkml.kernel.org/r/20191018165231.249872-1-shakeelb@google.com Fixes: bb65f89b7d3d ("mm: memcontrol: flush percpu vmevents before releasing memcg") Fixes: c350a99ea2b1 ("mm: memcontrol: flush percpu vmstats before releasing memcg") Signed-off-by: Shakeel Butt Reported-by: syzbot+515d5bcfe179cdf049b2@syzkaller.appspotmail.com Reviewed-by: Roman Gushchin Cc: Michal Hocko Cc: Johannes Weiner Cc: Vladimir Davydov Cc: Signed-off-by: Andrew Morton --- mm/memcontrol.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) --- a/mm/memcontrol.c~mm-memcontrol-fix-null-ptr-deref-in-percpu-stats-flush +++ a/mm/memcontrol.c @@ -5014,12 +5014,6 @@ static void __mem_cgroup_free(struct mem { int node; - /* - * Flush percpu vmstats and vmevents to guarantee the value correctness - * on parent's and all ancestor levels. - */ - memcg_flush_percpu_vmstats(memcg, false); - memcg_flush_percpu_vmevents(memcg); for_each_node(node) free_mem_cgroup_per_node_info(memcg, node); free_percpu(memcg->vmstats_percpu); @@ -5030,6 +5024,12 @@ static void __mem_cgroup_free(struct mem static void mem_cgroup_free(struct mem_cgroup *memcg) { memcg_wb_domain_exit(memcg); + /* + * Flush percpu vmstats and vmevents to guarantee the value correctness + * on parent's and all ancestor levels. + */ + memcg_flush_percpu_vmstats(memcg, false); + memcg_flush_percpu_vmevents(memcg); __mem_cgroup_free(memcg); } From patchwork Wed Nov 6 05:16:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229297 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 19F131515 for ; Wed, 6 Nov 2019 05:16:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C4D60217F5 for ; Wed, 6 Nov 2019 05:16:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="xm73wNtX" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C4D60217F5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0FA7D6B026A; Wed, 6 Nov 2019 00:16:27 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0AB7A6B026B; Wed, 6 Nov 2019 00:16:27 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F035E6B026C; Wed, 6 Nov 2019 00:16:26 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0081.hostedemail.com [216.40.44.81]) by kanga.kvack.org (Postfix) with ESMTP id DB1906B026A for ; Wed, 6 Nov 2019 00:16:26 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 38534180AD817 for ; Wed, 6 Nov 2019 05:16:26 +0000 (UTC) X-FDA: 76124691972.07.river78_3828dc440b60a X-Spam-Summary: 2,0,0,a18bd3174de0da40,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:jglisse@redhat.com:jhubbard@nvidia.com:keith.busch@intel.com::mm-commits@vger.kernel.org:torvalds@linux-foundation.org,RULES_HIT:41:152:355:379:800:960:967:973:982:988:989:1260:1263:1277:1311:1313:1314:1345:1381:1431:1437:1513:1515:1516:1518:1521:1534:1542:1593:1594:1711:1730:1747:1777:1792:2393:2525:2559:2563:2682:2685:2859:2902:2915:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3870:3871:3872:3873:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:4605:5007:6261:6653:7576:7809:7882:7903:8599:9025:9545:10004:10400:10913:11026:11658:11914:12043:12048:12296:12297:12517:12519:12555:12679:12783:12986:13161:13184:13229:13846:14181:14721:14849:21080:21451:21627:21795:21796:21939:30036:30051:30054:30056:30064,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neu tral,Cus X-HE-Tag: river78_3828dc440b60a X-Filterd-Recvd-Size: 3383 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf04.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:25 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 78D4721882; Wed, 6 Nov 2019 05:16:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017384; bh=zAERiXKmMRx/TRsIN7Jcr7ADe37v/DvY6Bw5+uHaPfg=; h=Date:From:To:Subject:From; b=xm73wNtX1k9mKKXl3snRl2/h0vrMr/9zRXXDH5wKBcEFlmUOF7Re53nds+PqIk9ID bERqUW95z6GFsE5pez7mFXYx84MMgBQ3xk9YIl0um1HztwWV771kIwyapvtv/BvktP +LlDkTMefoenfkWwSn/Rw/ovPDMeN9fgRGZQ8D7k= Date: Tue, 05 Nov 2019 21:16:24 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, jglisse@redhat.com, jhubbard@nvidia.com, keith.busch@intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 02/17] mm/gup_benchmark: fix MAP_HUGETLB case Message-ID: <20191106051624.K0OGr88gF%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: John Hubbard Subject: mm/gup_benchmark: fix MAP_HUGETLB case The MAP_HUGETLB ("-H" option) of gup_benchmark fails: $ sudo ./gup_benchmark -H mmap: Invalid argument This is because gup_benchmark.c is passing in a file descriptor to mmap(), but the fd came from opening up the /dev/zero file. This confuses the mmap syscall implementation, which thinks that, if the caller did not specify MAP_ANONYMOUS, then the file must be a huge page file. So it attempts to verify that the file really is a huge page file, as you can see here: ksys_mmap_pgoff() { if (!(flags & MAP_ANONYMOUS)) { retval = -EINVAL; if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file))) goto out_fput; /* THIS IS WHERE WE END UP */ else if (flags & MAP_HUGETLB) { ...proceed normally, /dev/zero is ok here... ...and of course is_file_hugepages() returns "false" for the /dev/zero file. The problem is that the user space program, gup_benchmark.c, really just wants anonymous memory here. The simplest way to get that is to pass MAP_ANONYMOUS whenever MAP_HUGETLB is specified, so that's what this patch does. Link: http://lkml.kernel.org/r/20191021212435.398153-2-jhubbard@nvidia.com Signed-off-by: John Hubbard Reviewed-by: Andrew Morton Reviewed-by: Jérôme Glisse Cc: Keith Busch Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/gup_benchmark.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/tools/testing/selftests/vm/gup_benchmark.c~mm-gup_benchmark-fix-map_hugetlb-case +++ a/tools/testing/selftests/vm/gup_benchmark.c @@ -71,7 +71,7 @@ int main(int argc, char **argv) flags |= MAP_SHARED; break; case 'H': - flags |= MAP_HUGETLB; + flags |= (MAP_HUGETLB | MAP_ANONYMOUS); break; default: return -1; From patchwork Wed Nov 6 05:16:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229299 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 539C215AB for ; Wed, 6 Nov 2019 05:16:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F0EE1217F5 for ; Wed, 6 Nov 2019 05:16:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="OrmJK3s1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F0EE1217F5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 234736B026B; Wed, 6 Nov 2019 00:16:30 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1E6536B026C; Wed, 6 Nov 2019 00:16:30 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D28F6B026D; Wed, 6 Nov 2019 00:16:30 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0160.hostedemail.com [216.40.44.160]) by kanga.kvack.org (Postfix) with ESMTP id ECC516B026B for ; Wed, 6 Nov 2019 00:16:29 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id A5AD9180AD817 for ; Wed, 6 Nov 2019 05:16:29 +0000 (UTC) X-FDA: 76124692098.22.room63_38a679fda163e X-Spam-Summary: 2,0,0,4d38d37801546087,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:bp@alien8.de:cai@lca.pw:david@redhat.com::matt@codeblueprint.co.uk:mgorman@techsingularity.net:mhocko@suse.com:mm-commits@vger.kernel.org:stable@vger.kernel.org:tglx@linutronix.de:torvalds@linux-foundation.org:vbabka@suse.cz,RULES_HIT:41:355:379:421:800:960:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1535:1544:1605:1711:1730:1747:1777:1792:1801:2196:2198:2199:2200:2393:2525:2559:2564:2682:2685:2693:2859:2895:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3167:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4250:4321:4385:4605:5007:6119:6261:6653:6737:7576:7774:7875:7903:8599:8603:8666:9010:9025:9040:9121:9545:10004:10913:11026:11473:11658:11914:12043:12048:12297:12438:12517:12519:12555:12679:12783:12895:12986:13151:13161:13228:13229:13255:13846:14181:14721:14819:14849:21080:21222:21324:21451:21 627:2174 X-HE-Tag: room63_38a679fda163e X-Filterd-Recvd-Size: 5953 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:29 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B574A206A3; Wed, 6 Nov 2019 05:16:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017388; bh=n7FB6KyjIXvLNXEJ+mQXziurJpOoivmMjY7dh/PdhGk=; h=Date:From:To:Subject:From; b=OrmJK3s1zbBdF48SvVYgRckR7MDcwR/yZwqJjTBnm5arZs3p1dSiVXrV9SO7cHXla 2C+dQJFjzsfCRKA1661dQjuOcpxOXCJUTOSNkBAM8lDvIdgW4wNoZM7sDI+I77Sh7C U4CYXf9EPs/x7G/Uoh1nOXrEgBUaGb+6iusV5UDw= Date: Tue, 05 Nov 2019 21:16:27 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, bp@alien8.de, cai@lca.pw, david@redhat.com, linux-mm@kvack.org, matt@codeblueprint.co.uk, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, stable@vger.kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 03/17] mm, meminit: recalculate pcpu batch and high limits after init completes Message-ID: <20191106051627.Ny4e_A2F5%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mel Gorman Subject: mm, meminit: recalculate pcpu batch and high limits after init completes Deferred memory initialisation updates zone->managed_pages during the initialisation phase but before that finishes, the per-cpu page allocator (pcpu) calculates the number of pages allocated/freed in batches as well as the maximum number of pages allowed on a per-cpu list. As zone->managed_pages is not up to date yet, the pcpu initialisation calculates inappropriately low batch and high values. This increases zone lock contention quite severely in some cases with the degree of severity depending on how many CPUs share a local zone and the size of the zone. A private report indicated that kernel build times were excessive with extremely high system CPU usage. A perf profile indicated that a large chunk of time was lost on zone->lock contention. This patch recalculates the pcpu batch and high values after deferred initialisation completes for every populated zone in the system. It was tested on a 2-socket AMD EPYC 2 machine using a kernel compilation workload -- allmodconfig and all available CPUs. mmtests configuration: config-workload-kernbench-max Configuration was modified to build on a fresh XFS partition. kernbench 5.4.0-rc3 5.4.0-rc3 vanilla resetpcpu-v2 Amean user-256 13249.50 ( 0.00%) 16401.31 * -23.79%* Amean syst-256 14760.30 ( 0.00%) 4448.39 * 69.86%* Amean elsp-256 162.42 ( 0.00%) 119.13 * 26.65%* Stddev user-256 42.97 ( 0.00%) 19.15 ( 55.43%) Stddev syst-256 336.87 ( 0.00%) 6.71 ( 98.01%) Stddev elsp-256 2.46 ( 0.00%) 0.39 ( 84.03%) 5.4.0-rc3 5.4.0-rc3 vanilla resetpcpu-v2 Duration User 39766.24 49221.79 Duration System 44298.10 13361.67 Duration Elapsed 519.11 388.87 The patch reduces system CPU usage by 69.86% and total build time by 26.65%. The variance of system CPU usage is also much reduced. Before, this was the breakdown of batch and high values over all zones was. 256 batch: 1 256 batch: 63 512 batch: 7 256 high: 0 256 high: 378 512 high: 42 512 pcpu pagesets had a batch limit of 7 and a high limit of 42. After the patch: 256 batch: 1 768 batch: 63 256 high: 0 768 high: 378 [mgorman@techsingularity.net: fix merge/linkage snafu] Link: http://lkml.kernel.org/r/20191023084705.GD3016@techsingularity.netLink: http://lkml.kernel.org/r/20191021094808.28824-2-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Michal Hocko Acked-by: Vlastimil Babka Acked-by: David Hildenbrand Cc: Matt Fleming Cc: Thomas Gleixner Cc: Borislav Petkov Cc: Qian Cai Cc: [4.1+] Signed-off-by: Andrew Morton --- mm/page_alloc.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) --- a/mm/page_alloc.c~mm-meminit-recalculate-pcpu-batch-and-high-limits-after-init-completes +++ a/mm/page_alloc.c @@ -1948,6 +1948,14 @@ void __init page_alloc_init_late(void) wait_for_completion(&pgdat_init_all_done_comp); /* + * The number of managed pages has changed due to the initialisation + * so the pcpu batch and high limits needs to be updated or the limits + * will be artificially small. + */ + for_each_populated_zone(zone) + zone_pcp_update(zone); + + /* * We initialized the rest of the deferred pages. Permanently disable * on-demand struct page initialization. */ @@ -8514,7 +8522,6 @@ void free_contig_range(unsigned long pfn WARN(count != 0, "%d pages are still in use!\n", count); } -#ifdef CONFIG_MEMORY_HOTPLUG /* * The zone indicated has a new number of managed_pages; batch sizes and percpu * page high values need to be recalulated. @@ -8528,7 +8535,6 @@ void __meminit zone_pcp_update(struct zo per_cpu_ptr(zone->pageset, cpu)); mutex_unlock(&pcp_batch_high_lock); } -#endif void zone_pcp_reset(struct zone *zone) { From patchwork Wed Nov 6 05:16:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229301 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 07D3115AB for ; Wed, 6 Nov 2019 05:16:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A5BB42187F for ; Wed, 6 Nov 2019 05:16:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="FpKemzW6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A5BB42187F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D57946B026C; Wed, 6 Nov 2019 00:16:33 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D2E906B026D; Wed, 6 Nov 2019 00:16:33 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6C726B026E; Wed, 6 Nov 2019 00:16:33 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0083.hostedemail.com [216.40.44.83]) by kanga.kvack.org (Postfix) with ESMTP id B2B8C6B026C for ; Wed, 6 Nov 2019 00:16:33 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 375238249980 for ; Wed, 6 Nov 2019 05:16:33 +0000 (UTC) X-FDA: 76124692266.09.unit11_3926110d29241 X-Spam-Summary: 2,0,0,859df64c89619ab5,d41d8cd98f00b204,akpm@linux-foundation.org,:aarcange@redhat.com:akpm@linux-foundation.org:gavin.dg@linux.alibaba.com:hughd@google.com:kirill.shutemov@linux.intel.com::mm-commits@vger.kernel.org:stable@vger.kernel.org:torvalds@linux-foundation.org:willy@infradead.org:yang.shi@linux.alibaba.com,RULES_HIT:2:41:355:379:800:960:966:967:968:973:988:989:1260:1263:1345:1381:1431:1437:1535:1606:1730:1747:1777:1792:2194:2196:2199:2200:2393:2525:2553:2559:2564:2682:2685:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4117:4250:4321:4385:4605:4837:5007:6119:6120:6261:6653:6737:7576:7901:7903:8603:8957:9025:9391:9545:9592:10004:10913:11026:11473:11638:11639:11658:11914:12043:12048:12295:12296:12297:12438:12517:12519:12555:12679:12740:12783:12895:12986:13221:13229:13870:14096:21060:21080:21220:21433:21451:21627:21819:21939:300 12:30034 X-HE-Tag: unit11_3926110d29241 X-Filterd-Recvd-Size: 6731 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:32 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4AD42217F4; Wed, 6 Nov 2019 05:16:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017391; bh=R3g8c6kIW5WzWb9NlCfXi7C9BnPG7b7quIZ/FZu3a0c=; h=Date:From:To:Subject:From; b=FpKemzW6NYL8uDjDS5uB10wj/coKp1ztckKTymmJSoUxBuUUqW0FCoR4UYor9PW38 b4PueJAfvzTThuMu/iB1DnwJ45yraUE9Ui5viQSeteCc+x3LA5nGM1AVuwHCp0mLB9 BfiT8WfRe2U9aMmmQ9nE+Xj7HavfG0agowRGyyLs= Date: Tue, 05 Nov 2019 21:16:30 -0800 From: akpm@linux-foundation.org To: aarcange@redhat.com, akpm@linux-foundation.org, gavin.dg@linux.alibaba.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org, willy@infradead.org, yang.shi@linux.alibaba.com Subject: [patch 04/17] mm: thp: handle page cache THP correctly in PageTransCompoundMap Message-ID: <20191106051630.mVkfeUoJk%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: thp: handle page cache THP correctly in PageTransCompoundMap We have a usecase to use tmpfs as QEMU memory backend and we would like to take the advantage of THP as well. But, our test shows the EPT is not PMD mapped even though the underlying THP are PMD mapped on host. The number showed by /sys/kernel/debug/kvm/largepage is much less than the number of PMD mapped shmem pages as the below: 7f2778200000-7f2878200000 rw-s 00000000 00:14 262232 /dev/shm/qemu_back_mem.mem.Hz2hSf (deleted) Size: 4194304 kB [snip] AnonHugePages: 0 kB ShmemPmdMapped: 579584 kB [snip] Locked: 0 kB cat /sys/kernel/debug/kvm/largepages 12 And some benchmarks do worse than with anonymous THPs. By digging into the code we figured out that commit 127393fbe597 ("mm: thp: kvm: fix memory corruption in KVM with THP enabled") checks if there is a single PTE mapping on the page for anonymous THP when setting up EPT map. But, the _mapcount < 0 check doesn't fit to page cache THP since every subpage of page cache THP would get _mapcount inc'ed once it is PMD mapped, so PageTransCompoundMap() always returns false for page cache THP. This would prevent KVM from setting up PMD mapped EPT entry. So we need handle page cache THP correctly. However, when page cache THP's PMD gets split, kernel just remove the map instead of setting up PTE map like what anonymous THP does. Before KVM calls get_user_pages() the subpages may get PTE mapped even though it is still a THP since the page cache THP may be mapped by other processes at the mean time. Checking its _mapcount and whether the THP has PTE mapped or not. Although this may report some false negative cases (PTE mapped by other processes), it looks not trivial to make this accurate. With this fix /sys/kernel/debug/kvm/largepage would show reasonable pages are PMD mapped by EPT as the below: 7fbeaee00000-7fbfaee00000 rw-s 00000000 00:14 275464 /dev/shm/qemu_back_mem.mem.SKUvat (deleted) Size: 4194304 kB [snip] AnonHugePages: 0 kB ShmemPmdMapped: 557056 kB [snip] Locked: 0 kB cat /sys/kernel/debug/kvm/largepages 271 And the benchmarks are as same as anonymous THPs. [yang.shi@linux.alibaba.com: v4] Link: http://lkml.kernel.org/r/1571865575-42913-1-git-send-email-yang.shi@linux.alibaba.com Link: http://lkml.kernel.org/r/1571769577-89735-1-git-send-email-yang.shi@linux.alibaba.com Fixes: dd78fedde4b9 ("rmap: support file thp") Signed-off-by: Yang Shi Reported-by: Gang Deng Tested-by: Gang Deng Suggested-by: Hugh Dickins Acked-by: Kirill A. Shutemov Cc: Andrea Arcangeli Cc: Matthew Wilcox Cc: [4.8+] Signed-off-by: Andrew Morton --- include/linux/mm.h | 5 ----- include/linux/mm_types.h | 5 +++++ include/linux/page-flags.h | 20 ++++++++++++++++++-- 3 files changed, 23 insertions(+), 7 deletions(-) --- a/include/linux/mm.h~mm-thp-handle-page-cache-thp-correctly-in-pagetranscompoundmap +++ a/include/linux/mm.h @@ -695,11 +695,6 @@ static inline void *kvcalloc(size_t n, s extern void kvfree(const void *addr); -static inline atomic_t *compound_mapcount_ptr(struct page *page) -{ - return &page[1].compound_mapcount; -} - static inline int compound_mapcount(struct page *page) { VM_BUG_ON_PAGE(!PageCompound(page), page); --- a/include/linux/mm_types.h~mm-thp-handle-page-cache-thp-correctly-in-pagetranscompoundmap +++ a/include/linux/mm_types.h @@ -221,6 +221,11 @@ struct page { #endif } _struct_page_alignment; +static inline atomic_t *compound_mapcount_ptr(struct page *page) +{ + return &page[1].compound_mapcount; +} + /* * Used for sizing the vmemmap region on some architectures */ --- a/include/linux/page-flags.h~mm-thp-handle-page-cache-thp-correctly-in-pagetranscompoundmap +++ a/include/linux/page-flags.h @@ -622,12 +622,28 @@ static inline int PageTransCompound(stru * * Unlike PageTransCompound, this is safe to be called only while * split_huge_pmd() cannot run from under us, like if protected by the - * MMU notifier, otherwise it may result in page->_mapcount < 0 false + * MMU notifier, otherwise it may result in page->_mapcount check false * positives. + * + * We have to treat page cache THP differently since every subpage of it + * would get _mapcount inc'ed once it is PMD mapped. But, it may be PTE + * mapped in the current process so comparing subpage's _mapcount to + * compound_mapcount to filter out PTE mapped case. */ static inline int PageTransCompoundMap(struct page *page) { - return PageTransCompound(page) && atomic_read(&page->_mapcount) < 0; + struct page *head; + + if (!PageTransCompound(page)) + return 0; + + if (PageAnon(page)) + return atomic_read(&page->_mapcount) < 0; + + head = compound_head(page); + /* File THP is PMD mapped and not PTE mapped */ + return atomic_read(&page->_mapcount) == + atomic_read(compound_mapcount_ptr(head)); } /* From patchwork Wed Nov 6 05:16:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229303 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C57E81515 for ; Wed, 6 Nov 2019 05:16:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6BBFD217F4 for ; Wed, 6 Nov 2019 05:16:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="0oRRke7i" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6BBFD217F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 58F236B026D; Wed, 6 Nov 2019 00:16:37 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 566336B026E; Wed, 6 Nov 2019 00:16:37 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 47D086B026F; Wed, 6 Nov 2019 00:16:37 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0084.hostedemail.com [216.40.44.84]) by kanga.kvack.org (Postfix) with ESMTP id 347856B026D for ; Wed, 6 Nov 2019 00:16:37 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id A656D441E for ; Wed, 6 Nov 2019 05:16:36 +0000 (UTC) X-FDA: 76124692392.02.loss45_39a77566a5b30 X-Spam-Summary: 2,0,0,63c82a910f565cfe,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:gechangwei@live.cn:ghe@suse.com:jiangqi903@gmail.com:jlbec@evilplan.org:junxiao.bi@oracle.com::mark@fasheh.com:mm-commits@vger.kernel.org:piaojun@huawei.com:stable@vger.kernel.org:sunny.s.zhang@oracle.com:torvalds@linux-foundation.org,RULES_HIT:1:2:41:69:355:379:800:960:967:968:973:988:989:1260:1263:1345:1381:1431:1437:1605:1730:1747:1777:1792:2393:2525:2553:2559:2563:2682:2685:2693:2859:2897:2898:2902:2918:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3150:3865:3866:3867:3868:3870:3871:3872:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4052:4250:4321:4605:5007:6119:6121:6238:6261:6653:6737:7208:7514:7576:7875:7903:8531:8660:9025:9036:9163:9545:9592:10004:10913:11026:11658:11914:12043:12048:12050:12114:12291:12296:12297:12438:12517:12519:12555:12679:12683:12740:12783:12895:12986:13148:13161:13229:13230:13548:13846:13870:14096:21080:21324:214 33:21451 X-HE-Tag: loss45_39a77566a5b30 X-Filterd-Recvd-Size: 10936 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:36 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9CAA7217F5; Wed, 6 Nov 2019 05:16:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017395; bh=NFcCkCtqJWbmpNHFhYBvt1Dmz0uav22Na6I5TGqAZ1c=; h=Date:From:To:Subject:From; b=0oRRke7i2QIZB7AaSqmlhB79czvTUSJNtbnpKQggNutFXdpwxDXrfxwlDfz+z8Dkv sca+UQoz38mKj2JzvgaM5PJZxMkIEWFKsLUpjYRHvrDEn7i1z4tEZri+YAgBkq/uUz ZGaaqESurkGF+J/svER7IXYKpnkJTDg9f/6KJzUk= Date: Tue, 05 Nov 2019 21:16:34 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, gechangwei@live.cn, ghe@suse.com, jiangqi903@gmail.com, jlbec@evilplan.org, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, stable@vger.kernel.org, sunny.s.zhang@oracle.com, torvalds@linux-foundation.org Subject: [patch 05/17] ocfs2: protect extent tree in ocfs2_prepare_inode_for_write() Message-ID: <20191106051634.IwGqLbBvh%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shuning Zhang Subject: ocfs2: protect extent tree in ocfs2_prepare_inode_for_write() When the extent tree is modified, it should be protected by inode cluster lock and ip_alloc_sem. The extent tree is accessed and modified in the ocfs2_prepare_inode_for_write, but isn't protected by ip_alloc_sem. The following is a case. The function ocfs2_fiemap is accessing the extent tree, which is modified at the same time. [47145.974472] kernel BUG at fs/ocfs2/extent_map.c:475! [47145.974480] invalid opcode: 0000 [#1] SMP [47145.974489] Modules linked in: tun ocfs2 ocfs2_nodemanager configfs ocfs2_stackglue xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn xenfs xen_privcmd vfat fat bnx2fc fcoe libfcoe libfc scsi_transport_fc sunrpc bridge 8021q mrp garp stp llc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr dm_round_robin dm_multipath sg pcspkr raid1 shpchp ipmi_devintf ipmi_msghandler ext4 jbd2 mbcache2 sd_mod nvme nvme_core bnxt_en xhci_pci xhci_hcd crc32c_intel be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi ipv6 cxgb3 mdio qla4xxx wmi dm_mirror dm_region_hash dm_log dm_mod iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iscsi_ibft iscsi_boot_sysfs [47145.974636] CPU: 16 PID: 14047 Comm: o2info Not tainted 4.1.12-124.23.1.el6uek.x86_64 #2 [47145.974646] Hardware name: Oracle Corporation ORACLE SERVER X7-2L/ASM, MB MECH, X7-2L, BIOS 42040600 10/19/2018 [47145.974658] task: ffff88019487e200 ti: ffff88003daa4000 task.ti: ffff88003daa4000 [47145.974667] RIP: e030:[] [] ocfs2_get_clusters_nocache.isra.11+0x390/0x550 [ocfs2] [47145.974708] RSP: e02b:ffff88003daa7d88 EFLAGS: 00010287 [47145.974713] RAX: 00000000000000de RBX: ffff8801d1104030 RCX: ffff8801d1104e10 [47145.974719] RDX: 00000000000000de RSI: 000000000009ec40 RDI: ffff8801d1104e24 [47145.974725] RBP: ffff88003daa7df8 R08: ffff88003daa7e38 R09: 0000000000000000 [47145.974732] R10: 000000000009ec3f R11: 0000000000000246 R12: 000000000009ec3f [47145.974739] R13: ffff88004c419000 R14: 0000000000000002 R15: ffff88003daa7e28 [47145.974754] FS: 00007fdbccc92720(0000) GS:ffff880358800000(0000) knlGS:ffff880358800000 [47145.974764] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [47145.974772] CR2: 00007fd5dfcd8350 CR3: 0000000208677000 CR4: 0000000000042660 [47145.974785] Stack: [47145.974790] ffff88003daa7df8 00002000a05e249b ffff8801d1104000 ffff88003daa7e2c [47145.974802] ffff88003daa7e38 ffff88000cc484c0 ffff880145f5b478 0000000000000000 [47145.974811] 0000000000002000 ffff88000cc484c0 ffff88003daa7ea0 0000000000000000 [47145.974820] Call Trace: [47145.974837] [] ocfs2_fiemap+0x1e3/0x430 [ocfs2] [47145.974848] [] ? xen_hypervisor_callback+0x7f/0x120 [47145.974855] [] ? xen_hypervisor_callback+0x78/0x120 [47145.974861] [] ? xen_hypervisor_callback+0xd3/0x120 [47145.974872] [] do_vfs_ioctl+0x155/0x510 [47145.974878] [] SyS_ioctl+0x81/0xa0 [47145.974885] [] ? system_call_after_swapgs+0xe9/0x190 [47145.974891] [] ? system_call_after_swapgs+0xe2/0x190 [47145.974899] [] ? system_call_after_swapgs+0xdb/0x190 [47145.974905] [] system_call_fastpath+0x18/0xd8 [47145.974910] Code: 18 48 c7 c6 60 7f 65 a0 31 c0 bb e2 ff ff ff 48 8b 4a 40 48 8b 7a 28 48 c7 c2 78 2d 66 a0 e8 38 4f 05 00 e9 28 fe ff ff 0f 1f 00 <0f> 0b 66 0f 1f 44 00 00 bb 86 ff ff ff e9 13 fe ff ff 66 0f 1f [47145.975000] RIP [] ocfs2_get_clusters_nocache.isra.11+0x390/0x550 [ocfs2] [47145.975018] RSP [47145.989999] ---[ end trace c8aa0c8180e869dc ]--- [47146.087579] Kernel panic - not syncing: Fatal exception [47146.087691] Kernel Offset: disabled This issue can be reproduced every week in a production environment. This issue is related to the usage mode. If others use ocfs2 in this mode, the kernel will panic frequently. [akpm@linux-foundation.org: coding style fixes] Link: http://lkml.kernel.org/r/1568772175-2906-2-git-send-email-sunny.s.zhang@oracle.com Signed-off-by: Shuning Zhang Reviewed-by: Junxiao Bi Reviewed-by: Gang He Cc: Mark Fasheh Cc: Joel Becker Cc: Joseph Qi Cc: Changwei Ge Cc: Jun Piao Cc: Signed-off-by: Andrew Morton --- fs/ocfs2/file.c | 125 ++++++++++++++++++++++++++++++++++++---------- 1 file changed, 99 insertions(+), 26 deletions(-) --- a/fs/ocfs2/file.c~ocfs2-protect-extent-tree-in-the-ocfs2_prepare_inode_for_write +++ a/fs/ocfs2/file.c @@ -2125,26 +2125,89 @@ out: return ret; } +static int ocfs2_inode_lock_for_extent_tree(struct inode *inode, + struct buffer_head **di_bh, + int meta_level, + int overwrite_io, + int write_sem, + int wait) +{ + int ret = 0; + + if (wait) + ret = ocfs2_inode_lock(inode, NULL, meta_level); + else + ret = ocfs2_try_inode_lock(inode, + overwrite_io ? NULL : di_bh, meta_level); + if (ret < 0) + goto out; + + if (wait) { + if (write_sem) + down_write(&OCFS2_I(inode)->ip_alloc_sem); + else + down_read(&OCFS2_I(inode)->ip_alloc_sem); + } else { + if (write_sem) + ret = down_write_trylock(&OCFS2_I(inode)->ip_alloc_sem); + else + ret = down_read_trylock(&OCFS2_I(inode)->ip_alloc_sem); + + if (!ret) { + ret = -EAGAIN; + goto out_unlock; + } + } + + return ret; + +out_unlock: + brelse(*di_bh); + ocfs2_inode_unlock(inode, meta_level); +out: + return ret; +} + +static void ocfs2_inode_unlock_for_extent_tree(struct inode *inode, + struct buffer_head **di_bh, + int meta_level, + int write_sem) +{ + if (write_sem) + up_write(&OCFS2_I(inode)->ip_alloc_sem); + else + up_read(&OCFS2_I(inode)->ip_alloc_sem); + + brelse(*di_bh); + *di_bh = NULL; + + if (meta_level >= 0) + ocfs2_inode_unlock(inode, meta_level); +} + static int ocfs2_prepare_inode_for_write(struct file *file, loff_t pos, size_t count, int wait) { int ret = 0, meta_level = 0, overwrite_io = 0; + int write_sem = 0; struct dentry *dentry = file->f_path.dentry; struct inode *inode = d_inode(dentry); struct buffer_head *di_bh = NULL; + u32 cpos; + u32 clusters; /* * We start with a read level meta lock and only jump to an ex * if we need to make modifications here. */ for(;;) { - if (wait) - ret = ocfs2_inode_lock(inode, NULL, meta_level); - else - ret = ocfs2_try_inode_lock(inode, - overwrite_io ? NULL : &di_bh, meta_level); + ret = ocfs2_inode_lock_for_extent_tree(inode, + &di_bh, + meta_level, + overwrite_io, + write_sem, + wait); if (ret < 0) { - meta_level = -1; if (ret != -EAGAIN) mlog_errno(ret); goto out; @@ -2156,15 +2219,8 @@ static int ocfs2_prepare_inode_for_write */ if (!wait && !overwrite_io) { overwrite_io = 1; - if (!down_read_trylock(&OCFS2_I(inode)->ip_alloc_sem)) { - ret = -EAGAIN; - goto out_unlock; - } ret = ocfs2_overwrite_io(inode, di_bh, pos, count); - brelse(di_bh); - di_bh = NULL; - up_read(&OCFS2_I(inode)->ip_alloc_sem); if (ret < 0) { if (ret != -EAGAIN) mlog_errno(ret); @@ -2183,7 +2239,10 @@ static int ocfs2_prepare_inode_for_write * set inode->i_size at the end of a write. */ if (should_remove_suid(dentry)) { if (meta_level == 0) { - ocfs2_inode_unlock(inode, meta_level); + ocfs2_inode_unlock_for_extent_tree(inode, + &di_bh, + meta_level, + write_sem); meta_level = 1; continue; } @@ -2197,18 +2256,32 @@ static int ocfs2_prepare_inode_for_write ret = ocfs2_check_range_for_refcount(inode, pos, count); if (ret == 1) { - ocfs2_inode_unlock(inode, meta_level); - meta_level = -1; + ocfs2_inode_unlock_for_extent_tree(inode, + &di_bh, + meta_level, + write_sem); + ret = ocfs2_inode_lock_for_extent_tree(inode, + &di_bh, + meta_level, + overwrite_io, + 1, + wait); + write_sem = 1; + if (ret < 0) { + if (ret != -EAGAIN) + mlog_errno(ret); + goto out; + } - ret = ocfs2_prepare_inode_for_refcount(inode, - file, - pos, - count, - &meta_level); + cpos = pos >> OCFS2_SB(inode->i_sb)->s_clustersize_bits; + clusters = + ocfs2_clusters_for_bytes(inode->i_sb, pos + count) - cpos; + ret = ocfs2_refcount_cow(inode, di_bh, cpos, clusters, UINT_MAX); } if (ret < 0) { - mlog_errno(ret); + if (ret != -EAGAIN) + mlog_errno(ret); goto out_unlock; } @@ -2219,10 +2292,10 @@ out_unlock: trace_ocfs2_prepare_inode_for_write(OCFS2_I(inode)->ip_blkno, pos, count, wait); - brelse(di_bh); - - if (meta_level >= 0) - ocfs2_inode_unlock(inode, meta_level); + ocfs2_inode_unlock_for_extent_tree(inode, + &di_bh, + meta_level, + write_sem); out: return ret; From patchwork Wed Nov 6 05:16:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229305 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 85EF81515 for ; Wed, 6 Nov 2019 05:16:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3C4B1218AE for ; Wed, 6 Nov 2019 05:16:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="jubvlHr5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3C4B1218AE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0AC756B026E; Wed, 6 Nov 2019 00:16:40 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 05CEB6B026F; Wed, 6 Nov 2019 00:16:39 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB79D6B0270; Wed, 6 Nov 2019 00:16:39 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0149.hostedemail.com [216.40.44.149]) by kanga.kvack.org (Postfix) with ESMTP id D672D6B026E for ; Wed, 6 Nov 2019 00:16:39 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 89139824999B for ; Wed, 6 Nov 2019 05:16:39 +0000 (UTC) X-FDA: 76124692518.19.grape01_3a176b594fd3b X-Spam-Summary: 2,0,0,f1773f156265c97f,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:daniel.vetter@ffwll.ch:jgg@mellanox.com::mm-commits@vger.kernel.org:torvalds@linux-foundation.org,RULES_HIT:41:355:379:800:960:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1541:1711:1714:1730:1747:1777:1792:2393:2525:2559:2563:2682:2685:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3351:3865:3867:3870:3871:3872:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:5007:6261:6653:7576:8599:8784:9025:9545:10004:10913:11026:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12696:12737:12783:12986:13069:13161:13229:13311:13357:13846:14181:14384:14721:14849:21080:21451:21611:21627:21939:30012:30054:30070,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: grape01_3a176b594fd3b X-Filterd-Recvd-Size: 2349 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:39 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E43BC21872; Wed, 6 Nov 2019 05:16:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017398; bh=YBYrDKXpETJ/6fc5qqBIIw9eSao9nj5mmgbh6N4jLx0=; h=Date:From:To:Subject:From; b=jubvlHr5qUjrOMc3xfr9PEu68xPkevDHSHHv8cJPdFu7+FEGEgfcXxulsQlxZH4ZA ir0usGBEQyj5jscZQYD00j9KtG4Sj/2i+FaYy61UwWlRQzat/ojtH8zEICghyyGVtB zcL8ArHQ4xmI0pi5glTzcju3XXs/vfmQWcxwH7R8= Date: Tue, 05 Nov 2019 21:16:37 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, daniel.vetter@ffwll.ch, jgg@mellanox.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 06/17] mm/mmu_notifiers: use the right return code for WARN_ON Message-ID: <20191106051637.lVtnEjC3S%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jason Gunthorpe Subject: mm/mmu_notifiers: use the right return code for WARN_ON The return code from the op callback is actually in _ret, while the WARN_ON was checking ret which causes it to misfire. Link: http://lkml.kernel.org/r/20191025175502.GA31127@ziepe.ca Fixes: 8402ce61bec2 ("mm/mmu_notifiers: check if mmu notifier callbacks are allowed to fail") Signed-off-by: Jason Gunthorpe Reviewed-by: Andrew Morton Cc: Daniel Vetter Signed-off-by: Andrew Morton --- mm/mmu_notifier.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/mmu_notifier.c~mm-mmu_notifiers-use-the-right-return-code-for-warn_on +++ a/mm/mmu_notifier.c @@ -180,7 +180,7 @@ int __mmu_notifier_invalidate_range_star mn->ops->invalidate_range_start, _ret, !mmu_notifier_range_blockable(range) ? "non-" : ""); WARN_ON(mmu_notifier_range_blockable(range) || - ret != -EAGAIN); + _ret != -EAGAIN); ret = _ret; } } From patchwork Wed Nov 6 05:16:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229307 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EC77815AB for ; Wed, 6 Nov 2019 05:16:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A491C217F4 for ; Wed, 6 Nov 2019 05:16:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="VwhSV0Cx" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A491C217F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AE6896B026F; Wed, 6 Nov 2019 00:16:43 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A984C6B0270; Wed, 6 Nov 2019 00:16:43 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D2C26B0271; Wed, 6 Nov 2019 00:16:43 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id 89CE26B026F for ; Wed, 6 Nov 2019 00:16:43 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 3D300180AD817 for ; Wed, 6 Nov 2019 05:16:43 +0000 (UTC) X-FDA: 76124692686.24.ring96_3aa0475720143 X-Spam-Summary: 2,0,0,3376c3e4121b80cd,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:aquini@redhat.com:david@redhat.com:gregkh@linuxfoundation.org:guro@fb.com:hannes@cmpxchg.org:jannh@google.com:khlebnikov@yandex-team.ru::longman@redhat.com:mgorman@suse.de:mhocko@suse.com:mm-commits@vger.kernel.org:rientjes@google.com:songliubraving@fb.com:stable@vger.kernel.org:torvalds@linux-foundation.org:vbabka@suse.cz,RULES_HIT:41:355:379:800:960:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1542:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2525:2559:2563:2682:2685:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3353:3865:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:4385:5007:6119:6261:6653:6737:7576:7903:8599:9025:9207:9545:10004:10913:11026:11473:11658:11914:12043:12048:12296:12297:12517:12519:12555:12679:12783:12986:13221:13229:13846:14181:14721:14849:14915:21067:21080:21451:2 1627:219 X-HE-Tag: ring96_3aa0475720143 X-Filterd-Recvd-Size: 3669 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:42 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1FA53206A3; Wed, 6 Nov 2019 05:16:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017401; bh=r9BtuMYb3R9mJMA1w1aspU7g26I4zEz2UWfRcEJG6mY=; h=Date:From:To:Subject:From; b=VwhSV0CxiUhHaSpdK7Qk8pCd6Z1gq3yq1G1Mm4S/QdlCaVQkdvSmwQq/22gmatkVW nhujA01w/hJNQrL+7N7IUrZIaULDAbNvrEy124HHYIm/oDng7NNbrUALcipFbbW5lN OcqCJao2bBpWQ5HGw454xVSH1drx2NBE64qknV1s= Date: Tue, 05 Nov 2019 21:16:40 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, aquini@redhat.com, david@redhat.com, gregkh@linuxfoundation.org, guro@fb.com, hannes@cmpxchg.org, jannh@google.com, khlebnikov@yandex-team.ru, linux-mm@kvack.org, longman@redhat.com, mgorman@suse.de, mhocko@suse.com, mm-commits@vger.kernel.org, rientjes@google.com, songliubraving@fb.com, stable@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 07/17] mm, vmstat: hide /proc/pagetypeinfo from normal users Message-ID: <20191106051640.jFTxrBEBb%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Michal Hocko Subject: mm, vmstat: hide /proc/pagetypeinfo from normal users /proc/pagetypeinfo is a debugging tool to examine internal page allocator state wrt to fragmentation. It is not very useful for any other use so normal users really do not need to read this file. Waiman Long has noticed that reading this file can have negative side effects because zone->lock is necessary for gathering data and that a) interferes with the page allocator and its users and b) can lead to hard lockups on large machines which have very long free_list. Reduce both issues by simply not exporting the file to regular users. Link: http://lkml.kernel.org/r/20191025072610.18526-2-mhocko@kernel.org Fixes: 467c996c1e19 ("Print out statistics in relation to fragmentation avoidance to /proc/pagetypeinfo") Signed-off-by: Michal Hocko Reported-by: Waiman Long Acked-by: Mel Gorman Acked-by: Vlastimil Babka Acked-by: Waiman Long Acked-by: Rafael Aquini Acked-by: David Rientjes Reviewed-by: Andrew Morton Cc: David Hildenbrand Cc: Johannes Weiner Cc: Roman Gushchin Cc: Konstantin Khlebnikov Cc: Jann Horn Cc: Song Liu Cc: Greg Kroah-Hartman Cc: Signed-off-by: Andrew Morton --- mm/vmstat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/vmstat.c~mm-vmstat-hide-proc-pagetypeinfo-from-normal-users +++ a/mm/vmstat.c @@ -1972,7 +1972,7 @@ void __init init_mm_internals(void) #endif #ifdef CONFIG_PROC_FS proc_create_seq("buddyinfo", 0444, NULL, &fragmentation_op); - proc_create_seq("pagetypeinfo", 0444, NULL, &pagetypeinfo_op); + proc_create_seq("pagetypeinfo", 0400, NULL, &pagetypeinfo_op); proc_create_seq("vmstat", 0444, NULL, &vmstat_op); proc_create_seq("zoneinfo", 0444, NULL, &zoneinfo_op); #endif From patchwork Wed Nov 6 05:16:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229309 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 842AB15AB for ; Wed, 6 Nov 2019 05:16:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3CD1B217F5 for ; Wed, 6 Nov 2019 05:16:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="13AHPYdq" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3CD1B217F5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 47D056B0270; Wed, 6 Nov 2019 00:16:47 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 42CC56B0271; Wed, 6 Nov 2019 00:16:47 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 343686B0272; Wed, 6 Nov 2019 00:16:47 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0040.hostedemail.com [216.40.44.40]) by kanga.kvack.org (Postfix) with ESMTP id 201A26B0270 for ; Wed, 6 Nov 2019 00:16:47 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id D2E2E441E for ; Wed, 6 Nov 2019 05:16:46 +0000 (UTC) X-FDA: 76124692812.23.mine07_3b24fa18f5303 X-Spam-Summary: 2,0,0,643427dfa3096fc7,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:aquini@redhat.com:david@redhat.com:gregkh@linuxfoundation.org:guro@fb.com:hannes@cmpxchg.org:jannh@google.com:khlebnikov@yandex-team.ru::longman@redhat.com:mgorman@suse.de:mhocko@suse.com:mm-commits@vger.kernel.org:rientjes@google.com:songliubraving@fb.com:torvalds@linux-foundation.org:vbabka@suse.cz,RULES_HIT:41:355:379:800:960:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1543:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2525:2559:2563:2682:2685:2693:2731:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4250:4321:4385:4605:5007:6261:6653:6737:7576:7901:7903:8599:9025:9207:9545:10004:10128:10913:11026:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12679:12783:12986:13149:13161:13229:13230:13846:14181:14721:14849 :14915:2 X-HE-Tag: mine07_3b24fa18f5303 X-Filterd-Recvd-Size: 4940 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:46 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B5F6921A49; Wed, 6 Nov 2019 05:16:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017405; bh=cg/Xg2ZYmhpNwCUnp1S84InZujsfKbhVvuJtA4z0YhQ=; h=Date:From:To:Subject:From; b=13AHPYdqEeiEn3a0VNXC+AlzS573K5YLd4pACG0C23VENveOHvYxRbVSI8z++PLI9 f2zOcji5DDqe3jcJPFS/atdO/Bu2blA5RSYhZOUE6ZrnD6mpJiCHtjFL+DFzDupE+b vL7kkrb5i3OywAObuJrUjs9VaYnI/bX+Udjgsk20= Date: Tue, 05 Nov 2019 21:16:44 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, aquini@redhat.com, david@redhat.com, gregkh@linuxfoundation.org, guro@fb.com, hannes@cmpxchg.org, jannh@google.com, khlebnikov@yandex-team.ru, linux-mm@kvack.org, longman@redhat.com, mgorman@suse.de, mhocko@suse.com, mm-commits@vger.kernel.org, rientjes@google.com, songliubraving@fb.com, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 08/17] mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo Message-ID: <20191106051644.jR9CWg6LN%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Michal Hocko Subject: mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo pagetypeinfo_showfree_print is called by zone->lock held in irq mode. This is not really nice because it blocks both any interrupts on that cpu and the page allocator. On large machines this might even trigger the hard lockup detector. Considering the pagetypeinfo is a debugging tool we do not really need exact numbers here. The primary reason to look at the outuput is to see how pageblocks are spread among different migratetypes and low number of pages is much more interesting therefore putting a bound on the number of pages on the free_list sounds like a reasonable tradeoff. The new output will simply tell [...] Node 6, zone Normal, type Movable >100000 >100000 >100000 >100000 41019 31560 23996 10054 3229 983 648 instead of Node 6, zone Normal, type Movable 399568 294127 221558 102119 41019 31560 23996 10054 3229 983 648 The limit has been chosen arbitrary and it is a subject of a future change should there be a need for that. While we are at it, also drop the zone lock after each free_list iteration which will help with the IRQ and page allocator responsiveness even further as the IRQ lock held time is always bound to those 100k pages. [akpm@linux-foundation.org: tweak comment text, per David Hildenbrand] Link: http://lkml.kernel.org/r/20191025072610.18526-3-mhocko@kernel.org Signed-off-by: Michal Hocko Suggested-by: Andrew Morton Reviewed-by: Waiman Long Acked-by: Vlastimil Babka Acked-by: David Hildenbrand Acked-by: Rafael Aquini Acked-by: David Rientjes Reviewed-by: Andrew Morton Cc: Greg Kroah-Hartman Cc: Jann Horn Cc: Johannes Weiner Cc: Konstantin Khlebnikov Cc: Mel Gorman Cc: Roman Gushchin Cc: Song Liu Signed-off-by: Andrew Morton --- mm/vmstat.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) --- a/mm/vmstat.c~mm-vmstat-reduce-zone-lock-holding-time-by-proc-pagetypeinfo +++ a/mm/vmstat.c @@ -1383,12 +1383,29 @@ static void pagetypeinfo_showfree_print( unsigned long freecount = 0; struct free_area *area; struct list_head *curr; + bool overflow = false; area = &(zone->free_area[order]); - list_for_each(curr, &area->free_list[mtype]) - freecount++; - seq_printf(m, "%6lu ", freecount); + list_for_each(curr, &area->free_list[mtype]) { + /* + * Cap the free_list iteration because it might + * be really large and we are under a spinlock + * so a long time spent here could trigger a + * hard lockup detector. Anyway this is a + * debugging tool so knowing there is a handful + * of pages of this order should be more than + * sufficient. + */ + if (++freecount >= 100000) { + overflow = true; + break; + } + } + seq_printf(m, "%s%6lu ", overflow ? ">" : "", freecount); + spin_unlock_irq(&zone->lock); + cond_resched(); + spin_lock_irq(&zone->lock); } seq_putc(m, '\n'); } From patchwork Wed Nov 6 05:16:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229311 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 748601515 for ; Wed, 6 Nov 2019 05:16:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 26F8A21D7E for ; Wed, 6 Nov 2019 05:16:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="qu4EO8ty" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 26F8A21D7E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1C0316B0271; Wed, 6 Nov 2019 00:16:51 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 16EDF6B0272; Wed, 6 Nov 2019 00:16:51 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0ABAA6B0273; Wed, 6 Nov 2019 00:16:51 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0029.hostedemail.com [216.40.44.29]) by kanga.kvack.org (Postfix) with ESMTP id E6A236B0271 for ; Wed, 6 Nov 2019 00:16:50 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 67E33180AD815 for ; Wed, 6 Nov 2019 05:16:50 +0000 (UTC) X-FDA: 76124692980.29.bath13_3baa18f18c638 X-Spam-Summary: 2,0,0,be578525afeac8bc,d41d8cd98f00b204,akpm@linux-foundation.org,:aarcange@redhat.com:akpm@linux-foundation.org:bp@alien8.de:daniel.vetter@intel.com:hpa@zytor.com:ira.weiny@intel.com:jgg@mellanox.com:jglisse@redhat.com:kirill.shutemov@linux.intel.com::mingo@redhat.com:mm-commits@vger.kernel.org:rcampbell@nvidia.com:stable@vger.kernel.org:tglx@linutronix.de:torvalds@linux-foundation.org:ville.syrjala@linux.intel.com,RULES_HIT:41:152:355:379:800:960:967:968:973:988:989:1260:1263:1277:1311:1313:1314:1345:1381:1431:1437:1513:1515:1516:1518:1521:1534:1543:1593:1594:1711:1730:1747:1777:1792:2198:2199:2393:2525:2559:2563:2682:2685:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3167:3353:3865:3867:3868:3871:3872:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:5007:6261:6653:6737:7576:8599:8660:9025:9163:9545:9592:10004:10400:10913:11026:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12740:12783:12895:12 986:1314 X-HE-Tag: bath13_3baa18f18c638 X-Filterd-Recvd-Size: 4712 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:49 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5E07521929; Wed, 6 Nov 2019 05:16:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017409; bh=2192o0EalrK7xnxZaWr2apSoUfecLosbzkrrzZmsLuo=; h=Date:From:To:Subject:From; b=qu4EO8tysU1sVCgnyQobZXDSGbf5Kbhb1WXByJqVNJv99l7UdPAJwkBb0jEQgdOpN WiTiN0nCewepzq3swOOToFuFPikNwEEGyNtIP90wjptko1s3wKwnnsl1qeeHir2RX0 pKBwAk/HVz0BKoAXPjWjWBuzobdho0+ekqG/Ja7k= Date: Tue, 05 Nov 2019 21:16:48 -0800 From: akpm@linux-foundation.org To: aarcange@redhat.com, akpm@linux-foundation.org, bp@alien8.de, daniel.vetter@intel.com, hpa@zytor.com, ira.weiny@intel.com, jgg@mellanox.com, jglisse@redhat.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mingo@redhat.com, mm-commits@vger.kernel.org, rcampbell@nvidia.com, stable@vger.kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org, ville.syrjala@linux.intel.com Subject: [patch 09/17] mm/khugepaged: fix might_sleep() warn with CONFIG_HIGHPTE=y Message-ID: <20191106051648.GjLXleKr-%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ville Syrjälä Subject: mm/khugepaged: fix might_sleep() warn with CONFIG_HIGHPTE=y I got some khugepaged spew on a 32bit x86: [ 217.490026] BUG: sleeping function called from invalid context at include/linux/mmu_notifier.h:346 [ 217.492826] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 25, name: khugepaged [ 217.495589] INFO: lockdep is turned off. [ 217.498371] CPU: 1 PID: 25 Comm: khugepaged Not tainted 5.4.0-rc5-elk+ #206 [ 217.501233] Hardware name: System manufacturer P5Q-EM/P5Q-EM, BIOS 2203 07/08/2009 [ 217.501697] Call Trace: [ 217.501697] dump_stack+0x66/0x8e [ 217.501697] ___might_sleep.cold.96+0x95/0xa6 [ 217.501697] __might_sleep+0x2e/0x80 [ 217.501697] collapse_huge_page.isra.51+0x5ac/0x1360 [ 217.501697] ? __alloc_pages_nodemask+0xec/0xf80 [ 217.501697] ? __alloc_pages_nodemask+0x191/0xf80 [ 217.501697] ? trace_hardirqs_on+0x4a/0xf0 [ 217.501697] khugepaged+0x9a9/0x20f0 [ 217.501697] ? _raw_spin_unlock+0x21/0x30 [ 217.501697] ? trace_hardirqs_on+0x4a/0xf0 [ 217.501697] ? wait_woken+0xa0/0xa0 [ 217.501697] kthread+0xf5/0x110 [ 217.501697] ? collapse_pte_mapped_thp+0x3b0/0x3b0 [ 217.501697] ? kthread_create_worker_on_cpu+0x20/0x20 [ 217.501697] ret_from_fork+0x2e/0x38 Looks like it's due to CONFIG_HIGHPTE=y pte_offset_map()->kmap_atomic() vs. mmu_notifier_invalidate_range_start(). Let's do the naive approach and just reorder the two operations. Link: http://lkml.kernel.org/r/20191029201513.GG1208@intel.com Fixes: 810e24e009cf71 ("mm/mmu_notifiers: annotate with might_sleep()") Signed-off-by: Ville Syrjl Reviewed-by: Andrew Morton Acked-by: Kirill A. Shutemov Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: "H. Peter Anvin" Cc: Jérôme Glisse Cc: Ralph Campbell Cc: Ira Weiny Cc: Jason Gunthorpe Cc: Daniel Vetter Cc: Andrea Arcangeli Cc: Signed-off-by: Andrew Morton --- mm/khugepaged.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) --- a/mm/khugepaged.c~khugepaged-might_sleep-warn-due-to-config_highpte=y +++ a/mm/khugepaged.c @@ -1028,12 +1028,13 @@ static void collapse_huge_page(struct mm anon_vma_lock_write(vma->anon_vma); - pte = pte_offset_map(pmd, address); - pte_ptl = pte_lockptr(mm, pmd); - mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, NULL, mm, address, address + HPAGE_PMD_SIZE); mmu_notifier_invalidate_range_start(&range); + + pte = pte_offset_map(pmd, address); + pte_ptl = pte_lockptr(mm, pmd); + pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */ /* * After this gup_fast can't run anymore. This also removes From patchwork Wed Nov 6 05:16:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229313 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 82D561515 for ; Wed, 6 Nov 2019 05:16:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 39E97206A3 for ; Wed, 6 Nov 2019 05:16:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="tW/rhAWT" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 39E97206A3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 05ED66B0272; Wed, 6 Nov 2019 00:16:54 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id F29C16B0273; Wed, 6 Nov 2019 00:16:53 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF2EF6B0274; Wed, 6 Nov 2019 00:16:53 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0043.hostedemail.com [216.40.44.43]) by kanga.kvack.org (Postfix) with ESMTP id C87416B0272 for ; Wed, 6 Nov 2019 00:16:53 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 5DB76180AD815 for ; Wed, 6 Nov 2019 05:16:53 +0000 (UTC) X-FDA: 76124693106.08.list37_3c1bfb0b46e25 X-Spam-Summary: 2,0,0,f5839ff6717ed1bd,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:hannes@cmpxchg.org::mm-commits@vger.kernel.org:rientjes@google.com:torvalds@linux-foundation.org,RULES_HIT:41:355:379:800:960:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1542:1711:1730:1747:1777:1792:2393:2525:2559:2563:2682:2685:2859:2892:2899:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4250:4321:4605:5007:6261:6653:7576:8599:9025:9545:9592:10004:10913:11026:11233:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12783:12986:13161:13191:13192:13208:13229:13255:13846:14096:14181:14721:14849:21067:21080:21451:21627:21939:30054:30075,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: list37_3c1bfb0b46e25 X-Filterd-Recvd-Size: 3443 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:52 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C3480217F4; Wed, 6 Nov 2019 05:16:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017412; bh=tJgsivYSCfoAHzi6vjmLxrRm7ADXpw5N7uKhdHCbwcI=; h=Date:From:To:Subject:From; b=tW/rhAWTOB81wSL//5+7LlgQeLHtALutNrxWMy/GkSzaAf/xmyixIFi5J1amuKDEs fRrAB6N6TPqIqcOgwYwMZZ7zlrSqqG11vFZCGhUKxOMsfFwui4Yg/X3ZNx8k0Z/BK+ wPPGdRWyCuXAarh1c1LtKyXVVdnWrKiQ7N0Pxqbs= Date: Tue, 05 Nov 2019 21:16:51 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, hannes@cmpxchg.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, rientjes@google.com, torvalds@linux-foundation.org Subject: [patch 10/17] mm/page_alloc.c: ratelimit allocation failure warnings more aggressively Message-ID: <20191106051651.PJvLJuvA-%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Johannes Weiner Subject: mm/page_alloc.c: ratelimit allocation failure warnings more aggressively While investigating a bug related to higher atomic allocation failures, we noticed the failure warnings positively drowning the console, and in our case trigger lockup warnings because of a serial console too slow to handle all that output. But even if we had a faster console, it's unclear what additional information the current level of repetition provides. Allocation failures happen for three reasons: The machine is OOM, the VM is failing to handle reasonable requests, or somebody is making unreasonable requests (and didn't acknowledge their opportunism with __GFP_NOWARN). Having the memory dump, a callstack, and the ratelimit stats on skipped failure warnings should provide enough information to let users/admins/developers know whether something is wrong and point them in the right direction for debugging, bpftracing etc. Limit allocation failure warnings to 1 spew every ten seconds. Link: http://lkml.kernel.org/r/20191028194906.26899-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner Acked-by: David Rientjes Signed-off-by: Andrew Morton --- mm/page_alloc.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) --- a/mm/page_alloc.c~mm-rate-limit-allocation-failure-warnings-more-aggressively +++ a/mm/page_alloc.c @@ -3728,10 +3728,6 @@ try_this_zone: static void warn_alloc_show_mem(gfp_t gfp_mask, nodemask_t *nodemask) { unsigned int filter = SHOW_MEM_FILTER_NODES; - static DEFINE_RATELIMIT_STATE(show_mem_rs, HZ, 1); - - if (!__ratelimit(&show_mem_rs)) - return; /* * This documents exceptions given to allocations in certain @@ -3752,8 +3748,7 @@ void warn_alloc(gfp_t gfp_mask, nodemask { struct va_format vaf; va_list args; - static DEFINE_RATELIMIT_STATE(nopage_rs, DEFAULT_RATELIMIT_INTERVAL, - DEFAULT_RATELIMIT_BURST); + static DEFINE_RATELIMIT_STATE(nopage_rs, 10*HZ, 1); if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs)) return; From patchwork Wed Nov 6 05:16:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229315 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4CE6316B1 for ; Wed, 6 Nov 2019 05:16:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0F00D217F5 for ; Wed, 6 Nov 2019 05:16:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="xC3b6gVM" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0F00D217F5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A972A6B0273; Wed, 6 Nov 2019 00:16:56 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A47196B0274; Wed, 6 Nov 2019 00:16:56 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 93C0E6B0275; Wed, 6 Nov 2019 00:16:56 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0166.hostedemail.com [216.40.44.166]) by kanga.kvack.org (Postfix) with ESMTP id 7BF296B0273 for ; Wed, 6 Nov 2019 00:16:56 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 494F145B3 for ; Wed, 6 Nov 2019 05:16:56 +0000 (UTC) X-FDA: 76124693232.15.front90_3c893e6acc447 X-Spam-Summary: 2,0,0,52d2f258556c709c,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:ddstreet@ieee.org::mm-commits@vger.kernel.org:torvalds@linux-foundation.org:vitaly.wool@konsulko.com,RULES_HIT:41:355:379:800:960:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1540:1568:1711:1714:1730:1747:1777:1792:2393:2525:2559:2563:2682:2685:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3867:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:5007:6119:6261:6653:7576:7809:8599:9010:9012:9025:9545:10004:10913:11658:11914:12043:12048:12297:12517:12519:12555:12679:12783:12986:13069:13311:13357:13846:14094:14181:14384:14721:14849:21080:21433:21451:21627:21939:30054,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:2,LUA_SUMMARY:none X-HE-Tag: front90_3c893e6acc447 X-Filterd-Recvd-Size: 2117 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:55 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id BC29E217F4; Wed, 6 Nov 2019 05:16:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017415; bh=uxmBhGRnwk2VqtjiFFY+gqKrFKtfPjb8v8L6W34jA2E=; h=Date:From:To:Subject:From; b=xC3b6gVMkYu0a/i7NV77A2cCqCQ02ix6fsiHrwe4YeBYLG6PZoJowO8YMlrBAdMa+ SnpML2Us3xJDfUo7s5SYYcar4csQD5SwejgSEPI3qZ0ir3UthBuH6p3gA6vOQZQFRI 1n6gV9xQU4v89q2LTIuU3lC3PSjtDvkrYZAOWWi8= Date: Tue, 05 Nov 2019 21:16:54 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, ddstreet@ieee.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, vitaly.wool@konsulko.com Subject: [patch 11/17] zswap: add Vitaly to the maintainers list Message-ID: <20191106051654.ZE31IeF95%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Vitaly Wool Subject: zswap: add Vitaly to the maintainers list Per conversation with Dan, add myself to the zswap MAINTAINERS list. Link: http://lkml.kernel.org/r/20191028143154.31304-1-vitaly.wool@konsulko.com Signed-off-by: Vitaly Wool Acked-by: Dan Streetman Acked-by: Andrew Morton Signed-off-by: Andrew Morton --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) --- a/MAINTAINERS~zswap-add-myself-to-the-maintainers-list +++ a/MAINTAINERS @@ -18034,6 +18034,7 @@ F: Documentation/vm/zsmalloc.rst ZSWAP COMPRESSED SWAP CACHING M: Seth Jennings M: Dan Streetman +M: Vitaly Wool L: linux-mm@kvack.org S: Maintained F: mm/zswap.c From patchwork Wed Nov 6 05:16:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229317 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 87F1015AB for ; Wed, 6 Nov 2019 05:17:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 495E2206A3 for ; Wed, 6 Nov 2019 05:17:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="qFyuIoms" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 495E2206A3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1492B6B0274; Wed, 6 Nov 2019 00:17:00 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0F9DA6B0275; Wed, 6 Nov 2019 00:17:00 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0105C6B0276; Wed, 6 Nov 2019 00:16:59 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0008.hostedemail.com [216.40.44.8]) by kanga.kvack.org (Postfix) with ESMTP id E0D3D6B0274 for ; Wed, 6 Nov 2019 00:16:59 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 5F7928249980 for ; Wed, 6 Nov 2019 05:16:59 +0000 (UTC) X-FDA: 76124693358.19.light51_3cfc575666358 X-Spam-Summary: 2,0,0,cdaaa0825d87fbcc,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:haokexin@gmail.com::mm-commits@vger.kernel.org:stable@vger.kernel.org:torvalds@linux-foundation.org,RULES_HIT:41:355:379:800:960:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1541:1711:1730:1747:1777:1792:2198:2199:2393:2525:2559:2563:2682:2685:2693:2731:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3352:3865:3866:3867:3870:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:5007:6261:6653:7514:7576:8599:9025:9545:10004:10913:11026:11658:11914:12043:12048:12296:12297:12517:12519:12555:12679:12783:12986:13069:13311:13357:13846:14181:14384:14721:14849:21080:21324:21451:21627:21939:30012:30054,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: light51_3cfc575666358 X-Filterd-Recvd-Size: 2766 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:16:58 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CEB86206A3; Wed, 6 Nov 2019 05:16:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017418; bh=6SoXQmpFL1pLJ30tuGrZg1GHHjTEWHo92t9q3pvJXso=; h=Date:From:To:Subject:From; b=qFyuIomsckNtmr2V/sFtUsF2WgW8/6DJAnwvA7qMTvar9GKsrLHaXBo1ifEDMnYWa CB16Y8WUsxV3shC3g0JDuhy5ZXb+tA7EmZ/CJvJ/u8Fu9fRzpnq47CZ/ZLJ0KwJPej gmD8/cjcCyY3uZjHAJ3z8jvCbMlemBbM/rlPEL1I= Date: Tue, 05 Nov 2019 21:16:57 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, haokexin@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 12/17] dump_stack: avoid the livelock of the dump_lock Message-ID: <20191106051657.fRGxFzH1B%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kevin Hao Subject: dump_stack: avoid the livelock of the dump_lock In the current code, we use the atomic_cmpxchg() to serialize the output of the dump_stack(), but this implementation suffers the thundering herd problem. We have observed such kind of livelock on a Marvell cn96xx board(24 cpus) when heavily using the dump_stack() in a kprobe handler. Actually we can let the competitors to wait for the releasing of the lock before jumping to atomic_cmpxchg(). This will definitely mitigate the thundering herd problem. Thanks Linus for the suggestion. [akpm@linux-foundation.org: fix comment] Link: http://lkml.kernel.org/r/20191030031637.6025-1-haokexin@gmail.com Fixes: b58d977432c8 ("dump_stack: serialize the output from dump_stack()") Signed-off-by: Kevin Hao Suggested-by: Linus Torvalds Cc: Signed-off-by: Andrew Morton --- lib/dump_stack.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) --- a/lib/dump_stack.c~dump_stack-avoid-the-livelock-of-the-dump_lock +++ a/lib/dump_stack.c @@ -106,7 +106,12 @@ retry: was_locked = 1; } else { local_irq_restore(flags); - cpu_relax(); + /* + * Wait for the lock to release before jumping to + * atomic_cmpxchg() in order to mitigate the thundering herd + * problem. + */ + do { cpu_relax(); } while (atomic_read(&dump_lock) != -1); goto retry; } From patchwork Wed Nov 6 05:17:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229319 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A902F1515 for ; Wed, 6 Nov 2019 05:17:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6B172206A3 for ; Wed, 6 Nov 2019 05:17:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="D+u0L/TY" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6B172206A3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 05CC96B0275; Wed, 6 Nov 2019 00:17:03 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 00DA46B0276; Wed, 6 Nov 2019 00:17:02 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E65CC6B0277; Wed, 6 Nov 2019 00:17:02 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0058.hostedemail.com [216.40.44.58]) by kanga.kvack.org (Postfix) with ESMTP id CF1E46B0275 for ; Wed, 6 Nov 2019 00:17:02 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 85B0140C2 for ; Wed, 6 Nov 2019 05:17:02 +0000 (UTC) X-FDA: 76124693484.24.oven41_3d6e21cc46f61 X-Spam-Summary: 50,0,0,edf8502a04513a27,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:hannes@cmpxchg.org::mm-commits@vger.kernel.org:songliubraving@fb.com:torvalds@linux-foundation.org,RULES_HIT:41:355:379:800:960:967:973:981:988:989:1260:1263:1345:1381:1431:1437:1534:1541:1711:1730:1747:1777:1792:2393:2525:2566:2682:2685:2859:2892:2895:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3352:3865:3867:3868:3870:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:4362:5007:6119:6261:6653:7576:7903:8599:9025:9545:10004:10913:11658:11914:12043:12048:12297:12517:12519:12555:12679:12783:12986:13069:13311:13357:13846:14093:14181:14384:14721:14764:14849:21080:21324:21451:21627:21939:30029:30054:30064:30070:30075,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: oven41_3d6e21cc46f61 X-Filterd-Recvd-Size: 2684 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:17:01 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D43AC217F4; Wed, 6 Nov 2019 05:17:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017421; bh=DGdH11QtFAcnUCSS9W9KBKLZc9Cw3FK6o6w3l4k5jmI=; h=Date:From:To:Subject:From; b=D+u0L/TYm/mut2FYNl7m41ntYt9NK9dYcO251YsYBDlMcEMRBjAVkSuXr0u6gW8iC F2p3COeYRCx7sXTHhfvPGeya8xROA/pdniuglPutW5Uibwl2yQNBWgHgVSyXonBj6F OIzCDhXE43Kk3r05cl+OOmIyQ2mUagIINoex40fA= Date: Tue, 05 Nov 2019 21:17:00 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, hannes@cmpxchg.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, songliubraving@fb.com, torvalds@linux-foundation.org Subject: [patch 13/17] MAINTAINERS: update information for "MEMORY MANAGEMENT" Message-ID: <20191106051700.3aS7_uGu4%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Song Liu Subject: MAINTAINERS: update information for "MEMORY MANAGEMENT" I was trying to find the mm tree in MAINTAINERS by searching "Morton". Unfortunately, I didn't find one. And I didn't even locate the MEMORY MANAGEMENT section quickly, because Andrew's name was not listed there. Thanks to Johannes who helped me find the mm tree. Let save other's time searching around by adding: M: Andrew Morton T: git git://github.com/hnaz/linux-mm.git [akpm@linux-foundation.org: add ozlabs.org quilt trees] Link: http://lkml.kernel.org/r/20191030202217.3498133-1-songliubraving@fb.com Signed-off-by: Song Liu Acked-by: Andrew Morton Cc: Johannes Weiner Signed-off-by: Andrew Morton --- MAINTAINERS | 4 ++++ 1 file changed, 4 insertions(+) --- a/MAINTAINERS~maintainers-update-information-for-memory-management +++ a/MAINTAINERS @@ -10519,8 +10519,12 @@ F: mm/memblock.c F: Documentation/core-api/boot-time-mm.rst MEMORY MANAGEMENT +M: Andrew Morton L: linux-mm@kvack.org W: http://www.linux-mm.org +T: quilt https://ozlabs.org/~akpm/mmotm/ +T: quilt https://ozlabs.org/~akpm/mmots/ +T: git git://github.com/hnaz/linux-mm.git S: Maintained F: include/linux/mm.h F: include/linux/gfp.h From patchwork Wed Nov 6 05:17:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229321 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C62B815AB for ; Wed, 6 Nov 2019 05:17:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 86C26206A3 for ; Wed, 6 Nov 2019 05:17:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="CrzJpAaI" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 86C26206A3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5B7206B0276; Wed, 6 Nov 2019 00:17:06 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 540806B0277; Wed, 6 Nov 2019 00:17:06 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 454EB6B0278; Wed, 6 Nov 2019 00:17:06 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 305D16B0276 for ; Wed, 6 Nov 2019 00:17:06 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id D997B181AEF0B for ; Wed, 6 Nov 2019 05:17:05 +0000 (UTC) X-FDA: 76124693610.03.test78_3de8f68c19d5f X-Spam-Summary: 2,0,0,f89a76b30958b549,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:daniel.m.jordan@oracle.com:guro@fb.com::mm-commits@vger.kernel.org:n-horiguchi@ah.jp.nec.com:rientjes@google.com:shakeelb@google.com:stable@vger.kernel.org:torvalds@linux-foundation.org:vdavydov.dev@gmail.com,RULES_HIT:41:355:379:800:960:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1543:1711:1730:1747:1777:1792:1981:2194:2198:2199:2200:2393:2525:2559:2563:2682:2685:2689:2731:2859:2900:2902:2903:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:4605:4860:5007:6261:6653:6737:7514:7576:7903:8599:8957:9025:9545:10004:10913:11026:11658:11914:12043:12048:12291:12294:12296:12297:12438:12517:12519:12555:12679:12683:12783:12986:13255:14181:14721:14849:21060:21080:21451:21627:21795:21796:21939:30036:30051:30054:30056:30064:30070,0,RBL:error,CacheIP:none,B ayesian: X-HE-Tag: test78_3de8f68c19d5f X-Filterd-Recvd-Size: 4499 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:17:05 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id EE5F4217F4; Wed, 6 Nov 2019 05:17:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017424; bh=KUmijE95Xvc0HyyoAL7SGK4urv4VVLqUFykkOHkDeN4=; h=Date:From:To:Subject:From; b=CrzJpAaINSLmpt7y2iMrrbVFTsdlRY8/4lhKyO8a7tBXi5fAEVM6qBFnM7O9recn8 43wLxeqldQMRsyn0MXb+E0jtkn0kWvkQNIc0k7YpOHjAGxPNj6ejLCGQ6Ga/cUaSSS Xq24zWCNA9ezeqFu2oYygITP/piZVU8wfms8hhBk= Date: Tue, 05 Nov 2019 21:17:03 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, daniel.m.jordan@oracle.com, guro@fb.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, n-horiguchi@ah.jp.nec.com, rientjes@google.com, shakeelb@google.com, stable@vger.kernel.org, torvalds@linux-foundation.org, vdavydov.dev@gmail.com Subject: [patch 14/17] mm: slab: make page_cgroup_ino() to recognize non-compound slab pages properly Message-ID: <20191106051703.QdQBp46KO%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Roman Gushchin Subject: mm: slab: make page_cgroup_ino() to recognize non-compound slab pages properly page_cgroup_ino() doesn't return a valid memcg pointer for non-compound slab pages, because it depends on PgHead AND PgSlab flags to be set to determine the memory cgroup from the kmem_cache. It's correct for compound pages, but not for generic small pages. Those don't have PgHead set, so it ends up returning zero. Fix this by replacing the condition to PageSlab() && !PageTail(). Before this patch: [root@localhost ~]# ./page-types -c /sys/fs/cgroup/user.slice/user-0.slice/user@0.service/ | grep slab 0x0000000000000080 38 0 _______S___________________________________ slab After this patch: [root@localhost ~]# ./page-types -c /sys/fs/cgroup/user.slice/user-0.slice/user@0.service/ | grep slab 0x0000000000000080 147 0 _______S___________________________________ slab Also, hwpoison_filter_task() uses output of page_cgroup_ino() in order to filter error injection events based on memcg. So if page_cgroup_ino() fails to return memcg pointer, we just fail to inject memory error. Considering that hwpoison filter is for testing, affected users are limited and the impact should be marginal. [n-horiguchi@ah.jp.nec.com: changelog additions] Link: http://lkml.kernel.org/r/20191031012151.2722280-1-guro@fb.com Fixes: 4d96ba353075 ("mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages") Signed-off-by: Roman Gushchin Reviewed-by: Shakeel Butt Acked-by: David Rientjes Cc: Vladimir Davydov Cc: Daniel Jordan Cc: Naoya Horiguchi Cc: Signed-off-by: Andrew Morton --- mm/memcontrol.c | 2 +- mm/slab.h | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) --- a/mm/memcontrol.c~mm-slab-make-page_cgroup_ino-to-recognize-non-compound-slab-pages-properly +++ a/mm/memcontrol.c @@ -484,7 +484,7 @@ ino_t page_cgroup_ino(struct page *page) unsigned long ino = 0; rcu_read_lock(); - if (PageHead(page) && PageSlab(page)) + if (PageSlab(page) && !PageTail(page)) memcg = memcg_from_slab_page(page); else memcg = READ_ONCE(page->mem_cgroup); --- a/mm/slab.h~mm-slab-make-page_cgroup_ino-to-recognize-non-compound-slab-pages-properly +++ a/mm/slab.h @@ -323,8 +323,8 @@ static inline struct kmem_cache *memcg_r * Expects a pointer to a slab page. Please note, that PageSlab() check * isn't sufficient, as it returns true also for tail compound slab pages, * which do not have slab_cache pointer set. - * So this function assumes that the page can pass PageHead() and PageSlab() - * checks. + * So this function assumes that the page can pass PageSlab() && !PageTail() + * check. * * The kmem_cache can be reparented asynchronously. The caller must ensure * the memcg lifetime, e.g. by taking rcu_read_lock() or cgroup_mutex. From patchwork Wed Nov 6 05:17:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229323 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A336015AB for ; Wed, 6 Nov 2019 05:17:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 635DD206A3 for ; Wed, 6 Nov 2019 05:17:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="y7+Vvw2P" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 635DD206A3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4C97C6B0277; Wed, 6 Nov 2019 00:17:09 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 47CA76B0278; Wed, 6 Nov 2019 00:17:09 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3BD196B0279; Wed, 6 Nov 2019 00:17:09 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id 21AA16B0277 for ; Wed, 6 Nov 2019 00:17:09 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id CF4F78249980 for ; Wed, 6 Nov 2019 05:17:08 +0000 (UTC) X-FDA: 76124693736.23.start60_3e5c793de7e20 X-Spam-Summary: 2,0,0,d72fcd85f000bbb8,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:gor@linux.ibm.com:heiko.carstens@de.ibm.com:iii@linux.ibm.com:jan.kiszka@siemens.com:kbingham@kernel.org::mm-commits@vger.kernel.org:torvalds@linux-foundation.org,RULES_HIT:41:355:379:800:960:967:973:982:988:989:1260:1263:1345:1381:1431:1437:1534:1542:1711:1730:1747:1777:1792:1801:2198:2199:2393:2525:2553:2559:2563:2682:2685:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4250:4321:4605:5007:6119:6261:6653:7576:7903:7904:8599:9010:9025:9165:9545:10004:10913:11026:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12783:12986:13255:14181:14721:14849:21063:21080:21433:21451:21627:21939:30003:30012:30029:30054:30090,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Cus tom_rule X-HE-Tag: start60_3e5c793de7e20 X-Filterd-Recvd-Size: 3600 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:17:08 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 251DA217F5; Wed, 6 Nov 2019 05:17:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017427; bh=8SvU4u+s/tnq1OYQoJXJmctCA/m45Lj3AfUNfrRsPAc=; h=Date:From:To:Subject:From; b=y7+Vvw2P3txoI8ky7Sqw1NCSTbS6UukePPWwdlL3ryIxH5H6/ccvq30GcTye8/IhW obrBbqBUMs2B+Gq5mTSsZYRyJJ9di/qOjzQj0kIRPG3w4xEph6UU6fW64uY/ow3a5N qn1p+rGqa7focscb42NSnHD6Wp7WUBZN9/kzXXvc= Date: Tue, 05 Nov 2019 21:17:06 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, gor@linux.ibm.com, heiko.carstens@de.ibm.com, iii@linux.ibm.com, jan.kiszka@siemens.com, kbingham@kernel.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 15/17] scripts/gdb: fix debugging modules compiled with hot/cold partitioning Message-ID: <20191106051706.tvt1FBphW%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ilya Leoshkevich Subject: scripts/gdb: fix debugging modules compiled with hot/cold partitioning gcc's -freorder-blocks-and-partition option makes it group frequently and infrequently used code in .text.hot and .text.unlikely sections respectively. At least when building modules on s390, this option is used by default. gdb assumes that all code is located in .text section, and that .text section is located at module load address. With such modules this is no longer the case: there is code in .text.hot and .text.unlikely, and either of them might precede .text. Fix by explicitly telling gdb the addresses of code sections. It might be tempting to do this for all sections, not only the ones in the white list. Unfortunately, gdb appears to have an issue, when telling it about e.g. loadable .note.gnu.build-id section causes it to think that non-loadable .note.Linux section is loaded at address 0, which in turn causes NULL pointers to be resolved to bogus symbols. So keep using the white list approach for the time being. Link: http://lkml.kernel.org/r/20191028152734.13065-1-iii@linux.ibm.com Signed-off-by: Ilya Leoshkevich Reviewed-by: Jan Kiszka Cc: Kieran Bingham Cc: Heiko Carstens Cc: Vasily Gorbik Signed-off-by: Andrew Morton --- scripts/gdb/linux/symbols.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/scripts/gdb/linux/symbols.py~scripts-gdb-fix-debugging-modules-compiled-with-hot-cold-partitioning +++ a/scripts/gdb/linux/symbols.py @@ -99,7 +99,8 @@ lx-symbols command.""" attrs[n]['name'].string(): attrs[n]['address'] for n in range(int(sect_attrs['nsections']))} args = [] - for section_name in [".data", ".data..read_mostly", ".rodata", ".bss"]: + for section_name in [".data", ".data..read_mostly", ".rodata", ".bss", + ".text", ".text.hot", ".text.unlikely"]: address = section_name_to_address.get(section_name) if address: args.append(" -s {name} {addr}".format( From patchwork Wed Nov 6 05:17:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229325 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 398DC1515 for ; Wed, 6 Nov 2019 05:17:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EF27D217F4 for ; Wed, 6 Nov 2019 05:17:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="0Sa++pPY" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EF27D217F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E41206B0279; Wed, 6 Nov 2019 00:17:12 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DCA0C6B027A; Wed, 6 Nov 2019 00:17:12 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE05A6B027B; Wed, 6 Nov 2019 00:17:12 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0152.hostedemail.com [216.40.44.152]) by kanga.kvack.org (Postfix) with ESMTP id B833E6B0279 for ; Wed, 6 Nov 2019 00:17:12 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 406DA2C9D for ; Wed, 6 Nov 2019 05:17:12 +0000 (UTC) X-FDA: 76124693904.18.mark23_3ed9e80e16542 X-Spam-Summary: 2,0,0,5b229f3871aa5ca6,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:dan.j.williams@intel.com:david@redhat.com:gregkh@linuxfoundation.org::mhocko@suse.com:mm-commits@vger.kernel.org:osalvador@suse.de:pasha.tatashin@soleen.com:sfr@canb.auug.org.au:torvalds@linux-foundation.org,RULES_HIT:41:355:379:800:960:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1542:1711:1730:1747:1777:1792:2393:2525:2559:2563:2682:2685:2693:2859:2898:2901:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:5007:6119:6261:6653:6737:7576:8599:9025:9121:9545:10004:10913:11026:11658:11914:12043:12048:12297:12517:12519:12555:12679:12783:12986:13161:13229:13846:14181:14721:14849:21080:21451:21627:21740:21939:30054:30064:30070,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custo m_rules: X-HE-Tag: mark23_3ed9e80e16542 X-Filterd-Recvd-Size: 3553 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:17:11 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 63734218AE; Wed, 6 Nov 2019 05:17:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017430; bh=N+h2hKlaBEcnv05qpO0BORpjth17rFxRoqlqXF3K8mc=; h=Date:From:To:Subject:From; b=0Sa++pPYGIN4LC2BzAhjAKEk9nQFhJXXkM2pbqedNTLn8Xw0kwXFdSpFy0k3P7J6u WE+2FHzbjl2VaK8WRwEHmct21OL9Emg6xm8VpR7LciSpGcb4nddqG2qzvPBu6k4ROE l7AOvkD+IBdqIV11YsKxWelRT0ZahSD1W0+JIDuY= Date: Tue, 05 Nov 2019 21:17:10 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, dan.j.williams@intel.com, david@redhat.com, gregkh@linuxfoundation.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, osalvador@suse.de, pasha.tatashin@soleen.com, sfr@canb.auug.org.au, torvalds@linux-foundation.org Subject: [patch 16/17] mm/memory_hotplug: fix updating the node span Message-ID: <20191106051710.8xneT0YIC%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: mm/memory_hotplug: fix updating the node span We recently started updating the node span based on the zone span to avoid touching uninitialized memmaps. Currently, we will always detect the node span to start at 0, meaning a node can easily span too many pages. pgdat_is_empty() will still work correctly if all zones span no pages. We should skip over all zones without spanned pages and properly handle the first detected zone that spans pages. Unfortunately, in contrast to the zone span (/proc/zoneinfo), the node span cannot easily be inspected and tested. The node span gives no real guarantees when an architecture supports memory hotplug, meaning it can easily contain holes or span pages of different nodes. The node span is not really used after init on architectures that support memory hotplug. E.g., we use it in mm/memory_hotplug.c:try_offline_node() and in mm/kmemleak.c:kmemleak_scan(). These users seem to be fine. Link: http://lkml.kernel.org/r/20191027222714.5313-1-david@redhat.com Fixes: 00d6c019b5bc ("mm/memory_hotplug: don't access uninitialized memmaps in shrink_pgdat_span()") Signed-off-by: David Hildenbrand Cc: Michal Hocko Cc: Oscar Salvador Cc: Stephen Rothwell Cc: Dan Williams Cc: Pavel Tatashin Cc: Greg Kroah-Hartman Signed-off-by: Andrew Morton --- mm/memory_hotplug.c | 8 ++++++++ 1 file changed, 8 insertions(+) --- a/mm/memory_hotplug.c~mm-memory_hotplug-fix-updating-the-node-span +++ a/mm/memory_hotplug.c @@ -447,6 +447,14 @@ static void update_pgdat_span(struct pgl zone->spanned_pages; /* No need to lock the zones, they can't change. */ + if (!zone->spanned_pages) + continue; + if (!node_end_pfn) { + node_start_pfn = zone->zone_start_pfn; + node_end_pfn = zone_end_pfn; + continue; + } + if (zone_end_pfn > node_end_pfn) node_end_pfn = zone_end_pfn; if (zone->zone_start_pfn < node_start_pfn) From patchwork Wed Nov 6 05:17:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11229327 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4703F1515 for ; Wed, 6 Nov 2019 05:17:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F272A217F4 for ; Wed, 6 Nov 2019 05:17:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="Mmnz9Xgc" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F272A217F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 05D1A6B027B; Wed, 6 Nov 2019 00:17:17 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 00F266B027C; Wed, 6 Nov 2019 00:17:16 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8E4D6B027D; Wed, 6 Nov 2019 00:17:16 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0159.hostedemail.com [216.40.44.159]) by kanga.kvack.org (Postfix) with ESMTP id D276E6B027B for ; Wed, 6 Nov 2019 00:17:16 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 896E6181AEF0B for ; Wed, 6 Nov 2019 05:17:16 +0000 (UTC) X-FDA: 76124694072.22.fear56_3f76301a9cd11 X-Spam-Summary: 20,1.5,0,f2a3086f6f2f9146,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:hannes@cmpxchg.org::mhocko@kernel.org:mm-commits@vger.kernel.org:shakeelb@google.com:stable@vger.kernel.org:suleiman@google.com:torvalds@linux-foundation.org,RULES_HIT:41:355:365:379:800:960:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1535:1543:1711:1730:1747:1777:1792:2194:2196:2198:2199:2200:2201:2393:2525:2559:2563:2682:2685:2729:2731:2859:2890:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3355:3834:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4042:4225:4250:4321:4385:4560:4605:5007:6261:6653:7576:7875:7903:8599:8603:9025:9121:9545:9707:10004:10913:11026:11233:11473:11658:11914:12043:12048:12297:12438:12517:12519:12555:12660:12679:12783:12986:13146:13156:13161:13227:13228:13229:13230:13255:13846:13869:14040:14093:14181:14721:14849:21067:21080:21450:21451:21627:21740:21939:30001:30005: 30054,0, X-HE-Tag: fear56_3f76301a9cd11 X-Filterd-Recvd-Size: 5078 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 05:17:15 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B74FC206A3; Wed, 6 Nov 2019 05:17:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573017435; bh=3fn7iwFlRDaj6a2XmXRqZp0udWJS2IAh+6Gxuqt1l3s=; h=Date:From:To:Subject:From; b=Mmnz9XgcMTybxX1XXtXvnmEA08ibgCXriOFcVOUM0XYgKeI/PvHidXHdNDFuMBBCr Nm++DTAGhjMiHh6J4+uakarNhQEhdviFHZDE0G+cBPa5fU58lRf/M4FVV4bkYZCFgq ACbiH4a06BKyLkKGkAMxX4nsvvww4u3nBGVhQu08= Date: Tue, 05 Nov 2019 21:17:13 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, shakeelb@google.com, stable@vger.kernel.org, suleiman@google.com, torvalds@linux-foundation.org Subject: [patch 17/17] mm: memcontrol: fix network errors from failing __GFP_ATOMIC charges Message-ID: <20191106051713.wpz4nLMGp%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Johannes Weiner Subject: mm: memcontrol: fix network errors from failing __GFP_ATOMIC charges While upgrading from 4.16 to 5.2, we noticed these allocation errors in the log of the new kernel: [ 8642.253395] SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC) [ 8642.269170] cache: tw_sock_TCPv6(960:helper-logs), object size: 232, buffer size: 240, default order: 1, min order: 0 [ 8642.293009] node 0: slabs: 5, objs: 170, free: 0 slab_out_of_memory+1 ___slab_alloc+969 __slab_alloc+14 kmem_cache_alloc+346 inet_twsk_alloc+60 tcp_time_wait+46 tcp_fin+206 tcp_data_queue+2034 tcp_rcv_state_process+784 tcp_v6_do_rcv+405 __release_sock+118 tcp_close+385 inet_release+46 __sock_release+55 sock_close+17 __fput+170 task_work_run+127 exit_to_usermode_loop+191 do_syscall_64+212 entry_SYSCALL_64_after_hwframe+68 accompanied by an increase in machines going completely radio silent under memory pressure. One thing that changed since 4.16 is e699e2c6a654 ("net, mm: account sock objects to kmemcg"), which made these slab caches subject to cgroup memory accounting and control. The problem with that is that cgroups, unlike the page allocator, do not maintain dedicated atomic reserves. As a cgroup's usage hovers at its limit, atomic allocations - such as done during network rx - can fail consistently for extended periods of time. The kernel is not able to operate under these conditions. We don't want to revert the culprit patch, because it indeed tracks a potentially substantial amount of memory used by a cgroup. We also don't want to implement dedicated atomic reserves for cgroups. There is no point in keeping a fixed margin of unused bytes in the cgroup's memory budget to accomodate a consumer that is impossible to predict - we'd be wasting memory and get into configuration headaches, not unlike what we have going with min_free_kbytes. We do this for physical mem because we have to, but cgroups are an accounting game. Instead, account these privileged allocations to the cgroup, but let them bypass the configured limit if they have to. This way, we get the benefits of accounting the consumed memory and have it exert pressure on the rest of the cgroup, but like with the page allocator, we shift the burden of reclaimining on behalf of atomic allocations onto the regular allocations that can block. Link: http://lkml.kernel.org/r/20191022233708.365764-1-hannes@cmpxchg.org Fixes: e699e2c6a654 ("net, mm: account sock objects to kmemcg") Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt Cc: Suleiman Souhlal Cc: Michal Hocko Cc: [4.18+] Signed-off-by: Andrew Morton --- mm/memcontrol.c | 9 +++++++++ 1 file changed, 9 insertions(+) --- a/mm/memcontrol.c~mm-memcontrol-fix-network-errors-from-failing-__gfp_atomic-charges +++ a/mm/memcontrol.c @@ -2535,6 +2535,15 @@ retry: } /* + * Memcg doesn't have a dedicated reserve for atomic + * allocations. But like the global atomic pool, we need to + * put the burden of reclaim on regular allocation requests + * and let these go through as privileged allocations. + */ + if (gfp_mask & __GFP_ATOMIC) + goto force; + + /* * Unlike in global OOM situations, memcg is not in a physical * memory shortage. Allow dying and OOM-killed tasks to * bypass the last charges so that they can exit quickly and