From patchwork Wed Oct 10 15:11:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Hocko X-Patchwork-Id: 10634745 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4350D14BD for ; Wed, 10 Oct 2018 15:11:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 27E042A6E9 for ; Wed, 10 Oct 2018 15:11:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1C1D32A676; Wed, 10 Oct 2018 15:11:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E59812A676 for ; Wed, 10 Oct 2018 15:11:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D23356B0007; Wed, 10 Oct 2018 11:11:53 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CAAB36B0008; Wed, 10 Oct 2018 11:11:53 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B74906B000A; Wed, 10 Oct 2018 11:11:53 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) by kanga.kvack.org (Postfix) with ESMTP id 561466B0007 for ; Wed, 10 Oct 2018 11:11:53 -0400 (EDT) Received: by mail-ed1-f69.google.com with SMTP id c26-v6so3349842eda.7 for ; Wed, 10 Oct 2018 08:11:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=33kccVia1YlGhF8vy6H40nEgUB+nCabXE9vEglMNrys=; b=R///lw+k+UlKMjbPkAIpy7nwXi6PXBrZCegz5TRFNWUyKFnH2ctLNwKpYkvw7Ky1/9 2phtOcWdtqBEa/5SzyK428IeMOB7e3Sn8eXKYWk1xXxbeQZu1WPBnitvwQC055K9fII2 KEoTjSI3XWASI1MGd2T3wuzZQGGHtbg5Tovkyu37SHktmq6rpn4S9unpXSkpEhkJMsUp SCdpRHdT7Fdmso/VBh77ZoZZVMYs6b26E+nty0nqp2xWzMebgXSOR5llLMsVOHoyQo/1 0WZLJVvee3fZhAvSoJVmprJpEkILkdSPqZUqYhFBQMrbaDrhrX1FIgcQ0JU6GdtHEd2B UX5Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Gm-Message-State: ABuFfohHgpgMTlePeN+quF7ntkG6JfGdV8bDBIJN4CsKec0+3oht/tCy vFCbShtrcqFTd0heVhqwI07PodJmFBlrEJhy9YpfaXjUZj+YpGOise5K1OvEQiq/ADMLXfl2ZRi sZAzb9yL1vmLbgV5tDrGsEfsDlD5wt6DdEZ1aKjMAq+AwjPs4BgQve6QzTq5wdmq0EN+rpCQqFF vNkktHvZAoLjaLcyL1kE8yGvYMf3fjUTqKetn5ru6FjUPtKhEI0Px3+OorjU5oHWQZZq2w5tvnD oSOScB42j67Jq7TWVhS0DMGox1sggeHYJtZjRChlY8+OPRDJKExHFU2N526HnKViOI7mdWwPWDL O3zwDMK+h30gTn5YIvTKIXtvBJDueV9S3XzVaHrupkLlaj4aF0E3WzJfK1r5H4O5Wfu+0lCC7w= = X-Received: by 2002:a17:906:c19:: with SMTP id s25-v6mr34179701ejf.140.1539184312817; Wed, 10 Oct 2018 08:11:52 -0700 (PDT) X-Received: by 2002:a17:906:c19:: with SMTP id s25-v6mr34179536ejf.140.1539184311077; Wed, 10 Oct 2018 08:11:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539184311; cv=none; d=google.com; s=arc-20160816; b=OV4x7uxL8JKzJHcqqVGrskpqh+gv5WhOAMZpwmo67CTcffoXbz2DvqYFzntoxF6drx 28/Afzitmy0GO4YFlYsFEcZLqB428mcIEVmebJnomENrMFm9PutJ/CtXzFNXWUlvE8Ea d94lE3B0T8bryMir50fk8Y6dXKDp5bZRTpCZ2VmI3hi36BSIJ4IJZ8SifB8qsOPCzucq hISdBXpYL5bthFwpqpiYCZMJKYVcKAjb2nt2jtOtm9eiOER3gW9H3lJTzyYFmktiAav4 8JG1QMNXeYiXgXvFSFEQElaHPuVJ/DBHBPYRcMVIV1M2CBM61zrA9IvYVXyknlcKcPDt HiLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=33kccVia1YlGhF8vy6H40nEgUB+nCabXE9vEglMNrys=; b=kFWmq1TwCP/No80UAigcQDbbP94KWeUZD7sU/NboJt577xdC7va+uPb9tm8U77tFB4 UBurGQHWLqSHcb9Pr4PiKjyG749GqoQFfhN3wBeELtqEi4gds+3Yt6Irdn1W5jf4oNDP tMS55JEaMEyH/yC0Mf4Ylrqp734XSnHtnRcjKXXeTK2QrjCx9TlfH9ezZu7CjDi18JvG uYTEom+KQ+2KjzOZhsoSB1zzvXbD1ThpEiVMe76MEr27BRNySFabo9eu/elQpa8PbPN0 EbGQ527aUDxXwBiPa42vCNCEaV2rEXnnRB01ajZKUNr82tIkjWy4hwbOhFr4tTpt/2R9 Idug== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id g32-v6sor10836042edb.5.2018.10.10.08.11.50 for (Google Transport Security); Wed, 10 Oct 2018 08:11:51 -0700 (PDT) Received-SPF: pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Google-Smtp-Source: ACcGV60ED4A18OhNQsOQlakjhABE6KSXD+wKRtB3gNFVkXODCjpvzKc0dJP84TARGnge/E3NX1qEfA== X-Received: by 2002:a50:ae21:: with SMTP id c30-v6mr20969554edd.20.1539184310207; Wed, 10 Oct 2018 08:11:50 -0700 (PDT) Received: from tiehlicka.suse.cz (ip-37-188-149-132.eurotel.cz. [37.188.149.132]) by smtp.gmail.com with ESMTPSA id x53-v6sm8129625edx.63.2018.10.10.08.11.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 10 Oct 2018 08:11:48 -0700 (PDT) From: Michal Hocko To: Cc: syzkaller-bugs@googlegroups.com, Michal Hocko , guro@fb.com, hannes@cmpxchg.org, kirill.shutemov@linux.intel.com, linux-kernel@vger.kernel.org, penguin-kernel@i-love.sakura.ne.jp, rientjes@google.com, yang.s@alibaba-inc.com Subject: [RFC PATCH] memcg, oom: throttle dump_header for memcg ooms without eligible tasks Date: Wed, 10 Oct 2018 17:11:35 +0200 Message-Id: <20181010151135.25766-1-mhocko@kernel.org> X-Mailer: git-send-email 2.19.0 In-Reply-To: <000000000000dc48d40577d4a587@google.com> References: <000000000000dc48d40577d4a587@google.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Michal Hocko syzbot has noticed that it can trigger RCU stalls from the memcg oom path: RIP: 0010:dump_stack+0x358/0x3ab lib/dump_stack.c:118 Code: 74 0c 48 c7 c7 f0 f5 31 89 e8 9f 0e 0e fa 48 83 3d 07 15 7d 01 00 0f 84 63 fe ff ff e8 1c 89 c9 f9 48 8b bd 70 ff ff ff 57 9d <0f> 1f 44 00 00 e8 09 89 c9 f9 48 8b 8d 68 ff ff ff b8 ff ff 37 00 RSP: 0018:ffff88017d3a5c70 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: 0000000000040000 RBX: 1ffffffff1263ebe RCX: ffffc90001e5a000 RDX: 0000000000040000 RSI: ffffffff87b4e0f4 RDI: 0000000000000246 RBP: ffff88017d3a5d18 R08: ffff8801d7e02480 R09: fffffbfff13da030 R10: fffffbfff13da030 R11: 0000000000000003 R12: 1ffff1002fa74b96 R13: 00000000ffffffff R14: 0000000000000200 R15: 0000000000000000 dump_header+0x27b/0xf72 mm/oom_kill.c:441 out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109 mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386 mem_cgroup_oom mm/memcontrol.c:1701 [inline] try_charge+0xb7c/0x1710 mm/memcontrol.c:2260 mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892 mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907 shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784 shmem_fault+0x25f/0x960 mm/shmem.c:1982 __do_fault+0x100/0x6b0 mm/memory.c:2996 do_read_fault mm/memory.c:3408 [inline] do_fault mm/memory.c:3531 [inline] The primary reason of the stall lies in an expensive printk handling of oom report flood because a misconfiguration on the syzbot side caused that there is simply no eligible task because they have OOM_SCORE_ADJ_MIN set. This generates the oom report for each allocation from the memcg context. While normal workloads should be much more careful about potential heavy memory consumers that are OOM disabled it makes some sense to rate limit a potentially expensive oom reports for cases when there is no eligible victim found. Do that by moving the rate limit logic inside dump_header. We no longer rely on the caller to do that. It was only oom_kill_process which has been throttling. Other two call sites simply didn't have to care because one just paniced on the OOM when configured that way and no eligible task would panic for the global case as well. Memcg changed the picture because we do not panic and we might have multiple sources of the same event. Once we are here, make sure that the reason to trigger the OOM is printed without ratelimiting because this is really valuable to debug what happened. Reported-by: syzbot+77e6b28a7a7106ad0def@syzkaller.appspotmail.com Cc: guro@fb.com Cc: hannes@cmpxchg.org Cc: kirill.shutemov@linux.intel.com Cc: linux-kernel@vger.kernel.org Cc: penguin-kernel@i-love.sakura.ne.jp Cc: rientjes@google.com Cc: yang.s@alibaba-inc.com Signed-off-by: Michal Hocko Signed-off-by: Tetsuo Handa Reported-by: syzbot Signed-off-by: Tetsuo Handa Reported-by: syzbot Acked-by: Johannes Weiner Nacked-by: Tetsuo Handa --- mm/oom_kill.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index f10aa5360616..4ee393c85e27 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -430,6 +430,9 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask) static void dump_header(struct oom_control *oc, struct task_struct *p) { + static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + pr_warn("%s invoked oom-killer: gfp_mask=%#x(%pGg), nodemask=%*pbl, order=%d, oom_score_adj=%hd\n", current->comm, oc->gfp_mask, &oc->gfp_mask, nodemask_pr_args(oc->nodemask), oc->order, @@ -437,6 +440,9 @@ static void dump_header(struct oom_control *oc, struct task_struct *p) if (!IS_ENABLED(CONFIG_COMPACTION) && oc->order) pr_warn("COMPACTION is disabled!!!\n"); + if (!__ratelimit(&oom_rs)) + return; + cpuset_print_current_mems_allowed(); dump_stack(); if (is_memcg_oom(oc)) @@ -931,8 +937,6 @@ static void oom_kill_process(struct oom_control *oc, const char *message) struct task_struct *t; struct mem_cgroup *oom_group; unsigned int victim_points = 0; - static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL, - DEFAULT_RATELIMIT_BURST); /* * If the task is already exiting, don't alarm the sysadmin or kill @@ -949,8 +953,7 @@ static void oom_kill_process(struct oom_control *oc, const char *message) } task_unlock(p); - if (__ratelimit(&oom_rs)) - dump_header(oc, p); + dump_header(oc, p); pr_err("%s: Kill process %d (%s) score %u or sacrifice child\n", message, task_pid_nr(p), p->comm, points);