From patchwork Wed Aug 8 07:13:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Hocko X-Patchwork-Id: 10559575 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 10D1D1057 for ; Wed, 8 Aug 2018 07:13:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7409E204BF for ; Wed, 8 Aug 2018 07:13:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 67D9A20952; Wed, 8 Aug 2018 07:13:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E3F03204BF for ; Wed, 8 Aug 2018 07:13:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AFFD16B0007; Wed, 8 Aug 2018 03:13:15 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AB0696B0008; Wed, 8 Aug 2018 03:13:15 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A5BE6B000A; Wed, 8 Aug 2018 03:13:15 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl0-f70.google.com (mail-pl0-f70.google.com [209.85.160.70]) by kanga.kvack.org (Postfix) with ESMTP id 543C06B0007 for ; Wed, 8 Aug 2018 03:13:15 -0400 (EDT) Received: by mail-pl0-f70.google.com with SMTP id d10-v6so959838pll.22 for ; Wed, 08 Aug 2018 00:13:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=li9oVNH3dWPrRH5WhREscgBvq/72ffd5HpSEGay4wBk=; b=DYpN/ASw8PPMBXQYdBg6k6nVL5vwMRLBsFo63pULqVgciTvU8ZOWZpsNp+mq/JumXP gOQLq8/fVezjCrvgcq/uEHoQrzjLtm/ex2w6pmVCZ1igYeP2vExJ1Nq2JkOOWCkmvBvO hr4uTCJ5RTfyr/WBysTwxk++gNTFUMJgHZzy5dZ7+HYO+ZYlj7POoBuW1bnfqzvSYcID sscZuxL6NpTG7oBKLo61q3wN69PRt+f4iJJW1g2OTwLC3pb1xzvFfrSSi4BkSO2Tu0Pj qQw/vsfASC+K5QgRySJk3vWwzWbjGnT6Gq8xAu1uey9ZfzCkG+c+sPUICyvQ3JdQ3foS NCDw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Gm-Message-State: AOUpUlEfcMkdIsXRs9NVc+MxVphyNUhws+XvkEb7WDv3J5rCQyc1t5yk CTP7z1xK12ghJIvDicRoXWx72GKVj5cqUOr8JgC5hLS/1EzSMqOX1zXyrESwvwmqFZ0IkLFUTX/ +nx/tDrihUzFdxay3LayzvwiR81MGUkxxi2kPqnbOnXO6oZcr6X1mMJrjGnQPC5cgOAYYEw2RYo LrrL4QC8Om8XtgNT6Us2o4kdaWCJP9XnOMKdIDlmln4/d50RgUVlHBphLZqKHmzwMYSy3h9jZyi 8XQ8dN27pBLoLz+gSI/qtJeG6l6ASff4c2omg0a6ghoU/6tiUf0t+2zrd8/9CD490TqVFcrYwmg N74qElBN6Yo8kT8VT5Wrh0M633uWVPNYou/Jy5MONLbnHCbkaGEdrVrilNxutphUNpoQrim7lQ= = X-Received: by 2002:a63:e001:: with SMTP id e1-v6mr1376317pgh.380.1533712395029; Wed, 08 Aug 2018 00:13:15 -0700 (PDT) X-Received: by 2002:a63:e001:: with SMTP id e1-v6mr1376276pgh.380.1533712394360; Wed, 08 Aug 2018 00:13:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533712394; cv=none; d=google.com; s=arc-20160816; b=E8n8+H8FQ3ZkxgHqRLTCn07g+0kcmu7QP3/kckmtxnv+X9KmXIOxePFhfQUXarShzO q1fqicDA9BfHgzyZVqGCkAONLGC3iUFRhfBk4/LzsO5DE4yM+rr2IYEfQpppz+mj9Rvx p/c3aZADFYHOQQBnwBbBlcVSYVFWHPcJMmVfYdEjaT0r6lU8XLcBCYgbMRd69AiT82YC ws9KuksUcxbA4mru8Z//WvhHqySmO2GOYLp3UpZwzGJsR9q6MBRqQlFm7vXoahjliTJ7 AkSQePIB9iub/ptfUt6Fdh/96FPuYOggoqR51xEgzCMkrAs9C2JFc8aBz9aNQ/wXqxPQ K7oA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=li9oVNH3dWPrRH5WhREscgBvq/72ffd5HpSEGay4wBk=; b=wX5WIMDaKKaK+xjlg1FrmekN27PD7nxGV5NJd6OZjTp26Cs4XvddMG0i9j9jNBcUY7 ivzJNQ6I6UZ8KsEWQFEUTuHOP54ch78TThOFHJL4TnayaDRQPS6TKTR3vZjYUDD9i5DC a/hZLFO2vSsV9AiooP8uyaWtHbQ+2a7bIKkheqXMd4xiyEP8XB1dosjF5/YCgFZAs8pG Pao1e3YzIpIMJpoHQ0ZbX8JR8Se4MEMZV0D+7tjUDOvpoqlG00a5C6ULDT4Br9ckNLw4 UibCrK5InJ2iv5cbDsLyfA7FYD+FradIYUQ3xpwg+2rjpjcGS5ayEMKL3SzR6qmgZC+N sAWg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id x36-v6sor932674pgl.294.2018.08.08.00.13.14 for (Google Transport Security); Wed, 08 Aug 2018 00:13:14 -0700 (PDT) Received-SPF: pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of mstsxfx@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=mstsxfx@gmail.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Google-Smtp-Source: AA+uWPxRRfgRYoJNNCEUq+BG3YcMdb7eCph5IGgsMKW6YrYYTnsnvPwh7OxCrbrMTuaUcYsdVRO/fg== X-Received: by 2002:a63:fe02:: with SMTP id p2-v6mr1392087pgh.148.1533712394066; Wed, 08 Aug 2018 00:13:14 -0700 (PDT) Received: from tiehlicka.suse.cz (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id n9-v6sm5517945pfg.21.2018.08.08.00.13.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 08 Aug 2018 00:13:13 -0700 (PDT) From: Michal Hocko To: Johannes Weiner Cc: Andrew Morton , Vladimir Davydov , Greg Thelen , Tetsuo Handa , Dmitry Vyukov , , LKML , Michal Hocko Subject: [PATCH 1/2] memcg, oom: be careful about races when warning about no reclaimable task Date: Wed, 8 Aug 2018 09:13:00 +0200 Message-Id: <20180808071301.12478-2-mhocko@kernel.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180808071301.12478-1-mhocko@kernel.org> References: <20180808064414.GA27972@dhcp22.suse.cz> <20180808071301.12478-1-mhocko@kernel.org> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Michal Hocko "memcg, oom: move out_of_memory back to the charge path" has added a warning triggered when the oom killer cannot find any eligible task and so there is no way to reclaim the oom memcg under its hard limit. Further charges for such a memcg are forced and therefore the hard limit isolation is weakened. The current warning is however too eager to trigger even when we are not really hitting the above condition. Syzbot[1] and Greg Thelen have noticed that we can hit this condition even when there is still oom victim pending. E.g. the following race is possible: memcg has two tasks taskA, taskB. CPU1 (taskA) CPU2 CPU3 (taskB) try_charge mem_cgroup_out_of_memory try_charge select_bad_process(taskB) oom_kill_process oom_reap_task # No real memory reaped mem_cgroup_out_of_memory # set taskB -> MMF_OOM_SKIP # retry charge mem_cgroup_out_of_memory oom_lock oom_lock select_bad_process(self) oom_kill_process(self) oom_unlock # no eligible task In fact syzbot test triggered this situation by placing multiple tasks into a memcg with hard limit set to 0. So no task really had any memory charged to the memcg : Memory cgroup stats for /ile0: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB : Tasks state (memory values in pages): : [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name : [ 6569] 0 6562 9427 1 53248 0 0 syz-executor0 : [ 6576] 0 6576 9426 0 61440 0 0 syz-executor6 : [ 6578] 0 6578 9426 534 61440 0 0 syz-executor4 : [ 6579] 0 6579 9426 0 57344 0 0 syz-executor5 : [ 6582] 0 6582 9426 0 61440 0 0 syz-executor7 : [ 6584] 0 6584 9426 0 57344 0 0 syz-executor1 so in principle there is indeed nothing reclaimable in this memcg and this looks like a misconfiguration. On the other hand we can clearly kill all those tasks so it is a bit early to warn and scare users. Do that by checking that the current is the oom victim and bypass the warning then. The victim is allowed to force charge and terminate to release its temporal charge along the way. [1] http://lkml.kernel.org/r/0000000000005e979605729c1564@google.com Fixes: "memcg, oom: move out_of_memory back to the charge path" Noticed-by: Greg Thelen Reported-and-tested-by: syzbot+bab151e82a4e973fa325@syzkaller.appspotmail.com Signed-off-by: Michal Hocko --- mm/memcontrol.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 4603ad75c9a9..c80e5b6a8e9f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1705,6 +1705,15 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int if (mem_cgroup_out_of_memory(memcg, mask, order)) return OOM_SUCCESS; + + /* + * under rare race the current task might have been selected while + * reaching mem_cgroup_out_of_memory and there is no other oom victim + * left. There is still no reason to warn because this task will + * die and release its bypassed charge eventually. + */ + if (tsk_is_oom_victim(current)) + return OOM_SUCCESS; WARN(1,"Memory cgroup charge failed because of no reclaimable memory! " "This looks like a misconfiguration or a kernel bug.");