From patchwork Fri Jul 30 16:20:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aaron Tomlin X-Patchwork-Id: 12411665 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EF81C4338F for ; Fri, 30 Jul 2021 16:27:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8FC7760EE6 for ; Fri, 30 Jul 2021 16:27:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8FC7760EE6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 18D646B0033; Fri, 30 Jul 2021 12:27:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 13E1A8D0001; Fri, 30 Jul 2021 12:27:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 005206B006C; Fri, 30 Jul 2021 12:27:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0094.hostedemail.com [216.40.44.94]) by kanga.kvack.org (Postfix) with ESMTP id D70FC6B0033 for ; Fri, 30 Jul 2021 12:27:45 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 887191848C950 for ; Fri, 30 Jul 2021 16:27:45 +0000 (UTC) X-FDA: 78419785290.29.10E4E9E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf17.hostedemail.com (Postfix) with ESMTP id 3509AF00348D for ; Fri, 30 Jul 2021 16:27:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1627662462; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=6sxbrTXcrTd5pHC9K2KxbmQLHh8u1jEYeqbDBwTE+r4=; b=AhxmTJRJZJWMjwyEGS4yLfHHr1PIcpshdu5m4rtgXeyPSPUeWH9ha/aonabGLaqwyo/wzH 1CnflLOVAgxD7C3YDRCjbtC8Jop1ka9RWIZUcjqvucRiQyBrlSVctIJ6nsVYmMAtrww0io tsx5G/tyxiHlbAMbxqXoBpyUhVOMCQI= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-74-DJdlKC6LMaO4IGCpwuWMgw-1; Fri, 30 Jul 2021 12:20:27 -0400 X-MC-Unique: DJdlKC6LMaO4IGCpwuWMgw-1 Received: by mail-wr1-f70.google.com with SMTP id p2-20020a5d48c20000b0290150e4a5e7e0so3352686wrs.13 for ; Fri, 30 Jul 2021 09:20:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=6sxbrTXcrTd5pHC9K2KxbmQLHh8u1jEYeqbDBwTE+r4=; b=LHzeNHO9lUgaRva8D01A759LsM8oGZl62xEkrhgHLMYPNJqIBXZqpzqo7PgL9Qg+eg b431UDhD7OR2oWDOGvPob5tAEkTUIx7aOEz7otYvJoo+NBbVSVB/Td3z9D8zK3BWj3la ohcVbBD1ZAtuYeJaryB4jDS3udECTICF/DXl/Jofb5RLNFbD5UosQGKBya8PknnjpJ7f Ke7B11qdHfeCiNE5iKU9CPfDtAF22vHPrh0WDYljDoHo+bh28Y32BrlwOeQLOJi0Ul4c /lA+LmZP5qeImXPb/H0KvZrDRy4n8x8y3AmARFwJa/ROWpEIFSCkT5KGQbCDwB1FiEPV IMIQ== X-Gm-Message-State: AOAM530hgUobAD1TtL/GNgu/6X+BipSvgMDjuITLlNHMQ4M1dl0TCEeT kQHDQCz/XNG1jA15Osdbkz/EhsYBcciPN1iNrF0Y9+VyK93n+bg9mq2bygYJnM4cpqqoTPpMhZE yzp+uHPqD2cGyt7Yg0lRDIpcmpaFAOstY8YJur+btXWH3pptnxUaktjxS8FCm X-Received: by 2002:adf:8b86:: with SMTP id o6mr86940wra.116.1627662004603; Fri, 30 Jul 2021 09:20:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwueOQ6ozH2s/dDrfAY4tQiAwkhHfpipARtKccXK0pW5HfzXL3N4gPkJFLi4GaS2oJfkRKhFA== X-Received: by 2002:adf:8b86:: with SMTP id o6mr86898wra.116.1627662004124; Fri, 30 Jul 2021 09:20:04 -0700 (PDT) Received: from localhost (cpc111743-lutn13-2-0-cust979.9-3.cable.virginm.net. [82.17.115.212]) by smtp.gmail.com with ESMTPSA id b15sm2619315wrr.27.2021.07.30.09.20.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Jul 2021 09:20:02 -0700 (PDT) From: Aaron Tomlin To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, mhocko@suse.com, penguin-kernel@i-love.sakura.ne.jp, rientjes@google.com, llong@redhat.com, neelx@redhat.com, linux-kernel@vger.kernel.org Subject: [PATCH v3] mm/oom_kill: show oom eligibility when displaying the current memory state of all tasks Date: Fri, 30 Jul 2021 17:20:02 +0100 Message-Id: <20210730162002.279678-1-atomlin@redhat.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 3509AF00348D Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=AhxmTJRJ; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf17.hostedemail.com: domain of atomlin@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=atomlin@redhat.com X-Stat-Signature: rw87i83tmkonxe59pwmtgxpsga19h59r X-HE-Tag: 1627662463-334323 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Changes since v2: - Use single character (e.g. 'R' for MMF_OOM_SKIP) as suggested by Tetsuo Handa - Add new header to oom_dump_tasks documentation - Provide further justification The output generated by dump_tasks() can be helpful to determine why there was an OOM condition and which rogue task potentially caused it. Please note that this is only provided when sysctl oom_dump_tasks is enabled. At the present time, when showing potential OOM victims, we do not exclude any task that are not OOM eligible e.g. those that have MMF_OOM_SKIP set; it is possible that the last OOM killable victim was already OOM killed, yet the OOM reaper failed to reclaim memory and set MMF_OOM_SKIP. This can be confusing (or perhaps even be misleading) to the viewer. Now, we already unconditionally display a task's oom_score_adj_min value that can be set to OOM_SCORE_ADJ_MIN which is indicative of an "unkillable" task. This patch provides a clear indication with regard to the OOM ineligibility (and why) of each displayed task with the addition of a new column namely "oom_skipped". An example is provided below: [ 5084.524970] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj oom_skipped name [ 5084.526397] [660417] 0 660417 35869 683 167936 0 -1000 M conmon [ 5084.526400] [660452] 0 660452 175834 472 86016 0 -998 pod [ 5084.527460] [752415] 0 752415 35869 650 172032 0 -1000 M conmon [ 5084.527462] [752575] 1001050000 752575 184205 11158 700416 0 999 npm [ 5084.527467] [753606] 1001050000 753606 183380 46843 2134016 0 999 node [ 5084.527581] Memory cgroup out of memory: Killed process 753606 (node) total-vm:733520kB, anon-rss:161228kB, file-rss:26144kB, shmem-rss:0kB, UID:1001050000 So, a single character 'M' is for OOM_SCORE_ADJ_MIN, 'R' MMF_OOM_SKIP and 'V' for in_vfork(). Signed-off-by: Aaron Tomlin --- Documentation/admin-guide/sysctl/vm.rst | 5 ++-- mm/oom_kill.c | 31 +++++++++++++++++++++---- 2 files changed, 30 insertions(+), 6 deletions(-) diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst index 003d5cc3751b..4c79fa00ddb3 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -650,8 +650,9 @@ oom_dump_tasks Enables a system-wide task dump (excluding kernel threads) to be produced when the kernel performs an OOM-killing and includes such information as pid, uid, tgid, vm size, rss, pgtables_bytes, swapents, oom_score_adj -score, and name. This is helpful to determine why the OOM killer was -invoked, to identify the rogue task that caused it, and to determine why +score, oom eligibility status and name. This is helpful to determine why +the OOM killer was invoked, to identify the rogue task that caused it, and +to determine why the OOM killer chose the task it did to kill. If this is set to zero, this information is suppressed. On very diff --git a/mm/oom_kill.c b/mm/oom_kill.c index c729a4c4a1ac..36daa6917b62 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -160,6 +160,27 @@ static inline bool is_sysrq_oom(struct oom_control *oc) return oc->order == -1; } +/** + * is_task_eligible_oom - determine if and why a task cannot be OOM killed + * @tsk: task to check + * + * Needs to be called with task_lock(). + */ +static const char * const is_task_oom_eligible(struct task_struct *p) +{ + long adj; + + adj = (long)p->signal->oom_score_adj; + if (adj == OOM_SCORE_ADJ_MIN) + return "M"; + else if (test_bit(MMF_OOM_SKIP, &p->mm->flags) + return "R"; + else if (in_vfork(p)) + return "V"; + else + return ""; +} + /* return true if the task is not adequate as candidate victim task. */ static bool oom_unkillable_task(struct task_struct *p) { @@ -401,12 +422,13 @@ static int dump_task(struct task_struct *p, void *arg) return 0; } - pr_info("[%7d] %5d %5d %8lu %8lu %8ld %8lu %5hd %s\n", + pr_info("[%7d] %5d %5d %8lu %8lu %8ld %8lu %5hd %1s %s\n", task->pid, from_kuid(&init_user_ns, task_uid(task)), task->tgid, task->mm->total_vm, get_mm_rss(task->mm), mm_pgtables_bytes(task->mm), get_mm_counter(task->mm, MM_SWAPENTS), - task->signal->oom_score_adj, task->comm); + task->signal->oom_score_adj, is_task_oom_eligible(task), + task->comm); task_unlock(task); return 0; @@ -420,12 +442,13 @@ static int dump_task(struct task_struct *p, void *arg) * memcg, not in the same cpuset, or bound to a disjoint set of mempolicy nodes * are not shown. * State information includes task's pid, uid, tgid, vm size, rss, - * pgtables_bytes, swapents, oom_score_adj value, and name. + * pgtables_bytes, swapents, oom_score_adj value, oom eligibility status + * and name. */ static void dump_tasks(struct oom_control *oc) { pr_info("Tasks state (memory values in pages):\n"); - pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n"); + pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj oom_skipped name\n"); if (is_memcg_oom(oc)) mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);