From patchwork Sat Apr 11 09:36:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11484059 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7213392C for ; Sat, 11 Apr 2020 09:36:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2C37D212CC for ; Sat, 11 Apr 2020 09:36:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mjWC8okW" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2C37D212CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6363F8E00A5; Sat, 11 Apr 2020 05:36:32 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5E6808E0007; Sat, 11 Apr 2020 05:36:32 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4FC598E00A5; Sat, 11 Apr 2020 05:36:32 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0108.hostedemail.com [216.40.44.108]) by kanga.kvack.org (Postfix) with ESMTP id 358F78E0007 for ; Sat, 11 Apr 2020 05:36:32 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id DF1DCAF81 for ; Sat, 11 Apr 2020 09:36:31 +0000 (UTC) X-FDA: 76695068982.14.flock16_73325c8e6e22c X-Spam-Summary: 2,0,0,c25f0d16bbc1bf2e,d41d8cd98f00b204,laoar.shao@gmail.com,,RULES_HIT:41:173:355:379:541:800:960:973:988:989:1260:1345:1437:1535:1544:1605:1711:1730:1747:1777:1792:2393:2553:2559:2562:2693:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4117:4250:4321:4605:5007:6120:6261:6653:7514:7875:7903:9413:10004:11026:11233:11473:11658:11914:12043:12291:12296:12297:12438:12517:12519:12555:12683:12895:12986:13095:13146:13161:13229:13230:14181:14687:14721:21080:21324:21433:21444:21451:21627:21666:21983:30054:30090,0,RBL:209.85.215.193:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: flock16_73325c8e6e22c X-Filterd-Recvd-Size: 6745 Received: from mail-pg1-f193.google.com (mail-pg1-f193.google.com [209.85.215.193]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Sat, 11 Apr 2020 09:36:31 +0000 (UTC) Received: by mail-pg1-f193.google.com with SMTP id p8so2054778pgi.5 for ; Sat, 11 Apr 2020 02:36:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=9XpXqmrWHghY7OW65HLy1BWJA2XOZ2Y6yCbcUJHc1wM=; b=mjWC8okWmT3R/x0+DqE9oIgoGHxn1H4d8lCm1K0pGBTH3OPZLPALoSxIwVJB5Ss4m1 y+4qlFQeAAuJZuIfL9nHmfxVhmOdaa8i26h2iEIPcJZ0mGpYIFNmbQ6VJRGs7GuvRap4 8iBt11sZx90H0H1cFgtYSHE0UhT5sTqJP1LQKuYZMXUK3XrhDOtYaiKfr7LFfWbCArKU aP/BXMO6hFVl0kmScPRT2lYv+H/rToAIRloPGj6f3KBaVJTMhiYFL7l6QrsoCh8uOC+R i1BW0X5fYVH6SFhBwC71Yy3suHVC76LHYwG5fkByzdU1sGIgVaNBqArJWf/TSrOKDjRy ygGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=9XpXqmrWHghY7OW65HLy1BWJA2XOZ2Y6yCbcUJHc1wM=; b=klGwjg1hleiOag70sVCZWZUuSUJxgJ8sHsGDIi9vnqMJ2dO7L8z9WhSRzJlpLjlwGR 8eWQlXVLG7HYKabjaEo7/2rP4jckqWxsA4+FeQs9x8xGEBaL5oUgFZ8k5DbC0Ao0XZzw 8w6CNPWOBNQjP/nIplQ9dOt/9LFxIZ1ntgcZPsBjR+zylze4ujFnYozhhqgLn4kbpmHu OaRaGoX4uHQ4e/CFfd17roX7jUH9IkddP5JfJGCwjjYFIuNm9VyFUzCPkNJx3pjnGcXr GxzZQqbK8hTSbKw/mwk3WTiWW7KN32Kmr/18nPDe6yV9wK9yAI09rczKDndozwpqGeIH tUAA== X-Gm-Message-State: AGi0PuZCpQk7DAUQe9C1QDfwEXRLQZQw+mKOjqopT19HXTzuwxV56T28 MkVLNKMGT6KdszkIbQTSEfo= X-Google-Smtp-Source: APiQypJZQmhyl+cqriBCbQvRLy/GIuhXZJcs/ADjcdw9mhWHHje8CunZnFnU/xZGNQEKnfOeLM4DJw== X-Received: by 2002:a63:741a:: with SMTP id p26mr8689321pgc.40.1586597790416; Sat, 11 Apr 2020 02:36:30 -0700 (PDT) Received: from dev.localdomain ([203.100.54.194]) by smtp.gmail.com with ESMTPSA id l9sm3733516pjl.20.2020.04.11.02.36.28 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 11 Apr 2020 02:36:29 -0700 (PDT) From: Yafang Shao To: akpm@linux-foundation.org, mhocko@kernel.org Cc: linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH] mm, oom: oom ratelimit auto tuning Date: Sat, 11 Apr 2020 05:36:14 -0400 Message-Id: <1586597774-6831-1-git-send-email-laoar.shao@gmail.com> X-Mailer: git-send-email 1.8.3.1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Recently we find an issue that when OOM happens the server is almost unresponsive for several minutes. That is caused by a slow serial set with "console=ttyS1,19200". As the speed of this serial is too slow, it will take almost 10 seconds to print a full OOM message into it. And then all tasks allocating pages will be blocked as there is almost no pages can be reclaimed. At that time, the memory pressure is around 90 for a long time. If we don't print the OOM messages into this serial, a full OOM message only takes less than 1ms and the memory pressure is less than 40. We can avoid printing OOM messages into slow serial by adjusting /proc/sys/kernel/printk to fix this issue, but then all messages with KERN_WARNING level can't be printed into it neither, that may loss some useful messages when we want to collect messages from the it for debugging purpose. So it is better to decrease the ratelimit. We can introduce some sysctl knobes similar with printk_ratelimit and burst, but it will burden the amdin. Let the kernel automatically adjust the ratelimit, that would be a better choice. The OOM ratelimit starts with a slow rate, and it will increase slowly if the speed of the console is rapid and decrease rapidly if the speed of the console is slow. oom_rs.burst will be in [1, 10] and oom_rs.interval will always greater than 5 * HZ. Signed-off-by: Yafang Shao --- mm/oom_kill.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 48 insertions(+), 3 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index dfc357614e56..23dba8ccf313 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -954,8 +954,10 @@ static void oom_kill_process(struct oom_control *oc, const char *message) { struct task_struct *victim = oc->chosen; struct mem_cgroup *oom_group; - static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL, - DEFAULT_RATELIMIT_BURST); + static DEFINE_RATELIMIT_STATE(oom_rs, 20 * HZ, 1); + int delta; + unsigned long start; + unsigned long end; /* * If the task is already exiting, don't alarm the sysadmin or kill @@ -972,8 +974,51 @@ static void oom_kill_process(struct oom_control *oc, const char *message) } task_unlock(victim); - if (__ratelimit(&oom_rs)) + if (__ratelimit(&oom_rs)) { + start = jiffies; dump_header(oc, victim); + end = jiffies; + delta = end - start; + + /* + * The OOM messages may be printed to a serial with very low + * speed, e.g. console=ttyS1,19200. It will take long + * time to print these OOM messages to this serial, and + * then processes allocating pages will all be blocked due + * to it can hardly reclaim pages. That will case high + * memory pressure and the system may be unresponsive for a + * long time. + * In this case, we should decrease the OOM ratelimit or + * avoid printing OOM messages into the slow serial. But if + * we avoid printing OOM messages into the slow serial, all + * messages with KERN_WARNING level can't be printed into + * it neither, that may loss some useful messages when we + * want to collect messages from the console for debugging + * purpose. So it is better to decrease the ratelimit. We + * can introduce some sysctl knobes similar with + * printk_ratelimit and burst, but it will burden the + * admin. Let the kernel automatically adjust the ratelimit + * would be a better chioce. + * In bellow algorithm, it will decrease the OOM ratelimit + * rapidly if the console is slow and increase the OOM + * ratelimit slowly if the console is fast. oom_rs.burst + * will be in [1, 10] and oom_rs.interval will always + * greater than 5 * HZ. + */ + if (delta < oom_rs.interval / 10) { + if (oom_rs.interval >= 10 * HZ) + oom_rs.interval /= 2; + else if (oom_rs.interval > 6 * HZ) + oom_rs.interval -= HZ; + + if (oom_rs.burst < 10) + oom_rs.burst += 1; + } else if (oom_rs.burst > 1) { + oom_rs.burst = 1; + oom_rs.interval = 4 * delta; + } + + } /* * Do we need to kill the entire memory cgroup?