From patchwork Tue Sep 15 11:40:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chunxin Zang X-Patchwork-Id: 11776245 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 61BC7618 for ; Tue, 15 Sep 2020 11:40:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0550920872 for ; Tue, 15 Sep 2020 11:40:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="vxnijL8m" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0550920872 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B26BF90002D; Tue, 15 Sep 2020 07:40:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AB01C90001D; Tue, 15 Sep 2020 07:40:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 976BB90002D; Tue, 15 Sep 2020 07:40:26 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0070.hostedemail.com [216.40.44.70]) by kanga.kvack.org (Postfix) with ESMTP id 7A9DB90001D for ; Tue, 15 Sep 2020 07:40:26 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3658C1DF5 for ; Tue, 15 Sep 2020 11:40:26 +0000 (UTC) X-FDA: 77265102852.06.doll85_030b4ce27111 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id DFE1D10055006 for ; Tue, 15 Sep 2020 11:40:25 +0000 (UTC) X-Spam-Summary: 1,0,0,9637d716ac585ac8,d41d8cd98f00b204,zangchunxin@bytedance.com,,RULES_HIT:41:355:379:541:800:960:966:973:982:988:989:1260:1311:1314:1345:1437:1515:1535:1543:1711:1730:1747:1777:1792:1801:2196:2199:2393:2559:2562:2895:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:3871:3872:3874:4321:4385:4605:5007:6261:6653:7576:7903:7904:8784:9149:10004:11026:11658:11914:12043:12294:12296:12297:12517:12519:12555:12679:12740:12895:12986:13053:13161:13229:13869:13894:14096:14181:14394:14721:21080:21444:21451:21627:21789:21939:21990:30005:30054:30070:30075,0,RBL:209.85.215.195:@bytedance.com:.lbl8.mailshell.net-66.100.201.201 62.2.0.100;04yf8fxof14qnmost8ieyg4yp4r9cycxawyhuu8qpfp7ie6amd7jj7wr3qn49u1.zst1fbmh97on8bj1sxrum6unsjw7efyzxs4y6dob9wxbs66jnqdgzajf3pnptdw.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: doll85_030b4ce27111 X-Filterd-Recvd-Size: 5890 Received: from mail-pg1-f195.google.com (mail-pg1-f195.google.com [209.85.215.195]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Tue, 15 Sep 2020 11:40:25 +0000 (UTC) Received: by mail-pg1-f195.google.com with SMTP id y1so1836299pgk.8 for ; Tue, 15 Sep 2020 04:40:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=UZTeZgS/HRogQFGvFWOGASfxBZXKDJCSTQPpZUgR/Es=; b=vxnijL8mD9fvf864vGCOCOtWI6+tMeoadJgLcONMkLvgHCNOX7kdC9XC/M93DzSeaH QGPzRRlkpg27B6cazJeQWQcukneWZzkX3TsmMEKNulhclV6UHsbzVnsPK/v0pmwaQ9PS 6kv02KQB0aYp44SStbAb+j15rXzKYFiZ7+H+6hZ+xyyepznt5H0/uDGGQt1c6hlgIHlU 3LU18HP4i4iP29wIDm8jOpJUNZbfzovYBwpQfdofgF8EwvrZGxBy6edMzo0oZIlWN0IF PZlGnm0Af0AhMKADYufJz8mXCJ4ak48VmgKSfTH+aKg8OZneNx4RDA7tedMxIyRjfPEE JELw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=UZTeZgS/HRogQFGvFWOGASfxBZXKDJCSTQPpZUgR/Es=; b=I2kJOullnlAfBEU5JqvxLFGKsb9JWyWpYq3lvF4rih4Eoj3AQ9nrGeVNi9sLsK1o0a uY2w5d3Y5Q90ds1LH0T0newFgV86b9899Owx3xvX7wCs7jaOKoCjuVBExXdDXl129LlW SJ6OCiY5vErdCCvJFR7/ly+iXHO9zdrFJ+Tivo3jBWFAaNdCLirZG75v8FnAN+CUcesc AewHCxZRo8KL1uT8GRcbH810GU2bkb2PAqZnjus+TZlbRY30XyB5LugQZTCUMOkW0ApH 1ntzxwZoejMufM8r2/Q7cPgSiHMb64s9vKvtX6X88SOw2uIWVjUx/wbz0tTfC7woWpBd /NEA== X-Gm-Message-State: AOAM532xFJrXOFbYzH+p1JaikYDfaYR83xjyYKr3RGlDfdPuYyiq+b13 HD8xbEROq2ovknTemAEAfV2N+g== X-Google-Smtp-Source: ABdhPJy1tf/Lc997XhgJ4gLpnYAYfclS51OkwLLYh8aJRd2O+CupL1BAPxlxg7PUeTNuwm2T+8BB8A== X-Received: by 2002:a63:5d07:: with SMTP id r7mr1494562pgb.440.1600170023688; Tue, 15 Sep 2020 04:40:23 -0700 (PDT) Received: from Zs-MacBook-Pro.local.net ([103.136.220.73]) by smtp.gmail.com with ESMTPSA id u14sm13494204pfm.80.2020.09.15.04.40.20 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 15 Sep 2020 04:40:23 -0700 (PDT) From: zangchunxin@bytedance.com To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Chunxin Zang , Muchun Song Subject: [PATCH v3] mm/vmscan: add a fatal signals check in drop_slab_node Date: Tue, 15 Sep 2020 19:40:01 +0800 Message-Id: <20200915114001.79950-1-zangchunxin@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 X-Rspamd-Queue-Id: DFE1D10055006 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000111, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Chunxin Zang On our server, there are about 10k memcg in one machine. They use memory very frequently. We have observed that drop_caches can take a considerable amount of time, and can't stop it. There are two reasons: 1. There is somebody constantly generating more objects to reclaim on drop_caches, result the 'freed' always bigger than 10. 2. The process has no chance to process signals. We can get the following info through 'ps': root:~# ps -aux | grep drop root 357956 ... R Aug25 21119854:55 echo 3 > /proc/sys/vm/drop_caches root 1771385 ... R Aug16 21146421:17 echo 3 > /proc/sys/vm/drop_caches root 1986319 ... R 18:56 117:27 echo 3 > /proc/sys/vm/drop_caches root 2002148 ... R Aug24 5720:39 echo 3 > /proc/sys/vm/drop_caches root 2564666 ... R 18:59 113:58 echo 3 > /proc/sys/vm/drop_caches root 2639347 ... R Sep03 2383:39 echo 3 > /proc/sys/vm/drop_caches root 3904747 ... R 03:35 993:31 echo 3 > /proc/sys/vm/drop_caches root 4016780 ... R Aug21 7882:18 echo 3 > /proc/sys/vm/drop_caches Use bpftrace follow 'freed' value in drop_slab_node: root:~# bpftrace -e 'kprobe:drop_slab_node+70 {@ret=hist(reg("bp")); }' Attaching 1 probe... ^B^C @ret: [64, 128) 1 | | [128, 256) 28 | | [256, 512) 107 |@ | [512, 1K) 298 |@@@ | [1K, 2K) 613 |@@@@@@@ | [2K, 4K) 4435 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4K, 8K) 442 |@@@@@ | [8K, 16K) 299 |@@@ | [16K, 32K) 100 |@ | [32K, 64K) 139 |@ | [64K, 128K) 56 | | [128K, 256K) 26 | | [256K, 512K) 2 | | We need one path to stop the process. Signed-off-by: Chunxin Zang Signed-off-by: Muchun Song Acked-by: Michal Hocko --- changelogs in v3: 1) update the description of the patch. v2 named: mm/vmscan: fix infinite loop in drop_slab_node changelogs in v2: 1) via check fatal signal break loop. mm/vmscan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/vmscan.c b/mm/vmscan.c index b6d84326bdf2..6b2b5d420510 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -704,6 +704,9 @@ void drop_slab_node(int nid) do { struct mem_cgroup *memcg = NULL; + if (signal_pending(current)) + return; + freed = 0; memcg = mem_cgroup_iter(NULL, NULL, NULL); do {