From patchwork Thu Apr 23 06:16:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11505015 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2817F1392 for ; Thu, 23 Apr 2020 06:16:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DF5E1206B9 for ; Thu, 23 Apr 2020 06:16:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="oOCsiYHa" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DF5E1206B9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E9D508E0008; Thu, 23 Apr 2020 02:16:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E25858E0003; Thu, 23 Apr 2020 02:16:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CED8F8E0008; Thu, 23 Apr 2020 02:16:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B40128E0003 for ; Thu, 23 Apr 2020 02:16:49 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6F98C8126 for ; Thu, 23 Apr 2020 06:16:49 +0000 (UTC) X-FDA: 76738111338.01.print30_58a31d334044a X-Spam-Summary: 2,0,0,89b1158d7186d47b,d41d8cd98f00b204,laoar.shao@gmail.com,,RULES_HIT:41:355:379:541:800:960:967:973:988:989:1260:1345:1437:1535:1544:1711:1730:1747:1777:1792:2393:2525:2559:2563:2682:2685:2692:2693:2859:2897:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4117:4321:4605:5007:6261:6653:7514:7903:8784:9025:9121:9413:9592:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12679:12895:12986:13161:13229:14096:14181:14394:14687:14721:21080:21444:21451:21627:21666:21740:21749:21811:21966:21990:30054:30064,0,RBL:209.85.210.196:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: print30_58a31d334044a X-Filterd-Recvd-Size: 6761 Received: from mail-pf1-f196.google.com (mail-pf1-f196.google.com [209.85.210.196]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Thu, 23 Apr 2020 06:16:48 +0000 (UTC) Received: by mail-pf1-f196.google.com with SMTP id z1so757716pfn.3 for ; Wed, 22 Apr 2020 23:16:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=AiUiWL9DurW7cWt61hni629aKzDiuXg4UvitEEzN4eM=; b=oOCsiYHaBQkCFBJwZosL19Urqncqa87XSh9WIGq91imxp4PBDF3acnNEYOVJgR/Utc B/dE2mBmprHy3nXCgO4gy7Kduk8+Glqa4RpkNVFZt0FKLzs4+7yKNeB0KOaKz/SZM+Qt Ddh1A0jtKV6Xb4pe4TJXmMLhngx4eZ3T81b0uCsjw7UZ1IKX3Xq0QYf4MrsiCMxlG8uO is5al36KxtmwA1KBceg/0dwagpuM2/YZ0fAK+K8jNkidVPWwMBhRQdAuWlbfckIpBFQZ s1vzgOLBitTEkLumwyNi12e2BUs4YVsndaA9xKdKLvp6oM9ejcT8T5Bcll3jj7yzavIN Zi3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=AiUiWL9DurW7cWt61hni629aKzDiuXg4UvitEEzN4eM=; b=hSyXFvBb7kAqU+dogTgsPTrgTJuvMhEMEfS94mD1x2KJ9AGUKR6wlxQwTmvMO0RoYg o5VLvU1UAmReNwh31xnVPdSSPR/KkOLn8AIk7uepLrKy2CkkIo14ZHP97xOBYoLGtp1H KR8e5Bm2XK6lQRoPi6v1b6IXWJD+WsYiHkLfL2uX2VT8jr5f323yfWUZhiAY5MDn2+pl 4X54fYq6AF5rb2X9NuUhUXWnjFzkejFdNVlOa1g5wSqVY6GvzNbc6jSlO9/w7coikeo1 Sfaa8e6C1MlJPyN8QYzg6ZjR9YuvY1E6FhbZaAEzt/EpyQKeN3cVUDEZXzPXJD3lVTRA ghQQ== X-Gm-Message-State: AGi0PuYA841+4Hh81P3Mn2rYrpn8XYHp7KHBHJ7o+WPc2QbGLF6rWGMt 9TlG+b9uiYf4g6clBH5Fong= X-Google-Smtp-Source: APiQypLOZhPenYQEd/BLUxjkkdN41vzd8yW8gFo0dM+Is6vtiO6t7xUEJDKavdqft2129oLlhFWNKA== X-Received: by 2002:a63:b64e:: with SMTP id v14mr2620194pgt.164.1587622607973; Wed, 22 Apr 2020 23:16:47 -0700 (PDT) Received: from localhost.localdomain ([203.100.54.194]) by smtp.gmail.com with ESMTPSA id w30sm1501141pfj.25.2020.04.22.23.16.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Apr 2020 23:16:47 -0700 (PDT) From: Yafang Shao To: akpm@linux-foundation.org, mhocko@kernel.org, vdavydov.dev@gmail.com Cc: linux-mm@kvack.org, Yafang Shao , Chris Down , Roman Gushchin , stable@vger.kernel.org Subject: [PATCH] mm, memcg: fix wrong mem cgroup protection Date: Thu, 23 Apr 2020 02:16:29 -0400 Message-Id: <20200423061629.24185-1-laoar.shao@gmail.com> X-Mailer: git-send-email 2.18.1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch is an improvement of a previous version[1], as the previous version is not easy to understand. This issue persists in the newest kernel, I have to resend the fix. As the implementation is changed, I drop Roman's ack from the previous version. Here's the explanation of this issue. memory.{low,min} won't take effect if the to-be-reclaimed memcg is the sc->target_mem_cgroup, that can also be proved by the implementation in mem_cgroup_protected(), see bellow, mem_cgroup_protected if (memcg == root) [2] return MEMCG_PROT_NONE; But this rule is ignored in mem_cgroup_protection(), which will read memory.{emin, elow} as the protection whatever the memcg is. How would this issue happen? Because in mem_cgroup_protected() we forget to clear the memory.{emin, elow} if the memcg is target_mem_cgroup [2]. An example to illustrate this issue. root_mem_cgroup / A memory.max: 1024M memory.min: 512M memory.current: 800M ('current' must be greater than 'min') Once kswapd starts to reclaim memcg A, it assigns 512M to memory.emin of A. Then kswapd stops. As a result of it, the memory values of A will be, root_mem_cgroup / A memory.max: 1024M memory.min: 512M memory.current: 512M (approximately) memory.emin: 512M Then a new workload starts to run in memcg A, and it will trigger memcg relcaim in A soon. As memcg A is the target_mem_cgroup of this reclaimer, so it return directly without touching memory.{emin, elow}.[2] The memory values of A will be, root_mem_cgroup / A memory.max: 1024M memory.min: 512M memory.current: 1024M (approximately) memory.emin: 512M Then this memory.emin will be used in mem_cgroup_protection() to get the scan count, which is obvoiusly a wrong scan count. [1]. https://lore.kernel.org/linux-mm/20200216145249.6900-1-laoar.shao@gmail.com/ Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim") Cc: Chris Down Cc: Roman Gushchin Cc: stable@vger.kernel.org Signed-off-by: Yafang Shao --- include/linux/memcontrol.h | 13 +++++++++++-- mm/vmscan.c | 4 ++-- 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d275c72c4f8e..114cfe06bf60 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -344,12 +344,20 @@ static inline bool mem_cgroup_disabled(void) return !cgroup_subsys_enabled(memory_cgrp_subsys); } -static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg, +static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root, + struct mem_cgroup *memcg, bool in_low_reclaim) { if (mem_cgroup_disabled()) return 0; + /* + * Memcg protection won't take effect if the memcg is the target + * root memcg. + */ + if (root == memcg) + return 0; + if (in_low_reclaim) return READ_ONCE(memcg->memory.emin); @@ -835,7 +843,8 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm, { } -static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg, +static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root, + struct mem_cgroup *memcg, bool in_low_reclaim) { return 0; diff --git a/mm/vmscan.c b/mm/vmscan.c index b06868fc4926..ad2782f754ab 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2346,9 +2346,9 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, unsigned long protection; lruvec_size = lruvec_lru_size(lruvec, lru, sc->reclaim_idx); - protection = mem_cgroup_protection(memcg, + protection = mem_cgroup_protection(sc->target_mem_cgroup, + memcg, sc->memcg_low_reclaim); - if (protection) { /* * Scale a cgroup's reclaim pressure by proportioning