From patchwork Sat Apr 25 15:24:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11509931 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F66D1667 for ; Sat, 25 Apr 2020 15:24:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B684720748 for ; Sat, 25 Apr 2020 15:24:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QYNnOVu/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B684720748 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CA5638E0008; Sat, 25 Apr 2020 11:24:55 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C30528E0003; Sat, 25 Apr 2020 11:24:55 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF8008E0008; Sat, 25 Apr 2020 11:24:55 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0058.hostedemail.com [216.40.44.58]) by kanga.kvack.org (Postfix) with ESMTP id 94CD28E0003 for ; Sat, 25 Apr 2020 11:24:55 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 558D3181AC9CB for ; Sat, 25 Apr 2020 15:24:55 +0000 (UTC) X-FDA: 76746750150.09.spark12_2aecc2a28cd21 X-Spam-Summary: 2,0,0,cf1fdf902807bce9,d41d8cd98f00b204,laoar.shao@gmail.com,,RULES_HIT:1:2:41:69:355:379:541:800:960:966:967:973:988:989:1260:1345:1359:1437:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2525:2553:2559:2563:2682:2685:2692:2731:2859:2897:2898:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4051:4321:4385:4605:5007:6261:6653:7514:7875:7903:9025:9040:9413:9592:10004:10241:11026:11232:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12683:12895:12986:13153:13161:13228:13229:13846:14096:14394:14687:21080:21094:21323:21433:21444:21451:21627:21666:21749:21811:21966:21972:21990:30041:30054:30064:30070:30090,0,RBL:209.85.215.194:@gmail.com:.lbl8.mailshell.net-62.18.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: spark12_2aecc2a28cd21 X-Filterd-Recvd-Size: 11160 Received: from mail-pg1-f194.google.com (mail-pg1-f194.google.com [209.85.215.194]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Sat, 25 Apr 2020 15:24:54 +0000 (UTC) Received: by mail-pg1-f194.google.com with SMTP id h69so6135897pgc.8 for ; Sat, 25 Apr 2020 08:24:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=f3R7QSPxrCk6YOKm1t92HgRPziYVOwFVn0uk4JaPzMI=; b=QYNnOVu/E8wlKEcg4qAUzu/h8PtQobUI9uDhc83lOuPFxcilOrEpCZOyZhn1FUwtTq x3rNnkQl5dQGzcPaVBhxTC7SiG1ud364w5SrXN/bfSPZzPPUrPd1nezJnjyvBhi4k95Y h8K217tKCVdyYP/1Xu6MZpCiscXhMQqS89oqh8qqKtDxc+guCsLNuqfPjs8LpmSevuM8 0kzBQ8vt22LyOzHED4kOpkgjxiXE6+Dwx4ehWnsyOcx/mvsEbrJOvZxoiXEpHqDX/qHD YJSOgXYhHwmCzvomurTRcSO8BZ/VAP9doZRFTkjZYykp05AIaIJ3QRvtldtwtihAgpSt p/RA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=f3R7QSPxrCk6YOKm1t92HgRPziYVOwFVn0uk4JaPzMI=; b=DGFXxLX2BT2LwC4jEBU2WMNRA2TTUg8zdRt7gG1QKI5dHT8M4QPGaficbCMeD9V/5L tG6y7aidDwcJHUUFCToFNctg8ohae4iqxpaq7aJ/Vs8R+OERMqF9M/SlQyQkrHlp/IXB UUZp6r0lbUcVF2pxif4G51UQfietElrjOvYKBMskPiOrS8Gq0QC8bRUGKeqRArq7iITT mYAeU4LfTt04H1psUfdmdJq5LlmqDAai0Q0DUU1y/dcGnI2ldloIAvTO+Pu+1saaamE2 pL/9xExl1drq64M9GdZ2TpxZGemDAcIipQJUPQ/2bdEpuekG0CCUHxcdKPxeqxds+OFr 5GeQ== X-Gm-Message-State: AGi0PuZrKfOu/qU2KlCMWPRIuOERqK33MCSZZ4Of6ldpxo4AcSrtzxMG +SrN8vEzKmD7MBIbvmB8yFk= X-Google-Smtp-Source: APiQypJlOWr8bpBVxWYuxRo769/9ykKEHMymnUwhYHB8574wnYmZ92D9b6Dkx0ZVlEWsNqfW+xVR5w== X-Received: by 2002:aa7:8bc8:: with SMTP id s8mr15513225pfd.252.1587828293991; Sat, 25 Apr 2020 08:24:53 -0700 (PDT) Received: from localhost.localdomain ([203.100.54.194]) by smtp.gmail.com with ESMTPSA id w11sm7319838pgj.4.2020.04.25.08.24.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 25 Apr 2020 08:24:53 -0700 (PDT) From: Yafang Shao To: akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, guro@fb.com, chris@chrisdown.name Cc: linux-mm@kvack.org, Yafang Shao Subject: [PATCH 3/3] mm: improvements on memcg protection functions Date: Sat, 25 Apr 2020 11:24:18 -0400 Message-Id: <20200425152418.28388-4-laoar.shao@gmail.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20200425152418.28388-1-laoar.shao@gmail.com> References: <20200425152418.28388-1-laoar.shao@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since proportional memory.{min, low} reclaim is introduced in commit 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim"), it have been proved that the proportional reclaim is hard to understand and the issues caused by it is harder to understand.[1]. That dilemma faced by us is caused by that the proportional reclaim mixed up memcg and the reclaim context. In proportional reclaim, the whole reclaim context - includes the memcg to be reclaimed and the reclaimer, should be considered, rather than memcg only. To make it clear, a new member 'protection' is introduced in the reclaim context (struct shrink_control) to replace mem_cgroup_protection(). This one is set when we check whether the memcg is protected or not. After this change, the issue pointed by me[1] - a really old left-over value can slow donw target reclaim - can be fixed, and I think it could also avoid some potential race. [1]. https://lore.kernel.org/linux-mm/20200423061629.24185-1-laoar.shao@gmail.com Cc: Johannes Weiner Cc: Michal Hocko Cc: Roman Gushchin Cc: Chris Down Signed-off-by: Yafang Shao --- include/linux/memcontrol.h | 25 ---------------- mm/internal.h | 17 +++++++++++ mm/memcontrol.c | 58 +++++++++++++++++++++++++++----------- mm/vmscan.c | 35 +++-------------------- 4 files changed, 63 insertions(+), 72 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index b327857a1e7e..9d5ceeba3b31 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -50,12 +50,6 @@ enum memcg_memory_event { MEMCG_NR_MEMORY_EVENTS, }; -enum mem_cgroup_protection { - MEMCG_PROT_NONE, - MEMCG_PROT_LOW, - MEMCG_PROT_MIN, -}; - struct mem_cgroup_reclaim_cookie { pg_data_t *pgdat; unsigned int generation; @@ -344,19 +338,6 @@ static inline bool mem_cgroup_disabled(void) return !cgroup_subsys_enabled(memory_cgrp_subsys); } -static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg, - bool in_low_reclaim) -{ - if (mem_cgroup_disabled()) - return 0; - - if (in_low_reclaim) - return READ_ONCE(memcg->memory.emin); - - return max(READ_ONCE(memcg->memory.emin), - READ_ONCE(memcg->memory.elow)); -} - int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, struct mem_cgroup **memcgp, bool compound); @@ -832,12 +813,6 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm, { } -static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg, - bool in_low_reclaim) -{ - return 0; -} - static inline int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, struct mem_cgroup **memcgp, diff --git a/mm/internal.h b/mm/internal.h index a0b3bdd933b9..10c762a79c0c 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -271,6 +271,9 @@ struct scan_control { */ struct mem_cgroup *target_mem_cgroup; + /* Memcg protection in this reclaim context */ + unsigned long protection; + /* Can active pages be deactivated as part of reclaim? */ #define DEACTIVATE_ANON 1 #define DEACTIVATE_FILE 2 @@ -338,6 +341,20 @@ struct scan_control { struct reclaim_state reclaim_state; }; +#ifdef CONFIG_MEMCG +bool mem_cgroup_protected(struct mem_cgroup *target, + struct mem_cgroup *memcg, + struct scan_control *sc); + +#else +static inline bool mem_cgroup_protected(struct mem_cgroup *target, + struct mem_cgroup *memcg, + struct scan_control *sc) +{ + return false; +} +#endif + /* * This function returns the order of a free page in the buddy system. In * general, page_zone(page)->lock must be held by the caller to prevent the diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 51dab7f2e714..f2f191898f2b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6372,35 +6372,30 @@ static unsigned long effective_protection(unsigned long usage, * WARNING: This function is not stateless! It can only be used as part * of a top-down tree iteration, not for isolated queries. * - * Returns one of the following: - * MEMCG_PROT_NONE: cgroup memory is not protected - * MEMCG_PROT_LOW: cgroup memory is protected as long there is - * an unprotected supply of reclaimable memory from other cgroups. - * MEMCG_PROT_MIN: cgroup memory is protected */ -enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *target, - struct mem_cgroup *memcg, - struct scan_control *sc) +bool mem_cgroup_protected(struct mem_cgroup *target, + struct mem_cgroup *memcg, + struct scan_control *sc) { unsigned long usage, parent_usage; struct mem_cgroup *parent; if (mem_cgroup_disabled()) - return MEMCG_PROT_NONE; + return false; if (!target) target = root_mem_cgroup; if (memcg == target) - return MEMCG_PROT_NONE; + return false; usage = page_counter_read(&memcg->memory); if (!usage) - return MEMCG_PROT_NONE; + return false; parent = parent_mem_cgroup(memcg); /* No parent means a non-hierarchical mode on v1 memcg */ if (!parent) - return MEMCG_PROT_NONE; + return false; if (parent == target) { memcg->memory.emin = READ_ONCE(memcg->memory.min); @@ -6420,12 +6415,43 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *target, atomic_long_read(&parent->memory.children_low_usage))); out: + /* + * Hard protection. + * If there is no reclaimable memory, OOM. + */ if (usage <= memcg->memory.emin) - return MEMCG_PROT_MIN; - else if (usage <= memcg->memory.elow) - return MEMCG_PROT_LOW; + return true; + + /* The protection takes effect when false is returned. */ + if (sc->memcg_low_reclaim) + sc->protection = memcg->memory.emin; else - return MEMCG_PROT_NONE; + sc->protection = max(memcg->memory.emin, memcg->memory.elow); + + if (usage <= memcg->memory.elow) { + /* + * Soft protection. + * Respect the protection only as long as there is an + * unprotected supply of reclaimable memory from other + * cgroups. + */ + if (!sc->memcg_low_reclaim) { + sc->memcg_low_skipped = 1; + return true; + } + + memcg_memory_event(memcg, MEMCG_LOW); + + return false; + } + + /* + * All protection thresholds breached. We may still choose to vary + * the scan pressure applied based on by how much the cgroup in + * question has exceeded its protection thresholds + * (see get_scan_count). + */ + return false; } /** diff --git a/mm/vmscan.c b/mm/vmscan.c index 61c944e7f587..a81bf736ac11 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2263,8 +2263,7 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, unsigned long protection; lruvec_size = lruvec_lru_size(lruvec, lru, sc->reclaim_idx); - protection = mem_cgroup_protection(memcg, - sc->memcg_low_reclaim); + protection = sc->protection; if (protection) { /* @@ -2551,36 +2550,10 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) unsigned long reclaimed; unsigned long scanned; - switch (mem_cgroup_protected(target_memcg, memcg, sc)) { - case MEMCG_PROT_MIN: - /* - * Hard protection. - * If there is no reclaimable memory, OOM. - */ + sc->protection = 0; + + if (mem_cgroup_protected(target_memcg, memcg, sc)) continue; - case MEMCG_PROT_LOW: - /* - * Soft protection. - * Respect the protection only as long as - * there is an unprotected supply - * of reclaimable memory from other cgroups. - */ - if (!sc->memcg_low_reclaim) { - sc->memcg_low_skipped = 1; - continue; - } - memcg_memory_event(memcg, MEMCG_LOW); - break; - case MEMCG_PROT_NONE: - /* - * All protection thresholds breached. We may - * still choose to vary the scan pressure - * applied based on by how much the cgroup in - * question has exceeded its protection - * thresholds (see get_scan_count). - */ - break; - } reclaimed = sc->nr_reclaimed; scanned = sc->nr_scanned;