From patchwork Tue Feb 2 18:47:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12062637 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08DE6C433E6 for ; Tue, 2 Feb 2021 18:47:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 90F1F64E51 for ; Tue, 2 Feb 2021 18:47:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 90F1F64E51 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2798D6B0006; Tue, 2 Feb 2021 13:47:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 18B706B006C; Tue, 2 Feb 2021 13:47:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EAAAA6B0070; Tue, 2 Feb 2021 13:47:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0232.hostedemail.com [216.40.44.232]) by kanga.kvack.org (Postfix) with ESMTP id CF0A26B0006 for ; Tue, 2 Feb 2021 13:47:54 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 96B2E180AD81D for ; Tue, 2 Feb 2021 18:47:54 +0000 (UTC) X-FDA: 77774212068.26.doll22_2514fe5275cd Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id 70ADF1804B661 for ; Tue, 2 Feb 2021 18:47:54 +0000 (UTC) X-HE-Tag: doll22_2514fe5275cd X-Filterd-Recvd-Size: 5516 Received: from mail-qk1-f174.google.com (mail-qk1-f174.google.com [209.85.222.174]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Feb 2021 18:47:53 +0000 (UTC) Received: by mail-qk1-f174.google.com with SMTP id t63so20862952qkc.1 for ; Tue, 02 Feb 2021 10:47:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=MBCKydbpWga5DpiOrlHudvnTFh13spVvcDscPggPa+8=; b=iixvdo9II3CKbmX3rdwEdikCy6FYq2Xh7l6cr1JZDyZLV3tA7TuJMIZuIukFI2YdMG BTy4s/CiTvdw7jKAbWtlNoWIJe4vyWfjyD9CrO57tcOSLynLgtpULv2m6u3oUrxMqe6n cdDCwBSYaO1AFHPmCA2adqCaOiR2FF28vjLiw9S4VV2dR2P+1YK3kcyjPDRjRzQJfmHU iUdM0DlrSVzOwnsqjfhCFN1CXQKjBxXFaOvaFjkd0I314jpXy9KqXtn6Gpji3x8apJ7O WNsgg5PiF2eDrCHcn+OWgi/CW/dWhUMfSQcTW2vMCUHR5pB5jPDCL8FW9+KOOoyrSOBR s06A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MBCKydbpWga5DpiOrlHudvnTFh13spVvcDscPggPa+8=; b=jcFEK70KaXcYEhHy/lHYuNh/gYA+d+xvv0bgJQNc6FigU9E185UqieYOglkC7maxNg O6d+SbhOiz2IkdvDrvsG/F592HmofoegZD2VypqwuSfoCHGxXX2s4uPo5bBnD/ziKnJ4 mAo41lf8yhQMHDt3MKRfJ3I0CNmjvEri3dRb6WlXxncyG3qz5+/hETdk9u66WOW27eiM eIARfQqXVuZH0j///QPY8oHa1IPOANMh17u5nZum4/WDQe1WATddQyIyKwIHcVTDEBuk 1uDRnLw4LCqLXjuceS+bNO/QgVE6vZ/FWF9Sr5zVOcAOxwUyc71+3zvCE3WX3Bd4p4x2 h5ig== X-Gm-Message-State: AOAM530sg6+bK9QXjCRsouI3szx6xJ7iXsHJd3AHl+ZT7/x0kZTAYx4/ wq/tPEN8/EP0jbYWhzui/8kXmTGKXO4kXQ== X-Google-Smtp-Source: ABdhPJy5MjmvwvaZM06ZkRt9SFWmclrfAONi0RAh0i6LjYYVtxj9vXig0yYsTm/dF1v/5Mq0AeSP8Q== X-Received: by 2002:a37:a40b:: with SMTP id n11mr23845773qke.430.1612291673304; Tue, 02 Feb 2021 10:47:53 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id y135sm18881253qkb.14.2021.02.02.10.47.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 10:47:52 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 1/7] mm: memcontrol: fix cpuhotplug statistics flushing Date: Tue, 2 Feb 2021 13:47:40 -0500 Message-Id: <20210202184746.119084-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210202184746.119084-1-hannes@cmpxchg.org> References: <20210202184746.119084-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The memcg hotunplug callback erroneously flushes counts on the local CPU, not the counts of the CPU going away; those counts will be lost. Flush the CPU that is actually going away. Also simplify the code a bit by using mod_memcg_state() and count_memcg_events() instead of open-coding the upward flush - this is comparable to how vmstat.c handles hotunplug flushing. Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt Reviewed-by: Roman Gushchin Acked-by: Michal Hocko --- mm/memcontrol.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ed5cc78a8dbf..8120d565dd79 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2411,45 +2411,52 @@ static void drain_all_stock(struct mem_cgroup *root_memcg) static int memcg_hotplug_cpu_dead(unsigned int cpu) { struct memcg_stock_pcp *stock; - struct mem_cgroup *memcg, *mi; + struct mem_cgroup *memcg; stock = &per_cpu(memcg_stock, cpu); drain_stock(stock); for_each_mem_cgroup(memcg) { + struct memcg_vmstats_percpu *statc; int i; + statc = per_cpu_ptr(memcg->vmstats_percpu, cpu); + for (i = 0; i < MEMCG_NR_STAT; i++) { int nid; - long x; - x = this_cpu_xchg(memcg->vmstats_percpu->stat[i], 0); - if (x) - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - atomic_long_add(x, &memcg->vmstats[i]); + if (statc->stat[i]) { + mod_memcg_state(memcg, i, statc->stat[i]); + statc->stat[i] = 0; + } if (i >= NR_VM_NODE_STAT_ITEMS) continue; for_each_node(nid) { + struct batched_lruvec_stat *lstatc; struct mem_cgroup_per_node *pn; + long x; pn = mem_cgroup_nodeinfo(memcg, nid); - x = this_cpu_xchg(pn->lruvec_stat_cpu->count[i], 0); - if (x) + lstatc = per_cpu_ptr(pn->lruvec_stat_cpu, cpu); + + x = lstatc->count[i]; + lstatc->count[i] = 0; + + if (x) { do { atomic_long_add(x, &pn->lruvec_stat[i]); } while ((pn = parent_nodeinfo(pn, nid))); + } } } for (i = 0; i < NR_VM_EVENT_ITEMS; i++) { - long x; - - x = this_cpu_xchg(memcg->vmstats_percpu->events[i], 0); - if (x) - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - atomic_long_add(x, &memcg->vmevents[i]); + if (statc->events[i]) { + count_memcg_events(memcg, i, statc->events[i]); + statc->events[i] = 0; + } } } From patchwork Tue Feb 2 18:47:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12062639 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 556F9C433E0 for ; Tue, 2 Feb 2021 18:47:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BA2B564E9C for ; Tue, 2 Feb 2021 18:47:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BA2B564E9C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A0F096B006C; Tue, 2 Feb 2021 13:47:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 954AA6B0070; Tue, 2 Feb 2021 13:47:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7CDC76B0071; Tue, 2 Feb 2021 13:47:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0093.hostedemail.com [216.40.44.93]) by kanga.kvack.org (Postfix) with ESMTP id 641006B006C for ; Tue, 2 Feb 2021 13:47:56 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 2476D3636 for ; Tue, 2 Feb 2021 18:47:56 +0000 (UTC) X-FDA: 77774212152.18.kiss94_5712acc275cd Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id EC4C4100EDBF4 for ; Tue, 2 Feb 2021 18:47:55 +0000 (UTC) X-HE-Tag: kiss94_5712acc275cd X-Filterd-Recvd-Size: 7423 Received: from mail-qk1-f178.google.com (mail-qk1-f178.google.com [209.85.222.178]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Feb 2021 18:47:55 +0000 (UTC) Received: by mail-qk1-f178.google.com with SMTP id k193so20826142qke.6 for ; Tue, 02 Feb 2021 10:47:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zZpcdqBIUeHuRA3ry6psqLVWAEi8jyTeTt09lQraF+c=; b=pFBlBbi+hJBi102ivz3tEHlSObS5vUVEGXyRbRzgummggVh/7m310IjtgC0c5SkxI4 ZwBqqTyEwUsGjXo9z72N4yfm1lTy6QxlimLDLaKHCHpgbcd/5vWVSu0MD0EgclWQcusU sC8MZzqg/WGCEvGiILpsgFFvSVDcgpPyAfgoGtr6We7mC9PelfRoA4dkX3xwXXJURd+W C6dNlV3YKjirNjhViGWJiOmCS2m8poDXp150V4FPKtwVdXa8SLrGT/8oztQoZeZAhC0h spTzo+OGHgdYMWpheVyoIWntjh94cxkF2wMcEHJbfreYpDlVRxLvVDCtR3hlfhu19aer Munw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zZpcdqBIUeHuRA3ry6psqLVWAEi8jyTeTt09lQraF+c=; b=ptRIFxy+mDPZxCpkfgMzqq303pBXcMo0nH8vrwAAlS1OCMhEmCn/uK3Vgmfie+SNxf QSaWgsouYxQFLXYwtn+qn7MLVpgP+11gbrB3iVnOnwhSpUg9wKuCq8VZsUBQeeyXvyqz quuWqi3krKTVlxJWbQSa7ZEgYdXW1G9eU1gJrcZ7C7jFi89v2ihQcmacGutkRLrnotfi 4iWkOM9E8lzWeIWWb8/Ns1Ha/uGFkip4PLLCJRtQpPRkT1JBRQmlptAco03E/GaGUPxs bkwYg0hxs74z29viMuHR/rbExsjg5i+9soh9M5Pk9eG7Zy00iVRFtQcpsQSlOpe0DCFm 3MaQ== X-Gm-Message-State: AOAM530O1stfTBpQ2I1xIonqPOREH2lPeTRNFqi6oYlHwZZOpnOZzecC jq+skt/4bmGeI8yYf+AC3/xazA== X-Google-Smtp-Source: ABdhPJzTsLutgDrMgCR1wRgt4p6a6DZOk6A1+uADXqy5GnX8nT8WQznWBU1eYUe6FosnWJkPulPg5Q== X-Received: by 2002:a05:620a:16c9:: with SMTP id a9mr21724215qkn.41.1612291674776; Tue, 02 Feb 2021 10:47:54 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id s15sm16545799qtn.35.2021.02.02.10.47.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 10:47:54 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 2/7] mm: memcontrol: kill mem_cgroup_nodeinfo() Date: Tue, 2 Feb 2021 13:47:41 -0500 Message-Id: <20210202184746.119084-3-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210202184746.119084-1-hannes@cmpxchg.org> References: <20210202184746.119084-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: No need to encapsulate a simple struct member access. Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt Reviewed-by: Roman Gushchin Acked-by: Michal Hocko --- include/linux/memcontrol.h | 8 +------- mm/memcontrol.c | 21 +++++++++++---------- 2 files changed, 12 insertions(+), 17 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 7a38a1517a05..c7f387a6233e 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -602,12 +602,6 @@ void mem_cgroup_uncharge_list(struct list_head *page_list); void mem_cgroup_migrate(struct page *oldpage, struct page *newpage); -static struct mem_cgroup_per_node * -mem_cgroup_nodeinfo(struct mem_cgroup *memcg, int nid) -{ - return memcg->nodeinfo[nid]; -} - /** * mem_cgroup_lruvec - get the lru list vector for a memcg & node * @memcg: memcg of the wanted lruvec @@ -631,7 +625,7 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, if (!memcg) memcg = root_mem_cgroup; - mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id); + mz = memcg->nodeinfo[pgdat->node_id]; lruvec = &mz->lruvec; out: /* diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 8120d565dd79..7e05a4ebf80f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -414,13 +414,14 @@ static int memcg_expand_one_shrinker_map(struct mem_cgroup *memcg, int size, int old_size) { struct memcg_shrinker_map *new, *old; + struct mem_cgroup_per_node *pn; int nid; lockdep_assert_held(&memcg_shrinker_map_mutex); for_each_node(nid) { - old = rcu_dereference_protected( - mem_cgroup_nodeinfo(memcg, nid)->shrinker_map, true); + pn = memcg->nodeinfo[nid]; + old = rcu_dereference_protected(pn->shrinker_map, true); /* Not yet online memcg */ if (!old) return 0; @@ -433,7 +434,7 @@ static int memcg_expand_one_shrinker_map(struct mem_cgroup *memcg, memset(new->map, (int)0xff, old_size); memset((void *)new->map + old_size, 0, size - old_size); - rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_map, new); + rcu_assign_pointer(pn->shrinker_map, new); call_rcu(&old->rcu, memcg_free_shrinker_map_rcu); } @@ -450,7 +451,7 @@ static void memcg_free_shrinker_maps(struct mem_cgroup *memcg) return; for_each_node(nid) { - pn = mem_cgroup_nodeinfo(memcg, nid); + pn = memcg->nodeinfo[nid]; map = rcu_dereference_protected(pn->shrinker_map, true); kvfree(map); rcu_assign_pointer(pn->shrinker_map, NULL); @@ -713,7 +714,7 @@ static void mem_cgroup_remove_from_trees(struct mem_cgroup *memcg) int nid; for_each_node(nid) { - mz = mem_cgroup_nodeinfo(memcg, nid); + mz = memcg->nodeinfo[nid]; mctz = soft_limit_tree_node(nid); if (mctz) mem_cgroup_remove_exceeded(mz, mctz); @@ -796,7 +797,7 @@ parent_nodeinfo(struct mem_cgroup_per_node *pn, int nid) parent = parent_mem_cgroup(pn->memcg); if (!parent) return NULL; - return mem_cgroup_nodeinfo(parent, nid); + return parent->nodeinfo[nid]; } void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, @@ -1163,7 +1164,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root, if (reclaim) { struct mem_cgroup_per_node *mz; - mz = mem_cgroup_nodeinfo(root, reclaim->pgdat->node_id); + mz = root->nodeinfo[reclaim->pgdat->node_id]; iter = &mz->iter; if (prev && reclaim->generation != iter->generation) @@ -1265,7 +1266,7 @@ static void __invalidate_reclaim_iterators(struct mem_cgroup *from, int nid; for_each_node(nid) { - mz = mem_cgroup_nodeinfo(from, nid); + mz = from->nodeinfo[nid]; iter = &mz->iter; cmpxchg(&iter->position, dead_memcg, NULL); } @@ -2438,7 +2439,7 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu) struct mem_cgroup_per_node *pn; long x; - pn = mem_cgroup_nodeinfo(memcg, nid); + pn = memcg->nodeinfo[nid]; lstatc = per_cpu_ptr(pn->lruvec_stat_cpu, cpu); x = lstatc->count[i]; @@ -4145,7 +4146,7 @@ static int memcg_stat_show(struct seq_file *m, void *v) unsigned long file_cost = 0; for_each_online_pgdat(pgdat) { - mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id); + mz = memcg->nodeinfo[pgdat->node_id]; anon_cost += mz->lruvec.anon_cost; file_cost += mz->lruvec.file_cost; From patchwork Tue Feb 2 18:47:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12062641 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60801C433E9 for ; Tue, 2 Feb 2021 18:48:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EDBD264F4B for ; Tue, 2 Feb 2021 18:47:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EDBD264F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8A86C6B0070; Tue, 2 Feb 2021 13:47:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7F36E6B0071; Tue, 2 Feb 2021 13:47:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D0886B0072; Tue, 2 Feb 2021 13:47:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0031.hostedemail.com [216.40.44.31]) by kanga.kvack.org (Postfix) with ESMTP id 3A9DD6B0070 for ; Tue, 2 Feb 2021 13:47:58 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 02F6C181AEF1A for ; Tue, 2 Feb 2021 18:47:58 +0000 (UTC) X-FDA: 77774212236.14.mind39_340dd44275cd Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id D8AF118229818 for ; Tue, 2 Feb 2021 18:47:57 +0000 (UTC) X-HE-Tag: mind39_340dd44275cd X-Filterd-Recvd-Size: 6279 Received: from mail-qk1-f181.google.com (mail-qk1-f181.google.com [209.85.222.181]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Feb 2021 18:47:57 +0000 (UTC) Received: by mail-qk1-f181.google.com with SMTP id u20so20849281qku.7 for ; Tue, 02 Feb 2021 10:47:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Mpanv+/+JrDOLKkhynshAPXe4Ve6rW0I2YibAc13/NE=; b=O6ja58PYqYoZvn6p3Ba4+2qoDJfPZb1STqRFQNHtToIEjSyj1BXxH6ycynJzuxomPV 9fl+x9JgJmfXvhv62NMyzN1ZD+/Buddm65nbmPOqK3iXg+jJ9NqA/wCMtFsY9aGRAVJi KTZk3QQsuUAQgka169LtM2V9I023EzNu1Kt4m8OB0Cw52000aJ3vTLT60nM1lA1DioZb ILdfgkvKbzGlqMLCcQWafrt71/KPfYqXBK0uxS0VzOmRtaPEq5CUAujti4DcSP8AABo1 p/I6pOucyVi/HFirs7ugdp8cwARsiQm+CcFgrfIHN5K2WXtPHEzwkxqpZMIPDGP+ZRxj Yjdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Mpanv+/+JrDOLKkhynshAPXe4Ve6rW0I2YibAc13/NE=; b=W2Kw8783gWQ0qkLDSL+0MMYGKHomdMYOjVGzICzKKsviJtY+lym7Zzb+5wmMXYcsTL 9a9DAdNvJo9ynNco3w78e+1G2AEC+LbFbhBjYLDEaM8BvTG887cVBwx6LmTaggw426+8 hWElOnxuIxanTZVY3RiZuyQhh7LFSf98fqfRH08n2uucbk720o6TvwyBxKkbmP7M32Gx l2R/NXDyhu/AK93e7Em3EvBReVTo8ublXUQQKSYBC0ejn6cW5SSG286mLTH5XEdlOBOM fUBdXNySkEAZfeOe5usxKkdydgUWHcmX3bdXJ856q/DJ9SCAtB5Ff+R4CylTRa+DX8q0 B9cQ== X-Gm-Message-State: AOAM532NLd1/MtLCq7TaWV46EsDsSiyvkKuvymyj2BYxgeamWW1oir0u rvCxNusHqCymwpLDOdCTGIAKrw== X-Google-Smtp-Source: ABdhPJxK7OLtyLw44gKIZ9kPmlcX22umQXeXznlya2ca+opvuqTPoKzTzbj+MedVjmzxEJEkywMcRQ== X-Received: by 2002:a05:620a:12d7:: with SMTP id e23mr22344898qkl.58.1612291676855; Tue, 02 Feb 2021 10:47:56 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id e5sm17299659qtp.86.2021.02.02.10.47.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 10:47:56 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 3/7] mm: memcontrol: privatize memcg_page_state query functions Date: Tue, 2 Feb 2021 13:47:42 -0500 Message-Id: <20210202184746.119084-4-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210202184746.119084-1-hannes@cmpxchg.org> References: <20210202184746.119084-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There are no users outside of the memory controller itself. The rest of the kernel cares either about node or lruvec stats. Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt Reviewed-by: Roman Gushchin Acked-by: Michal Hocko --- include/linux/memcontrol.h | 44 -------------------------------------- mm/memcontrol.c | 32 +++++++++++++++++++++++++++ 2 files changed, 32 insertions(+), 44 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index c7f387a6233e..20ecdfae3289 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -867,39 +867,6 @@ struct mem_cgroup *lock_page_memcg(struct page *page); void __unlock_page_memcg(struct mem_cgroup *memcg); void unlock_page_memcg(struct page *page); -/* - * idx can be of type enum memcg_stat_item or node_stat_item. - * Keep in sync with memcg_exact_page_state(). - */ -static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) -{ - long x = atomic_long_read(&memcg->vmstats[idx]); -#ifdef CONFIG_SMP - if (x < 0) - x = 0; -#endif - return x; -} - -/* - * idx can be of type enum memcg_stat_item or node_stat_item. - * Keep in sync with memcg_exact_page_state(). - */ -static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg, - int idx) -{ - long x = 0; - int cpu; - - for_each_possible_cpu(cpu) - x += per_cpu(memcg->vmstats_local->stat[idx], cpu); -#ifdef CONFIG_SMP - if (x < 0) - x = 0; -#endif - return x; -} - void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val); /* idx can be of type enum memcg_stat_item or node_stat_item */ @@ -1337,17 +1304,6 @@ static inline void mem_cgroup_print_oom_group(struct mem_cgroup *memcg) { } -static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) -{ - return 0; -} - -static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg, - int idx) -{ - return 0; -} - static inline void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int nr) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7e05a4ebf80f..2f97cb4cef6d 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -789,6 +789,38 @@ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) __this_cpu_write(memcg->vmstats_percpu->stat[idx], x); } +/* + * idx can be of type enum memcg_stat_item or node_stat_item. + * Keep in sync with memcg_exact_page_state(). + */ +static unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) +{ + long x = atomic_long_read(&memcg->vmstats[idx]); +#ifdef CONFIG_SMP + if (x < 0) + x = 0; +#endif + return x; +} + +/* + * idx can be of type enum memcg_stat_item or node_stat_item. + * Keep in sync with memcg_exact_page_state(). + */ +static unsigned long memcg_page_state_local(struct mem_cgroup *memcg, int idx) +{ + long x = 0; + int cpu; + + for_each_possible_cpu(cpu) + x += per_cpu(memcg->vmstats_local->stat[idx], cpu); +#ifdef CONFIG_SMP + if (x < 0) + x = 0; +#endif + return x; +} + static struct mem_cgroup_per_node * parent_nodeinfo(struct mem_cgroup_per_node *pn, int nid) { From patchwork Tue Feb 2 18:47:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12062643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6C12C433DB for ; Tue, 2 Feb 2021 18:48:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3B92364E75 for ; Tue, 2 Feb 2021 18:48:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3B92364E75 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5A9C66B0071; Tue, 2 Feb 2021 13:48:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 536FF6B0072; Tue, 2 Feb 2021 13:48:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 315FD6B0073; Tue, 2 Feb 2021 13:48:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0231.hostedemail.com [216.40.44.231]) by kanga.kvack.org (Postfix) with ESMTP id 0DED16B0071 for ; Tue, 2 Feb 2021 13:48:00 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id BBA97180AD80F for ; Tue, 2 Feb 2021 18:47:59 +0000 (UTC) X-FDA: 77774212278.25.straw66_420f588275cd Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 8E5401804E3A8 for ; Tue, 2 Feb 2021 18:47:59 +0000 (UTC) X-HE-Tag: straw66_420f588275cd X-Filterd-Recvd-Size: 7977 Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com [209.85.222.176]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Feb 2021 18:47:59 +0000 (UTC) Received: by mail-qk1-f176.google.com with SMTP id l27so20843860qki.9 for ; Tue, 02 Feb 2021 10:47:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XT+JmgvaZOl0uziPjZmcv+RThYrP5A/U4TIbmhwqks0=; b=CD4IDH/zzGOlaHncy8YNUVjQm/YufoLAqbtib1Ch7ybgevCSkokoyoEE3mDGi649k6 uISfIK4Zhrz0TY/k0WF131nO/i1D4UkZTNHH3vKWaWWQYQkfe8Tjh8Ji0Y+MWF2Ja2lE QpRKdxGp+Fj6/Gu0pDy/5mXW9duKwRtwAPmou0o58ON0XVbPiGU+boZdjFyCQPLwgmbB 7zSc0z3e4r+RdUOyryCsgPdTLPenmZk7qkob+pro+yEqTurvBfZiGHboma9diumUprF9 //fG61ZYz3az/cPkAxqodIrLcS9f9y1F2LIRUO0WaGMPm7KsZqk7ZA/TisuxoO3FaYfY 8g7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XT+JmgvaZOl0uziPjZmcv+RThYrP5A/U4TIbmhwqks0=; b=RNyXiWApaTDWTpCAVe8q7dbiLC6Uqfrqke/y1W2XuKQCEcAn7ZSVlFIg5LaCGSp5bF kpQ1yRIibZwv+Q4VhZfciU4loXziQi25wTlIxdJoLQ4kPZ0Kk8fR7VLI/SnvuKyZ+dN2 L2WG7g881qmGTD+B+mrIXQLgx4jwjzXFv1Yngz6vUfG2zLIFfGxvIP3vohoW0oV2guUI nqemTIVJaUCLKZpw63vyl2Trz41sslh6J4PvPNe1nY86zrxaKmXoki/eXCxvpkoQvmQh x6UwPq8DZJleNiMuphKF09ieTdjfTCNKhMxkZeqZDQz2nSkFjv0PHBRF6xKRUTqAP07n YSmQ== X-Gm-Message-State: AOAM531MSrVa1Kq+R9TbrMrEra+8Y8ulB0vOSiyRtCtTdNCzKFU0+eIU WOO2G/My4WD/okPyuOLihbY6CA== X-Google-Smtp-Source: ABdhPJxPedtnjVsyY8n38Z9X98YOBxEnI4tIs5n3uUt0HHVmzh2yVH5YgSvWMWxFrXTvHEnobPV6EQ== X-Received: by 2002:a37:ad1:: with SMTP id 200mr23205999qkk.195.1612291678246; Tue, 02 Feb 2021 10:47:58 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id k129sm18116923qkf.108.2021.02.02.10.47.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 10:47:57 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 4/7] cgroup: rstat: support cgroup1 Date: Tue, 2 Feb 2021 13:47:43 -0500 Message-Id: <20210202184746.119084-5-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210202184746.119084-1-hannes@cmpxchg.org> References: <20210202184746.119084-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Rstat currently only supports the default hierarchy in cgroup2. In order to replace memcg's private stats infrastructure - used in both cgroup1 and cgroup2 - with rstat, the latter needs to support cgroup1. The initialization and destruction callbacks for regular cgroups are already in place. Remove the cgroup_on_dfl() guards to handle cgroup1. The initialization of the root cgroup is currently hardcoded to only handle cgrp_dfl_root.cgrp. Move those callbacks to cgroup_setup_root() and cgroup_destroy_root() to handle the default root as well as the various cgroup1 roots we may set up during mounting. The linking of css to cgroups happens in code shared between cgroup1 and cgroup2 as well. Simply remove the cgroup_on_dfl() guard. Linkage of the root css to the root cgroup is a bit trickier: per default, the root css of a subsystem controller belongs to the default hierarchy (i.e. the cgroup2 root). When a controller is mounted in its cgroup1 version, the root css is stolen and moved to the cgroup1 root; on unmount, the css moves back to the default hierarchy. Annotate rebind_subsystems() to move the root css linkage along between roots. Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin --- kernel/cgroup/cgroup.c | 34 +++++++++++++++++++++------------- kernel/cgroup/rstat.c | 2 -- 2 files changed, 21 insertions(+), 15 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 9153b20e5cc6..e049edd66776 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1339,6 +1339,7 @@ static void cgroup_destroy_root(struct cgroup_root *root) mutex_unlock(&cgroup_mutex); + cgroup_rstat_exit(cgrp); kernfs_destroy_root(root->kf_root); cgroup_free_root(root); } @@ -1751,6 +1752,12 @@ int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask) &dcgrp->e_csets[ss->id]); spin_unlock_irq(&css_set_lock); + if (ss->css_rstat_flush) { + list_del_rcu(&css->rstat_css_node); + list_add_rcu(&css->rstat_css_node, + &dcgrp->rstat_css_list); + } + /* default hierarchy doesn't enable controllers by default */ dst_root->subsys_mask |= 1 << ssid; if (dst_root == &cgrp_dfl_root) { @@ -1971,10 +1978,14 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask) if (ret) goto destroy_root; - ret = rebind_subsystems(root, ss_mask); + ret = cgroup_rstat_init(root_cgrp); if (ret) goto destroy_root; + ret = rebind_subsystems(root, ss_mask); + if (ret) + goto exit_stats; + ret = cgroup_bpf_inherit(root_cgrp); WARN_ON_ONCE(ret); @@ -2006,6 +2017,8 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask) ret = 0; goto out; +exit_stats: + cgroup_rstat_exit(root_cgrp); destroy_root: kernfs_destroy_root(root->kf_root); root->kf_root = NULL; @@ -4934,8 +4947,7 @@ static void css_free_rwork_fn(struct work_struct *work) cgroup_put(cgroup_parent(cgrp)); kernfs_put(cgrp->kn); psi_cgroup_free(cgrp); - if (cgroup_on_dfl(cgrp)) - cgroup_rstat_exit(cgrp); + cgroup_rstat_exit(cgrp); kfree(cgrp); } else { /* @@ -4976,8 +4988,7 @@ static void css_release_work_fn(struct work_struct *work) /* cgroup release path */ TRACE_CGROUP_PATH(release, cgrp); - if (cgroup_on_dfl(cgrp)) - cgroup_rstat_flush(cgrp); + cgroup_rstat_flush(cgrp); spin_lock_irq(&css_set_lock); for (tcgrp = cgroup_parent(cgrp); tcgrp; @@ -5034,7 +5045,7 @@ static void init_and_link_css(struct cgroup_subsys_state *css, css_get(css->parent); } - if (cgroup_on_dfl(cgrp) && ss->css_rstat_flush) + if (ss->css_rstat_flush) list_add_rcu(&css->rstat_css_node, &cgrp->rstat_css_list); BUG_ON(cgroup_css(cgrp, ss)); @@ -5159,11 +5170,9 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name, if (ret) goto out_free_cgrp; - if (cgroup_on_dfl(parent)) { - ret = cgroup_rstat_init(cgrp); - if (ret) - goto out_cancel_ref; - } + ret = cgroup_rstat_init(cgrp); + if (ret) + goto out_cancel_ref; /* create the directory */ kn = kernfs_create_dir(parent->kn, name, mode, cgrp); @@ -5250,8 +5259,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name, out_kernfs_remove: kernfs_remove(cgrp->kn); out_stat_exit: - if (cgroup_on_dfl(parent)) - cgroup_rstat_exit(cgrp); + cgroup_rstat_exit(cgrp); out_cancel_ref: percpu_ref_exit(&cgrp->self.refcnt); out_free_cgrp: diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c index d51175cedfca..faa767a870ba 100644 --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -285,8 +285,6 @@ void __init cgroup_rstat_boot(void) for_each_possible_cpu(cpu) raw_spin_lock_init(per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)); - - BUG_ON(cgroup_rstat_init(&cgrp_dfl_root.cgrp)); } /* From patchwork Tue Feb 2 18:47:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12062645 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1367DC433E0 for ; Tue, 2 Feb 2021 18:48:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 973A764E75 for ; Tue, 2 Feb 2021 18:48:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 973A764E75 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 31AE16B0072; Tue, 2 Feb 2021 13:48:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 274AF6B0073; Tue, 2 Feb 2021 13:48:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF52B6B0074; Tue, 2 Feb 2021 13:48:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id D38BB6B0072 for ; Tue, 2 Feb 2021 13:48:01 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 6D553181AEF1A for ; Tue, 2 Feb 2021 18:48:01 +0000 (UTC) X-FDA: 77774212362.20.frog86_3f141b4275cd Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin20.hostedemail.com (Postfix) with ESMTP id 14E82180C0609 for ; Tue, 2 Feb 2021 18:48:01 +0000 (UTC) X-HE-Tag: frog86_3f141b4275cd X-Filterd-Recvd-Size: 9978 Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Feb 2021 18:48:00 +0000 (UTC) Received: by mail-qv1-f51.google.com with SMTP id l14so10407990qvp.2 for ; Tue, 02 Feb 2021 10:48:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qIYQWYhqyLwEqsbVc7mdSrEASRoPZLsNukqo0gJhBmg=; b=UA+NDRaUgwhmDiqE9A/0CZWTqaviooGCnPGyA+jZ2rmm9s/ZWHmRACKLLov+V3gnkf f+WbulaPJMozM9OO9Nm70fO1X+fZ96IGtJLLv8iQasxbYpNhytoZLiw2GErGyqDmG7U/ kQiBBHmf7csgJc/wcdVAf77uLQqTFN/SY4go1+zwoWOteoN4cDkF1qgflZikBVisGnJ6 huximZuV8y3uI3nAgaySyrwW9iKWbTA3DLjStpuUBXO88S7Qfy5H/ityRic6Swg8CM9v 3qeCFwAgZ+ZNOyO7e06J3eVQmDaJCPOAb4z07v0tBl3Ehtq9UApNk9+k9a+1wmuxnVBS 2zcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qIYQWYhqyLwEqsbVc7mdSrEASRoPZLsNukqo0gJhBmg=; b=QzP+i7OCd4AE9Fjcl8Ogl32O3cXII66KNkiPMoLlx0eJIy+Micop1J2GyEOYVNwS8+ Jv9PeQc6b8V1M/lG7CYkoJaQUkj8kn4J4uCWqjLpdaScNWRI6sFCOzYf7oUZ2lmrPDbp oHqOhAHREZ09PIKyJMdF0lisZ61ND/XRiobFyuR9J9/o7FKsQt3U4/Vmb10hF359bdFX lG1OjvSDuFtjgfLcS0X/oggpP/cZQe8jdMqaWIseaF0OaRp2bGutAtkdvUxZalTG3DT9 RSzZElNPKw0aYFDa2liZxTqK0y679pMUZSNyIOanLNPUaRov2s946YAQXIY9SjEw8l2J ffpA== X-Gm-Message-State: AOAM532VKcAKq5oq1GEv06VYw3UUMNVI6IEVz7zFRiliEDS3RRl8YTxR 91RxrauH5C/ydrvC1s3ulPXLdg== X-Google-Smtp-Source: ABdhPJwKzK8oTl/nXwYBkFb8pFZhqxFNYPm1ZDqXwBOlixoBPuZguX4hE2gQpL4i5jlg/B7/MsaObw== X-Received: by 2002:ad4:4c84:: with SMTP id bs4mr21871414qvb.0.1612291679723; Tue, 02 Feb 2021 10:47:59 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id j66sm18116876qkf.78.2021.02.02.10.47.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 10:47:59 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 5/7] cgroup: rstat: punt root-level optimization to individual controllers Date: Tue, 2 Feb 2021 13:47:44 -0500 Message-Id: <20210202184746.119084-6-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210202184746.119084-1-hannes@cmpxchg.org> References: <20210202184746.119084-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Current users of the rstat code can source root-level statistics from the native counters of their respective subsystem, allowing them to forego aggregation at the root level. This optimization is currently implemented inside the generic rstat code, which doesn't track the root cgroup and doesn't invoke the subsystem flush callbacks on it. However, the memory controller cannot do this optimization, because cgroup1 breaks out memory specifically for the local level, including at the root level. In preparation for the memory controller switching to rstat, move the optimization from rstat core to the controllers. Afterwards, rstat will always track the root cgroup for changes and invoke the subsystem callbacks on it; and it's up to the subsystem to special-case and skip aggregation of the root cgroup if it can source this information through other, cheaper means. The extra cost of tracking the root cgroup is negligible: on stat changes, we actually remove a branch that checks for the root. The queueing for a flush touches only per-cpu data, and only the first stat change since a flush requires a (per-cpu) lock. Signed-off-by: Johannes Weiner --- block/blk-cgroup.c | 14 +++++++--- kernel/cgroup/rstat.c | 60 +++++++++++++++++++++++++------------------ 2 files changed, 45 insertions(+), 29 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 02ce2058c14b..76725e1cad7f 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -766,6 +766,10 @@ static void blkcg_rstat_flush(struct cgroup_subsys_state *css, int cpu) struct blkcg *blkcg = css_to_blkcg(css); struct blkcg_gq *blkg; + /* Root-level stats are sourced from system-wide IO stats */ + if (!cgroup_parent(css->cgroup)) + return; + rcu_read_lock(); hlist_for_each_entry_rcu(blkg, &blkcg->blkg_list, blkcg_node) { @@ -789,6 +793,7 @@ static void blkcg_rstat_flush(struct cgroup_subsys_state *css, int cpu) u64_stats_update_end(&blkg->iostat.sync); /* propagate global delta to parent */ + /* XXX: could skip this if parent is root */ if (parent) { u64_stats_update_begin(&parent->iostat.sync); blkg_iostat_set(&delta, &blkg->iostat.cur); @@ -803,10 +808,11 @@ static void blkcg_rstat_flush(struct cgroup_subsys_state *css, int cpu) } /* - * The rstat algorithms intentionally don't handle the root cgroup to avoid - * incurring overhead when no cgroups are defined. For that reason, - * cgroup_rstat_flush in blkcg_print_stat does not actually fill out the - * iostat in the root cgroup's blkcg_gq. + * We source root cgroup stats from the system-wide stats to avoid + * tracking the same information twice and incurring overhead when no + * cgroups are defined. For that reason, cgroup_rstat_flush in + * blkcg_print_stat does not actually fill out the iostat in the root + * cgroup's blkcg_gq. * * However, we would like to re-use the printing code between the root and * non-root cgroups to the extent possible. For that reason, we simulate diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c index faa767a870ba..6f50c199bf2a 100644 --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -25,13 +25,8 @@ static struct cgroup_rstat_cpu *cgroup_rstat_cpu(struct cgroup *cgrp, int cpu) void cgroup_rstat_updated(struct cgroup *cgrp, int cpu) { raw_spinlock_t *cpu_lock = per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu); - struct cgroup *parent; unsigned long flags; - /* nothing to do for root */ - if (!cgroup_parent(cgrp)) - return; - /* * Speculative already-on-list test. This may race leading to * temporary inaccuracies, which is fine. @@ -46,10 +41,10 @@ void cgroup_rstat_updated(struct cgroup *cgrp, int cpu) raw_spin_lock_irqsave(cpu_lock, flags); /* put @cgrp and all ancestors on the corresponding updated lists */ - for (parent = cgroup_parent(cgrp); parent; - cgrp = parent, parent = cgroup_parent(cgrp)) { + while (true) { struct cgroup_rstat_cpu *rstatc = cgroup_rstat_cpu(cgrp, cpu); - struct cgroup_rstat_cpu *prstatc = cgroup_rstat_cpu(parent, cpu); + struct cgroup *parent = cgroup_parent(cgrp); + struct cgroup_rstat_cpu *prstatc; /* * Both additions and removals are bottom-up. If a cgroup @@ -58,8 +53,16 @@ void cgroup_rstat_updated(struct cgroup *cgrp, int cpu) if (rstatc->updated_next) break; + if (!parent) { + rstatc->updated_next = cgrp; + break; + } + + prstatc = cgroup_rstat_cpu(parent, cpu); rstatc->updated_next = prstatc->updated_children; prstatc->updated_children = cgrp; + + cgrp = parent; } raw_spin_unlock_irqrestore(cpu_lock, flags); @@ -113,23 +116,26 @@ static struct cgroup *cgroup_rstat_cpu_pop_updated(struct cgroup *pos, */ if (rstatc->updated_next) { struct cgroup *parent = cgroup_parent(pos); - struct cgroup_rstat_cpu *prstatc = cgroup_rstat_cpu(parent, cpu); - struct cgroup_rstat_cpu *nrstatc; - struct cgroup **nextp; - - nextp = &prstatc->updated_children; - while (true) { - nrstatc = cgroup_rstat_cpu(*nextp, cpu); - if (*nextp == pos) - break; - - WARN_ON_ONCE(*nextp == parent); - nextp = &nrstatc->updated_next; + + if (parent) { + struct cgroup_rstat_cpu *prstatc; + struct cgroup **nextp; + + prstatc = cgroup_rstat_cpu(parent, cpu); + nextp = &prstatc->updated_children; + while (true) { + struct cgroup_rstat_cpu *nrstatc; + + nrstatc = cgroup_rstat_cpu(*nextp, cpu); + if (*nextp == pos) + break; + WARN_ON_ONCE(*nextp == parent); + nextp = &nrstatc->updated_next; + } + *nextp = rstatc->updated_next; } - *nextp = rstatc->updated_next; rstatc->updated_next = NULL; - return pos; } @@ -309,11 +315,15 @@ static void cgroup_base_stat_sub(struct cgroup_base_stat *dst_bstat, static void cgroup_base_stat_flush(struct cgroup *cgrp, int cpu) { - struct cgroup *parent = cgroup_parent(cgrp); struct cgroup_rstat_cpu *rstatc = cgroup_rstat_cpu(cgrp, cpu); + struct cgroup *parent = cgroup_parent(cgrp); struct cgroup_base_stat cur, delta; unsigned seq; + /* Root-level stats are sourced from system-wide CPU stats */ + if (!parent) + return; + /* fetch the current per-cpu values */ do { seq = __u64_stats_fetch_begin(&rstatc->bsync); @@ -326,8 +336,8 @@ static void cgroup_base_stat_flush(struct cgroup *cgrp, int cpu) cgroup_base_stat_add(&cgrp->bstat, &delta); cgroup_base_stat_add(&rstatc->last_bstat, &delta); - /* propagate global delta to parent */ - if (parent) { + /* propagate global delta to parent (unless that's root) */ + if (cgroup_parent(parent)) { delta = cgrp->bstat; cgroup_base_stat_sub(&delta, &cgrp->last_bstat); cgroup_base_stat_add(&parent->bstat, &delta); From patchwork Tue Feb 2 18:47:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12062649 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C5FFC433E0 for ; Tue, 2 Feb 2021 18:48:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BAA0C64E75 for ; Tue, 2 Feb 2021 18:48:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BAA0C64E75 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 99EC96B0074; Tue, 2 Feb 2021 13:48:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 92E7A6B0075; Tue, 2 Feb 2021 13:48:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 730BA6B0078; Tue, 2 Feb 2021 13:48:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id 5252A6B0074 for ; Tue, 2 Feb 2021 13:48:05 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id F2CB63636 for ; Tue, 2 Feb 2021 18:48:04 +0000 (UTC) X-FDA: 77774212530.13.scale23_4f0c79b275cd Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id BED4218140B80 for ; Tue, 2 Feb 2021 18:48:02 +0000 (UTC) X-HE-Tag: scale23_4f0c79b275cd X-Filterd-Recvd-Size: 21416 Received: from mail-qk1-f178.google.com (mail-qk1-f178.google.com [209.85.222.178]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Feb 2021 18:48:01 +0000 (UTC) Received: by mail-qk1-f178.google.com with SMTP id k193so20826515qke.6 for ; Tue, 02 Feb 2021 10:48:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JL1i7ak45M8HdONTsMB57+pTJ2hyyEM8z9wBe/UWwRs=; b=AdJMtUl40lKEx8+msDqnegT9pTbHZsHgaolkPhFBx/KdpzdzJAzfUOPjK663Hb3gQi FaNLo3kiXS+NSVs86mbHSdxchg2fvks72X13KBdsQb2qLl7PKvlp+1/KICMyWbtOtWMF Ca+OX8srCF74sMCB5CtYqzAK9MkVcz41QOE1691Y+bkW1foVQO+iOSiiAmtJI4JLSYU0 ytwqI11ST3WvLx025C7AYTaxvivorMGllT4SsKJFGe2TREfZ2oYPmAjNCdATgO03C78F cMHnWCJxOSXm7fNiZdrKJabk7bCNm3HM4Dl8U8u3DriqHV1i5tj8qsxD11tI0O/7gXj1 OCGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JL1i7ak45M8HdONTsMB57+pTJ2hyyEM8z9wBe/UWwRs=; b=AJ8nM0KdWgLP9EfHxq0P7yBd9j1mPll9HO88RSLt/d39IqzVKKhrgaJoPzeXOoSOOm QlmoZVn6m17Yg+XaRKPbz66J2k4nPHEw5mePd8pNUeOTXGbD0PJ27fjo7sWm10cswjF7 SMA5YVFMhY2E/pkhb1NCExFrdZvIA8/9A2tjvWe5mEAPG6LBydQJSnpR8fTrUHL0EFMq a1P6RCak6RmBPUa0tpuwD67DdU9Yrecpj57KVRpvkr3D/QpUSEUMwgejdNX1E7ZIGUJo RHOVR1LOy/cLtQxl6rlSMG/tiX2PHF2oXLPnklKEuZEpRx3Pz6e+gWFxfXi/evmFqU5c ZMBQ== X-Gm-Message-State: AOAM530XUqiC0BdalMmHzeJ9JfWAmdgVVNJESyl1zTJQYsBSQ71GHmfY Vgq0BcCZjldf6/w9ThXrCJ8nhg== X-Google-Smtp-Source: ABdhPJyNuoyqoVc2sxTgh2slFUA3XV7StNgNg7OmoHROCaui5GHKMQatWL4rQ1Ph1bmVLwQrcF20Bg== X-Received: by 2002:a37:a08c:: with SMTP id j134mr21726407qke.92.1612291681154; Tue, 02 Feb 2021 10:48:01 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id n24sm16832922qtv.26.2021.02.02.10.48.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 10:48:00 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 6/7] mm: memcontrol: switch to rstat Date: Tue, 2 Feb 2021 13:47:45 -0500 Message-Id: <20210202184746.119084-7-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210202184746.119084-1-hannes@cmpxchg.org> References: <20210202184746.119084-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Replace the memory controller's custom hierarchical stats code with the generic rstat infrastructure provided by the cgroup core. The current implementation does batched upward propagation from the write side (i.e. as stats change). The per-cpu batches introduce an error, which is multiplied by the number of subgroups in a tree. In systems with many CPUs and sizable cgroup trees, the error can be large enough to confuse users (e.g. 32 batch pages * 32 CPUs * 32 subgroups results in an error of up to 128M per stat item). This can entirely swallow allocation bursts inside a workload that the user is expecting to see reflected in the statistics. In the past, we've done read-side aggregation, where a memory.stat read would have to walk the entire subtree and add up per-cpu counts. This became problematic with lazily-freed cgroups: we could have large subtrees where most cgroups were entirely idle. Hence the switch to change-driven upward propagation. Unfortunately, it needed to trade accuracy for speed due to the write side being so hot. Rstat combines the best of both worlds: from the write side, it cheaply maintains a queue of cgroups that have pending changes, so that the read side can do selective tree aggregation. This way the reported stats will always be precise and recent as can be, while the aggregation can skip over potentially large numbers of idle cgroups. This adds a second vmstats to struct mem_cgroup (MEMCG_NR_STAT + NR_VM_EVENT_ITEMS) to track pending subtree deltas during upward aggregation. It removes 3 words from the per-cpu data. It eliminates memcg_exact_page_state(), since memcg_page_state() is now exact. Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin . Acked-by: Michal Hocko --- include/linux/memcontrol.h | 67 ++++++----- mm/memcontrol.c | 224 +++++++++++++++---------------------- 2 files changed, 133 insertions(+), 158 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 20ecdfae3289..a8c7a0ccc759 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -76,10 +76,27 @@ enum mem_cgroup_events_target { }; struct memcg_vmstats_percpu { - long stat[MEMCG_NR_STAT]; - unsigned long events[NR_VM_EVENT_ITEMS]; - unsigned long nr_page_events; - unsigned long targets[MEM_CGROUP_NTARGETS]; + /* Local (CPU and cgroup) page state & events */ + long state[MEMCG_NR_STAT]; + unsigned long events[NR_VM_EVENT_ITEMS]; + + /* Delta calculation for lockless upward propagation */ + long state_prev[MEMCG_NR_STAT]; + unsigned long events_prev[NR_VM_EVENT_ITEMS]; + + /* Cgroup1: threshold notifications & softlimit tree updates */ + unsigned long nr_page_events; + unsigned long targets[MEM_CGROUP_NTARGETS]; +}; + +struct memcg_vmstats { + /* Aggregated (CPU and subtree) page state & events */ + long state[MEMCG_NR_STAT]; + unsigned long events[NR_VM_EVENT_ITEMS]; + + /* Pending child counts during tree propagation */ + long state_pending[MEMCG_NR_STAT]; + unsigned long events_pending[NR_VM_EVENT_ITEMS]; }; struct mem_cgroup_reclaim_iter { @@ -287,8 +304,8 @@ struct mem_cgroup { MEMCG_PADDING(_pad1_); - atomic_long_t vmstats[MEMCG_NR_STAT]; - atomic_long_t vmevents[NR_VM_EVENT_ITEMS]; + /* memory.stat */ + struct memcg_vmstats vmstats; /* memory.events */ atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS]; @@ -315,10 +332,6 @@ struct mem_cgroup { atomic_t moving_account; struct task_struct *move_lock_task; - /* Legacy local VM stats and events */ - struct memcg_vmstats_percpu __percpu *vmstats_local; - - /* Subtree VM stats and events (batched updates) */ struct memcg_vmstats_percpu __percpu *vmstats_percpu; #ifdef CONFIG_CGROUP_WRITEBACK @@ -942,10 +955,6 @@ static inline void mod_memcg_lruvec_state(struct lruvec *lruvec, local_irq_restore(flags); } -unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, - gfp_t gfp_mask, - unsigned long *total_scanned); - void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, unsigned long count); @@ -1028,6 +1037,10 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm, void mem_cgroup_split_huge_fixup(struct page *head); #endif +unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, + gfp_t gfp_mask, + unsigned long *total_scanned); + #else /* CONFIG_MEMCG */ #define MEM_CGROUP_ID_SHIFT 0 @@ -1136,6 +1149,10 @@ static inline bool lruvec_holds_page_lru_lock(struct page *page, return lruvec == &pgdat->__lruvec; } +static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) +{ +} + static inline struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg) { return NULL; @@ -1349,18 +1366,6 @@ static inline void mod_lruvec_kmem_state(void *p, enum node_stat_item idx, mod_node_page_state(page_pgdat(page), idx, val); } -static inline -unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, - gfp_t gfp_mask, - unsigned long *total_scanned) -{ - return 0; -} - -static inline void mem_cgroup_split_huge_fixup(struct page *head) -{ -} - static inline void count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, unsigned long count) @@ -1383,8 +1388,16 @@ void count_memcg_event_mm(struct mm_struct *mm, enum vm_event_item idx) { } -static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) +static inline void mem_cgroup_split_huge_fixup(struct page *head) +{ +} + +static inline +unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, + gfp_t gfp_mask, + unsigned long *total_scanned) { + return 0; } #endif /* CONFIG_MEMCG */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2f97cb4cef6d..b205b2413186 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -757,6 +757,11 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz) return mz; } +static void memcg_flush_vmstats(struct mem_cgroup *memcg) +{ + cgroup_rstat_flush(memcg->css.cgroup); +} + /** * __mod_memcg_state - update cgroup memory statistics * @memcg: the memory cgroup @@ -765,37 +770,17 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz) */ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) { - long x, threshold = MEMCG_CHARGE_BATCH; - if (mem_cgroup_disabled()) return; - if (memcg_stat_item_in_bytes(idx)) - threshold <<= PAGE_SHIFT; - - x = val + __this_cpu_read(memcg->vmstats_percpu->stat[idx]); - if (unlikely(abs(x) > threshold)) { - struct mem_cgroup *mi; - - /* - * Batch local counters to keep them in sync with - * the hierarchical ones. - */ - __this_cpu_add(memcg->vmstats_local->stat[idx], x); - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - atomic_long_add(x, &mi->vmstats[idx]); - x = 0; - } - __this_cpu_write(memcg->vmstats_percpu->stat[idx], x); + __this_cpu_add(memcg->vmstats_percpu->state[idx], val); + cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id()); } -/* - * idx can be of type enum memcg_stat_item or node_stat_item. - * Keep in sync with memcg_exact_page_state(). - */ +/* idx can be of type enum memcg_stat_item or node_stat_item. */ static unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) { - long x = atomic_long_read(&memcg->vmstats[idx]); + long x = READ_ONCE(memcg->vmstats.state[idx]); #ifdef CONFIG_SMP if (x < 0) x = 0; @@ -803,17 +788,14 @@ static unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) return x; } -/* - * idx can be of type enum memcg_stat_item or node_stat_item. - * Keep in sync with memcg_exact_page_state(). - */ +/* idx can be of type enum memcg_stat_item or node_stat_item. */ static unsigned long memcg_page_state_local(struct mem_cgroup *memcg, int idx) { long x = 0; int cpu; for_each_possible_cpu(cpu) - x += per_cpu(memcg->vmstats_local->stat[idx], cpu); + x += per_cpu(memcg->vmstats_percpu->state[idx], cpu); #ifdef CONFIG_SMP if (x < 0) x = 0; @@ -936,30 +918,16 @@ void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val) void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, unsigned long count) { - unsigned long x; - if (mem_cgroup_disabled()) return; - x = count + __this_cpu_read(memcg->vmstats_percpu->events[idx]); - if (unlikely(x > MEMCG_CHARGE_BATCH)) { - struct mem_cgroup *mi; - - /* - * Batch local counters to keep them in sync with - * the hierarchical ones. - */ - __this_cpu_add(memcg->vmstats_local->events[idx], x); - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - atomic_long_add(x, &mi->vmevents[idx]); - x = 0; - } - __this_cpu_write(memcg->vmstats_percpu->events[idx], x); + __this_cpu_add(memcg->vmstats_percpu->events[idx], count); + cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id()); } static unsigned long memcg_events(struct mem_cgroup *memcg, int event) { - return atomic_long_read(&memcg->vmevents[event]); + return READ_ONCE(memcg->vmstats.events[event]); } static unsigned long memcg_events_local(struct mem_cgroup *memcg, int event) @@ -968,7 +936,7 @@ static unsigned long memcg_events_local(struct mem_cgroup *memcg, int event) int cpu; for_each_possible_cpu(cpu) - x += per_cpu(memcg->vmstats_local->events[event], cpu); + x += per_cpu(memcg->vmstats_percpu->events[event], cpu); return x; } @@ -1631,6 +1599,7 @@ static char *memory_stat_format(struct mem_cgroup *memcg) * * Current memory state: */ + memcg_flush_vmstats(memcg); for (i = 0; i < ARRAY_SIZE(memory_stats); i++) { u64 size; @@ -2450,22 +2419,11 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu) drain_stock(stock); for_each_mem_cgroup(memcg) { - struct memcg_vmstats_percpu *statc; int i; - statc = per_cpu_ptr(memcg->vmstats_percpu, cpu); - - for (i = 0; i < MEMCG_NR_STAT; i++) { + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { int nid; - if (statc->stat[i]) { - mod_memcg_state(memcg, i, statc->stat[i]); - statc->stat[i] = 0; - } - - if (i >= NR_VM_NODE_STAT_ITEMS) - continue; - for_each_node(nid) { struct batched_lruvec_stat *lstatc; struct mem_cgroup_per_node *pn; @@ -2484,13 +2442,6 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu) } } } - - for (i = 0; i < NR_VM_EVENT_ITEMS; i++) { - if (statc->events[i]) { - count_memcg_events(memcg, i, statc->events[i]); - statc->events[i] = 0; - } - } } return 0; @@ -3618,6 +3569,8 @@ static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap) { unsigned long val; + memcg_flush_vmstats(memcg); + if (mem_cgroup_is_root(memcg)) { val = memcg_page_state(memcg, NR_FILE_PAGES) + memcg_page_state(memcg, NR_ANON_MAPPED); @@ -3683,26 +3636,15 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css, } } -static void memcg_flush_percpu_vmstats(struct mem_cgroup *memcg) +static void memcg_flush_lruvec_page_state(struct mem_cgroup *memcg) { - unsigned long stat[MEMCG_NR_STAT] = {0}; - struct mem_cgroup *mi; - int node, cpu, i; - - for_each_online_cpu(cpu) - for (i = 0; i < MEMCG_NR_STAT; i++) - stat[i] += per_cpu(memcg->vmstats_percpu->stat[i], cpu); - - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - for (i = 0; i < MEMCG_NR_STAT; i++) - atomic_long_add(stat[i], &mi->vmstats[i]); + int node; for_each_node(node) { struct mem_cgroup_per_node *pn = memcg->nodeinfo[node]; + unsigned long stat[NR_VM_NODE_STAT_ITEMS] = {0, }; struct mem_cgroup_per_node *pi; - - for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) - stat[i] = 0; + int cpu, i; for_each_online_cpu(cpu) for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) @@ -3715,25 +3657,6 @@ static void memcg_flush_percpu_vmstats(struct mem_cgroup *memcg) } } -static void memcg_flush_percpu_vmevents(struct mem_cgroup *memcg) -{ - unsigned long events[NR_VM_EVENT_ITEMS]; - struct mem_cgroup *mi; - int cpu, i; - - for (i = 0; i < NR_VM_EVENT_ITEMS; i++) - events[i] = 0; - - for_each_online_cpu(cpu) - for (i = 0; i < NR_VM_EVENT_ITEMS; i++) - events[i] += per_cpu(memcg->vmstats_percpu->events[i], - cpu); - - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - for (i = 0; i < NR_VM_EVENT_ITEMS; i++) - atomic_long_add(events[i], &mi->vmevents[i]); -} - #ifdef CONFIG_MEMCG_KMEM static int memcg_online_kmem(struct mem_cgroup *memcg) { @@ -4050,6 +3973,8 @@ static int memcg_numa_stat_show(struct seq_file *m, void *v) int nid; struct mem_cgroup *memcg = mem_cgroup_from_seq(m); + memcg_flush_vmstats(memcg); + for (stat = stats; stat < stats + ARRAY_SIZE(stats); stat++) { seq_printf(m, "%s=%lu", stat->name, mem_cgroup_nr_lru_pages(memcg, stat->lru_mask, @@ -4120,6 +4045,8 @@ static int memcg_stat_show(struct seq_file *m, void *v) BUILD_BUG_ON(ARRAY_SIZE(memcg1_stat_names) != ARRAY_SIZE(memcg1_stats)); + memcg_flush_vmstats(memcg); + for (i = 0; i < ARRAY_SIZE(memcg1_stats); i++) { unsigned long nr; @@ -4596,22 +4523,6 @@ struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb) return &memcg->cgwb_domain; } -/* - * idx can be of type enum memcg_stat_item or node_stat_item. - * Keep in sync with memcg_exact_page(). - */ -static unsigned long memcg_exact_page_state(struct mem_cgroup *memcg, int idx) -{ - long x = atomic_long_read(&memcg->vmstats[idx]); - int cpu; - - for_each_online_cpu(cpu) - x += per_cpu_ptr(memcg->vmstats_percpu, cpu)->stat[idx]; - if (x < 0) - x = 0; - return x; -} - /** * mem_cgroup_wb_stats - retrieve writeback related stats from its memcg * @wb: bdi_writeback in question @@ -4637,13 +4548,14 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages, struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css); struct mem_cgroup *parent; - *pdirty = memcg_exact_page_state(memcg, NR_FILE_DIRTY); + memcg_flush_vmstats(memcg); - *pwriteback = memcg_exact_page_state(memcg, NR_WRITEBACK); - *pfilepages = memcg_exact_page_state(memcg, NR_INACTIVE_FILE) + - memcg_exact_page_state(memcg, NR_ACTIVE_FILE); - *pheadroom = PAGE_COUNTER_MAX; + *pdirty = memcg_page_state(memcg, NR_FILE_DIRTY); + *pwriteback = memcg_page_state(memcg, NR_WRITEBACK); + *pfilepages = memcg_page_state(memcg, NR_INACTIVE_FILE) + + memcg_page_state(memcg, NR_ACTIVE_FILE); + *pheadroom = PAGE_COUNTER_MAX; while ((parent = parent_mem_cgroup(memcg))) { unsigned long ceiling = min(READ_ONCE(memcg->memory.max), READ_ONCE(memcg->memory.high)); @@ -5275,7 +5187,6 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg) for_each_node(node) free_mem_cgroup_per_node_info(memcg, node); free_percpu(memcg->vmstats_percpu); - free_percpu(memcg->vmstats_local); kfree(memcg); } @@ -5283,11 +5194,10 @@ static void mem_cgroup_free(struct mem_cgroup *memcg) { memcg_wb_domain_exit(memcg); /* - * Flush percpu vmstats and vmevents to guarantee the value correctness - * on parent's and all ancestor levels. + * Flush percpu lruvec stats to guarantee the value + * correctness on parent's and all ancestor levels. */ - memcg_flush_percpu_vmstats(memcg); - memcg_flush_percpu_vmevents(memcg); + memcg_flush_lruvec_page_state(memcg); __mem_cgroup_free(memcg); } @@ -5314,11 +5224,6 @@ static struct mem_cgroup *mem_cgroup_alloc(void) goto fail; } - memcg->vmstats_local = alloc_percpu_gfp(struct memcg_vmstats_percpu, - GFP_KERNEL_ACCOUNT); - if (!memcg->vmstats_local) - goto fail; - memcg->vmstats_percpu = alloc_percpu_gfp(struct memcg_vmstats_percpu, GFP_KERNEL_ACCOUNT); if (!memcg->vmstats_percpu) @@ -5518,6 +5423,62 @@ static void mem_cgroup_css_reset(struct cgroup_subsys_state *css) memcg_wb_domain_size_changed(memcg); } +static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu) +{ + struct mem_cgroup *memcg = mem_cgroup_from_css(css); + struct mem_cgroup *parent = parent_mem_cgroup(memcg); + struct memcg_vmstats_percpu *statc; + long delta, v; + int i; + + statc = per_cpu_ptr(memcg->vmstats_percpu, cpu); + + for (i = 0; i < MEMCG_NR_STAT; i++) { + /* + * Collect the aggregated propagation counts of groups + * below us. We're in a per-cpu loop here and this is + * a global counter, so the first cycle will get them. + */ + delta = memcg->vmstats.state_pending[i]; + if (delta) + memcg->vmstats.state_pending[i] = 0; + + /* Add CPU changes on this level since the last flush */ + v = READ_ONCE(statc->state[i]); + if (v != statc->state_prev[i]) { + delta += v - statc->state_prev[i]; + statc->state_prev[i] = v; + } + + if (!delta) + continue; + + /* Aggregate counts on this level and propagate upwards */ + memcg->vmstats.state[i] += delta; + if (parent) + parent->vmstats.state_pending[i] += delta; + } + + for (i = 0; i < NR_VM_EVENT_ITEMS; i++) { + delta = memcg->vmstats.events_pending[i]; + if (delta) + memcg->vmstats.events_pending[i] = 0; + + v = READ_ONCE(statc->events[i]); + if (v != statc->events_prev[i]) { + delta += v - statc->events_prev[i]; + statc->events_prev[i] = v; + } + + if (!delta) + continue; + + memcg->vmstats.events[i] += delta; + if (parent) + parent->vmstats.events_pending[i] += delta; + } +} + #ifdef CONFIG_MMU /* Handlers for move charge at task migration. */ static int mem_cgroup_do_precharge(unsigned long count) @@ -6571,6 +6532,7 @@ struct cgroup_subsys memory_cgrp_subsys = { .css_released = mem_cgroup_css_released, .css_free = mem_cgroup_css_free, .css_reset = mem_cgroup_css_reset, + .css_rstat_flush = mem_cgroup_css_rstat_flush, .can_attach = mem_cgroup_can_attach, .cancel_attach = mem_cgroup_cancel_attach, .post_attach = mem_cgroup_move_task, From patchwork Tue Feb 2 18:47:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12062647 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 040D9C433E6 for ; Tue, 2 Feb 2021 18:48:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B1E6E64E9C for ; Tue, 2 Feb 2021 18:48:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B1E6E64E9C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5A9046B0073; Tue, 2 Feb 2021 13:48:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 50A736B0074; Tue, 2 Feb 2021 13:48:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 362A96B0075; Tue, 2 Feb 2021 13:48:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 20AB16B0073 for ; Tue, 2 Feb 2021 13:48:04 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D612E8249980 for ; Tue, 2 Feb 2021 18:48:03 +0000 (UTC) X-FDA: 77774212446.11.cause56_1800815275cd Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id B249E180F8B82 for ; Tue, 2 Feb 2021 18:48:03 +0000 (UTC) X-HE-Tag: cause56_1800815275cd X-Filterd-Recvd-Size: 6990 Received: from mail-qk1-f180.google.com (mail-qk1-f180.google.com [209.85.222.180]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Feb 2021 18:48:03 +0000 (UTC) Received: by mail-qk1-f180.google.com with SMTP id a12so20827722qkh.10 for ; Tue, 02 Feb 2021 10:48:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=wH4c6SXNdhNwh+C23BEFkUNz9OTjPe+NannGM5GtmrQ=; b=J29bxNRinL1CN+JqxD3H5YMXk5tmfxXkSEjOPJX/ssjGALv8pJGYjjR5KFiU19djZJ S6ZsrjzCu/7fjI1R9wGkp68lqh0Ham2Gkmos778PuL03eCfcX0NJrzAZdDt3IblYPUkg m+bR2gl8Du2Ksk+3F3P4Otlj2gPliOGcn9EL7zt/X9MOjB5pmNpc13Wp6wwl+p6vpN3R ETUpZrK5m+SFtwGy8cKknOGHSThCZWAXNgKzokTBTDb36plDldtYBXVfUWucOZO4abs/ z7WaXx9SJRbFDMlQJNh08vgzpdCyFr4nbRhRFHVEc5AmJpfeG5ASg2LtbyD26zAjntrl NDOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wH4c6SXNdhNwh+C23BEFkUNz9OTjPe+NannGM5GtmrQ=; b=be06jHUMlnd2KLDVQ52GQ4prdNghBJ4PrCfVtKME6oY8Yisyo5wrSfbW+4MZVAgE+N lOvtG+cNr5oE/3b1DY8JBHZOj7PwGvUz4eDT3FU/BoDI33QmoTDXMOUVfR6quXboEVBx HVZ5dF2cZmOkuUqYLCy6iiMR750yN7ac61Osm5XAG8ZFN5ecbF75wy7z2NZKPHJGZAs0 9k0l83nvclLBJk3hmiEm1ZNd3SonrAXtfSlG0FU67U8klkW0zK9axY2ssB+koNRPVCgy DVhPjXywWbSYffiJMPLpoKr61ueNl3F9H4HPhgKdCOJfrZducucCKxR4UJ9H5sdngmYi SmXw== X-Gm-Message-State: AOAM5326mYg8R7syF5AimFEowUJa0h2qOWGKVr4VnryBXYeLYlzzjVMj rinKiO0xTiRkplv8KmP6FRMXIXPQsUIOUA== X-Google-Smtp-Source: ABdhPJyNb24nKjqr7ExMx3Nitbz7FaveWMnR7LErpEQpfoxBwRQEu+EI4Tol1j/vL6HvAGomXp8glA== X-Received: by 2002:ae9:d881:: with SMTP id u123mr22659274qkf.133.1612291682726; Tue, 02 Feb 2021 10:48:02 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id k129sm18117119qkf.108.2021.02.02.10.48.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 10:48:02 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 7/7] mm: memcontrol: consolidate lruvec stat flushing Date: Tue, 2 Feb 2021 13:47:46 -0500 Message-Id: <20210202184746.119084-8-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210202184746.119084-1-hannes@cmpxchg.org> References: <20210202184746.119084-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There are two functions to flush the per-cpu data of an lruvec into the rest of the cgroup tree: when the cgroup is being freed, and when a CPU disappears during hotplug. The difference is whether all CPUs or just one is being collected, but the rest of the flushing code is the same. Merge them into one function and share the common code. Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin Acked-by: Michal Hocko --- mm/memcontrol.c | 88 +++++++++++++++++++++++-------------------------- 1 file changed, 42 insertions(+), 46 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b205b2413186..88e8afc49a46 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2410,39 +2410,56 @@ static void drain_all_stock(struct mem_cgroup *root_memcg) mutex_unlock(&percpu_charge_mutex); } -static int memcg_hotplug_cpu_dead(unsigned int cpu) +static void memcg_flush_lruvec_page_state(struct mem_cgroup *memcg, int cpu) { - struct memcg_stock_pcp *stock; - struct mem_cgroup *memcg; - - stock = &per_cpu(memcg_stock, cpu); - drain_stock(stock); + int nid; - for_each_mem_cgroup(memcg) { + for_each_node(nid) { + struct mem_cgroup_per_node *pn = memcg->nodeinfo[nid]; + unsigned long stat[NR_VM_NODE_STAT_ITEMS] = { 0, }; + struct batched_lruvec_stat *lstatc; int i; - for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { - int nid; - - for_each_node(nid) { - struct batched_lruvec_stat *lstatc; - struct mem_cgroup_per_node *pn; - long x; - - pn = memcg->nodeinfo[nid]; + if (cpu == -1) { + int cpui; + /* + * The memcg is about to be freed, collect all + * CPUs, no need to zero anything out. + */ + for_each_online_cpu(cpui) { + lstatc = per_cpu_ptr(pn->lruvec_stat_cpu, cpui); + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) + stat[i] += lstatc->count[i]; + } + } else { + /* + * The CPU has gone away, collect and zero out + * its stats, it may come back later. + */ + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { lstatc = per_cpu_ptr(pn->lruvec_stat_cpu, cpu); - - x = lstatc->count[i]; + stat[i] = lstatc->count[i]; lstatc->count[i] = 0; - - if (x) { - do { - atomic_long_add(x, &pn->lruvec_stat[i]); - } while ((pn = parent_nodeinfo(pn, nid))); - } } } + + do { + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) + atomic_long_add(stat[i], &pn->lruvec_stat[i]); + } while ((pn = parent_nodeinfo(pn, nid))); } +} + +static int memcg_hotplug_cpu_dead(unsigned int cpu) +{ + struct memcg_stock_pcp *stock; + struct mem_cgroup *memcg; + + stock = &per_cpu(memcg_stock, cpu); + drain_stock(stock); + + for_each_mem_cgroup(memcg) + memcg_flush_lruvec_page_state(memcg, cpu); return 0; } @@ -3636,27 +3653,6 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css, } } -static void memcg_flush_lruvec_page_state(struct mem_cgroup *memcg) -{ - int node; - - for_each_node(node) { - struct mem_cgroup_per_node *pn = memcg->nodeinfo[node]; - unsigned long stat[NR_VM_NODE_STAT_ITEMS] = {0, }; - struct mem_cgroup_per_node *pi; - int cpu, i; - - for_each_online_cpu(cpu) - for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) - stat[i] += per_cpu( - pn->lruvec_stat_cpu->count[i], cpu); - - for (pi = pn; pi; pi = parent_nodeinfo(pi, node)) - for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) - atomic_long_add(stat[i], &pi->lruvec_stat[i]); - } -} - #ifdef CONFIG_MEMCG_KMEM static int memcg_online_kmem(struct mem_cgroup *memcg) { @@ -5197,7 +5193,7 @@ static void mem_cgroup_free(struct mem_cgroup *memcg) * Flush percpu lruvec stats to guarantee the value * correctness on parent's and all ancestor levels. */ - memcg_flush_lruvec_page_state(memcg); + memcg_flush_lruvec_page_state(memcg, -1); __mem_cgroup_free(memcg); }