From patchwork Fri Feb 5 18:27:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12070705 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E58D2C4332D for ; Fri, 5 Feb 2021 18:28:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 53A7A64EEF for ; Fri, 5 Feb 2021 18:28:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 53A7A64EEF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C5FF06B0071; Fri, 5 Feb 2021 13:28:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C0FD86B0073; Fri, 5 Feb 2021 13:28:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B012E6B0074; Fri, 5 Feb 2021 13:28:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0230.hostedemail.com [216.40.44.230]) by kanga.kvack.org (Postfix) with ESMTP id 99D3B6B0071 for ; Fri, 5 Feb 2021 13:28:28 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 642808249980 for ; Fri, 5 Feb 2021 18:28:28 +0000 (UTC) X-FDA: 77785049496.08.leg84_5007526275e6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin08.hostedemail.com (Postfix) with ESMTP id 421B31819E793 for ; Fri, 5 Feb 2021 18:28:28 +0000 (UTC) X-HE-Tag: leg84_5007526275e6 X-Filterd-Recvd-Size: 5765 Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Fri, 5 Feb 2021 18:28:27 +0000 (UTC) Received: by mail-qt1-f170.google.com with SMTP id e15so5675405qte.9 for ; Fri, 05 Feb 2021 10:28:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=anbL9Wexkq0QH+x4JfZ6vBo+adkAAWXAkexWmp3sHZ4=; b=U4EQA/daTMWDj2KpnKM6JeOv9uy/kBDEm/uO2MZ/dYoFEl9BtZMe0u8Up1GJlOAVat FyUdHSnYcoLRyCDpy8I5JguO9gryQYk06pvXlFuWvP1nanx9BoVt708OpOVjX7hGp0NM u3CpSRjrbpoptPnLF+vCKAD0uEl8CgLgfTmhRpQRx/D7hadhMAKsjwWqjtUsOTrJBngg UoyMrvGzLTbhsQjBLb0Zn4ex3iUdg2hNEv85wEs/BAJ3rcBGu3oLsoIUeTYsYAC4yl/2 Yon8QeGPR9FRKoCpxU7jZA/EY4KF58sECERo6V/grHRiHwQukuZWKSjdWkEetayENa9D 8EAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=anbL9Wexkq0QH+x4JfZ6vBo+adkAAWXAkexWmp3sHZ4=; b=khdONrDJVa28SlruegnYwZ80eLWx3TBUHOKFKmv1cdjql3ZdIy1SI23RvODgL8cfgA ieT7k8D8iKPwQOLrNgzO6Qpx8j0AN4rCk06PPKAFq0V3Sq93xBtC0TFHAVCc78YJ02DN hser0rf78ThAqNfKSGKezw21jc9fPDnDPwgMZcOFMqI4G+23ryz//vZhvE3u78dMEwEK 8e33fUJ8tYFn9psatuGh2q/plbdaa6++c7N9DX9JbTlPLbxx55riMkZNOEXiK9N6npPo XmOevezUfZUdgEhdux4tJS8zz5AxTIsHAxb8kzVGeUwJHsx+1abPc8uzwQzAwQ1txfcI Jlxg== X-Gm-Message-State: AOAM530fGCuCfhRtkiIUVQWRLHFhv0R7i09JrNU4kCrlEqAIRYrdDeId hpIbERSmNp0Ib6H7mXn/rZXyrA== X-Google-Smtp-Source: ABdhPJxc7WXk4zJUXM1XvvDnCZEwxVU/BXPLAaamoPKXlLUScXsoh7UaMFIajiqR+agL6tmf/QJPQg== X-Received: by 2002:ac8:598e:: with SMTP id e14mr5644319qte.346.1612549707014; Fri, 05 Feb 2021 10:28:27 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id q50sm9452451qtb.32.2021.02.05.10.28.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Feb 2021 10:28:26 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , Shakeel Butt , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 1/8] mm: memcontrol: fix cpuhotplug statistics flushing Date: Fri, 5 Feb 2021 13:27:59 -0500 Message-Id: <20210205182806.17220-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210205182806.17220-1-hannes@cmpxchg.org> References: <20210205182806.17220-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The memcg hotunplug callback erroneously flushes counts on the local CPU, not the counts of the CPU going away; those counts will be lost. Flush the CPU that is actually going away. Also simplify the code a bit by using mod_memcg_state() and count_memcg_events() instead of open-coding the upward flush - this is comparable to how vmstat.c handles hotunplug flushing. Fixes: a983b5ebee572 ("mm: memcontrol: fix excessive complexity in memory.stat reporting") Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt Reviewed-by: Roman Gushchin Acked-by: Michal Hocko --- mm/memcontrol.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ed5cc78a8dbf..8120d565dd79 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2411,45 +2411,52 @@ static void drain_all_stock(struct mem_cgroup *root_memcg) static int memcg_hotplug_cpu_dead(unsigned int cpu) { struct memcg_stock_pcp *stock; - struct mem_cgroup *memcg, *mi; + struct mem_cgroup *memcg; stock = &per_cpu(memcg_stock, cpu); drain_stock(stock); for_each_mem_cgroup(memcg) { + struct memcg_vmstats_percpu *statc; int i; + statc = per_cpu_ptr(memcg->vmstats_percpu, cpu); + for (i = 0; i < MEMCG_NR_STAT; i++) { int nid; - long x; - x = this_cpu_xchg(memcg->vmstats_percpu->stat[i], 0); - if (x) - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - atomic_long_add(x, &memcg->vmstats[i]); + if (statc->stat[i]) { + mod_memcg_state(memcg, i, statc->stat[i]); + statc->stat[i] = 0; + } if (i >= NR_VM_NODE_STAT_ITEMS) continue; for_each_node(nid) { + struct batched_lruvec_stat *lstatc; struct mem_cgroup_per_node *pn; + long x; pn = mem_cgroup_nodeinfo(memcg, nid); - x = this_cpu_xchg(pn->lruvec_stat_cpu->count[i], 0); - if (x) + lstatc = per_cpu_ptr(pn->lruvec_stat_cpu, cpu); + + x = lstatc->count[i]; + lstatc->count[i] = 0; + + if (x) { do { atomic_long_add(x, &pn->lruvec_stat[i]); } while ((pn = parent_nodeinfo(pn, nid))); + } } } for (i = 0; i < NR_VM_EVENT_ITEMS; i++) { - long x; - - x = this_cpu_xchg(memcg->vmstats_percpu->events[i], 0); - if (x) - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - atomic_long_add(x, &memcg->vmevents[i]); + if (statc->events[i]) { + count_memcg_events(memcg, i, statc->events[i]); + statc->events[i] = 0; + } } } From patchwork Fri Feb 5 18:28:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12070709 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A13A6C433E0 for ; Fri, 5 Feb 2021 18:28:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2E04B64E75 for ; Fri, 5 Feb 2021 18:28:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2E04B64E75 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 029216B0073; Fri, 5 Feb 2021 13:28:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 002366B0074; Fri, 5 Feb 2021 13:28:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E36086B0075; Fri, 5 Feb 2021 13:28:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0146.hostedemail.com [216.40.44.146]) by kanga.kvack.org (Postfix) with ESMTP id C9AE06B0073 for ; Fri, 5 Feb 2021 13:28:29 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 98D541F06 for ; Fri, 5 Feb 2021 18:28:29 +0000 (UTC) X-FDA: 77785049538.01.C2833B2 Received: from mail-qt1-f177.google.com (mail-qt1-f177.google.com [209.85.160.177]) by imf06.hostedemail.com (Postfix) with ESMTP id DFEB0C0001E3 for ; Fri, 5 Feb 2021 18:28:28 +0000 (UTC) Received: by mail-qt1-f177.google.com with SMTP id x3so3774756qti.5 for ; Fri, 05 Feb 2021 10:28:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=t4R65KE0u5JIY01fjm4wgVu+gu5q/lvBl+LPH4oeHXQ=; b=2GoTFtEowqrMcE/2JXvhPu/VJNSggY8k2YXKw+oBnF8qVqSj6nFZJeXkZSmQvh8Chb iKccOdyZsvW9v8jznhdC39v/OqbplLGxQ115ICBCsWhCh6GUT5AItWlGkMnzetbPVRL4 VlU94D2Rl6GaGRw4lAd6FsE25kmQ7ni7w3piepIbRGd4Jts5HY0bjVyLY1Mer5oD7HHr tN3uEwpoJoGGutOMcv2OHRj9Yw2D8DxYUTpqeFsShqs6/jsKScTJxsrn6XlRoYE9R85e 35x/epOpQ3YVQmPBd9A3QMw5J03QfaPF6YSrE2LfOJkkOZzB+3UoHW/aBkLkQEMw6b20 9KSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=t4R65KE0u5JIY01fjm4wgVu+gu5q/lvBl+LPH4oeHXQ=; b=R8kFvwqxVFxMr4k8madmgkCiUwnKzJAuj9YCeTWHIoRHujwUM01YusCH/3GsTqjju/ 2Ry6bDcvHFh+YJJh+9nCRTLBS1lQrK25V7crwzNPAVPDGk/poB5xFrjaQWkR/CoYspNO 1L8PjoTsnh8bJt1gd1E+kUHZ/xOXA6hGApjgFrFCkPtyUyG8Q6Hm4gFqiHqfuTnbsdDp jmgOIZO1deT075IKB6V1b/hMwXi2afpoabxtT0GPTzDTlfRaDUD1P8kPWy+3x0PIJZ8T Vte1zrMsnTd2ti9L614p8TvmpqpdrWsWl1fahtRTHJ8Ie0I+HT8MtV7EiDKUhHdpscfZ zPjQ== X-Gm-Message-State: AOAM531hOSRwZ43JK/2uRioeN8g6ydbpbEiyqZT+97HHjHUPE6Jgfvlm 2/FKWc6xbxHAG3uh8eJv40FYPQ== X-Google-Smtp-Source: ABdhPJxskLL12QuPeM80alts2Ieohd4DWn8gwtbIQhntDTTUcKmLwiiV5MLgVvq2UCooCL7jTMQRbg== X-Received: by 2002:aed:2090:: with SMTP id 16mr5283187qtb.265.1612549708373; Fri, 05 Feb 2021 10:28:28 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id h5sm9029066qti.22.2021.02.05.10.28.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Feb 2021 10:28:27 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , Shakeel Butt , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 2/8] mm: memcontrol: kill mem_cgroup_nodeinfo() Date: Fri, 5 Feb 2021 13:28:00 -0500 Message-Id: <20210205182806.17220-3-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210205182806.17220-1-hannes@cmpxchg.org> References: <20210205182806.17220-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: DFEB0C0001E3 X-Stat-Signature: dg8q6b17yn3468m4chhf46og5iasi5ew Received-SPF: none (cmpxchg.org>: No applicable sender policy available) receiver=imf06; identity=mailfrom; envelope-from=""; helo=mail-qt1-f177.google.com; client-ip=209.85.160.177 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1612549708-171898 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: No need to encapsulate a simple struct member access. Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt Reviewed-by: Roman Gushchin Acked-by: Michal Hocko --- include/linux/memcontrol.h | 8 +------- mm/memcontrol.c | 21 +++++++++++---------- 2 files changed, 12 insertions(+), 17 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 7a38a1517a05..c7f387a6233e 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -602,12 +602,6 @@ void mem_cgroup_uncharge_list(struct list_head *page_list); void mem_cgroup_migrate(struct page *oldpage, struct page *newpage); -static struct mem_cgroup_per_node * -mem_cgroup_nodeinfo(struct mem_cgroup *memcg, int nid) -{ - return memcg->nodeinfo[nid]; -} - /** * mem_cgroup_lruvec - get the lru list vector for a memcg & node * @memcg: memcg of the wanted lruvec @@ -631,7 +625,7 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, if (!memcg) memcg = root_mem_cgroup; - mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id); + mz = memcg->nodeinfo[pgdat->node_id]; lruvec = &mz->lruvec; out: /* diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 8120d565dd79..7e05a4ebf80f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -414,13 +414,14 @@ static int memcg_expand_one_shrinker_map(struct mem_cgroup *memcg, int size, int old_size) { struct memcg_shrinker_map *new, *old; + struct mem_cgroup_per_node *pn; int nid; lockdep_assert_held(&memcg_shrinker_map_mutex); for_each_node(nid) { - old = rcu_dereference_protected( - mem_cgroup_nodeinfo(memcg, nid)->shrinker_map, true); + pn = memcg->nodeinfo[nid]; + old = rcu_dereference_protected(pn->shrinker_map, true); /* Not yet online memcg */ if (!old) return 0; @@ -433,7 +434,7 @@ static int memcg_expand_one_shrinker_map(struct mem_cgroup *memcg, memset(new->map, (int)0xff, old_size); memset((void *)new->map + old_size, 0, size - old_size); - rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_map, new); + rcu_assign_pointer(pn->shrinker_map, new); call_rcu(&old->rcu, memcg_free_shrinker_map_rcu); } @@ -450,7 +451,7 @@ static void memcg_free_shrinker_maps(struct mem_cgroup *memcg) return; for_each_node(nid) { - pn = mem_cgroup_nodeinfo(memcg, nid); + pn = memcg->nodeinfo[nid]; map = rcu_dereference_protected(pn->shrinker_map, true); kvfree(map); rcu_assign_pointer(pn->shrinker_map, NULL); @@ -713,7 +714,7 @@ static void mem_cgroup_remove_from_trees(struct mem_cgroup *memcg) int nid; for_each_node(nid) { - mz = mem_cgroup_nodeinfo(memcg, nid); + mz = memcg->nodeinfo[nid]; mctz = soft_limit_tree_node(nid); if (mctz) mem_cgroup_remove_exceeded(mz, mctz); @@ -796,7 +797,7 @@ parent_nodeinfo(struct mem_cgroup_per_node *pn, int nid) parent = parent_mem_cgroup(pn->memcg); if (!parent) return NULL; - return mem_cgroup_nodeinfo(parent, nid); + return parent->nodeinfo[nid]; } void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, @@ -1163,7 +1164,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root, if (reclaim) { struct mem_cgroup_per_node *mz; - mz = mem_cgroup_nodeinfo(root, reclaim->pgdat->node_id); + mz = root->nodeinfo[reclaim->pgdat->node_id]; iter = &mz->iter; if (prev && reclaim->generation != iter->generation) @@ -1265,7 +1266,7 @@ static void __invalidate_reclaim_iterators(struct mem_cgroup *from, int nid; for_each_node(nid) { - mz = mem_cgroup_nodeinfo(from, nid); + mz = from->nodeinfo[nid]; iter = &mz->iter; cmpxchg(&iter->position, dead_memcg, NULL); } @@ -2438,7 +2439,7 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu) struct mem_cgroup_per_node *pn; long x; - pn = mem_cgroup_nodeinfo(memcg, nid); + pn = memcg->nodeinfo[nid]; lstatc = per_cpu_ptr(pn->lruvec_stat_cpu, cpu); x = lstatc->count[i]; @@ -4145,7 +4146,7 @@ static int memcg_stat_show(struct seq_file *m, void *v) unsigned long file_cost = 0; for_each_online_pgdat(pgdat) { - mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id); + mz = memcg->nodeinfo[pgdat->node_id]; anon_cost += mz->lruvec.anon_cost; file_cost += mz->lruvec.file_cost; From patchwork Fri Feb 5 18:28:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12070711 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90191C433DB for ; Fri, 5 Feb 2021 18:28:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1289364EFE for ; Fri, 5 Feb 2021 18:28:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1289364EFE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 867456B0074; Fri, 5 Feb 2021 13:28:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7F89A6B0075; Fri, 5 Feb 2021 13:28:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 583F08D0001; Fri, 5 Feb 2021 13:28:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0175.hostedemail.com [216.40.44.175]) by kanga.kvack.org (Postfix) with ESMTP id 3DCD46B0074 for ; Fri, 5 Feb 2021 13:28:31 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 057103620 for ; Fri, 5 Feb 2021 18:28:31 +0000 (UTC) X-FDA: 77785049622.02.dad95_57051d0275e6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id E02D610097AA1 for ; Fri, 5 Feb 2021 18:28:30 +0000 (UTC) X-HE-Tag: dad95_57051d0275e6 X-Filterd-Recvd-Size: 6445 Received: from mail-qk1-f181.google.com (mail-qk1-f181.google.com [209.85.222.181]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Fri, 5 Feb 2021 18:28:30 +0000 (UTC) Received: by mail-qk1-f181.google.com with SMTP id r77so7823381qka.12 for ; Fri, 05 Feb 2021 10:28:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cyRE6cACCpenz5S22wUDNGiEebDEXlsVuxUU66sedEg=; b=icEoG5fvED4gxDkodWySfEiJNdvHYTrXwyvRJR/NMlNgXGomJfFsLQ0X1Z7EzQefwS rfW+aNcKIIETp0qQ7BgaL+bHxz9JDwG8D0DQ7OfBTQMxsMX1PKR5GEDlskiULC37oHzk snxN/HmpVBXYD1Azekn8O3eQqv9R8NTKfz3zGLkjSAnFLNp0F5ChuS6QLvZs+NwzrOXa MTt6ZuHZmnFc/5tqEtOoAJV4yDeMqaNNC3TDVMuNTyayv8+BojubGAjU9FECgFSvq0Rv itLqqfmn0+w1v9I4LLhobur8A2NlT6lqld3Lz3oem56U7oDrlQVEEyGhxQWCkJGnDPAB gMwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cyRE6cACCpenz5S22wUDNGiEebDEXlsVuxUU66sedEg=; b=M8Jbz4VyO/Or9MKBYyB+pYSHUYusEWq/cpA3jaF2K7VXMbjxRK6j1qDU65v6T8Kfs5 GvGd59sPYiww0MExe6lSvDNkJ4saIcIgArggIx8XYEX7ZMMkuDazYlY5IWJL8JwhMldM CBgaLHAiime8QSuk8QSP4A1s5uncWVtHgv1cO46J49hIGZf4Jxwl/s9ELW/0f9xa8Tdk aLU7duDQFVw8h5EmAZ8FegEJaVVXyDs8XkeP/9Sf9O7nQFQcxNKNOtD8oSZkZOdv2CNb S3++TlS6x6wn5LECYRTYX35DHVcaUJxEYGp9lNlDHm0DeRGDwJ1QW7YynWifHLe8kICF wpGg== X-Gm-Message-State: AOAM533upuBzVAUVylOPlD5ZCi835JbqoNIV3UnesQ7Zb+AkwDAZfCEQ +/sJJkJBY4KEH+AMT56sR8v28Q== X-Google-Smtp-Source: ABdhPJwxx/7nrxQH2xMQl9OF2NmblRhnJXqu1V7GG+uDZJC69P0ViHX8NU//3iSmLG2aZEYjiC0NyA== X-Received: by 2002:a05:620a:2226:: with SMTP id n6mr5435497qkh.193.1612549709755; Fri, 05 Feb 2021 10:28:29 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id 2sm9983861qkf.97.2021.02.05.10.28.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Feb 2021 10:28:29 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , Shakeel Butt , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 3/8] mm: memcontrol: privatize memcg_page_state query functions Date: Fri, 5 Feb 2021 13:28:01 -0500 Message-Id: <20210205182806.17220-4-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210205182806.17220-1-hannes@cmpxchg.org> References: <20210205182806.17220-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There are no users outside of the memory controller itself. The rest of the kernel cares either about node or lruvec stats. Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt Reviewed-by: Roman Gushchin Acked-by: Michal Hocko --- include/linux/memcontrol.h | 44 -------------------------------------- mm/memcontrol.c | 32 +++++++++++++++++++++++++++ 2 files changed, 32 insertions(+), 44 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index c7f387a6233e..20ecdfae3289 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -867,39 +867,6 @@ struct mem_cgroup *lock_page_memcg(struct page *page); void __unlock_page_memcg(struct mem_cgroup *memcg); void unlock_page_memcg(struct page *page); -/* - * idx can be of type enum memcg_stat_item or node_stat_item. - * Keep in sync with memcg_exact_page_state(). - */ -static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) -{ - long x = atomic_long_read(&memcg->vmstats[idx]); -#ifdef CONFIG_SMP - if (x < 0) - x = 0; -#endif - return x; -} - -/* - * idx can be of type enum memcg_stat_item or node_stat_item. - * Keep in sync with memcg_exact_page_state(). - */ -static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg, - int idx) -{ - long x = 0; - int cpu; - - for_each_possible_cpu(cpu) - x += per_cpu(memcg->vmstats_local->stat[idx], cpu); -#ifdef CONFIG_SMP - if (x < 0) - x = 0; -#endif - return x; -} - void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val); /* idx can be of type enum memcg_stat_item or node_stat_item */ @@ -1337,17 +1304,6 @@ static inline void mem_cgroup_print_oom_group(struct mem_cgroup *memcg) { } -static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) -{ - return 0; -} - -static inline unsigned long memcg_page_state_local(struct mem_cgroup *memcg, - int idx) -{ - return 0; -} - static inline void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int nr) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7e05a4ebf80f..2f97cb4cef6d 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -789,6 +789,38 @@ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) __this_cpu_write(memcg->vmstats_percpu->stat[idx], x); } +/* + * idx can be of type enum memcg_stat_item or node_stat_item. + * Keep in sync with memcg_exact_page_state(). + */ +static unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) +{ + long x = atomic_long_read(&memcg->vmstats[idx]); +#ifdef CONFIG_SMP + if (x < 0) + x = 0; +#endif + return x; +} + +/* + * idx can be of type enum memcg_stat_item or node_stat_item. + * Keep in sync with memcg_exact_page_state(). + */ +static unsigned long memcg_page_state_local(struct mem_cgroup *memcg, int idx) +{ + long x = 0; + int cpu; + + for_each_possible_cpu(cpu) + x += per_cpu(memcg->vmstats_local->stat[idx], cpu); +#ifdef CONFIG_SMP + if (x < 0) + x = 0; +#endif + return x; +} + static struct mem_cgroup_per_node * parent_nodeinfo(struct mem_cgroup_per_node *pn, int nid) { From patchwork Fri Feb 5 18:28:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12070713 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55CD3C433E0 for ; Fri, 5 Feb 2021 18:28:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B5ED964E4D for ; Fri, 5 Feb 2021 18:28:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B5ED964E4D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0F0FB6B0075; Fri, 5 Feb 2021 13:28:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 02EE16B0078; Fri, 5 Feb 2021 13:28:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E32188D0001; Fri, 5 Feb 2021 13:28:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0108.hostedemail.com [216.40.44.108]) by kanga.kvack.org (Postfix) with ESMTP id CD3AB6B0075 for ; Fri, 5 Feb 2021 13:28:32 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 950AB180AD817 for ; Fri, 5 Feb 2021 18:28:32 +0000 (UTC) X-FDA: 77785049664.22.base04_0f18171275e6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 74E4D18038E67 for ; Fri, 5 Feb 2021 18:28:32 +0000 (UTC) X-HE-Tag: base04_0f18171275e6 X-Filterd-Recvd-Size: 8050 Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Fri, 5 Feb 2021 18:28:31 +0000 (UTC) Received: by mail-qv1-f51.google.com with SMTP id l14so3929550qvp.2 for ; Fri, 05 Feb 2021 10:28:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BaUO0pgli+w3WAaSzoSou9pv5Ds3nAQBJUbmAgL0bVs=; b=AM3ZrZz8cvpBGn6MlO823+OEBTcXj/+p9LQp8MY8YAsGWuGslA5P7ox7fpa8rJoWfz md7ddBktXxCCmBlfPQRL549Cd+kHwYzfDMlBuA1Kis8uOitRMBu+JIr0TBY1JkhLaqEf U6c2a+wxWq9lQCLJwx5tuF3oJ+o5JNlwpkQa+paSMBwZ1veYAxAWfl0/TuCpDj4fsHzU mgGcUaIJfJbIt2Gl73PUll0IqbMyV3Cv4UHN3wInAJwUu3LsbL5p2rCnT+RMn6fSGVVV m4YfPHNDXLPlX5akurOGBWHjkX/aOxnlghimaHbpxRs9Q/5ddg7sX2By63vXOjhvn27H M5qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BaUO0pgli+w3WAaSzoSou9pv5Ds3nAQBJUbmAgL0bVs=; b=HuX1BZhbvBhXsf/OeHH2YYqNAtsbr12Ooz+rvaos4iQbksXw9eMSlJb49jTd6rleBU iUH8JmYJwXmYrpXdJ+tpTU5O3jQtt0G0IEjclQyjYpG3ppRGTPZLMpPy+9R/dER+sEF7 Mz+fVsh0BrofBnRtLRixcV1fyk3Xma+1MtG3pKv1n1lqlKyUX9gle+Nu2X1LCIj4xXXI Hwy6+JiyXUh6aeiw8zCPRjMM+Cn7NrxMGbB+hVWD1QSlnveuAwp/iUS5o65BctF25jmw Ybn2uTvK6m/KHnIlXHhee5Wow+GeJT3yWbTJpJyufqpX/tFPttNoA3FmSyEivQwLU3Qg gDng== X-Gm-Message-State: AOAM530BkpXvGDSpYsn+KzY3bjq7vOBDvpE/5Dcv8HyYyHjUahZIfyWM CmypREO8xijfaZ9Z/u979P0L+g== X-Google-Smtp-Source: ABdhPJyz81NdWPmY29u6GqqNKhYbdUYvhmYNTCR0U6pkERftLYkKVVBInVR/qTkf3K/FfYJzX0/b0w== X-Received: by 2002:a05:6214:a4f:: with SMTP id ee15mr5598175qvb.10.1612549711214; Fri, 05 Feb 2021 10:28:31 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id 5sm9582767qkk.109.2021.02.05.10.28.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Feb 2021 10:28:30 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , Shakeel Butt , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 4/8] cgroup: rstat: support cgroup1 Date: Fri, 5 Feb 2021 13:28:02 -0500 Message-Id: <20210205182806.17220-5-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210205182806.17220-1-hannes@cmpxchg.org> References: <20210205182806.17220-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Rstat currently only supports the default hierarchy in cgroup2. In order to replace memcg's private stats infrastructure - used in both cgroup1 and cgroup2 - with rstat, the latter needs to support cgroup1. The initialization and destruction callbacks for regular cgroups are already in place. Remove the cgroup_on_dfl() guards to handle cgroup1. The initialization of the root cgroup is currently hardcoded to only handle cgrp_dfl_root.cgrp. Move those callbacks to cgroup_setup_root() and cgroup_destroy_root() to handle the default root as well as the various cgroup1 roots we may set up during mounting. The linking of css to cgroups happens in code shared between cgroup1 and cgroup2 as well. Simply remove the cgroup_on_dfl() guard. Linkage of the root css to the root cgroup is a bit trickier: per default, the root css of a subsystem controller belongs to the default hierarchy (i.e. the cgroup2 root). When a controller is mounted in its cgroup1 version, the root css is stolen and moved to the cgroup1 root; on unmount, the css moves back to the default hierarchy. Annotate rebind_subsystems() to move the root css linkage along between roots. Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin Reviewed-by: Shakeel Butt Acked-by: Tejun Heo --- kernel/cgroup/cgroup.c | 34 +++++++++++++++++++++------------- kernel/cgroup/rstat.c | 2 -- 2 files changed, 21 insertions(+), 15 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 9153b20e5cc6..e049edd66776 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1339,6 +1339,7 @@ static void cgroup_destroy_root(struct cgroup_root *root) mutex_unlock(&cgroup_mutex); + cgroup_rstat_exit(cgrp); kernfs_destroy_root(root->kf_root); cgroup_free_root(root); } @@ -1751,6 +1752,12 @@ int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask) &dcgrp->e_csets[ss->id]); spin_unlock_irq(&css_set_lock); + if (ss->css_rstat_flush) { + list_del_rcu(&css->rstat_css_node); + list_add_rcu(&css->rstat_css_node, + &dcgrp->rstat_css_list); + } + /* default hierarchy doesn't enable controllers by default */ dst_root->subsys_mask |= 1 << ssid; if (dst_root == &cgrp_dfl_root) { @@ -1971,10 +1978,14 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask) if (ret) goto destroy_root; - ret = rebind_subsystems(root, ss_mask); + ret = cgroup_rstat_init(root_cgrp); if (ret) goto destroy_root; + ret = rebind_subsystems(root, ss_mask); + if (ret) + goto exit_stats; + ret = cgroup_bpf_inherit(root_cgrp); WARN_ON_ONCE(ret); @@ -2006,6 +2017,8 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask) ret = 0; goto out; +exit_stats: + cgroup_rstat_exit(root_cgrp); destroy_root: kernfs_destroy_root(root->kf_root); root->kf_root = NULL; @@ -4934,8 +4947,7 @@ static void css_free_rwork_fn(struct work_struct *work) cgroup_put(cgroup_parent(cgrp)); kernfs_put(cgrp->kn); psi_cgroup_free(cgrp); - if (cgroup_on_dfl(cgrp)) - cgroup_rstat_exit(cgrp); + cgroup_rstat_exit(cgrp); kfree(cgrp); } else { /* @@ -4976,8 +4988,7 @@ static void css_release_work_fn(struct work_struct *work) /* cgroup release path */ TRACE_CGROUP_PATH(release, cgrp); - if (cgroup_on_dfl(cgrp)) - cgroup_rstat_flush(cgrp); + cgroup_rstat_flush(cgrp); spin_lock_irq(&css_set_lock); for (tcgrp = cgroup_parent(cgrp); tcgrp; @@ -5034,7 +5045,7 @@ static void init_and_link_css(struct cgroup_subsys_state *css, css_get(css->parent); } - if (cgroup_on_dfl(cgrp) && ss->css_rstat_flush) + if (ss->css_rstat_flush) list_add_rcu(&css->rstat_css_node, &cgrp->rstat_css_list); BUG_ON(cgroup_css(cgrp, ss)); @@ -5159,11 +5170,9 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name, if (ret) goto out_free_cgrp; - if (cgroup_on_dfl(parent)) { - ret = cgroup_rstat_init(cgrp); - if (ret) - goto out_cancel_ref; - } + ret = cgroup_rstat_init(cgrp); + if (ret) + goto out_cancel_ref; /* create the directory */ kn = kernfs_create_dir(parent->kn, name, mode, cgrp); @@ -5250,8 +5259,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name, out_kernfs_remove: kernfs_remove(cgrp->kn); out_stat_exit: - if (cgroup_on_dfl(parent)) - cgroup_rstat_exit(cgrp); + cgroup_rstat_exit(cgrp); out_cancel_ref: percpu_ref_exit(&cgrp->self.refcnt); out_free_cgrp: diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c index d51175cedfca..faa767a870ba 100644 --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -285,8 +285,6 @@ void __init cgroup_rstat_boot(void) for_each_possible_cpu(cpu) raw_spin_lock_init(per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)); - - BUG_ON(cgroup_rstat_init(&cgrp_dfl_root.cgrp)); } /* From patchwork Fri Feb 5 18:28:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12070715 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B62BC433E6 for ; Fri, 5 Feb 2021 18:28:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C8FBA64EFE for ; Fri, 5 Feb 2021 18:28:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C8FBA64EFE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A63A18D0001; Fri, 5 Feb 2021 13:28:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A39BC6B007B; Fri, 5 Feb 2021 13:28:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9506C8D0001; Fri, 5 Feb 2021 13:28:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0066.hostedemail.com [216.40.44.66]) by kanga.kvack.org (Postfix) with ESMTP id 8019F6B0078 for ; Fri, 5 Feb 2021 13:28:34 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4A245181AC9CB for ; Fri, 5 Feb 2021 18:28:34 +0000 (UTC) X-FDA: 77785049748.06.earth51_01079bd275e6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 1A6021003B74D for ; Fri, 5 Feb 2021 18:28:34 +0000 (UTC) X-HE-Tag: earth51_01079bd275e6 X-Filterd-Recvd-Size: 10018 Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com [209.85.222.176]) by imf02.hostedemail.com (Postfix) with ESMTP for ; Fri, 5 Feb 2021 18:28:33 +0000 (UTC) Received: by mail-qk1-f176.google.com with SMTP id t126so5325829qke.11 for ; Fri, 05 Feb 2021 10:28:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qIYQWYhqyLwEqsbVc7mdSrEASRoPZLsNukqo0gJhBmg=; b=mbgzphR0YN4tKecjTyXwukcMHOoZDeId+hdL1RYDkVbNtkG6uWj2qGb3FxaDDdyS6v X6xwNYXmOT1r2DTV05nyWz1mqlTt80+DxKiZzaXlzzIzcZz26XSbRTHAGlTSUk4SUNzs wBKZ5+OH8Azokiw8WJr9DxoKBbQYZ4DLElnrnaMAuEwVfXghy8npcIeEPPn8SoqybJsi Y18Aor31X7vFwxoPVTIUAJVpD1+I30NbKioOWiLUxhHxeID4EEyy2BAnoc7LkH6gunmB DqbgYcSkOOAdMSwdYRcxFJDfcLmWWRd0yFg+nO+fwuYWhMzCfR7Ilbar1VgqjzdE/7LT sNbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qIYQWYhqyLwEqsbVc7mdSrEASRoPZLsNukqo0gJhBmg=; b=pDCs0o2fFvR8yctz58R9YURFXmtzc9GA1VNzzIHZ5i2pWsOklc8xJbgT63Ifs/J/9P 9cMAK9i6jNI/mpsyocRnDyTlKDG8Ms4bbdfFuouyYfi10DQ2YjrzR3JAKP8jn/voyhIe 9HzFCoAtQz5VvVwk8AmRH7Al4lpHuAd2MkiPzH3H0gV2DnDSulOrVYwQ1P/7fga47VL3 SZUB8DU/7BvLP8p3vCQnq9vcH07LfsSsda67YOpv7aBAAP8P6d57N/qpbCUcaA1UlJsk 0F7/hgoa/V7x2StxmMZO0A/mZnXaV7EFu6JpNcVr6kaanoOi+HvdsuZY9Clt5nRjgT2O oUMA== X-Gm-Message-State: AOAM5338goOpnOGPXgxDdU9/CHfSR74Smjpl0NzAUat28IrbvHM2X+Kh n4a4hyhz2lh5RhLd0b5+yp/wpw== X-Google-Smtp-Source: ABdhPJzexKjnLWE5vCXHICRghL3aphfw3CmfimOim3TxSODokljlyHm1qCbw6cTnNWSGFKQHhcKBwQ== X-Received: by 2002:a37:9b55:: with SMTP id d82mr5601786qke.172.1612549712822; Fri, 05 Feb 2021 10:28:32 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id h22sm8313273qth.55.2021.02.05.10.28.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Feb 2021 10:28:32 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , Shakeel Butt , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 5/8] cgroup: rstat: punt root-level optimization to individual controllers Date: Fri, 5 Feb 2021 13:28:03 -0500 Message-Id: <20210205182806.17220-6-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210205182806.17220-1-hannes@cmpxchg.org> References: <20210205182806.17220-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Current users of the rstat code can source root-level statistics from the native counters of their respective subsystem, allowing them to forego aggregation at the root level. This optimization is currently implemented inside the generic rstat code, which doesn't track the root cgroup and doesn't invoke the subsystem flush callbacks on it. However, the memory controller cannot do this optimization, because cgroup1 breaks out memory specifically for the local level, including at the root level. In preparation for the memory controller switching to rstat, move the optimization from rstat core to the controllers. Afterwards, rstat will always track the root cgroup for changes and invoke the subsystem callbacks on it; and it's up to the subsystem to special-case and skip aggregation of the root cgroup if it can source this information through other, cheaper means. The extra cost of tracking the root cgroup is negligible: on stat changes, we actually remove a branch that checks for the root. The queueing for a flush touches only per-cpu data, and only the first stat change since a flush requires a (per-cpu) lock. Signed-off-by: Johannes Weiner Acked-by: Tejun Heo --- block/blk-cgroup.c | 14 +++++++--- kernel/cgroup/rstat.c | 60 +++++++++++++++++++++++++------------------ 2 files changed, 45 insertions(+), 29 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 02ce2058c14b..76725e1cad7f 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -766,6 +766,10 @@ static void blkcg_rstat_flush(struct cgroup_subsys_state *css, int cpu) struct blkcg *blkcg = css_to_blkcg(css); struct blkcg_gq *blkg; + /* Root-level stats are sourced from system-wide IO stats */ + if (!cgroup_parent(css->cgroup)) + return; + rcu_read_lock(); hlist_for_each_entry_rcu(blkg, &blkcg->blkg_list, blkcg_node) { @@ -789,6 +793,7 @@ static void blkcg_rstat_flush(struct cgroup_subsys_state *css, int cpu) u64_stats_update_end(&blkg->iostat.sync); /* propagate global delta to parent */ + /* XXX: could skip this if parent is root */ if (parent) { u64_stats_update_begin(&parent->iostat.sync); blkg_iostat_set(&delta, &blkg->iostat.cur); @@ -803,10 +808,11 @@ static void blkcg_rstat_flush(struct cgroup_subsys_state *css, int cpu) } /* - * The rstat algorithms intentionally don't handle the root cgroup to avoid - * incurring overhead when no cgroups are defined. For that reason, - * cgroup_rstat_flush in blkcg_print_stat does not actually fill out the - * iostat in the root cgroup's blkcg_gq. + * We source root cgroup stats from the system-wide stats to avoid + * tracking the same information twice and incurring overhead when no + * cgroups are defined. For that reason, cgroup_rstat_flush in + * blkcg_print_stat does not actually fill out the iostat in the root + * cgroup's blkcg_gq. * * However, we would like to re-use the printing code between the root and * non-root cgroups to the extent possible. For that reason, we simulate diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c index faa767a870ba..6f50c199bf2a 100644 --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -25,13 +25,8 @@ static struct cgroup_rstat_cpu *cgroup_rstat_cpu(struct cgroup *cgrp, int cpu) void cgroup_rstat_updated(struct cgroup *cgrp, int cpu) { raw_spinlock_t *cpu_lock = per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu); - struct cgroup *parent; unsigned long flags; - /* nothing to do for root */ - if (!cgroup_parent(cgrp)) - return; - /* * Speculative already-on-list test. This may race leading to * temporary inaccuracies, which is fine. @@ -46,10 +41,10 @@ void cgroup_rstat_updated(struct cgroup *cgrp, int cpu) raw_spin_lock_irqsave(cpu_lock, flags); /* put @cgrp and all ancestors on the corresponding updated lists */ - for (parent = cgroup_parent(cgrp); parent; - cgrp = parent, parent = cgroup_parent(cgrp)) { + while (true) { struct cgroup_rstat_cpu *rstatc = cgroup_rstat_cpu(cgrp, cpu); - struct cgroup_rstat_cpu *prstatc = cgroup_rstat_cpu(parent, cpu); + struct cgroup *parent = cgroup_parent(cgrp); + struct cgroup_rstat_cpu *prstatc; /* * Both additions and removals are bottom-up. If a cgroup @@ -58,8 +53,16 @@ void cgroup_rstat_updated(struct cgroup *cgrp, int cpu) if (rstatc->updated_next) break; + if (!parent) { + rstatc->updated_next = cgrp; + break; + } + + prstatc = cgroup_rstat_cpu(parent, cpu); rstatc->updated_next = prstatc->updated_children; prstatc->updated_children = cgrp; + + cgrp = parent; } raw_spin_unlock_irqrestore(cpu_lock, flags); @@ -113,23 +116,26 @@ static struct cgroup *cgroup_rstat_cpu_pop_updated(struct cgroup *pos, */ if (rstatc->updated_next) { struct cgroup *parent = cgroup_parent(pos); - struct cgroup_rstat_cpu *prstatc = cgroup_rstat_cpu(parent, cpu); - struct cgroup_rstat_cpu *nrstatc; - struct cgroup **nextp; - - nextp = &prstatc->updated_children; - while (true) { - nrstatc = cgroup_rstat_cpu(*nextp, cpu); - if (*nextp == pos) - break; - - WARN_ON_ONCE(*nextp == parent); - nextp = &nrstatc->updated_next; + + if (parent) { + struct cgroup_rstat_cpu *prstatc; + struct cgroup **nextp; + + prstatc = cgroup_rstat_cpu(parent, cpu); + nextp = &prstatc->updated_children; + while (true) { + struct cgroup_rstat_cpu *nrstatc; + + nrstatc = cgroup_rstat_cpu(*nextp, cpu); + if (*nextp == pos) + break; + WARN_ON_ONCE(*nextp == parent); + nextp = &nrstatc->updated_next; + } + *nextp = rstatc->updated_next; } - *nextp = rstatc->updated_next; rstatc->updated_next = NULL; - return pos; } @@ -309,11 +315,15 @@ static void cgroup_base_stat_sub(struct cgroup_base_stat *dst_bstat, static void cgroup_base_stat_flush(struct cgroup *cgrp, int cpu) { - struct cgroup *parent = cgroup_parent(cgrp); struct cgroup_rstat_cpu *rstatc = cgroup_rstat_cpu(cgrp, cpu); + struct cgroup *parent = cgroup_parent(cgrp); struct cgroup_base_stat cur, delta; unsigned seq; + /* Root-level stats are sourced from system-wide CPU stats */ + if (!parent) + return; + /* fetch the current per-cpu values */ do { seq = __u64_stats_fetch_begin(&rstatc->bsync); @@ -326,8 +336,8 @@ static void cgroup_base_stat_flush(struct cgroup *cgrp, int cpu) cgroup_base_stat_add(&cgrp->bstat, &delta); cgroup_base_stat_add(&rstatc->last_bstat, &delta); - /* propagate global delta to parent */ - if (parent) { + /* propagate global delta to parent (unless that's root) */ + if (cgroup_parent(parent)) { delta = cgrp->bstat; cgroup_base_stat_sub(&delta, &cgrp->last_bstat); cgroup_base_stat_add(&parent->bstat, &delta); From patchwork Fri Feb 5 18:28:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12070717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8339AC433E0 for ; Fri, 5 Feb 2021 18:28:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DC72964EEF for ; Fri, 5 Feb 2021 18:28:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DC72964EEF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5F7018D0002; Fri, 5 Feb 2021 13:28:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5AA936B007B; Fri, 5 Feb 2021 13:28:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3FB4B8D0002; Fri, 5 Feb 2021 13:28:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0033.hostedemail.com [216.40.44.33]) by kanga.kvack.org (Postfix) with ESMTP id 237736B0078 for ; Fri, 5 Feb 2021 13:28:36 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D09083620 for ; Fri, 5 Feb 2021 18:28:35 +0000 (UTC) X-FDA: 77785049790.17.bike34_130caaa275e6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id B6440180D0181 for ; Fri, 5 Feb 2021 18:28:35 +0000 (UTC) X-HE-Tag: bike34_130caaa275e6 X-Filterd-Recvd-Size: 22100 Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Fri, 5 Feb 2021 18:28:34 +0000 (UTC) Received: by mail-qt1-f180.google.com with SMTP id v3so5698055qtw.4 for ; Fri, 05 Feb 2021 10:28:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qxrjTr68An1JC48/Dgt1GzhHxl+q3IMmZm87Cqod2Wk=; b=UXEdA9XIIDNdFZJuaqb/nXFFnkcvMoaPtHHtFWL+4bMPmKeVsckk5C87KVeoYGGsa8 sYC3BcYqPGDm8T31oENVu58gH3vxnMWxKc4figxpcaw6BiGbIkG8o8Qg7R1iVeYr86tP 4aoxAtEwnNW3DupHid/5l3ajalJ/o1QRc8XWBTGxl9K9ZsHbqR+CwbACa99EG+r8KcV4 Uunev0c1l7ccXlDUrdJf8ZE0oFB67CecnPo6JmEA9rM7HAWaBiOtbj/jeXOLDYDa8nwd tRhm8snSOBYK4gwxd3T930OxIXLYRaPNtQTi0GU9eKdnZbDiZIhv9ZITv1NIijm350XU eubw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qxrjTr68An1JC48/Dgt1GzhHxl+q3IMmZm87Cqod2Wk=; b=ouoKepyTrofxtQOIjSaiV5E4lmwgb7wXy0ywRVaKWQOikjTjXY/ESkNkjVmgoAD5zM 2r+7X0mvrU4YSQgBtocsUyTB4zUUZc+C5klYHWO4IkLZ/LlccV6puIasrcoT2XKlcUtO Hbhw6baIM8KPnxcfyH4qCnwZceQPJxRhVlbqNwEAlgZiIEt7S8xzSwau2C2rTp3DWMxr RXqq5LqgUGgQFiLMMmHN/VQsDwoKp7JN2y7Fb9arIHpn7Zsshojunx4sIEJ57zyaqr65 TXqMJpxT57Xmtp7Ff8n7GYf5fefVDnCi5mzwBnOWWo+CShfxmRUwU+qAvcV4uOmZ9Pip RGXw== X-Gm-Message-State: AOAM530I4w5z4aFfvIntOPLpCZiB1saUa0ANq0J0wYLQ+TPy2iMeM0aV HlmLa3oPbA6X3X5fL70GfK4cmQ== X-Google-Smtp-Source: ABdhPJz2hOV8/sFaI1qNGyF3ytetlSHqn6vPj8Gg3rEO7EwO8gpb1YS3mUpj3LdsyWMAd4HsEYm59Q== X-Received: by 2002:a05:622a:488:: with SMTP id p8mr5337533qtx.137.1612549714317; Fri, 05 Feb 2021 10:28:34 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id t8sm9188103qkt.125.2021.02.05.10.28.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Feb 2021 10:28:33 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , Shakeel Butt , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 6/8] mm: memcontrol: switch to rstat Date: Fri, 5 Feb 2021 13:28:04 -0500 Message-Id: <20210205182806.17220-7-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210205182806.17220-1-hannes@cmpxchg.org> References: <20210205182806.17220-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Replace the memory controller's custom hierarchical stats code with the generic rstat infrastructure provided by the cgroup core. The current implementation does batched upward propagation from the write side (i.e. as stats change). The per-cpu batches introduce an error, which is multiplied by the number of subgroups in a tree. In systems with many CPUs and sizable cgroup trees, the error can be large enough to confuse users (e.g. 32 batch pages * 32 CPUs * 32 subgroups results in an error of up to 128M per stat item). This can entirely swallow allocation bursts inside a workload that the user is expecting to see reflected in the statistics. In the past, we've done read-side aggregation, where a memory.stat read would have to walk the entire subtree and add up per-cpu counts. This became problematic with lazily-freed cgroups: we could have large subtrees where most cgroups were entirely idle. Hence the switch to change-driven upward propagation. Unfortunately, it needed to trade accuracy for speed due to the write side being so hot. Rstat combines the best of both worlds: from the write side, it cheaply maintains a queue of cgroups that have pending changes, so that the read side can do selective tree aggregation. This way the reported stats will always be precise and recent as can be, while the aggregation can skip over potentially large numbers of idle cgroups. The way rstat works is that it implements a tree for tracking cgroups with pending local changes, as well as a flush function that walks the tree upwards. The controller then drives this by 1) telling rstat when a local cgroup stat changes (e.g. mod_memcg_state) and 2) when a flush is required to get uptodate hierarchy stats for a given subtree (e.g. when memory.stat is read). The controller also provides a flush callback that is called during the rstat flush walk for each cgroup and aggregates its local per-cpu counters and propagates them upwards. This adds a second vmstats to struct mem_cgroup (MEMCG_NR_STAT + NR_VM_EVENT_ITEMS) to track pending subtree deltas during upward aggregation. It removes 3 words from the per-cpu data. It eliminates memcg_exact_page_state(), since memcg_page_state() is now exact. Signed-off-by: Johannes Weiner Reviewed-by: Roman Gushchin Acked-by: Michal Hocko Reviewed-by: Shakeel Butt --- include/linux/memcontrol.h | 67 ++++++----- mm/memcontrol.c | 224 +++++++++++++++---------------------- 2 files changed, 133 insertions(+), 158 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 20ecdfae3289..a8c7a0ccc759 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -76,10 +76,27 @@ enum mem_cgroup_events_target { }; struct memcg_vmstats_percpu { - long stat[MEMCG_NR_STAT]; - unsigned long events[NR_VM_EVENT_ITEMS]; - unsigned long nr_page_events; - unsigned long targets[MEM_CGROUP_NTARGETS]; + /* Local (CPU and cgroup) page state & events */ + long state[MEMCG_NR_STAT]; + unsigned long events[NR_VM_EVENT_ITEMS]; + + /* Delta calculation for lockless upward propagation */ + long state_prev[MEMCG_NR_STAT]; + unsigned long events_prev[NR_VM_EVENT_ITEMS]; + + /* Cgroup1: threshold notifications & softlimit tree updates */ + unsigned long nr_page_events; + unsigned long targets[MEM_CGROUP_NTARGETS]; +}; + +struct memcg_vmstats { + /* Aggregated (CPU and subtree) page state & events */ + long state[MEMCG_NR_STAT]; + unsigned long events[NR_VM_EVENT_ITEMS]; + + /* Pending child counts during tree propagation */ + long state_pending[MEMCG_NR_STAT]; + unsigned long events_pending[NR_VM_EVENT_ITEMS]; }; struct mem_cgroup_reclaim_iter { @@ -287,8 +304,8 @@ struct mem_cgroup { MEMCG_PADDING(_pad1_); - atomic_long_t vmstats[MEMCG_NR_STAT]; - atomic_long_t vmevents[NR_VM_EVENT_ITEMS]; + /* memory.stat */ + struct memcg_vmstats vmstats; /* memory.events */ atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS]; @@ -315,10 +332,6 @@ struct mem_cgroup { atomic_t moving_account; struct task_struct *move_lock_task; - /* Legacy local VM stats and events */ - struct memcg_vmstats_percpu __percpu *vmstats_local; - - /* Subtree VM stats and events (batched updates) */ struct memcg_vmstats_percpu __percpu *vmstats_percpu; #ifdef CONFIG_CGROUP_WRITEBACK @@ -942,10 +955,6 @@ static inline void mod_memcg_lruvec_state(struct lruvec *lruvec, local_irq_restore(flags); } -unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, - gfp_t gfp_mask, - unsigned long *total_scanned); - void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, unsigned long count); @@ -1028,6 +1037,10 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm, void mem_cgroup_split_huge_fixup(struct page *head); #endif +unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, + gfp_t gfp_mask, + unsigned long *total_scanned); + #else /* CONFIG_MEMCG */ #define MEM_CGROUP_ID_SHIFT 0 @@ -1136,6 +1149,10 @@ static inline bool lruvec_holds_page_lru_lock(struct page *page, return lruvec == &pgdat->__lruvec; } +static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) +{ +} + static inline struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg) { return NULL; @@ -1349,18 +1366,6 @@ static inline void mod_lruvec_kmem_state(void *p, enum node_stat_item idx, mod_node_page_state(page_pgdat(page), idx, val); } -static inline -unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, - gfp_t gfp_mask, - unsigned long *total_scanned) -{ - return 0; -} - -static inline void mem_cgroup_split_huge_fixup(struct page *head) -{ -} - static inline void count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, unsigned long count) @@ -1383,8 +1388,16 @@ void count_memcg_event_mm(struct mm_struct *mm, enum vm_event_item idx) { } -static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) +static inline void mem_cgroup_split_huge_fixup(struct page *head) +{ +} + +static inline +unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, + gfp_t gfp_mask, + unsigned long *total_scanned) { + return 0; } #endif /* CONFIG_MEMCG */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2f97cb4cef6d..5dc0bd53b64a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -757,6 +757,11 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz) return mz; } +static void memcg_flush_vmstats(struct mem_cgroup *memcg) +{ + cgroup_rstat_flush(memcg->css.cgroup); +} + /** * __mod_memcg_state - update cgroup memory statistics * @memcg: the memory cgroup @@ -765,37 +770,17 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz) */ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) { - long x, threshold = MEMCG_CHARGE_BATCH; - if (mem_cgroup_disabled()) return; - if (memcg_stat_item_in_bytes(idx)) - threshold <<= PAGE_SHIFT; - - x = val + __this_cpu_read(memcg->vmstats_percpu->stat[idx]); - if (unlikely(abs(x) > threshold)) { - struct mem_cgroup *mi; - - /* - * Batch local counters to keep them in sync with - * the hierarchical ones. - */ - __this_cpu_add(memcg->vmstats_local->stat[idx], x); - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - atomic_long_add(x, &mi->vmstats[idx]); - x = 0; - } - __this_cpu_write(memcg->vmstats_percpu->stat[idx], x); + __this_cpu_add(memcg->vmstats_percpu->state[idx], val); + cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id()); } -/* - * idx can be of type enum memcg_stat_item or node_stat_item. - * Keep in sync with memcg_exact_page_state(). - */ +/* idx can be of type enum memcg_stat_item or node_stat_item. */ static unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) { - long x = atomic_long_read(&memcg->vmstats[idx]); + long x = READ_ONCE(memcg->vmstats.state[idx]); #ifdef CONFIG_SMP if (x < 0) x = 0; @@ -803,17 +788,14 @@ static unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) return x; } -/* - * idx can be of type enum memcg_stat_item or node_stat_item. - * Keep in sync with memcg_exact_page_state(). - */ +/* idx can be of type enum memcg_stat_item or node_stat_item. */ static unsigned long memcg_page_state_local(struct mem_cgroup *memcg, int idx) { long x = 0; int cpu; for_each_possible_cpu(cpu) - x += per_cpu(memcg->vmstats_local->stat[idx], cpu); + x += per_cpu(memcg->vmstats_percpu->state[idx], cpu); #ifdef CONFIG_SMP if (x < 0) x = 0; @@ -936,30 +918,16 @@ void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val) void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, unsigned long count) { - unsigned long x; - if (mem_cgroup_disabled()) return; - x = count + __this_cpu_read(memcg->vmstats_percpu->events[idx]); - if (unlikely(x > MEMCG_CHARGE_BATCH)) { - struct mem_cgroup *mi; - - /* - * Batch local counters to keep them in sync with - * the hierarchical ones. - */ - __this_cpu_add(memcg->vmstats_local->events[idx], x); - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - atomic_long_add(x, &mi->vmevents[idx]); - x = 0; - } - __this_cpu_write(memcg->vmstats_percpu->events[idx], x); + __this_cpu_add(memcg->vmstats_percpu->events[idx], count); + cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id()); } static unsigned long memcg_events(struct mem_cgroup *memcg, int event) { - return atomic_long_read(&memcg->vmevents[event]); + return READ_ONCE(memcg->vmstats.events[event]); } static unsigned long memcg_events_local(struct mem_cgroup *memcg, int event) @@ -968,7 +936,7 @@ static unsigned long memcg_events_local(struct mem_cgroup *memcg, int event) int cpu; for_each_possible_cpu(cpu) - x += per_cpu(memcg->vmstats_local->events[event], cpu); + x += per_cpu(memcg->vmstats_percpu->events[event], cpu); return x; } @@ -1631,6 +1599,7 @@ static char *memory_stat_format(struct mem_cgroup *memcg) * * Current memory state: */ + memcg_flush_vmstats(memcg); for (i = 0; i < ARRAY_SIZE(memory_stats); i++) { u64 size; @@ -2450,22 +2419,11 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu) drain_stock(stock); for_each_mem_cgroup(memcg) { - struct memcg_vmstats_percpu *statc; int i; - statc = per_cpu_ptr(memcg->vmstats_percpu, cpu); - - for (i = 0; i < MEMCG_NR_STAT; i++) { + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { int nid; - if (statc->stat[i]) { - mod_memcg_state(memcg, i, statc->stat[i]); - statc->stat[i] = 0; - } - - if (i >= NR_VM_NODE_STAT_ITEMS) - continue; - for_each_node(nid) { struct batched_lruvec_stat *lstatc; struct mem_cgroup_per_node *pn; @@ -2484,13 +2442,6 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu) } } } - - for (i = 0; i < NR_VM_EVENT_ITEMS; i++) { - if (statc->events[i]) { - count_memcg_events(memcg, i, statc->events[i]); - statc->events[i] = 0; - } - } } return 0; @@ -3618,6 +3569,8 @@ static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap) { unsigned long val; + memcg_flush_vmstats(memcg); + if (mem_cgroup_is_root(memcg)) { val = memcg_page_state(memcg, NR_FILE_PAGES) + memcg_page_state(memcg, NR_ANON_MAPPED); @@ -3683,26 +3636,15 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css, } } -static void memcg_flush_percpu_vmstats(struct mem_cgroup *memcg) +static void memcg_flush_lruvec_page_state(struct mem_cgroup *memcg) { - unsigned long stat[MEMCG_NR_STAT] = {0}; - struct mem_cgroup *mi; - int node, cpu, i; - - for_each_online_cpu(cpu) - for (i = 0; i < MEMCG_NR_STAT; i++) - stat[i] += per_cpu(memcg->vmstats_percpu->stat[i], cpu); - - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - for (i = 0; i < MEMCG_NR_STAT; i++) - atomic_long_add(stat[i], &mi->vmstats[i]); + int node; for_each_node(node) { struct mem_cgroup_per_node *pn = memcg->nodeinfo[node]; + unsigned long stat[NR_VM_NODE_STAT_ITEMS] = { 0 }; struct mem_cgroup_per_node *pi; - - for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) - stat[i] = 0; + int cpu, i; for_each_online_cpu(cpu) for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) @@ -3715,25 +3657,6 @@ static void memcg_flush_percpu_vmstats(struct mem_cgroup *memcg) } } -static void memcg_flush_percpu_vmevents(struct mem_cgroup *memcg) -{ - unsigned long events[NR_VM_EVENT_ITEMS]; - struct mem_cgroup *mi; - int cpu, i; - - for (i = 0; i < NR_VM_EVENT_ITEMS; i++) - events[i] = 0; - - for_each_online_cpu(cpu) - for (i = 0; i < NR_VM_EVENT_ITEMS; i++) - events[i] += per_cpu(memcg->vmstats_percpu->events[i], - cpu); - - for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) - for (i = 0; i < NR_VM_EVENT_ITEMS; i++) - atomic_long_add(events[i], &mi->vmevents[i]); -} - #ifdef CONFIG_MEMCG_KMEM static int memcg_online_kmem(struct mem_cgroup *memcg) { @@ -4050,6 +3973,8 @@ static int memcg_numa_stat_show(struct seq_file *m, void *v) int nid; struct mem_cgroup *memcg = mem_cgroup_from_seq(m); + memcg_flush_vmstats(memcg); + for (stat = stats; stat < stats + ARRAY_SIZE(stats); stat++) { seq_printf(m, "%s=%lu", stat->name, mem_cgroup_nr_lru_pages(memcg, stat->lru_mask, @@ -4120,6 +4045,8 @@ static int memcg_stat_show(struct seq_file *m, void *v) BUILD_BUG_ON(ARRAY_SIZE(memcg1_stat_names) != ARRAY_SIZE(memcg1_stats)); + memcg_flush_vmstats(memcg); + for (i = 0; i < ARRAY_SIZE(memcg1_stats); i++) { unsigned long nr; @@ -4596,22 +4523,6 @@ struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb) return &memcg->cgwb_domain; } -/* - * idx can be of type enum memcg_stat_item or node_stat_item. - * Keep in sync with memcg_exact_page(). - */ -static unsigned long memcg_exact_page_state(struct mem_cgroup *memcg, int idx) -{ - long x = atomic_long_read(&memcg->vmstats[idx]); - int cpu; - - for_each_online_cpu(cpu) - x += per_cpu_ptr(memcg->vmstats_percpu, cpu)->stat[idx]; - if (x < 0) - x = 0; - return x; -} - /** * mem_cgroup_wb_stats - retrieve writeback related stats from its memcg * @wb: bdi_writeback in question @@ -4637,13 +4548,14 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages, struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css); struct mem_cgroup *parent; - *pdirty = memcg_exact_page_state(memcg, NR_FILE_DIRTY); + memcg_flush_vmstats(memcg); - *pwriteback = memcg_exact_page_state(memcg, NR_WRITEBACK); - *pfilepages = memcg_exact_page_state(memcg, NR_INACTIVE_FILE) + - memcg_exact_page_state(memcg, NR_ACTIVE_FILE); - *pheadroom = PAGE_COUNTER_MAX; + *pdirty = memcg_page_state(memcg, NR_FILE_DIRTY); + *pwriteback = memcg_page_state(memcg, NR_WRITEBACK); + *pfilepages = memcg_page_state(memcg, NR_INACTIVE_FILE) + + memcg_page_state(memcg, NR_ACTIVE_FILE); + *pheadroom = PAGE_COUNTER_MAX; while ((parent = parent_mem_cgroup(memcg))) { unsigned long ceiling = min(READ_ONCE(memcg->memory.max), READ_ONCE(memcg->memory.high)); @@ -5275,7 +5187,6 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg) for_each_node(node) free_mem_cgroup_per_node_info(memcg, node); free_percpu(memcg->vmstats_percpu); - free_percpu(memcg->vmstats_local); kfree(memcg); } @@ -5283,11 +5194,10 @@ static void mem_cgroup_free(struct mem_cgroup *memcg) { memcg_wb_domain_exit(memcg); /* - * Flush percpu vmstats and vmevents to guarantee the value correctness - * on parent's and all ancestor levels. + * Flush percpu lruvec stats to guarantee the value + * correctness on parent's and all ancestor levels. */ - memcg_flush_percpu_vmstats(memcg); - memcg_flush_percpu_vmevents(memcg); + memcg_flush_lruvec_page_state(memcg); __mem_cgroup_free(memcg); } @@ -5314,11 +5224,6 @@ static struct mem_cgroup *mem_cgroup_alloc(void) goto fail; } - memcg->vmstats_local = alloc_percpu_gfp(struct memcg_vmstats_percpu, - GFP_KERNEL_ACCOUNT); - if (!memcg->vmstats_local) - goto fail; - memcg->vmstats_percpu = alloc_percpu_gfp(struct memcg_vmstats_percpu, GFP_KERNEL_ACCOUNT); if (!memcg->vmstats_percpu) @@ -5518,6 +5423,62 @@ static void mem_cgroup_css_reset(struct cgroup_subsys_state *css) memcg_wb_domain_size_changed(memcg); } +static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu) +{ + struct mem_cgroup *memcg = mem_cgroup_from_css(css); + struct mem_cgroup *parent = parent_mem_cgroup(memcg); + struct memcg_vmstats_percpu *statc; + long delta, v; + int i; + + statc = per_cpu_ptr(memcg->vmstats_percpu, cpu); + + for (i = 0; i < MEMCG_NR_STAT; i++) { + /* + * Collect the aggregated propagation counts of groups + * below us. We're in a per-cpu loop here and this is + * a global counter, so the first cycle will get them. + */ + delta = memcg->vmstats.state_pending[i]; + if (delta) + memcg->vmstats.state_pending[i] = 0; + + /* Add CPU changes on this level since the last flush */ + v = READ_ONCE(statc->state[i]); + if (v != statc->state_prev[i]) { + delta += v - statc->state_prev[i]; + statc->state_prev[i] = v; + } + + if (!delta) + continue; + + /* Aggregate counts on this level and propagate upwards */ + memcg->vmstats.state[i] += delta; + if (parent) + parent->vmstats.state_pending[i] += delta; + } + + for (i = 0; i < NR_VM_EVENT_ITEMS; i++) { + delta = memcg->vmstats.events_pending[i]; + if (delta) + memcg->vmstats.events_pending[i] = 0; + + v = READ_ONCE(statc->events[i]); + if (v != statc->events_prev[i]) { + delta += v - statc->events_prev[i]; + statc->events_prev[i] = v; + } + + if (!delta) + continue; + + memcg->vmstats.events[i] += delta; + if (parent) + parent->vmstats.events_pending[i] += delta; + } +} + #ifdef CONFIG_MMU /* Handlers for move charge at task migration. */ static int mem_cgroup_do_precharge(unsigned long count) @@ -6571,6 +6532,7 @@ struct cgroup_subsys memory_cgrp_subsys = { .css_released = mem_cgroup_css_released, .css_free = mem_cgroup_css_free, .css_reset = mem_cgroup_css_reset, + .css_rstat_flush = mem_cgroup_css_rstat_flush, .can_attach = mem_cgroup_can_attach, .cancel_attach = mem_cgroup_cancel_attach, .post_attach = mem_cgroup_move_task, From patchwork Fri Feb 5 18:28:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12070719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75FE6C433DB for ; Fri, 5 Feb 2021 18:28:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F148564E4D for ; Fri, 5 Feb 2021 18:28:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F148564E4D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A14066B0078; Fri, 5 Feb 2021 13:28:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 97BFC8D0003; Fri, 5 Feb 2021 13:28:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A6BD6B007D; Fri, 5 Feb 2021 13:28:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0114.hostedemail.com [216.40.44.114]) by kanga.kvack.org (Postfix) with ESMTP id 57A316B0078 for ; Fri, 5 Feb 2021 13:28:37 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 220311F06 for ; Fri, 5 Feb 2021 18:28:37 +0000 (UTC) X-FDA: 77785049874.27.store85_5005589275e6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin27.hostedemail.com (Postfix) with ESMTP id 081A53D663 for ; Fri, 5 Feb 2021 18:28:37 +0000 (UTC) X-HE-Tag: store85_5005589275e6 X-Filterd-Recvd-Size: 6724 Received: from mail-qk1-f174.google.com (mail-qk1-f174.google.com [209.85.222.174]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Fri, 5 Feb 2021 18:28:36 +0000 (UTC) Received: by mail-qk1-f174.google.com with SMTP id a12so7840889qkh.10 for ; Fri, 05 Feb 2021 10:28:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NkGochaaRcmTlyZxkCgRxNgKhbX45qwASpGU1xNehx0=; b=0F6KH20bviGrQNM9hbNvSXi7Vk/KA9cDm/VECQSQnHEPGSsqEZRI0RBWRVQ6p7HnxD kC8C0UB7VSTJzFNjkBOtg4VH27hzXIAf2zYhcPeZ+v+UP273dW65hwzx7s1q0zHV9zuE 6vfEBaxgqx27Nni6gCcAWoGeoPzLDvXiXlOZTA9n3Z6qpCthBOX9QIdwNhwPDX36/zpY bL4fcNoSLRdYjOitEDTdeTKEkerao0KI7iYwRLfbfjEzu0FaMeyPFq93DsS1foVkMa5C iVzRPcKT76qsfJIiN0d/uMOZ0uR77juU+76rJpaWEPDpmJM4ewdbqq1FxIf6bknXkOCX bD1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NkGochaaRcmTlyZxkCgRxNgKhbX45qwASpGU1xNehx0=; b=bQ0tSkHM3ubK6hLI9lEP7D9eHn/NGWSmh+CgBxLHMlleYZ5L1+Kf+eAKBuxTC4uhmZ Ui7zcgkgCBr2W2El4lqitqvMyTVbRuf91D6Zhz90Mku7QbUdzkaeeGgJSTdmDfpyMDge vOFoiogsha7BgK8j4zqQ+ySXbu9DzirStUWiq69/eBGC4lZfnVb+gN4nVAXFU21gG08m HWe90RsYrR/WmZXh6rRWRK/YcTgXq7T+qKL5hKncPL5qdQa42pKmnuy3d/8BrbwTRPdR bROukSAFbtMYwWiLKJHl7UyJYCTw5BadoE3mfE47XpVXyaqYdm/6mTl9Xy0pGFR5dQHl NoTg== X-Gm-Message-State: AOAM530nb8n3HPEbb2pz/Mw1P9NSFDtuIhDOsP2NKLbuA6hdy5Tf6sUY KyxwQpPdo0SYZ2M1MgWmptZQrQ== X-Google-Smtp-Source: ABdhPJwOCyzi6wMCfUG8t5Uh5azCf+Qr016KkMDhRws7YkXV+J/NqxYYWdgO7dds/64Fu1/bKXhz0g== X-Received: by 2002:a37:9c94:: with SMTP id f142mr5628212qke.388.1612549715911; Fri, 05 Feb 2021 10:28:35 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id q92sm8919874qtd.92.2021.02.05.10.28.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Feb 2021 10:28:35 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , Shakeel Butt , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 7/8] mm: memcontrol: consolidate lruvec stat flushing Date: Fri, 5 Feb 2021 13:28:05 -0500 Message-Id: <20210205182806.17220-8-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210205182806.17220-1-hannes@cmpxchg.org> References: <20210205182806.17220-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There are two functions to flush the per-cpu data of an lruvec into the rest of the cgroup tree: when the cgroup is being freed, and when a CPU disappears during hotplug. The difference is whether all CPUs or just one is being collected, but the rest of the flushing code is the same. Merge them into one function and share the common code. Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt Acked-by: Michal Hocko --- mm/memcontrol.c | 74 +++++++++++++++++++------------------------------ 1 file changed, 28 insertions(+), 46 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 5dc0bd53b64a..490357945f2c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2410,39 +2410,39 @@ static void drain_all_stock(struct mem_cgroup *root_memcg) mutex_unlock(&percpu_charge_mutex); } -static int memcg_hotplug_cpu_dead(unsigned int cpu) +static void memcg_flush_lruvec_page_state(struct mem_cgroup *memcg, int cpu) { - struct memcg_stock_pcp *stock; - struct mem_cgroup *memcg; - - stock = &per_cpu(memcg_stock, cpu); - drain_stock(stock); + int nid; - for_each_mem_cgroup(memcg) { + for_each_node(nid) { + struct mem_cgroup_per_node *pn = memcg->nodeinfo[nid]; + unsigned long stat[NR_VM_NODE_STAT_ITEMS]; + struct batched_lruvec_stat *lstatc; int i; + lstatc = per_cpu_ptr(pn->lruvec_stat_cpu, cpu); for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { - int nid; + stat[i] = lstatc->count[i]; + lstatc->count[i] = 0; + } - for_each_node(nid) { - struct batched_lruvec_stat *lstatc; - struct mem_cgroup_per_node *pn; - long x; + do { + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) + atomic_long_add(stat[i], &pn->lruvec_stat[i]); + } while ((pn = parent_nodeinfo(pn, nid))); + } +} - pn = memcg->nodeinfo[nid]; - lstatc = per_cpu_ptr(pn->lruvec_stat_cpu, cpu); +static int memcg_hotplug_cpu_dead(unsigned int cpu) +{ + struct memcg_stock_pcp *stock; + struct mem_cgroup *memcg; - x = lstatc->count[i]; - lstatc->count[i] = 0; + stock = &per_cpu(memcg_stock, cpu); + drain_stock(stock); - if (x) { - do { - atomic_long_add(x, &pn->lruvec_stat[i]); - } while ((pn = parent_nodeinfo(pn, nid))); - } - } - } - } + for_each_mem_cgroup(memcg) + memcg_flush_lruvec_page_state(memcg, cpu); return 0; } @@ -3636,27 +3636,6 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css, } } -static void memcg_flush_lruvec_page_state(struct mem_cgroup *memcg) -{ - int node; - - for_each_node(node) { - struct mem_cgroup_per_node *pn = memcg->nodeinfo[node]; - unsigned long stat[NR_VM_NODE_STAT_ITEMS] = { 0 }; - struct mem_cgroup_per_node *pi; - int cpu, i; - - for_each_online_cpu(cpu) - for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) - stat[i] += per_cpu( - pn->lruvec_stat_cpu->count[i], cpu); - - for (pi = pn; pi; pi = parent_nodeinfo(pi, node)) - for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) - atomic_long_add(stat[i], &pi->lruvec_stat[i]); - } -} - #ifdef CONFIG_MEMCG_KMEM static int memcg_online_kmem(struct mem_cgroup *memcg) { @@ -5192,12 +5171,15 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg) static void mem_cgroup_free(struct mem_cgroup *memcg) { + int cpu; + memcg_wb_domain_exit(memcg); /* * Flush percpu lruvec stats to guarantee the value * correctness on parent's and all ancestor levels. */ - memcg_flush_lruvec_page_state(memcg); + for_each_online_cpu(cpu) + memcg_flush_lruvec_page_state(memcg, cpu); __mem_cgroup_free(memcg); } From patchwork Fri Feb 5 18:28:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 12070721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A33AC433E0 for ; Fri, 5 Feb 2021 18:28:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0E34E64E4D for ; Fri, 5 Feb 2021 18:28:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0E34E64E4D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CD6B78D0005; Fri, 5 Feb 2021 13:28:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C683E8D0003; Fri, 5 Feb 2021 13:28:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B26518D0005; Fri, 5 Feb 2021 13:28:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0026.hostedemail.com [216.40.44.26]) by kanga.kvack.org (Postfix) with ESMTP id 9C8468D0003 for ; Fri, 5 Feb 2021 13:28:38 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 63F4D8249980 for ; Fri, 5 Feb 2021 18:28:38 +0000 (UTC) X-FDA: 77785049916.13.pies61_040881d275e6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id 4168318140B72 for ; Fri, 5 Feb 2021 18:28:38 +0000 (UTC) X-HE-Tag: pies61_040881d275e6 X-Filterd-Recvd-Size: 6118 Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com [209.85.222.179]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Fri, 5 Feb 2021 18:28:37 +0000 (UTC) Received: by mail-qk1-f179.google.com with SMTP id d85so7870221qkg.5 for ; Fri, 05 Feb 2021 10:28:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QOIw4B/+XCkxqjVq4ULIq5aMnPErv/ZIiEkm9Qo0mIY=; b=idJT7y3SE1/7kfbEc9LO0zxd5v0/lc3uRviS/yKpvk1qGeWsTpJc+f5tyW0Ojr2CaC DOb2xjm355MkcBLUlQKt1LNt2Id4DsxDmXUgxkqZ9JGJJMyh456qCekFMvTRZyGz/2l+ OzT3PIYYyHKW0lBosOj2ryMkSu7+e7+qyS43BfMCOSnIwGtLcziwXyfikExd7JNpyH1/ l+/KVjav/lOTYOb/Dr+TKFz8wjZcRUl5SuY5AWvq+oABg4czWSmzQlnNMNMfbEUn9S2m bYErICkXlUGO+LrrVUP24yY9ckXylg3VyLAeAmX4ChtGlqTGJ9aGdxAHHD5zhRzfw7nW ZybQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QOIw4B/+XCkxqjVq4ULIq5aMnPErv/ZIiEkm9Qo0mIY=; b=C3pmuNLYGwga9HMwBgWCvPmG9osTLU9CZZECpxvs0uaygEGiMhfSVq7KHUP3Z8cFYR 4OxW80fj/hUWj1maXd1rAXteVj2nRtnZH70QPzGduGzXABNwKzS4sOx1w82g1/Op/aUL v9ImC8hO9UgHmpKKbum38ANu04Wjj5HU5wx+grfen4BPFV7LUy1G1U8ObgAD0GSkz4AE HW/hPKsTbqqRv1GIENl+v0TOQ8Px6P98BH9CG8yXGaa845j2BQ8fnN9k7ro9JVjT0g18 mVDi9U5c/zlNTpETUicDP/32T6knUjVQb3x/DuzWUlICMS/ffMD5d00G07VPKmNAKECM gtXw== X-Gm-Message-State: AOAM530fG7LotI33OiyNtMWuXMVST0GcRP/CbSwd8VV1V+H3a32knsPU fXzk0XpO/yQghw5GYxR9yPoeYA== X-Google-Smtp-Source: ABdhPJx1bFPwDEi8o9N0+OS7xWzsabm8/MDTIqsnLwpTyz37OxupQJ3kC6GDspFubixnbNJLuuy+Xg== X-Received: by 2002:a05:620a:901:: with SMTP id v1mr5351386qkv.331.1612549717227; Fri, 05 Feb 2021 10:28:37 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id j188sm9863366qke.67.2021.02.05.10.28.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Feb 2021 10:28:36 -0800 (PST) From: Johannes Weiner To: Andrew Morton , Tejun Heo Cc: Michal Hocko , Roman Gushchin , Shakeel Butt , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 8/8] kselftests: cgroup: update kmem test for new vmstat implementation Date: Fri, 5 Feb 2021 13:28:06 -0500 Message-Id: <20210205182806.17220-9-hannes@cmpxchg.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210205182806.17220-1-hannes@cmpxchg.org> References: <20210205182806.17220-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000627, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With memcg having switched to rstat, memory.stat output is precise. Update the cgroup selftest to reflect the expectations and error tolerances of the new implementation. Also add newly tracked types of memory to the memory.stat side of the equation, since they're included in memory.current and could throw false positives. Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt --- tools/testing/selftests/cgroup/test_kmem.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c index 0941aa16157e..22b31ebb3513 100644 --- a/tools/testing/selftests/cgroup/test_kmem.c +++ b/tools/testing/selftests/cgroup/test_kmem.c @@ -19,12 +19,12 @@ /* - * Memory cgroup charging and vmstat data aggregation is performed using - * percpu batches 32 pages big (look at MEMCG_CHARGE_BATCH). So the maximum - * discrepancy between charge and vmstat entries is number of cpus multiplied - * by 32 pages multiplied by 2. + * Memory cgroup charging is performed using percpu batches 32 pages + * big (look at MEMCG_CHARGE_BATCH), whereas memory.stat is exact. So + * the maximum discrepancy between charge and vmstat entries is number + * of cpus multiplied by 32 pages. */ -#define MAX_VMSTAT_ERROR (4096 * 32 * 2 * get_nprocs()) +#define MAX_VMSTAT_ERROR (4096 * 32 * get_nprocs()) static int alloc_dcache(const char *cgroup, void *arg) @@ -162,7 +162,7 @@ static int cg_run_in_subcgroups(const char *parent, */ static int test_kmem_memcg_deletion(const char *root) { - long current, slab, anon, file, kernel_stack, sum; + long current, slab, anon, file, kernel_stack, pagetables, percpu, sock, sum; int ret = KSFT_FAIL; char *parent; @@ -184,11 +184,14 @@ static int test_kmem_memcg_deletion(const char *root) anon = cg_read_key_long(parent, "memory.stat", "anon "); file = cg_read_key_long(parent, "memory.stat", "file "); kernel_stack = cg_read_key_long(parent, "memory.stat", "kernel_stack "); + pagetables = cg_read_key_long(parent, "memory.stat", "pagetables "); + percpu = cg_read_key_long(parent, "memory.stat", "percpu "); + sock = cg_read_key_long(parent, "memory.stat", "sock "); if (current < 0 || slab < 0 || anon < 0 || file < 0 || - kernel_stack < 0) + kernel_stack < 0 || pagetables < 0 || percpu < 0 || sock < 0) goto cleanup; - sum = slab + anon + file + kernel_stack; + sum = slab + anon + file + kernel_stack + pagetables + percpu + sock; if (abs(sum - current) < MAX_VMSTAT_ERROR) { ret = KSFT_PASS; } else { @@ -198,6 +201,9 @@ static int test_kmem_memcg_deletion(const char *root) printf("anon = %ld\n", anon); printf("file = %ld\n", file); printf("kernel_stack = %ld\n", kernel_stack); + printf("pagetables = %ld\n", pagetables); + printf("percpu = %ld\n", percpu); + printf("sock = %ld\n", sock); } cleanup: