From patchwork Thu Aug 27 22:58:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 11741969 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 441FE13B1 for ; Thu, 27 Aug 2020 22:59:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 082F820872 for ; Thu, 27 Aug 2020 22:59:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="IrBc9zRl" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 082F820872 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=fb.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8A99F6B000A; Thu, 27 Aug 2020 18:59:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 85B996B000C; Thu, 27 Aug 2020 18:59:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 74A316B000D; Thu, 27 Aug 2020 18:59:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0091.hostedemail.com [216.40.44.91]) by kanga.kvack.org (Postfix) with ESMTP id 5C20A6B000A for ; Thu, 27 Aug 2020 18:59:25 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 1EF77180AD807 for ; Thu, 27 Aug 2020 22:59:25 +0000 (UTC) X-FDA: 77197866690.19.tray63_1a148f227070 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id E87B21AD1B3 for ; Thu, 27 Aug 2020 22:59:24 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,prvs=550855a34e=guro@fb.com,,RULES_HIT:30054:30064:30074,0,RBL:67.231.153.30:@fb.com:.lbl8.mailshell.net-62.18.0.100 64.10.201.10;04y8ad6oqitkyrcprrfysrpd1wy3qypo9fphbmhryewhogqr8yxg6wm1jegost7.rpdwh39efoo9g1tcsm38sufj31957khak6hzix6eg4oay19ag6ahcwk1mk5p9mj.o-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: tray63_1a148f227070 X-Filterd-Recvd-Size: 7703 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Thu, 27 Aug 2020 22:59:24 +0000 (UTC) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 07RMxIh0029747 for ; Thu, 27 Aug 2020 15:59:23 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=sklp+IHvv9QC8xSZEbMJuDDVHQhUm1zwj1uRypGXaic=; b=IrBc9zRlud0NC3ZW+z+YC//P9egx45odDxmsDmbR06Q3+XNC6048muvgYaiPAvapDepi dIkt9uxgyw2wCL28AOP2fja1P6M3mMuL8EhAWOYftuL+BMY9/VNBR9wn1pDWdNMgZE7I lkXKG96e5quEzWkdYabh7xaCKp8bzsnoNHo= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 335up67yce-14 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 27 Aug 2020 15:59:23 -0700 Received: from intmgw004.06.prn3.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1979.3; Thu, 27 Aug 2020 15:58:46 -0700 Received: by devvm1096.prn0.facebook.com (Postfix, from userid 111017) id 6D093393DEE3; Thu, 27 Aug 2020 15:58:44 -0700 (PDT) Smtp-Origin-Hostprefix: devvm From: Roman Gushchin Smtp-Origin-Hostname: devvm1096.prn0.facebook.com To: Andrew Morton , CC: =Shakeel Butt , Johannes Weiner , Michal Hocko , , , Roman Gushchin Smtp-Origin-Cluster: prn0c01 Subject: [PATCH v1 3/4] mm: kmem: prepare remote memcg charging infra for interrupt contexts Date: Thu, 27 Aug 2020 15:58:42 -0700 Message-ID: <20200827225843.1270629-4-guro@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200827225843.1270629-1-guro@fb.com> References: <20200827225843.1270629-1-guro@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235,18.0.687 definitions=2020-08-27_14:2020-08-27,2020-08-27 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 adultscore=0 impostorscore=0 spamscore=0 bulkscore=0 priorityscore=1501 lowpriorityscore=0 mlxscore=0 suspectscore=0 phishscore=5 mlxlogscore=656 malwarescore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2008270174 X-FB-Internal: deliver X-Rspamd-Queue-Id: E87B21AD1B3 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Remote memcg charging API uses current->active_memcg to store the currently active memory cgroup, which overwrites the memory cgroup of the current process. It works well for normal contexts, but doesn't work for interrupt contexts: indeed, if an interrupt occurs during the execution of a section with an active memcg set, all allocations inside the interrupt will be charged to the active memcg set (given that we'll enable accounting for allocations from an interrupt context). But because the interrupt might have no relation to the active memcg set outside, it's obviously wrong from the accounting prospective. To resolve this problem, let's add a global percpu int_active_memcg variable, which will be used to store an active memory cgroup which will be used from interrupt contexts. set_active_memcg() will transparently use current->active_memcg or int_active_memcg depending on the context. To make the read part simple and transparent for the caller, let's introduce two new functions: - struct mem_cgroup *active_memcg(void), - struct mem_cgroup *get_active_memcg(void). They are returning the active memcg if it's set, hiding all implementation details: where to get it depending on the current context. Signed-off-by: Roman Gushchin Reviewed-by: Shakeel Butt --- include/linux/sched/mm.h | 13 +++++++++-- mm/memcontrol.c | 48 ++++++++++++++++++++++++++++------------ 2 files changed, 45 insertions(+), 16 deletions(-) diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 4c69a4349ac1..030a1cf77b8a 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -304,6 +304,7 @@ static inline void memalloc_nocma_restore(unsigned int flags) #endif #ifdef CONFIG_MEMCG +DECLARE_PER_CPU(struct mem_cgroup *, int_active_memcg); /** * set_active_memcg - Starts the remote memcg charging scope. * @memcg: memcg to charge. @@ -318,8 +319,16 @@ static inline void memalloc_nocma_restore(unsigned int flags) static inline struct mem_cgroup * set_active_memcg(struct mem_cgroup *memcg) { - struct mem_cgroup *old = current->active_memcg; - current->active_memcg = memcg; + struct mem_cgroup *old; + + if (in_interrupt()) { + old = this_cpu_read(int_active_memcg); + this_cpu_write(int_active_memcg, memcg); + } else { + old = current->active_memcg; + current->active_memcg = memcg; + } + return old; } #else diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 5d847257a639..a51a6066079e 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -73,6 +73,9 @@ EXPORT_SYMBOL(memory_cgrp_subsys); struct mem_cgroup *root_mem_cgroup __read_mostly; +/* Active memory cgroup to use from an interrupt context */ +DEFINE_PER_CPU(struct mem_cgroup *, int_active_memcg); + /* Socket memory accounting disabled? */ static bool cgroup_memory_nosocket; @@ -1069,26 +1072,43 @@ struct mem_cgroup *get_mem_cgroup_from_page(struct page *page) } EXPORT_SYMBOL(get_mem_cgroup_from_page); -/** - * If current->active_memcg is non-NULL, do not fallback to current->mm->memcg. - */ -static __always_inline struct mem_cgroup *get_mem_cgroup_from_current(void) +static __always_inline struct mem_cgroup *active_memcg(void) { - if (memcg_kmem_bypass()) - return NULL; + if (in_interrupt()) + return this_cpu_read(int_active_memcg); + else + return current->active_memcg; +} - if (unlikely(current->active_memcg)) { - struct mem_cgroup *memcg; +static __always_inline struct mem_cgroup *get_active_memcg(void) +{ + struct mem_cgroup *memcg; - rcu_read_lock(); + rcu_read_lock(); + memcg = active_memcg(); + if (memcg) { /* current->active_memcg must hold a ref. */ - if (WARN_ON_ONCE(!css_tryget(¤t->active_memcg->css))) + if (WARN_ON_ONCE(!css_tryget(&memcg->css))) memcg = root_mem_cgroup; else memcg = current->active_memcg; - rcu_read_unlock(); - return memcg; } + rcu_read_unlock(); + + return memcg; +} + +/** + * If active memcg is set, do not fallback to current->mm->memcg. + */ +static __always_inline struct mem_cgroup *get_mem_cgroup_from_current(void) +{ + if (memcg_kmem_bypass()) + return NULL; + + if (unlikely(active_memcg())) + return get_active_memcg(); + return get_mem_cgroup_from_mm(current->mm); } @@ -2920,8 +2940,8 @@ __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) return NULL; rcu_read_lock(); - if (unlikely(current->active_memcg)) - memcg = rcu_dereference(current->active_memcg); + if (unlikely(active_memcg())) + memcg = active_memcg(); else memcg = mem_cgroup_from_task(current);