From patchwork Mon Feb 21 18:25:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 12754043 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B56CC433EF for ; Mon, 21 Feb 2022 18:26:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 172C18D0007; Mon, 21 Feb 2022 13:26:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F30268D000A; Mon, 21 Feb 2022 13:26:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6B1C8D0007; Mon, 21 Feb 2022 13:26:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0187.hostedemail.com [216.40.44.187]) by kanga.kvack.org (Postfix) with ESMTP id 90C528D0009 for ; Mon, 21 Feb 2022 13:26:11 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 536229F87D for ; Mon, 21 Feb 2022 18:26:11 +0000 (UTC) X-FDA: 79167616542.10.D47E7C3 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by imf30.hostedemail.com (Postfix) with ESMTP id BDEE580007 for ; Mon, 21 Feb 2022 18:26:10 +0000 (UTC) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1645467968; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=afPsn9rdRXuPIOkd2LrdE/DTtbz9dmfBZBL5Zj4hTBY=; b=bAMIvvaS8izzIXLN/yKmx/SZzf4XD0A9mjhtYK+4wQumis5YbRtI0v9S5LCDGA8BNHa6ZR UJ9dgdR9iF6/R/ycwurKikTQNRky6WKB6qABns12M2VBcmZWFE9kChEUzYZiF69FmHYfVQ 07EhbQUqvu2CKdjsyBHWGWuUniPdN9/3uqbNJII4gCpy5Z+0CPMwwz8MWKKz2xdxITvlMb cq1fwR4FIid/XUOGfqSg5L+UZ9jCEkzW4Uh8Xz9/+8BnPmNApYDF2F8xVQfP8zpC7DTzq/ 91dz55kWZgS1t8+FXU81ODR23AvRcZKVSBO2GlMnfUhmWd58KzYZ0P0Bdgzkgg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1645467968; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=afPsn9rdRXuPIOkd2LrdE/DTtbz9dmfBZBL5Zj4hTBY=; b=LEsm8zvj0QFiLV4gnO0Wd/aFp1laeYbe6IOV5T48bfuHjKhQ+uOzREcdnLWdbZTPBQEMT2 dLcDKKTTIFfoOhAg== To: cgroups@vger.kernel.org, linux-mm@kvack.org Cc: Andrew Morton , Johannes Weiner , Michal Hocko , =?utf-8?q?Michal_Koutn=C3=BD?= , Peter Zijlstra , Thomas Gleixner , Vladimir Davydov , Waiman Long , Sebastian Andrzej Siewior , kernel test robot Subject: [PATCH v4 5/6] mm/memcg: Protect memcg_stock with a local_lock_t Date: Mon, 21 Feb 2022 19:25:39 +0100 Message-Id: <20220221182540.380526-6-bigeasy@linutronix.de> In-Reply-To: <20220221182540.380526-1-bigeasy@linutronix.de> References: <20220221182540.380526-1-bigeasy@linutronix.de> MIME-Version: 1.0 X-Stat-Signature: hnw8x4eispoex97jdory1dzsze3mwnd4 X-Rspam-User: Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b=bAMIvvaS; dkim=pass header.d=linutronix.de header.s=2020e header.b=LEsm8zvj; spf=pass (imf30.hostedemail.com: domain of bigeasy@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de; dmarc=pass (policy=none) header.from=linutronix.de X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BDEE580007 X-HE-Tag: 1645467970-989827 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The members of the per-CPU structure memcg_stock_pcp are protected by disabling interrupts. This is not working on PREEMPT_RT because it creates atomic context in which actions are performed which require preemptible context. One example is obj_cgroup_release(). The IRQ-disable sections can be replaced with local_lock_t which preserves the explicit disabling of interrupts while keeps the code preemptible on PREEMPT_RT. drain_obj_stock() drops a reference on obj_cgroup which leads to an invocation of obj_cgroup_release() if it is the last object. This in turn leads to recursive locking of the local_lock_t. To avoid this, obj_cgroup_release() is invoked outside of the locked section. obj_cgroup_uncharge_pages() can be invoked with the local_lock_t acquired and without it. This will lead later to a recursion in refill_stock(). To avoid the locking recursion provide obj_cgroup_uncharge_pages_locked() which uses the locked version of refill_stock(). - Replace disabling interrupts for memcg_stock with a local_lock_t. - Let drain_obj_stock() return the old struct obj_cgroup which is passed to obj_cgroup_put() outside of the locked section. - Provide obj_cgroup_uncharge_pages_locked() which uses the locked version of refill_stock() to avoid recursive locking in drain_obj_stock(). Link: https://lkml.kernel.org/r/20220209014709.GA26885@xsang-OptiPlex-9020 Reported-by: kernel test robot Signed-off-by: Sebastian Andrzej Siewior Acked-by: Michal Hocko --- mm/memcontrol.c | 55 ++++++++++++++++++++++++++++++------------------- 1 file changed, 34 insertions(+), 21 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index cd3950256519d..3d7ccb104374c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2128,6 +2128,7 @@ void unlock_page_memcg(struct page *page) } struct memcg_stock_pcp { + local_lock_t stock_lock; struct mem_cgroup *cached; /* this never be root cgroup */ unsigned int nr_pages; @@ -2143,18 +2144,21 @@ struct memcg_stock_pcp { unsigned long flags; #define FLUSHING_CACHED_CHARGE 0 }; -static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock); +static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock) = { + .stock_lock = INIT_LOCAL_LOCK(stock_lock), +}; static DEFINE_MUTEX(percpu_charge_mutex); #ifdef CONFIG_MEMCG_KMEM -static void drain_obj_stock(struct memcg_stock_pcp *stock); +static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock); static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg); static void memcg_account_kmem(struct mem_cgroup *memcg, int nr_pages); #else -static inline void drain_obj_stock(struct memcg_stock_pcp *stock) +static inline struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock) { + return NULL; } static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg) @@ -2186,7 +2190,7 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) if (nr_pages > MEMCG_CHARGE_BATCH) return ret; - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); if (memcg == stock->cached && stock->nr_pages >= nr_pages) { @@ -2194,7 +2198,7 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) ret = true; } - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); return ret; } @@ -2223,6 +2227,7 @@ static void drain_stock(struct memcg_stock_pcp *stock) static void drain_local_stock(struct work_struct *dummy) { struct memcg_stock_pcp *stock; + struct obj_cgroup *old = NULL; unsigned long flags; /* @@ -2230,14 +2235,16 @@ static void drain_local_stock(struct work_struct *dummy) * drain_stock races is that we always operate on local CPU stock * here with IRQ disabled */ - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); - drain_obj_stock(stock); + old = drain_obj_stock(stock); drain_stock(stock); clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + if (old) + obj_cgroup_put(old); } /* @@ -2264,9 +2271,9 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) { unsigned long flags; - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); __refill_stock(memcg, nr_pages); - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); } /* @@ -3093,10 +3100,11 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, enum node_stat_item idx, int nr) { struct memcg_stock_pcp *stock; + struct obj_cgroup *old = NULL; unsigned long flags; int *bytes; - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); /* @@ -3105,7 +3113,7 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, * changes. */ if (stock->cached_objcg != objcg) { - drain_obj_stock(stock); + old = drain_obj_stock(stock); obj_cgroup_get(objcg); stock->nr_bytes = atomic_read(&objcg->nr_charged_bytes) ? atomic_xchg(&objcg->nr_charged_bytes, 0) : 0; @@ -3149,7 +3157,9 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, if (nr) mod_objcg_mlstate(objcg, pgdat, idx, nr); - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + if (old) + obj_cgroup_put(old); } static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) @@ -3158,7 +3168,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) unsigned long flags; bool ret = false; - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) { @@ -3166,17 +3176,17 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) ret = true; } - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); return ret; } -static void drain_obj_stock(struct memcg_stock_pcp *stock) +static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock) { struct obj_cgroup *old = stock->cached_objcg; if (!old) - return; + return NULL; if (stock->nr_bytes) { unsigned int nr_pages = stock->nr_bytes >> PAGE_SHIFT; @@ -3226,8 +3236,8 @@ static void drain_obj_stock(struct memcg_stock_pcp *stock) stock->cached_pgdat = NULL; } - obj_cgroup_put(old); stock->cached_objcg = NULL; + return old; } static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, @@ -3248,14 +3258,15 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, bool allow_uncharge) { struct memcg_stock_pcp *stock; + struct obj_cgroup *old = NULL; unsigned long flags; unsigned int nr_pages = 0; - local_irq_save(flags); + local_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); if (stock->cached_objcg != objcg) { /* reset if necessary */ - drain_obj_stock(stock); + old = drain_obj_stock(stock); obj_cgroup_get(objcg); stock->cached_objcg = objcg; stock->nr_bytes = atomic_read(&objcg->nr_charged_bytes) @@ -3269,7 +3280,9 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, stock->nr_bytes &= (PAGE_SIZE - 1); } - local_irq_restore(flags); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + if (old) + obj_cgroup_put(old); if (nr_pages) obj_cgroup_uncharge_pages(objcg, nr_pages);