From patchwork Sat Jun 15 08:12:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 13699241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0C96C27C4F for ; Sat, 15 Jun 2024 08:13:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 12325900002; Sat, 15 Jun 2024 04:13:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EB1D18D0005; Sat, 15 Jun 2024 04:13:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 78657900002; Sat, 15 Jun 2024 04:13:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7BDBE8D0005 for ; Sat, 15 Jun 2024 04:13:14 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 1705A80B1A for ; Sat, 15 Jun 2024 08:13:14 +0000 (UTC) X-FDA: 82232407908.30.16E5245 Received: from out-173.mta1.migadu.com (out-173.mta1.migadu.com [95.215.58.173]) by imf08.hostedemail.com (Postfix) with ESMTP id A5B69160005 for ; Sat, 15 Jun 2024 08:13:11 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=DG1usZIT; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf08.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.173 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718439188; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=jxg5iGIBzjKKqOE5wby7nB9Gf0RymHtTU7fYqI4F+Xs=; b=n9B+QCxmxTMyeBtjps5eK9zyo8H3YTyJEmsDpE+lzFgrdKBJgI0Rdgpf9WxyF8v5pzPTCK Gl/3SONCMBqTOOg3U2RlERuV7eQNMJpuTjaH5jvy0b7+Uq9bnpm+2BR33gh2NJF4Oxb4v5 f46x9bfFzSW7wrzUlffhtJmFW27a8H0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718439188; a=rsa-sha256; cv=none; b=hXNJ+NxUKCiYmSBt3wWIGHYKe4QHCSoHHE9/RZuJA5JPKwSYPy8lR85tlXPzTY36aowzZg 50icJ8fVLYnpCM67d6apJD7WG7JgtsNTR/3pCbNTmhtnpp45zqh0TLGv6HGA30g1Tm2qsy 4lxi1c1C/RQDy1qMBkyg1sisuS4Ch6E= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=DG1usZIT; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf08.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.173 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev X-Envelope-To: akpm@linux-foundation.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1718439189; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=jxg5iGIBzjKKqOE5wby7nB9Gf0RymHtTU7fYqI4F+Xs=; b=DG1usZITPymN0SlTEM3Q6jayWvQCADocjZqcuCKEp0kDmOE+15ngQnACof3b5k6yXW+tw7 LbnSIxp2RWzO2YDZKRBKn+xplpxnCF6/wE4O7jnQspLRwJpjZn6nV4KioEB2oHCEZ2Vn1l ZUeMQc/spEe22ZuMj5sReeChPnVIgRo= X-Envelope-To: hannes@cmpxchg.org X-Envelope-To: mhocko@suse.com X-Envelope-To: roman.gushchin@linux.dev X-Envelope-To: yosryahmed@google.com X-Envelope-To: hawk@kernel.org X-Envelope-To: yuzhao@google.com X-Envelope-To: songmuchun@bytedance.com X-Envelope-To: kernel-team@meta.com X-Envelope-To: linux-mm@kvack.org X-Envelope-To: linux-kernel@vger.kernel.org X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Andrew Morton , Johannes Weiner Cc: Michal Hocko , Roman Gushchin , Yosry Ahmed , Jesper Dangaard Brouer , Yu Zhao , Muchun Song , Facebook Kernel Team , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH] memcg: use ratelimited stats flush in the reclaim Date: Sat, 15 Jun 2024 01:12:57 -0700 Message-ID: <20240615081257.3945587-1-shakeel.butt@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: A5B69160005 X-Stat-Signature: agjkp5gop6k6mruihfrzg1xsucmy3zwg X-Rspam-User: X-HE-Tag: 1718439191-945438 X-HE-Meta: U2FsdGVkX18yellulxi4xcJb1sQwnDI5/netc7JQCyt232bSDiyhvSO9zCxmWMCUzAGXhdNt9YvEVEs8seBG8C4cUNeqSNNkkCodwhMCpIeilfAp9ewNlzRrjWquRgtKoqhATImgwN8wJOaJr0/i1+vDRF5gAROTTcQntOBjavYzJY7KBfkj8h566OtxSsTVDX++Mkt9v0ghUD7y/B4nHD21xJHOVRxpr2G4M3oqJ5KKCJM0wetZdpwegHHTcP2jcG7Ghw/jV4cR6mu59+bNQSvbIujmwGfQRSEdUyUHVFUyPBXx6oBRllgkHf+7f6Ed+1+b8ge7177wZoLNItFcRDWi1C5rl/Ys4EWIgLZ/m9/3xTrzg6mGGM4PJQ2YhveJe0m7qNTPJmVWAC4wyRfGEqUPsdps0wzK0VPUqQUVDEz/7isYcB80kV/32Vh1oI3nwS2qOLIrLuvSd4qZ+GDFbEJRVhXn0oPmtcObpurlgDsjJ7XxIyPti0+w5kjRvRcLJZtPyeg5XrmZMAFuG1V2NsPsCAw5VvonXzn9SqgUUrMNgDnNoULEXOfKVg8mELsugHyhV7NKAsjWH5xT0n2bIs+2E30p9PXFkHXBdvobDXjS4c2ZikW1kgYCe+EsEVKxnjpXioH5WwnlW/aC48yRFXY7vJHInFjCWXvG1anAk1wRlzQrpL/QBtwJcXAQrESPfgLNlCqNPUTDuSvcqC1zE0YYxhucVneUVkOxIx7LizNqCzgzHdiJBT9uXz/FyvV3kJ8gC4gZvGYb5dUzYJdvK/OMjwiWVFE0k5mywpCqvVUSK54XpEuvPa1OH7wp3RUCRZP8+A4YTyZzoF7ps+ebay2bxti/aWKOMCG5YcV7ULIfzqinrYKaNhInro3k9iXY5Uvfof/kslrGotPGAHY6ANjH+YFnFYU24FpUKJb0S/++LWNykOyiXr6Zf2WY51bOfpB7koUDyLUnbnquljt XVANH+YG uf1ZGd9ojyOF2VbRx7smQPwRbkwL10bajHQJ/Z9DQtMZfGAr6riN1/OllAbRdOGcdZX/vTOkeHNoxnzm2Y0YrOs177qSdfEgUaw2n3tmkYWdyZ7xBxuJP+7p+1zlyO6wyQ7aBcxQnqKqPDNr4kkFzxnT9YTizpjX/GItgjqIivc1vfL8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The Meta prod is seeing large amount of stalls in memcg stats flush from the memcg reclaim code path. At the moment, this specific callsite is doing a synchronous memcg stats flush. The rstat flush is an expensive and time consuming operation, so concurrent relaimers will busywait on the lock potentially for a long time. Actually this issue is not unique to Meta and has been observed by Cloudflare [1] as well. For the Cloudflare case, the stalls were due to contention between kswapd threads running on their 8 numa node machines which does not make sense as rstat flush is global and flush from one kswapd thread should be sufficient for all. Simply replace the synchronous flush with the ratelimited one. One may raise a concern on potentially using 2 sec stale (at worst) stats for heuristics like desirable inactive:active ratio and preferring inactive file pages over anon pages but these specific heuristics do not require very precise stats and also are ignored under severe memory pressure. This patch has been running on Meta fleet for more than a month and we have not observed any issues. Please note that MGLRU is not impacted by this issue at all as it avoids rstat flushing completely. Link: https://lore.kernel.org/all/6ee2518b-81dd-4082-bdf5-322883895ffc@kernel.org [1] Signed-off-by: Shakeel Butt --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index c0429fd6c573..bda4f92eba71 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2263,7 +2263,7 @@ static void prepare_scan_control(pg_data_t *pgdat, struct scan_control *sc) * Flush the memory cgroup stats, so that we read accurate per-memcg * lruvec stats for heuristics. */ - mem_cgroup_flush_stats(sc->target_mem_cgroup); + mem_cgroup_flush_stats_ratelimited(sc->target_mem_cgroup); /* * Determine the scan balance between anon and file LRUs.