From patchwork Tue Aug 20 09:23:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13769653 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E43A1C3DA4A for ; Tue, 20 Aug 2024 09:24:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5E8F96B0083; Tue, 20 Aug 2024 05:24:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 598856B0085; Tue, 20 Aug 2024 05:24:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 438E76B0088; Tue, 20 Aug 2024 05:24:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 26EC96B0083 for ; Tue, 20 Aug 2024 05:24:12 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D53641A19C1 for ; Tue, 20 Aug 2024 09:24:11 +0000 (UTC) X-FDA: 82472087502.21.CA312B6 Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) by imf13.hostedemail.com (Postfix) with ESMTP id 20E732001E for ; Tue, 20 Aug 2024 09:24:09 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="HW5K/S+Z"; spf=pass (imf13.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724145771; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=Nlt2ID8Y4L9Xxz1JWEnsQz5/O2GKGR4Bos/ESGDaD3o=; b=mIzePXV7CEi59CNIerolyq6f7VVywv3W6U7rP4fbaNZacBmWy4MRxPFezKeH0oYPb9Qs3t ER6JeQEIGn3ZLP+lPCnlD1AcurYm4DuIqmVrH8Ytc7/CPpY80vkc5H2qjMvRzW5Wn4D30Y LLh8QKhOq97O+e8a8arP0F/rN89v+Og= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724145771; a=rsa-sha256; cv=none; b=7BIoDxun7WY7E6ILomUq9+zBEgsVd8tYM2vMA0xNPwjxkNGCXOvnymkbB0DUQ5aleqJY8/ FkuUWIE6ic+lBnedtn/35OPgCfZomGiME5jzIY6dSF1iba1NzGYdrH7sgyIimG8HNN/Vei r5LgfTs4y82SDaWhSFUZXn++NXOIyuM= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="HW5K/S+Z"; spf=pass (imf13.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pj1-f43.google.com with SMTP id 98e67ed59e1d1-2d3b36f5366so3641973a91.0 for ; Tue, 20 Aug 2024 02:24:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724145848; x=1724750648; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=Nlt2ID8Y4L9Xxz1JWEnsQz5/O2GKGR4Bos/ESGDaD3o=; b=HW5K/S+ZjwmeiElgJQViSE+nvuf+uTCOsHgtce5WCHbacZNNiz7WvJTCPZcA+D+9c5 jtxf2Q9ijno5XzToEzuX9hseIGIINYgubzlPO6UboYGkEYIfbErpiqHNuLhi0cCEVRNj bwIsyRJstC0wYx3Fephw3EJ606nA/RWp4TEwGmKttJhn4a8Mz/9na9YFBP++7hN0WN+N InVK0ZJ513tCMY9+kpQh8HV39zEwXipTWdkjJzgIYZFYSeZTJgnn4r2P9ZFSP8+5ftRf wNCKaMdKZn+x1Cegef6ozb3LVl4EwqjzsRMYZwQnRZ6YTMv/aNygsl/4M8Ddcm+Cfn8H MkpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724145848; x=1724750648; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Nlt2ID8Y4L9Xxz1JWEnsQz5/O2GKGR4Bos/ESGDaD3o=; b=CTbSZgshut/B62a7n3OpN8koCzxYOj2Clbe15xQlA/v+sjUAkh69lg+rKFKuBFrjdl bP/fLk0wubRaXCAOlP6fBPufY/gPrqHTNSTz4p7s/sg35xuiw0xa5GxJ+g3AUJhjZToD ZrP5WzVwgCwERafKn2Vn63hiDZoSkUW1Q4Nq86i8sLmzIFuChtdWjzS6AllixGqy71kB mKvzSP40Wy1V0b1uotPj0qjtMldmtB5vt9f4DY5CpK+/JQ5p/7DBFLE60ARjZUwvAied 8j69kF5JhiS/cMA6MZYG6wGpgDwljkqUKHMKJGTHQZxhpnl3o1atd83mmUI8OWcTrAyO j/Eg== X-Gm-Message-State: AOJu0YzheLDEAdbjPQGfZN3tnja1JRAuURphxsUxAQTa7/hCIN2fOfHc Rkz0wv/12enMCM+ZA5tOHYGyw3vPNGwWyvWtabMOUa9B2VraRpMIqERpYf6y X-Google-Smtp-Source: AGHT+IEq+YodW+Yl15mct3wILrYBBnyd3h7kGqFktIhdOaThlqhTzQftr9fvvZau9+ysLUMgbdZB0Q== X-Received: by 2002:a17:90b:1107:b0:2c9:a3ca:cc98 with SMTP id 98e67ed59e1d1-2d3dfc377bamr12147223a91.7.1724145847913; Tue, 20 Aug 2024 02:24:07 -0700 (PDT) Received: from KASONG-MC4.tencent.com ([43.132.141.21]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2d3e317181fsm8828676a91.25.2024.08.20.02.24.04 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 20 Aug 2024 02:24:07 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Matthew Wilcox , Johannes Weiner , Roman Gushchin , Shakeel Butt , Nhat Pham , Michal Hocko , Chengming Zhou , Muchun Song , Chris Li , Yosry Ahmed , "Huang, Ying" , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH] mm/swap, workingset: make anon shadow nodes memcg aware Date: Tue, 20 Aug 2024 17:23:59 +0800 Message-ID: <20240820092359.97782-1-ryncsn@gmail.com> X-Mailer: git-send-email 2.45.2 Reply-To: Kairui Song MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 20E732001E X-Stat-Signature: asfxwj8ffi5xoxzernsqq6ecybqrdoz5 X-HE-Tag: 1724145849-61431 X-HE-Meta: U2FsdGVkX1/MSkD8/UNOAFfrpNV3kvlbW0oJ3L/s4z70Rf13GdDo70LOXS4hbl0HnW+q36lhYvbC/wA8OyNMiAoFAIlAr4v3D64l+Ifhy0VZ4DTrHieLmYOiyUv03un4TiErpRwYszy3a6KTYSuwPs1n3taV9157PbgaSS2aZm8FISrTH3C1Uhio//eD762Ux3WNxrXVUzLbo+sl5wc1TVGQtEcLHJ2LvE4ZwEDJKF4y1bNy0LEyVrsgs2jcvhTWZWDMJOyhGo4qHCGOFnsQVpAey58t+OU9Q6nNbdOZTVSby0+5mIs89yAhPF5Orx0eklk/P7H1+7CGDmc6eXwm+A0fCpZx5duXTfukM1BW7QoK01W8UR15IEktSDbl2g/2lGxJhmeebAQRpMwrZ2qE0i9gzAOhe5/m5OaXLU+4+YL9cHYBesjhS6FLJe1JTxc+aXJWiEzlGAQNAdxzToNa8FFVpuZUq0bZOO2oL93ziL0z9ZbUqw5tgtUTfZ7HuYCsofzPL6JVpsCyQ5cbHBNi9FgKzYygXAddUiO67p1OTSdspgrmwtqQ92UWQZdv7LUNHiAvzXWuuXEMbUAteTbvGlmHnNVPXtiVQn+pCaEtzJSXb9r2MbSUa2HVs6sSCG/smwVtn6SftUB2EBmoQsfU8FqsqdHJYLM4gBo2Xt7krn+U2oaxT5ywBPPJuoO3aFWwdOj3UjlxiZa+XV/mC/yM90OGW371GT9j8e46EJzEgUze5EYl+uul7WRH4aDDcvfRcWI1I0HQzhma8W8Itx/RpO6OlLe2cuEzj1WUNpHQHGI6Kc9lkgs6C6bwOXSCyXbkOHu+uGwiEUHASrh49+1sOau6TWhrGl23ha0SjkjHUWnhP6z/VPxd2krdB9RwzSjW4H0Lqb+5KM6eCRa9aMECZe5BotmNYCp42lLJCZl/2YMk+vZ0Ls5ZB7oMsyVKdvlP6GzHF5YBt72eDr+JNgm aeO+CCyQ oJenbwzGn5mk1LQKowXsg8yj2Q89FAdEIaH6RVD8c8wGpYVLP5WMFdyPTUYIvBYqvgCcY375nrN8n0h6Bx9ymMzlttEBMjOoZxco+VOshvTjF/KZ1LtG7izoAUfVbhVUTtii5lpVpcpYHLJudWDjzsQje7ouwfAgtPpiOkAOUJm/69eMuFoUQbdynyPzp3+RmqK6CJsNh1M3QEvy2X2jGWMHUS9AA9spPkek6Db+50J+eEXGDSfqjd6BG5p0gaDUmiCPYH2FymunZTK0qtFXZeuVusFL38xJWClrS9rdjXQS+kWayembON5ZyTjtWHK0BB9csVSSTlLFnO4hMnDSrOfaKCHp2LShg2TybVujkwdP4prUyVP/w44Q9Xb/2Z2i3DO/hzZyH734hg7/lcVxmF/7fKiC5pee7Dx9dxO1nXnNTh+Xmlvv6zKx3ID513CSM7avzC2hVp2ZkxgrOXLdxE+u5bTBZlv8QTIa/hAQd3CiWfr9kj57bo93kJT3FXbcakVtxSFeB4JQrCLGDDzZ4yXHVAfQ9dEVpWKLo9IRlXZiMv/N+7M55Pe529ibHjXWf8mod8wKJEC3CiYHn/0zvPy9oSjLmy58hyf/UmjmKFwFfy81U6Sc0cSd+rQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song Currently, the workingset (shadow) nodes of the swap cache are not accounted to their corresponding memory cgroup, instead, they are all accounted to the root cgroup. This leads to inaccurate accounting and ineffective reclaiming. One cgroup could swap out a large amount of memory, take up a large amount of memory with shadow nodes without being accounted. This issue is similar to commit 7b785645e8f1 ("mm: fix page cache convergence regression"), where page cache shadow nodes were incorrectly accounted. That was due to the accidental dropping of the accounting flag during the XArray conversion in commit a28334862993 ("page cache: Finish XArray conversion"). However, this fix has a different cause. Swap cache shadow nodes were never accounted even before the XArray conversion, since they did not exist until commit 3852f6768ede ("mm/swapcache: support to handle the shadow entries"), which was years after the XArray conversion. It's worth noting that one anon shadow Xarray node may contain different entries from different cgroup, and it gets accounted at reclaim time, so it's arguable which cgroup it should be accounted to (as Shakeal Butt pointed out [1]). File pages may suffer similar issue but less common. Things like proactive memory reclaim could make thing more complex. So this commit still can't provide a 100% accurate accounting of anon shadows, but it covers the cases when one memory cgroup uses significant amount of swap, and in most cases memory pressure in one cgroup only suppose to reclaim this cgroup and children. Besides, this fix is clean and easy enough. Link: https://lore.kernel.org/all/7gzevefivueqtebzvikzbucnrnpurmh3scmfuiuo2tnrs37xso@haj7gzepjur2/ [1] Signed-off-by: Kairui Song --- This patch was part of previous series: https://lore.kernel.org/all/20240624175313.47329-1-ryncsn@gmail.com/ Split out as a fix as suggested by Muchun and Shakeal. mm/swap_state.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/swap_state.c b/mm/swap_state.c index 4669f29cf555..b4ed2c664c67 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -97,6 +97,7 @@ int add_to_swap_cache(struct folio *folio, swp_entry_t entry, void *old; xas_set_update(&xas, workingset_update_node); + xas_set_lru(&xas, &shadow_nodes); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio); @@ -718,7 +719,7 @@ int init_swap_address_space(unsigned int type, unsigned long nr_pages) return -ENOMEM; for (i = 0; i < nr; i++) { space = spaces + i; - xa_init_flags(&space->i_pages, XA_FLAGS_LOCK_IRQ); + xa_init_flags(&space->i_pages, XA_FLAGS_LOCK_IRQ | XA_FLAGS_ACCOUNT); atomic_set(&space->i_mmap_writable, 0); space->a_ops = &swap_aops; /* swap cache doesn't use writeback related tags */