From patchwork Wed Mar 19 06:41:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: jingxiang zeng X-Patchwork-Id: 14022138 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 110C7C35FFA for ; Wed, 19 Mar 2025 06:42:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1ACD7280007; Wed, 19 Mar 2025 02:42:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 15C53280001; Wed, 19 Mar 2025 02:42:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F1972280007; Wed, 19 Mar 2025 02:42:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D0651280001 for ; Wed, 19 Mar 2025 02:42:38 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id F0085C194A for ; Wed, 19 Mar 2025 06:42:39 +0000 (UTC) X-FDA: 83237357238.26.0F3E6AC Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf19.hostedemail.com (Postfix) with ESMTP id 11F681A0009 for ; Wed, 19 Mar 2025 06:42:37 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FdjgKsfX; spf=pass (imf19.hostedemail.com: domain of jingxiangzeng.cas@gmail.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=jingxiangzeng.cas@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742366558; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VG3Wd02sYbLjjRpwudjxDazboXof3Wg+sHi5kNrgZhE=; b=QkHsdZr1rTZgwYzfrf8k6zeAGDIMEigp77ayR2T1rX5vdreqsY725heoYdV0cYczXBOySN sLfmypyDe/0BA9dDVL/+to8qJJbXHlne9FnRQs7zeAL9bganV/+5WGQBPXhKO1WUG86T14 GX3m+zrpthDOQQ06imlTXNmbSMmzazo= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FdjgKsfX; spf=pass (imf19.hostedemail.com: domain of jingxiangzeng.cas@gmail.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=jingxiangzeng.cas@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742366558; a=rsa-sha256; cv=none; b=J9zL79uh7S+TMICflCKK+5R21KjIBVG2gIIBNMr1it39OCAJ/X7aONYavVDaqyt4jdxWsd knivyWqMLGun1FlRrk5womQftVV/yMethBLUsHfcSlPmjuWQWsob+Fw/j5ZBCkAZ9g+yfN kVl57hvOo3Jkxz1irzMPIvwdfGSr1rE= Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-225a28a511eso110821195ad.1 for ; Tue, 18 Mar 2025 23:42:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742366557; x=1742971357; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=VG3Wd02sYbLjjRpwudjxDazboXof3Wg+sHi5kNrgZhE=; b=FdjgKsfX2V/8XPFrVcqcQDTya2yWKaYYDqR35SDE8HVu4swvWfzYNL2dM926oP4tdo xr1YVn29s1YG+JUGuWvORrtEeyB+gw15jaFXiWYtCEgIz8zEKKrWK/FFIuPs4vwoXN1p olNsBC/Ngt7nJuZeAEDA3SAUf6OW+gCYPMMVO1s8C4yjX6AH/5l7r+iW0ZGCQHyClCIX 3JvLzuvJAmUVyQMXLJivLq14LTjhNFTfLl5DfP5+bnlLoEJAuhwbFYphT8XwgZ5GF1q6 7BcVv4ym7N47uj3lWFywTc6H7PAFpuKPatDt5QzzCWup9v97GAmS06aIgAFyqfd8SFi8 lGhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742366557; x=1742971357; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=VG3Wd02sYbLjjRpwudjxDazboXof3Wg+sHi5kNrgZhE=; b=X22WI2o/3V9cma25xIWSMfiZto0rnY/FvU84+3HbDj3j61nmmAI5LqXnVFdiEoCEnO igc9SPKlYaF6U17f8H8uobASQdzMhBq8hDE7P1Fdg1ny1JLfTgvO/Cx+a1Xi0SXtKML7 xr1H0+k3WxRDQ2vDfiC1tZiWF3lrcVKx8YvtMioHMtnecADGSEeBJZsaaXo+n+qrnwuJ aVJ3szMaodujKkyB5m4AzmNVBv1oRkWBGVqfzXTACLU9dkZin1FZ7bcXA9jWNwvekav6 OOT4qU2gc9RJncAqC4QySskqc2jBhzstXYc8/VYmdANKrck1phmjXLBwNkIJBbX1bYBk FNLw== X-Gm-Message-State: AOJu0YyExGSdq5TBOXxyes6RcseNlrwgJR+BUEP0CLknn1Ckt97xJwdZ UwItqbIK8g7ESPHhpPnHIwxbxCdnM7NtCt8PwrLXlvkdvOqn4QxT X-Gm-Gg: ASbGnctcEbeIw5b1USmj4Shj7mAB36a1s12q87mQJun9tztLSp/aDZddvTxx8My7zNK z9AbVJbKVT1k4fu0ysZb9P6GY4/smZcORjLqK5Vct7lCpxVSekTQ054zuu/Yd3Cz0gViB5p1PA2 GC+ZJORygAaDVVvAMuAJdddicNsZPXP8ddVNPhT/Emm+6Wprz4JMTZV9lh5IqLO8A0ENV8/Q7qN su3EKaItQo58Ab3l7pke7Vh1GwlnFXdiXOCUv6ybF1GiEAPcx8wA+7u20TpK/sPRJGwyOMyUNii YZOC2MKFWAE86y6raxVpynPzeNgkkl0AJbQPCbL6R8hphtHMjPNOIr30aDGMDcdQzvCT/56gTOt uVcA6HeyC/vRC+g== X-Google-Smtp-Source: AGHT+IFoC/rsafExGZdoPrX7mXfBQRLZqX5ES4/CB7dhcohKP3VkaXzbs3HPOAlBNSQGytmkp2y/BA== X-Received: by 2002:a17:902:cec6:b0:224:26f2:97da with SMTP id d9443c01a7336-22649a3d6a3mr21556305ad.29.1742366556960; Tue, 18 Mar 2025 23:42:36 -0700 (PDT) Received: from localhost.localdomain ([14.116.239.35]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-225c68a4876sm106191835ad.70.2025.03.18.23.42.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Mar 2025 23:42:36 -0700 (PDT) From: Jingxiang Zeng To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, kasong@tencent.com, Zeng Jingxiang Subject: [RFC 4/5] mm/memcontrol: allow memsw account in cgroup v2 Date: Wed, 19 Mar 2025 14:41:47 +0800 Message-ID: <20250319064148.774406-5-jingxiangzeng.cas@gmail.com> X-Mailer: git-send-email 2.41.1 In-Reply-To: <20250319064148.774406-1-jingxiangzeng.cas@gmail.com> References: <20250319064148.774406-1-jingxiangzeng.cas@gmail.com> Reply-To: Jingxiang Zeng MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 11F681A0009 X-Rspamd-Server: rspam08 X-Stat-Signature: u4jr1h3sswte48snuzcg6b36nkxbfw5q X-HE-Tag: 1742366557-30883 X-HE-Meta: U2FsdGVkX19mqUrbuf5QygS8XDxXJkUtV3ZgCabAdTltwh0wyexrzmNXaBXyo7pQq/DvhCm4pgsJDRmj6zcoSn7eDi+SF3u2tbRo6qTMELOj6y3Z6muHkvX5oBlF19UJhs/QPDwZaL5Lx/wez9fqLi3jtL0Ni337f6Qq4iHG5tZcCiVTPA/INn+T2S0/IXpAc/K8fmIlEmYhgayNoedhOc9NZvJq7GXsJB+GnXjCSriPz2t3Jhqp8N0ipGkZ5ELVZjoCJaJxXzXUNIskk0bclghzUgvuRSP6BOAaXfDZKZ1WLEPX7F82LN8+HuCASnOPeppmPIwsnJcFreOn+K5JyXE6iVzPAchfqdWfYrt1R1DENCB60avEpVpS9kK+N/FJ7HjyxvKkA7d1QpeNJvvhM5YVDQIvaueivJmwVR4M+3IMdiONn2R0z+7QaTJiV/AXo89gC4v0CJBOJa/rJuHZOuHIDjgVmIjEZMH4m4FrerM8/x7rh+73g4itOIwjv5YvEbzPosQHoHUIRM+Zh3tpu4E6MS6ujwFcQCkuiD7L+BRjqRRXFW1gN8auFxqDFY8cULuhjngHJrwdbs1Yl/Amy4di7dxqiK619U6pDumOewb2mRfPWg2Pn21eR9hwsYvUBerJhB2kFBy8J2DOdfD9YOOcR3FLcsIFEqsEYXfD1my0+efW3BUekpI8hvsi45M4MPnsFSpBuxvK82YcsSAUXV/XwCjEYhaoh5jj6BDvL9zAGWVvR46+D/BLoRAvQbkH+kmxdF+Z8WUxxDVPHbo9b+0/Ww0eAZShVCvWaXWSR0IyVRSKlkGZe+9DH+D/ctmfG9BZOgD4xoKEJUpKRkJFsryBZ4WKC52HwkipISlZryTmvxm0sEtglZhk5HSHr+gYntnNPDkiw+nZiPgqoEvtLyfCSZNQdSy6KrEOrCRek7lC369FC0AxOzGPfaRY6plCo62x9sxR7AMg+N22hJv SIev6uaY sFf8qOTYgp0paCNfO3zxBKALy1QUlON2OvPdSfDxhOiR3t4aZ0pMGI3SaJbLhQ1U/ZUdTMxupT5svY/fRBt96tPNgLxIEHVCZ+ac0sZJrpSjOhiJvURumYxJeqRnsnF+DqhnfYjN2A14hG94xcryqpvLQwY+4WpryMoB6nIPcitawqHJWiF9vJksElkO2ujYdrjQTtFCyOp4/IcxT1u6W5EIJF8CNQE589JQdybhVwq0W5bPwlRToAy8THpv/gkJ1wy5oAXyuqmG4f6lVpp50ACe7lwPuLONYHKXgeJyob1/DoPjCIXJueXIOAMrrHIEX5PSEIyujM+JD5mlRTry9Qxrss1O+Px1wO3VbTymaF0TMK7jov/LPOZX1+HKhbFrHbLziTg1pD6dEezPmJaan/nuoceigqnsWhT67bIYRG/mMruRvSAf0DRJWMU0RKYBrHtP1/UQ2c+1v7Jy0E7RA9dVLXiDcGxB+O9GhWV1GpWsnwaTq4RfBxEvedTAJ9LMtPRpdCqRGVf6A6B3ainjz4CAZ4KOS6RT9XX4Y0grKoJrxkfrqpeySrJlGh1Wi/G8gVp/1JoPGGn1Csekm7sjfwUc8YXx4zD5ZiPsf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Zeng Jingxiang memsw account is a very useful knob for container memory overcommitting: It's a great abstraction of the "expected total memory usage" of a container, so containers can't allocate too much memory using SWAP, but still be able to SWAP out. For a simple example, with memsw.limit == memory.limit, containers can't exceed their original memory limit, even with SWAP enabled, they get OOM killed as how they used to, but the host is now able to offload cold pages. Similar ability seems absent with V2: With memory.swap.max == 0, the host can't use SWAP to reclaim container memory at all. But with a value larger than that, containers are able to overuse memory, causing delayed OOM kill, thrashing, CPU/Memory usage ratio could be heavily out of balance, especially with compress SWAP backends. This patch restores the semantics of memory.swap.max to be consistent with memory.memsw.limit_in_bytes and the semantics of memory.swap.current to be consistent with memory.memsw.usage_in_bytes when MEMSW_ACCOUNT_ON_DFL config or cgroup.memsw_account_on_dfl startup parameter is enabled. Signed-off-by: Zeng Jingxiang --- mm/memcontrol-v1.c | 2 +- mm/memcontrol-v1.h | 4 +++- mm/memcontrol.c | 29 +++++++++++++++++++---------- 3 files changed, 23 insertions(+), 12 deletions(-) diff --git a/mm/memcontrol-v1.c b/mm/memcontrol-v1.c index c1feb3945350..3344d5e25822 100644 --- a/mm/memcontrol-v1.c +++ b/mm/memcontrol-v1.c @@ -1436,7 +1436,7 @@ void memcg1_oom_finish(struct mem_cgroup *memcg, bool locked) static DEFINE_MUTEX(memcg_max_mutex); -static int mem_cgroup_resize_max(struct mem_cgroup *memcg, +int mem_cgroup_resize_max(struct mem_cgroup *memcg, unsigned long max, bool memsw) { bool enlarge = false; diff --git a/mm/memcontrol-v1.h b/mm/memcontrol-v1.h index 6358464bb416..7f7ef9f6d03e 100644 --- a/mm/memcontrol-v1.h +++ b/mm/memcontrol-v1.h @@ -36,10 +36,12 @@ struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg); /* Cgroup v1-specific declarations */ #ifdef CONFIG_MEMCG_V1 +int mem_cgroup_resize_max(struct mem_cgroup *memcg, + unsigned long max, bool memsw); /* Whether legacy memory+swap accounting is active */ static inline bool do_memsw_account(void) { - return !cgroup_subsys_on_dfl(memory_cgrp_subsys); + return !cgroup_subsys_on_dfl(memory_cgrp_subsys) || do_memsw_account_on_dfl(); } unsigned long memcg_events_local(struct mem_cgroup *memcg, int event); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 623ebf610946..d85699fa8a90 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5205,9 +5205,12 @@ static ssize_t swap_max_write(struct kernfs_open_file *of, if (err) return err; - xchg(&memcg->swap.max, max); + if (do_memsw_account_on_dfl()) + err = mem_cgroup_resize_max(memcg, max, true); + else + xchg(&memcg->swap.max, max); - return nbytes; + return err ?: nbytes; } static int swap_events_show(struct seq_file *m, void *v) @@ -5224,24 +5227,28 @@ static int swap_events_show(struct seq_file *m, void *v) return 0; } -static struct cftype swap_files[] = { +static struct cftype swap_files_v1[] = { { .name = "swap.current", .flags = CFTYPE_NOT_ON_ROOT, .read_u64 = swap_current_read, }, - { - .name = "swap.high", - .flags = CFTYPE_NOT_ON_ROOT, - .seq_show = swap_high_show, - .write = swap_high_write, - }, { .name = "swap.max", .flags = CFTYPE_NOT_ON_ROOT, .seq_show = swap_max_show, .write = swap_max_write, }, + { } /* terminate */ +}; + +static struct cftype swap_files[] = { + { + .name = "swap.high", + .flags = CFTYPE_NOT_ON_ROOT, + .seq_show = swap_high_show, + .write = swap_high_write, + }, { .name = "swap.max.effective", .flags = CFTYPE_NOT_ON_ROOT, @@ -5473,7 +5480,9 @@ static int __init mem_cgroup_swap_init(void) if (mem_cgroup_disabled()) return 0; - WARN_ON(cgroup_add_dfl_cftypes(&memory_cgrp_subsys, swap_files)); + WARN_ON(cgroup_add_dfl_cftypes(&memory_cgrp_subsys, swap_files_v1)); + if (!do_memsw_account_on_dfl()) + WARN_ON(cgroup_add_dfl_cftypes(&memory_cgrp_subsys, swap_files)); #ifdef CONFIG_MEMCG_V1 WARN_ON(cgroup_add_legacy_cftypes(&memory_cgrp_subsys, memsw_files)); #endif