From patchwork Tue Apr 4 20:50:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shaun Tancheff X-Patchwork-Id: 13200985 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DC1CC761A6 for ; Tue, 4 Apr 2023 20:50:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 04CEA6B0071; Tue, 4 Apr 2023 16:50:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F3DB86B0074; Tue, 4 Apr 2023 16:50:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E056D6B0075; Tue, 4 Apr 2023 16:50:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id CFEBC6B0071 for ; Tue, 4 Apr 2023 16:50:26 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 9281AC0F32 for ; Tue, 4 Apr 2023 20:50:26 +0000 (UTC) X-FDA: 80644901652.02.C1A95C0 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf07.hostedemail.com (Postfix) with ESMTP id D0C4940004 for ; Tue, 4 Apr 2023 20:50:24 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="Nrw6i/FJ"; spf=pass (imf07.hostedemail.com: domain of shaun.tancheff@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=shaun.tancheff@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680641424; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=lglI8SHFdENU7LxVGQcu2UFVz8VugpbHJeSn7gO+RPc=; b=yKLeTLNymMyAYOf+G6ZG5D0Gxvrwll6FhYBFdEedoTxl6Vsc7uXAcYUyQCVhMNIjwzUZ4w yds6q37xI6qlYJdAmQiVtJyAXgx4/+Gbm7NbRNtuhNhfIr+IG+1j4RHK/lsXVxYyKtgZag 72EzuAVhNUw31dX/KWTZpFEG2Kktw6c= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="Nrw6i/FJ"; spf=pass (imf07.hostedemail.com: domain of shaun.tancheff@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=shaun.tancheff@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680641424; a=rsa-sha256; cv=none; b=uhMR+Hog+eWQf7JIWmfjtSN8kUuOuP6+L0uYU12pCJnAK/gL/2bObHC8JcjlwyyNwMNSDk sh7I28SBWmoTG6UB7cLos2nucG852sAqiOLNMungRiGL1SW902LTRE7NA8r5crnGBoHuDW e/Er5qA6TExxFAHLNNG9hl3sK3YO8gE= Received: by mail-pl1-f182.google.com with SMTP id c18so32482440ple.11 for ; Tue, 04 Apr 2023 13:50:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680641423; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=lglI8SHFdENU7LxVGQcu2UFVz8VugpbHJeSn7gO+RPc=; b=Nrw6i/FJkS+fqTHL9se7KaV8CBQ8xPVRt4LMvU2Do9i/mTfZSgWVYiqA/pz4ECapQQ tCpnUOnnfw0CQgEFR/FRbB9ejV68iU6P0vReR7IiHWMWh7XWRvd/zt3D7oPfgKzu8ZVL PFBGzhqG/eQ1Wg6dC/pz5EitUrx3E0FtWzBZHDob4owJKiqepla9rrMevQICz2jTD6sB IxCbiS3TOw5KrDKlwAMI7w2UXHiOSgdPeq6lxmP+5+46AFNq0b6/co31Vf/ozknj+8zJ x1LZzrI54HRvbU9F97DUHeoGyEX15x6ETRsdEZyk/TEk+xCTpDfq4O7jGHmFkSr5In1k FRow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680641423; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=lglI8SHFdENU7LxVGQcu2UFVz8VugpbHJeSn7gO+RPc=; b=uuQdjY9ngoCqhFqxSK1klaSbSXY+qGplHy9jWeljgAfuNKRiqOYCUaMY+rtnrhWkhA zgGO1+12+K2w4HxhWpAjP9EQ648/iC0yEgKSeZJzUSCzIyJHX30oievrr+hdpHexlnNE MPWRe10JAhgKBY4vboWhVG07HlXcHHjkZoDyITFTZpbpUQeFO0CwNkE0UnRGpGB4woot hS/AMLOCpDTJSX/hp+X2laXwJVOGdWqZUkb5SYBjYdnk4mYITYWnWv5gjkKytlF6t9k3 CBfRRwdhf0y9Y+0EOr5ptrOtIZmwFM7eHPdHRTyMWqXDjnrqUmGCrMLTOybSylanu9NB VGEw== X-Gm-Message-State: AAQBX9cw3/Y9mLoDhL4Tv60pSDI5aLOF/4yPQhAb0Cgjzprm0SzLk0Ja /5nHBRqVr5EQ+Bcn1xDu/u4= X-Google-Smtp-Source: AKy350YCVMZqXw6ZiGurBjzNmoYBsAbAsOz/1gvcWpAAID0BsOihl1waiyFK0dykig976XfgWdUntw== X-Received: by 2002:a17:903:244f:b0:1a1:ce5d:5a15 with SMTP id l15-20020a170903244f00b001a1ce5d5a15mr4998933pls.50.1680641423186; Tue, 04 Apr 2023 13:50:23 -0700 (PDT) Received: from lunar.aeonazure.com ([182.2.143.216]) by smtp.gmail.com with ESMTPSA id d9-20020a170902728900b0019f2328bef8sm8818551pll.34.2023.04.04.13.50.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Apr 2023 13:50:22 -0700 (PDT) From: Shaun Tancheff To: Johannes Weiner , Michal Hocko , Vladimir Davydov Cc: Shaun Tancheff , Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH] memcg-v1: Enable setting memory min, low, high Date: Wed, 5 Apr 2023 03:50:13 +0700 Message-Id: <20230404205013.31520-1-shaun.tancheff@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: oro1d8h6k4ogh81446c3w9x61umwwprw X-Rspamd-Queue-Id: D0C4940004 X-HE-Tag: 1680641424-243019 X-HE-Meta: U2FsdGVkX1/6+fqaf7Nl8cWAwyCDPi5AlGy249yQbb0g26t03Efy0UUKjA/mGVy5kIN7cplPgspPn54ZZMDVEdSqrcFBfmGSfChe1yrQn39xaabZMAgo+N24JRP3oAgsrb9UjPGCWIYX37+icxqOJdi+BUet+B7vbjdINkN/eK1H+PkzfV7VDGShXUPE3mA5RY3FA9BTp8uzHBU/+COXfUoIv/udXpbJ1etOYPglH+mzcFfFo+Q0HR/CUjw2T1yMO4QkPbMGSYEohnxGYoGm31ZZmz21OYq7FwoHbCJdwTBHSfADp8rLxNgLEX52t2YCXxK1RjhfRlc/WTCYKNiztOYXxcRrESN5GpPorEjupZS4qIpZoaifa8/huqzM1DVFSJ8w67jyfS/NzZ3cVLMnsxq8Z55Bl2LJ7sD2b0xYZWDyC6fb+M09V76rzO3dm5mMjnGXCTBpX/tZSrpZd+i1lv9NStIXgcy79zLYh72cPDS0910dAVsSrPNtSt96pXIWDeE64/vUhYKNxtK6AEyqT7ElaNWnBf54VsxQFSAlPBXN0AA5+BabPa9hsfqYWTZwMMaBpdD8LXkNToqt16E1HEAeFz16MFhID5gP3DaP6CPW7u29cGOmEiunMCgY/wuOB63d9PscfA/dJrsrUIbJ1qqKbmVVjLp/I14lU8Sf52J0LnJF8osSGte9FgMbjo7cHmq88sZCYCdpXoqCmXCjdPYHiLcv90Us13Z7t3ZI2n/bINKBqHqXOiXItsb9UAE4lswxKaIShbH4y0FMs9n40vKWdnxQrYCwCFBO9YNFBK/nRonNmuIDpil1wWIzDDje3q5fovbkqr/qlVFflNgeBmqbt4/6rhFdJDwV16c5qdPmvhto55C/gycBvsqGs7dZjioWSF/jdSYg1YNHO+WQThN843kZJ+aLzzOoorPXQWjnm2UL+97PhjqGFIfeIpkTlYwqsqWADhAL1e4Pb1L AUBDrerT QoB0S5jOZKw8rb614LFoD1ZW1QktO/EI9TqMFjtErNSuLBBEQcGzupGvmnA0SHiy+IXZeOXqaTRhQHK2G73SueSF91CYG1bIpBrtw7mvTKu077Q0ylS6Mfe4f67GZx5oWRD0YgA/h5Bfemb+DBCB3DXoUQ3fZjtLXYfpgBg2hzFwnZOslDXzFkz99BAPtYfNn7Zg5OfUM9gJ6Bg8iWgJhzPlCuB6uveuj4kLECKlMFoz51g1TxWemo1I9sqC0WaNunnbuUh95/NzASwiwX6Xmkissj2vIda8N8QG5xephTtR2Sn9KMdcvtyxDZ5jZyTNHWc1hEBgCa/ZYVQE+6e+YkoEqCzRP2SZ03gfInoBmKDz9QpzFb08rOs+4bB19vI0jTwb98Uk/Ode6smc+VX4Mx5CO4W1s4baf4JIq3Bv47smqN5guEDy2Je38d7Gr79IJLJHzkZ1hcLfp6jHyGdaOkY3+OFr7tg/82yY34v/OJdOEQv9mCGt4V0ncEFAxbL0/d1Wyx7f1T06dzEdAt8pOdSyq1uhwO6FXNvGK0+CmtDVJKGm1SqD0EjnKheBAk4rDAICJ9wd3/EQo++RmnkPqQppx1SaH3CKuS09C X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shaun Tancheff For users that are unable to update to memcg-v2 this provides a method where memcg-v1 can more effectively apply enough memory pressure to effectively throttle filesystem I/O or otherwise minimize being memcg oom killed at the expense of reduced performance. This patch extends the memcg-v1 legacy sysfs entries with: limit_in_bytes.min, limit_in_bytes.low and limit_in_bytes.high Since old software will need to be updated to take advantage of the new files a secondary method of setting min, low and high based on a percentage of the limit is also provided. The percentages are determined by module parameters. The available module parameters can be set at kernel boot time, for example: memcontrol.memcg_min=10 memcontrol.memcg_low=30 memcontrol.memcg_high=80 Would set min to 10%, low to 30% and high to 80% of the value written to: /sys/fs/cgroup/memory//memory.limit_in_bytes Signed-off-by: Shaun Tancheff --- mm/memcontrol.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 83 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 5abffe6f8389..eec6e6ed92f8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -73,6 +73,18 @@ #include +static unsigned int memcg_v1_min_default_percent; +module_param_named(memcg_min, memcg_v1_min_default_percent, uint, 0600); +MODULE_PARM_DESC(memcg_min, "memcg v1 min default percent"); + +static unsigned int memcg_v1_low_default_percent; +module_param_named(memcg_low, memcg_v1_low_default_percent, uint, 0600); +MODULE_PARM_DESC(memcg_low, "memcg v1 low default percent"); + +static unsigned int memcg_v1_high_default_percent; +module_param_named(memcg_high, memcg_v1_high_default_percent, uint, 0600); +MODULE_PARM_DESC(memcg_high, "memcg v1 high default percent"); + struct cgroup_subsys memory_cgrp_subsys __read_mostly; EXPORT_SYMBOL(memory_cgrp_subsys); @@ -208,6 +220,7 @@ enum res_type { _MEMSWAP, _KMEM, _TCP, + _MEM_V1, }; #define MEMFILE_PRIVATE(x, val) ((x) << 16 | (val)) @@ -3689,6 +3702,9 @@ enum { RES_MAX_USAGE, RES_FAILCNT, RES_SOFT_LIMIT, + RES_LIMIT_MIN, + RES_LIMIT_LOW, + RES_LIMIT_HIGH, }; static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css, @@ -3699,6 +3715,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css, switch (MEMFILE_TYPE(cft->private)) { case _MEM: + case _MEM_V1: counter = &memcg->memory; break; case _MEMSWAP: @@ -3729,6 +3746,12 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css, return counter->failcnt; case RES_SOFT_LIMIT: return (u64)memcg->soft_limit * PAGE_SIZE; + case RES_LIMIT_MIN: + return (u64)READ_ONCE(memcg->memory.min); + case RES_LIMIT_LOW: + return (u64)READ_ONCE(memcg->memory.low); + case RES_LIMIT_HIGH: + return (u64)READ_ONCE(memcg->memory.high); default: BUG(); } @@ -3828,6 +3851,35 @@ static int memcg_update_tcp_max(struct mem_cgroup *memcg, unsigned long max) return ret; } +static inline void mem_cgroup_v1_set_defaults(struct mem_cgroup *memcg, + u64 nr_pages) +{ + u64 max = (u64)(PAGE_COUNTER_MAX * PAGE_SIZE) / PAGE_SIZE; + u64 min, low, high; + + if (mem_cgroup_is_root(memcg) || max == nr_pages) + return; + + min = READ_ONCE(memcg->memory.min); + low = READ_ONCE(memcg->memory.low); + if (min || low) + return; + + if (!min && memcg_v1_min_default_percent) { + min = (nr_pages * memcg_v1_min_default_percent) / 100; + page_counter_set_min(&memcg->memory, min); + } + if (!low && memcg_v1_low_default_percent) { + low = (nr_pages * memcg_v1_low_default_percent) / 100; + page_counter_set_low(&memcg->memory, low); + } + high = READ_ONCE(memcg->memory.high); + if (high == PAGE_COUNTER_MAX && memcg_v1_high_default_percent) { + high = (nr_pages * memcg_v1_high_default_percent) / 100; + page_counter_set_high(&memcg->memory, high); + } +} + /* * The user of this function is... * RES_LIMIT. @@ -3851,6 +3903,11 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of, break; } switch (MEMFILE_TYPE(of_cft(of)->private)) { + case _MEM_V1: + ret = mem_cgroup_resize_max(memcg, nr_pages, false); + if (!ret) + mem_cgroup_v1_set_defaults(memcg, nr_pages); + break; case _MEM: ret = mem_cgroup_resize_max(memcg, nr_pages, false); break; @@ -4999,6 +5056,13 @@ static int mem_cgroup_slab_show(struct seq_file *m, void *p) } #endif +static ssize_t memory_min_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off); +static ssize_t memory_low_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off); +static ssize_t memory_high_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off); + static struct cftype mem_cgroup_legacy_files[] = { { .name = "usage_in_bytes", @@ -5013,10 +5077,28 @@ static struct cftype mem_cgroup_legacy_files[] = { }, { .name = "limit_in_bytes", - .private = MEMFILE_PRIVATE(_MEM, RES_LIMIT), + .private = MEMFILE_PRIVATE(_MEM_V1, RES_LIMIT), .write = mem_cgroup_write, .read_u64 = mem_cgroup_read_u64, }, + { + .name = "limit_in_bytes.min", + .private = MEMFILE_PRIVATE(_MEM_V1, RES_LIMIT_MIN), + .write = memory_min_write, + .read_u64 = mem_cgroup_read_u64, + }, + { + .name = "limit_in_bytes.low", + .private = MEMFILE_PRIVATE(_MEM_V1, RES_LIMIT_LOW), + .write = memory_low_write, + .read_u64 = mem_cgroup_read_u64, + }, + { + .name = "limit_in_bytes.high", + .private = MEMFILE_PRIVATE(_MEM_V1, RES_LIMIT_HIGH), + .write = memory_high_write, + .read_u64 = mem_cgroup_read_u64, + }, { .name = "soft_limit_in_bytes", .private = MEMFILE_PRIVATE(_MEM, RES_SOFT_LIMIT),