From patchwork Fri Sep 20 22:11:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Kaiyang Zhao X-Patchwork-Id: 13808710 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED2D2CF9C68 for ; Fri, 20 Sep 2024 22:12:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6F6C16B007B; Fri, 20 Sep 2024 18:12:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A6A96B0082; Fri, 20 Sep 2024 18:12:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 56DDA6B0085; Fri, 20 Sep 2024 18:12:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 392766B007B for ; Fri, 20 Sep 2024 18:12:47 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 8B1F31A06AA for ; Fri, 20 Sep 2024 22:12:46 +0000 (UTC) X-FDA: 82586517132.25.BFB8120 Received: from mail-qk1-f171.google.com (mail-qk1-f171.google.com [209.85.222.171]) by imf19.hostedemail.com (Postfix) with ESMTP id 7430A1A0014 for ; Fri, 20 Sep 2024 22:12:44 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=cs.cmu.edu header.s=google-2021 header.b=DSEcs4dx; spf=pass (imf19.hostedemail.com: domain of kaiyang2@andrew.cmu.edu designates 209.85.222.171 as permitted sender) smtp.mailfrom=kaiyang2@andrew.cmu.edu; dmarc=pass (policy=none) header.from=cs.cmu.edu ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726870249; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=wASTDKPq3o/45wVwQ7skaMGamRiZDMSqqNyMxwwDl/w=; b=HQTA9EAuj+bO1NTIoUQ8oaCulbHN+cU60/zDltN+XJKa+nTybTZzJ9/1Q+ZMjDSQkyawCn GVdYto+t43ToOveogzDMnRHCGg5TtfhVzGYjn1PApX0W77GVe1+ULjCq9xQQj+/y/tai4d VqQjHI92eYzZM+1U2zLGkl729M1S7O0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726870249; a=rsa-sha256; cv=none; b=UP1i/op6MNKSTlrOJmdsY0ZR9GUXW5NbGNmpMUVLMuyIo/191F2HyKVo+gHHMOZMezw5B8 7NIZitcdGtwH/tFEOpmgmL4x78ubG+JS+YRPbTm/zn4qRzqL5mN4lwQjVka/s97t2OyI52 D9am/EFcM4kDuLsfZkHdoTAZFWR8iek= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=cs.cmu.edu header.s=google-2021 header.b=DSEcs4dx; spf=pass (imf19.hostedemail.com: domain of kaiyang2@andrew.cmu.edu designates 209.85.222.171 as permitted sender) smtp.mailfrom=kaiyang2@andrew.cmu.edu; dmarc=pass (policy=none) header.from=cs.cmu.edu Received: by mail-qk1-f171.google.com with SMTP id af79cd13be357-7a9ad8a7c63so242588785a.3 for ; Fri, 20 Sep 2024 15:12:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.cmu.edu; s=google-2021; t=1726870363; x=1727475163; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=wASTDKPq3o/45wVwQ7skaMGamRiZDMSqqNyMxwwDl/w=; b=DSEcs4dx9KpH+/U2Z/KC0Ob9O393tsqZFg2meOKbcGQOcJ4Fbu828jNBYjEFJH7wPE YrwLdZ2h5t6t0aVUbNBxd3Y2ABuOTX2AYWaA1p59oO+/47+yHJSB/P0+JsbEVeqG9A5v haCfOpSewWcNlKmsLZIykbnl6B/Yw6OpG6Ncu9qVkwS/IDIA1VM6+yvvukjQn/9i46dl JzyMtYyYE0k//GYQOnGSyGNU2DU42cpQx3gyctu5dqEDbzYd2C11VzI9CSEFg2U7juoD j0O/+AKmB7kMbz9+q60o+ptA50aH6a07S87u/dFAuYHmYpFuNF2duyRKuJNyP+t6Q0Ov 3cYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726870363; x=1727475163; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wASTDKPq3o/45wVwQ7skaMGamRiZDMSqqNyMxwwDl/w=; b=c2FO2OWhybAXzLmOjhOTHODlEfbG/PZp6TR9bA63nMgxOMvJsM06uOqpimtL5GSS5V FLzhjRd33CiHmdmlDdofApZUo7oYBfbx3RzXYqg3cjN3bSKeAdnyG+5bZbBMpdCrcP5X e6bjnCWhnW4ytVSm1gX1mV/ukpyXqtbohk8Kl7bAYVRnZTNGERj8tY+oCE4JgxQ18JcL zln0JO6YUD6hnFICK2MSotc5BSFPea5UDallCQcOdpEd4MuHQFhnllyl75e7V2ebr76k 42mNGsUMqTtlYv1WiyB61Ejx2o2YSYzGxlTefmr2mgLKawJZ1jLPdah67WbdLg6GPlDF 9v3A== X-Gm-Message-State: AOJu0YywKNdOaeRjNDl44W6a1RdS3cuZaFO+k2KS17CHeEEd7LX4vb7N ETtkoNmQh/uHzTF2t67k5+PdAOo3VTo2vk1/uXkYUVtcbX5RF9YCERY4dz3LB1P8cDRNXN2J9yE nJRAL1hoM/8aomAwxOzKADjJddljIK+NTNYSAmgJJN/hMf4HFdxZ5Qcy5jHmh4oO3t1sFD4vlhU PNIdFMlE20//r5OPcDy0P6bhDlZb3lCpV3CmWJmA== X-Google-Smtp-Source: AGHT+IFfVO9Gx8kqg/dKinUO4HrZnCrrwzWH5gKkJkJSZqA/KdjnfmoyQgicHWO1RnCR9kAf8lutNQ== X-Received: by 2002:a05:620a:24d4:b0:7a9:ba9d:d257 with SMTP id af79cd13be357-7acb809d80bmr694624385a.9.1726870363113; Fri, 20 Sep 2024 15:12:43 -0700 (PDT) Received: from localhost (pool-74-98-231-160.pitbpa.fios.verizon.net. [74.98.231.160]) by smtp.gmail.com with UTF8SMTPSA id af79cd13be357-7acb0829f1dsm234142085a.60.2024.09.20.15.12.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 20 Sep 2024 15:12:42 -0700 (PDT) From: kaiyang2@cs.cmu.edu To: linux-mm@kvack.org, cgroups@vger.kernel.org Cc: roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, akpm@linux-foundation.org, mhocko@kernel.org, nehagholkar@meta.com, abhishekd@meta.com, hannes@cmpxchg.org, weixugc@google.com, rientjes@google.com, Kaiyang Zhao Subject: [RFC PATCH 0/4] memory tiering fairness by per-cgroup control of promotion and demotion Date: Fri, 20 Sep 2024 22:11:47 +0000 Message-ID: <20240920221202.1734227-1-kaiyang2@cs.cmu.edu> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Stat-Signature: yg1ndse451g3s94wxxn5ytrcdgqq8n3x X-Rspamd-Queue-Id: 7430A1A0014 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1726870364-612165 X-HE-Meta: U2FsdGVkX19r4KJ9ue8fjsU3nY54w0YcXpD/n7coPkvLP/qQTXjQBpO2ko0L7JlPN7VjrImqaveCMSrtU8GIvmaFtGT8UPJROmQFx1smICdUX+WzZ4H7HYQa3fDF6JskTa5Y3ap0tByQazK169a47/BWqIeMnTp1uRfR72OvpJA/5fxq5ZjgXGxFfW7Pfjtrp9L1r4NiQdzy419x8AFYF54A7yeEs/3omNljt7UKFeTgXfq7FNMZBzNnYYD8vHv9tg7O83XOvIF5sFGCx2WDLvSYY4eIM1EV9np0EuxVhqkelL8IBB1MhAAZanTuPw7+H/yaKBqjDhQe9QBMqERfshgpfe6apW6Yn/BEWgFOWLCWOFua/jShsjnFruSygffp+orrsY4u3/q1w+X229s9kc79dxZ1keqNp58rqFjxgZe7gnYgyL4b7SmMLtn8ozHmLmylLul6301JTUrfUjf55Tu1jY3oB+++/CuY8Kx9sHbMEWfbZ9Jq1V2FzrHZFtmstJH6pByL5kMoNSthGMbAst79xcDLMkyeayghnz/MEpl2aOW6ikzn2uZrgOa/CnMTAzu1+DE5kcdKWkIH1HyUr0p5/1fMvw6JkYYE4kh1jDAaY8DMZY9F/wklXaiTTrNHr+gcP69CnyyNRKwJv/qgBJ4QZfBxHV3NfCnefFmG8ZFDf8YeK596pg53tD2ceeU3OvNgNWbd04iEmaON1jDJEu/ZY2NHtJISMwrUTlZijapVpt64L8k9DzNkltMlZ2Nd0bsVWIuOT/J7+OdFTfxaAqRf6aICWAGfQ3GtnwDIUWgJGN9/Wwm8qtkTbnjOKRHFHwbkj/BydqZVyuMNgaoaUFbJHFeW3bfKkbUumQMBYMV1xYcnmv7qFCsnMLIuo60DWWu//y2y+dOIDKvKjTZRydj8lUsho9U9DMukjOIpIxshRoHrKPK1F+qZnuLC3F2XAEY1Ijw7d4bokhqu2Nx Lbu1AiZY qJv4UHFWiJZUunaLKc8/yVG9BwZEuJWFTmas9GdLRuQcQzO+PRILo61549U37ImeMCpitWiZ0ErfCMMbP3QSAxJByFL3Bqdv3zoXjn9U8j3J6AVOTeEkDC5pOCzl/lUIKkP+IYBTLsHiI6srU2VOZIQjkSRPW2RKGQwPJ6NUACZJ4vXm9HZj+i0QDLqQeZYpTR3qM+d4Ym9vCsP3ttd6DJkFwxlV6mMSyPbvU4wu7aYiYoczZaU3VzhBlhk/izqjGIBABiDEz9ZoU+1HE1eqBMLDMCrrj7rFRt7aBEU/0aM6Hlqz6NDv50SS3iYujodKfxN9rkGgM4aPSr2BN4JMmaUmOlXnrXrrssKHIhT/1e+Q7a7DMlKWmX8hgXzaaoqI/elzEopG6QyS3vMcmxaUy/iYnuXcD1etFYTJV X-Bogosity: Ham, tests=bogofilter, spamicity=0.129802, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kaiyang Zhao Currently in Linux, there is no concept of fairness in memory tiering. Depending on the memory usage and access patterns of other colocated applications, an application cannot be sure of how much memory in which tier it will get, and how much its performance will suffer or benefit. Fairness is, however, important in a multi-tenant system. For example, an application may need to meet a certain tail latency requirement, which can be difficult to satisfy without x amount of frequently accessed pages in top-tier memory. Similarly, an application may want to declare a minimum throughput when running on a system for capacity planning purposes, but without fairness controls in memory tiering its throughput can fluctuate wildly as other applications come and go on the system. In this proposal, we amend the memory.low control in memcg to protect a cgroup’s memory usage in top-tier memory. A low protection for top-tier memory is scaled proportionally to the ratio of top-tier memory and total memory on the system. The protection is then applied to reclaim for top-tier memory. Promotion by NUMA balancing is also throttled through reduced scanning window when top-tier memory is contended and the cgroup is over its protection. Experiments we did with microbenchmarks exhibiting a range of memory access patterns and memory size confirmed that when top-tier memory is contended, the system moves towards a stable memory distribution where each cgroup’s memory usage in local DRAM converges to the protected amounts. One notable missing part in the patches is determining which NUMA nodes have top-tier memory; currently they use hardcoded node 0 for top-tier memory and node 1 for a CPU-less node backed by CXL memory. We’re working on removing this artifact and correctly applying to top-tier nodes in the system. Your feedback is greatly appreciated! Kaiyang Zhao (4): Add get_cgroup_local_usage for estimating the top-tier memory usage calculate memory.low for the local node and track its usage use memory.low local node protection for local node reclaim reduce NUMA balancing scan size of cgroups over their local memory.low include/linux/memcontrol.h | 25 ++++++++----- include/linux/page_counter.h | 16 ++++++--- kernel/sched/fair.c | 54 +++++++++++++++++++++++++--- mm/hugetlb_cgroup.c | 4 +-- mm/memcontrol.c | 68 ++++++++++++++++++++++++++++++------ mm/page_counter.c | 52 +++++++++++++++++++++------ mm/vmscan.c | 19 +++++++--- 7 files changed, 192 insertions(+), 46 deletions(-)