From patchwork Fri Feb 28 22:03:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 13997205 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DDD2C282C5 for ; Fri, 28 Feb 2025 22:03:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C050F280002; Fri, 28 Feb 2025 17:03:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BB506280001; Fri, 28 Feb 2025 17:03:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7CC6280002; Fri, 28 Feb 2025 17:03:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 89FE7280001 for ; Fri, 28 Feb 2025 17:03:34 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3E986813A2 for ; Fri, 28 Feb 2025 22:03:34 +0000 (UTC) X-FDA: 83170730748.14.DB32101 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf21.hostedemail.com (Postfix) with ESMTP id 92E8B1C0021 for ; Fri, 28 Feb 2025 22:03:32 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=i5N8aWN9; spf=pass (imf21.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740780212; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=hZwFqg48MiOW6EbJjieJmjvSMnOd7lPF2UFIe6697Os=; b=l7O6cOTlY35RaqJrniCiYHzLG55/yrEwD6K9nDDkP1UjUl21rTDo9uPB589koXwKQujGis nXjOuUXU8W3G8ff7DflTIxpymi7LrxfomlxIAkM/RIyDkDoQJ91Ke8SfNKANRjvDpTZOnx T0adnwnrZzZLxUkCXvv/tX7eK93JjJ8= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=i5N8aWN9; spf=pass (imf21.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740780212; a=rsa-sha256; cv=none; b=AyEK/S72xKlEkyHSGEGUEZBq6NB9hNV3Hu1mN0ikKLRJ1ZB0ArkQPANow4l/c4PxGhSxnG Cgmh8DYgCxNSCgOiVqwfUWU47ly2aYUCuoAAsYfVBLwi8iacjmHMPOMMKGbkVMedCuHhvV HBfxvqbmd2rlLCHbjWL4iX59jVFQu2E= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id B4B145C5D3F; Fri, 28 Feb 2025 22:01:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 10100C4CED6; Fri, 28 Feb 2025 22:03:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740780211; bh=Zt0lUEdiXddoi07OHZAKDs4YA/iQzIVjey7pxY0i+U8=; h=From:To:Cc:Subject:Date:From; b=i5N8aWN96hU0WTEh499DBC0G3jNX+Wb1bz6WokrK2VmJ1gL365vyPI45SAElFE7Mb fCCQ1UnPEDv44B0xImWIoheYh0ZHiUiHGFFZrba64W5uuf25qjyoHSsXlaSmVv5XVd uILs+X5x2bmtYMlvpTbnRUGeF2GwigPEMV+AM/DpiJD6v73BfEVeu2OqIHETOlnCqV 8jvN/RlaOzBsYDe5XTaSpWhYwMSxy0/vLJMHdd8htYEf4wB5pfHa8yps2PQgoV6KTy 1iRcz2ODEJbqcE7zZxVCepZUTEfKq6FeeB+4FbC1gvsj2hER6JH8OGiDJMvObDwwYC IXxYXSGO3uMqA== From: SeongJae Park To: Cc: SeongJae Park , Andrew Morton , Jonathan Corbet , damon@lists.linux.dev, kernel-team@meta.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v2 0/8] mm/damon: auto-tune aggregation interval Date: Fri, 28 Feb 2025 14:03:20 -0800 Message-Id: <20250228220328.49438-1-sj@kernel.org> X-Mailer: git-send-email 2.39.5 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 92E8B1C0021 X-Stat-Signature: 4nce69rfbj8xnrk5fmxeih7zs71s9xar X-HE-Tag: 1740780212-738029 X-HE-Meta: U2FsdGVkX1/6+wYLucB9iW7enUoZFNyLYgQzpH9gaGF72iOruNx/aw6M5o9dEvYafgMJ8P1iRt7K5vYLdvSHnTM3MzYlYQ/WWlMlwJOVqOKD1r45VtMZBQfcHUgekoAhPMq3uxpucum4bu6eqIeyhlcQ9RiYcj1eUmArVRot6yx9/PC9pZ/DKTaYJzT35NMCOmCipYyl2mGNGZ+NdwNbm9XGdPFkuwRJpwu+hAd+dRrO6s0eDK6KKHoENm7+6MGgn41xunnNdpCJaoYBUbV0i54afxAkN5AatnuuNIIMSKzETcxEASas06XcaIHD82Hl5yZlvION2XuXqaBvhlJh0e2RPbMcOVjxdOsfdDEHUl5A+JNc2et3M2r3yvMzSIst/D7sUfkc9pWuuSgbOmN9JDxsmPYrgBNeg8ZvGXJOyZk4iuqSYnoRLUIaFL3tJX0uZXf28tG3Y9RJofem+Tj2vgNs7Z5wvDvlPImGet6aHyA79Y2Pxe8cfbiRlgTLwPKHuHs6TrycoW+BixgIpMjg1PvkzQCTx+br7kWFLDZZCtrbxQNpEcxyDWCDshl3EH2M97Hc+KkiU/Fuzu8afsf0g17aoMAySsop2Y5h1krxV+Z15bvX0sXNNOhplrT0kGbmTKtjlMEGag3sOhEvgnzMt7aDuWXCdFXoiQESzHITVb80waMnvIWcnPihUpaQHpqk0Erd8oIrImhYufK1onTAaUmImA3pIzpSvj1Ckid8wHfkkCX4lncabqgQN9fhH5oy7MMOqLM+POLzy0s1GNWnkNToDGTdKN80nm4WRXxMqAnNHuBjo5R2SGSNbCAfK923l8cdxWhdMf4GfvCsdlLVVvdG/3wWWbsq23pFQRODDiZuGyjV5nJ1Gi72nDS/V8xf69yT2GF1wPT9tWJFdCr0B5Sm0TsNWE/HYPhkZ/XaaRMj18bu8u1E1wsEBUhbvD6LeeJS0zABevm3dQ+/+Y1 YtGXtOdd JmPzFMhbh2CD4te2iTWDYVEAeAvep5vGRul2okqYDwvvN5gX/y1U3IZ5s/dS6dM6kDpzpYOtpuxF07RBppB8F59mRh1NR0ZO7W7o/OvebS2o3bIN7DDmtI6xlv3NLQCKMnXpi9L1X0YF5t2DB0SPknqyLEZU6JT0i1N7lg+mneTOhBRazkOCCSWzMNWHMNGKYIuWhz7JJycylRW3RaMX0CNkl3D7X/7bxFaraa46B/YrI744= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: DAMON requires time-consuming and repetitive aggregation interval tuning. Introduce a feature for automating it using a feedback loop that aims an amount of observed access events, like auto-exposing cameras. Background: Access Frequency Monitoring and Aggregation Interval ================================================================ DAMON checks if each memory element (damon_region) is accessed or not for every user-specified time interval called 'sampling interval'. It aggregates the check intervals on per-element counter called 'nr_accesses'. DAMON users can read the counters to get the access temperature of a given element. The counters are reset for every another user-specified time interval called 'aggregation interval'. This can be illustrated as DAMON continuously capturing a snapshot of access events that happen and captured within the last aggregation interval. This implies the aggregation interval plays a key role for the quality of the snapshots, like the camera exposure time. If it is too short, the amount of access events that happened and captured for each snapshot is small, so each snapshot will show no many interesting things but just a cold and dark world with hopefuly one pale blue dot or two. If it is too long, too many events are aggregated in a single shot, so each snapshot will look like world of flames, or Muspellheim. It will be difficult to find practical insights in both cases. Problem: Time Consuming and Repetitive Tuning ============================================= The appropriate length of the aggregation interval depends on how frequently the system and workloads are making access events that DAMON can observe. Hence, users have to tune the interval with excessive amount of tests with the target system and workloads. If the system and workloads are changed, the tuning should be done again. If the characteristic of the workloads is dynamic, it becomes more challenging. It is therefore time-consuming and repetitive. The tuning challenge mainly stems from the wrong question. It is not asking users what quality of monitoring results they want, but how DAMON should operate for their hidden goal. To make the right answer, users need to fully understand DAMON's mechanisms and the characteristics of their workloads. Users shouldn't be asked to understand the underlying mechanism. Understanding the characteristics of the workloads shouldn't be the role of users but DAMON. Aim-oriented Feedback-driven Auto-Tuning ========================================= Fortunately, the appropriate length of the aggregation interval can be inferred using a feedback loop. If the current snapshots are showing no much intresting information, in other words, if it shows only rare access events, increasing the aggregation interval helps, and vice versa. We tested this theory on a few real-world workloads, and documented one of the experience with an official DAMON monitoring intervals tuning guideline. Since it is a simple theory that requires repeatable tries, it can be a good job for machines. Based on the guideline's theory, we design an automation of aggregation interval tuning, in a way similar to that of camera auto-exposure feature. It defines the amount of interesting information as the ratio of DAMON-observed access events that DAMON actually observed to theoretical maximum amount of it within each snapshot. Events are accounted in byte and sampling attempts granularity. For example, let's say there is a region of 'X' bytes size. DAMON tried access check smapling for the region 'Y' times in total for a given aggregation. Among the 'Y' attempts, 'Z' times it shown positive results. Then, the theoritical maximum number of access events for the region is 'X * Y'. And the number of access events that DAMON has observed for the region is 'X * Z'. The abount of the interesting information is '(X * Z / X * Y)'. Note that each snapshot would have multiple regions. Users can set an arbitrary value of the ratio as their target. Once the target is set, the automation periodically measures the current value of the ratio and increase or decrease the aggregation interval if the ratio value is lower or higher than the target. The amount of the change is proportion to the distance between the current adn the target values. To avoid auto-tuning goes too long way, let users set the minimum and the maximum aggregation interval times. Changing only aggregation interval while sampling interval is kept makes the maximum level of access frequency in each snapshot, or discernment of regions inconsistent. Also, unnecessarily short sampling interval causes meaningless monitoring overhed. The automation therefore adjusts the sampling interval together with aggregation interval, while keeping the ratio between the two intervals. Users can set the ratio, or the discernment. Discussion ========== The modified question (aimed amount of access events, or lights, in each snapshot) is easy to answer by both the users and the kernel. If users are interested in finding more cold regions, the value should be lower, and vice versa. If users have no idea, kernel can suggest a fair default value based on some theories and experiments. For example, based on the Pareto principle (80/20 rule), we could expect 20% target ratio will capture 80% of real access events. Since 80% might be too high, applying the rule once again, 4% (20% * 20%) may capture about 56% (80% * 80%) of real access events. Sampling to aggregation intervals ratio and min/max aggregation intervals are also arguably easy to answer. What users want is discernment of regions for efficient system operation, for examples, X amount of colder regions or Y amount of warmer regions, not exactly how many times each cache line is accessed in nanoseconds degree. The appropriate min/max aggregation interval can relatively naively set, and may better to set for aimed monitoring overhead. Since sampling interval is directly deciding the overhead, setting it based on the sampling interval can be easy. With my experiences, I'd argue the intervals ratio 0.05, and 5 milliseconds to 20 seconds sampling interval range (100 milliseconds to 400 seconds aggregation interval) can be a good default suggestion. Evaluation ========== We confirmed the tuning works as expected with a few simple workloads including kernel builds and an in-memory caching representative benchmark[1]. We will conduct more evaluations with more workloads and share the results with more details by the time that we drop the RFC tag. Changelog ========= Changes from RFC v1 (https://lore.kernel.org/20250213014438.145611-1-sj@kernel.org) - Replace the target metric from positive samples ratio to DAMON-observed access samples ratio - Fix wrong max events accounting bug - Fix double-increase of next_aggregation_sis SeongJae Park (8): mm/damon: add data structure for monitoring intervals auto-tuning mm/damon/core: implement intervals auto-tuning mm/damon/sysfs: implement intervals tuning goal directory mm/damon/sysfs: commit intervals tuning goal mm/damon/sysfs: implement a command to update auto-tuned monitoring intervals Docs/mm/damon/design: document for intervals auto-tuning Docs/ABI/damon: document intervals auto-tuning ABI Docs/admin-guide/mm/damon/usage: add intervals_goal directory on the hierarchy .../ABI/testing/sysfs-kernel-mm-damon | 30 +++ Documentation/admin-guide/mm/damon/usage.rst | 25 ++ Documentation/mm/damon/design.rst | 50 ++++ include/linux/damon.h | 43 ++++ mm/damon/core.c | 98 ++++++++ mm/damon/sysfs.c | 216 ++++++++++++++++++ 6 files changed, 462 insertions(+) base-commit: 9e7d9145ab8ce407acc540fc29133c471bc29046