[v17,03/15] mm/damon: Implement region based sampling

Message ID	20200706115322.29598-4-sjpark@amazon.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=B4xR=AR=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6476820773 IronPort-SDR: 8HCQVYYZXo+jvqstnOoX5oqGfQUrtp79WEx5JTSKWgNbzwcN4qPQLuLZ+WvVGifwKMcntivz/M sS/c6HjoWRrQ== From: SeongJae Park <sjpark@amazon.com> To: <akpm@linux-foundation.org> CC: SeongJae Park <sjpark@amazon.de>, <Jonathan.Cameron@Huawei.com>, <aarcange@redhat.com>, <acme@kernel.org>, <alexander.shishkin@linux.intel.com>, <amit@kernel.org>, <benh@kernel.crashing.org>, <brendan.d.gregg@gmail.com>, <brendanhiggins@google.com>, <cai@lca.pw>, <colin.king@canonical.com>, <corbet@lwn.net>, <dwmw@amazon.com>, <foersleo@amazon.de>, <irogers@google.com>, <jolsa@redhat.com>, <kirill@shutemov.name>, <mark.rutland@arm.com>, <mgorman@suse.de>, <minchan@kernel.org>, <mingo@redhat.com>, <namhyung@kernel.org>, <peterz@infradead.org>, <rdunlap@infradead.org>, <riel@surriel.com>, <rientjes@google.com>, <rostedt@goodmis.org>, <sblbir@amazon.com>, <shakeelb@google.com>, <shuah@kernel.org>, <sj38.park@gmail.com>, <snu@amazon.de>, <vbabka@suse.cz>, <vdavydov.dev@gmail.com>, <yang.shi@linux.alibaba.com>, <ying.huang@intel.com>, <david@redhat.com>, <linux-damon@amazon.com>, <linux-mm@kvack.org>, <linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org> Subject: [PATCH v17 03/15] mm/damon: Implement region based sampling Date: Mon, 6 Jul 2020 13:53:10 +0200 Message-ID: <20200706115322.29598-4-sjpark@amazon.com> In-Reply-To: <20200706115322.29598-1-sjpark@amazon.com> References: <20200706115322.29598-1-sjpark@amazon.com> MIME-Version: 1.0 Content-Type: text/plain Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	Introduce Data Access MONitor (DAMON) \| expand [v17,00/15] Introduce Data Access MONitor (DAMON) [v17,01/15] mm/page_ext: Export lookup_page_ext() to GPL modules [v17,02/15] mm: Introduce Data Access MONitor (DAMON) [v17,03/15] mm/damon: Implement region based sampling [v17,04/15] mm/damon: Adaptively adjust regions [v17,05/15] mm/damon: Track dynamic monitoring target regions update [v17,06/15] mm/damon: Implement callbacks for the virtual memory address spaces [v17,07/15] mm/damon: Implement access pattern recording [v17,08/15] mm/damon: Add a tracepoint [v17,09/15] mm/damon: Implement a debugfs interface [v17,10/15] tools: Introduce a minimal user-space tool for DAMON [v17,11/15] tools/damon/wss: Implement '--thres' option [v17,12/15] Documentation/admin-guide/mm: Add a document for DAMON [v17,13/15] mm/damon: Add kunit tests [v17,14/15] mm/damon: Add user space selftests [v17,15/15] MAINTAINERS: Update for DAMON

diff --git a/include/linux/damon.h b/include/linux/damon.h index c8f8c1c41a45..7adc7b6b3507 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -11,6 +11,8 @@ #define _DAMON_H_ #include <linux/random.h> +#include <linux/mutex.h> +#include <linux/time64.h> #include <linux/types.h> /** @@ -53,11 +55,87 @@ struct damon_task { }; /** - * struct damon_ctx - Represents a context for each monitoring. + * struct damon_ctx - Represents a context for each monitoring. This is the + * main interface that allows users to set the attributes and get the results + * of the monitoring. + * + * @sample_interval: The time between access samplings. + * @aggr_interval: The time between monitor results aggregations. + * @nr_regions: The number of monitoring regions. + * + * For each @sample_interval, DAMON checks whether each region is accessed or + * not. It aggregates and keeps the access information (number of accesses to + * each region) for @aggr_interval time. All time intervals are in + * micro-seconds. + * + * @kdamond: Kernel thread who does the monitoring. + * @kdamond_stop: Notifies whether kdamond should stop. + * @kdamond_lock: Mutex for the synchronizations with @kdamond. + * + * For each monitoring request (damon_start()), a kernel thread for the + * monitoring is created. The pointer to the thread is stored in @kdamond. + * + * The monitoring thread sets @kdamond to NULL when it terminates. Therefore, + * users can know whether the monitoring is ongoing or terminated by reading + * @kdamond. Also, users can ask @kdamond to be terminated by writing non-zero + * to @kdamond_stop. Reads and writes to @kdamond and @kdamond_stop from + * outside of the monitoring thread must be protected by @kdamond_lock. + * + * Note that the monitoring thread protects only @kdamond and @kdamond_stop via + * @kdamond_lock. Accesses to other fields must be protected by themselves. + * * @tasks_list: Head of monitoring target tasks (&damon_task) list. + * + * @init_target_regions: Constructs initial monitoring target regions. + * @prepare_access_checks: Prepares next access check of target regions. + * @check_accesses: Checks the access of target regions. + * @sample_cb: Called for each sampling interval. + * @aggregate_cb: Called for each aggregation interval. + * + * DAMON can be extended for various address spaces by users. For this, users + * can register the target address space dependent low level functions for + * their usecases via the callback pointers of the context. The monitoring + * thread calls @init_target_regions before starting the monitoring, and + * @prepare_access_checks and @check_accesses for each @sample_interval. + * + * @init_target_regions should construct proper monitoring target regions and + * link those to the DAMON context struct. + * @prepare_access_checks should manipulate the monitoring regions to be + * prepare for the next access check. + * @check_accesses should check the accesses to each region that made after the + * last preparation and update the `->nr_accesses` of each region. + * + * @sample_cb and @aggregate_cb are called from @kdamond for each of the + * sampling intervals and aggregation intervals, respectively. Therefore, + * users can safely access to the monitoring results via @tasks_list without + * additional protection of @kdamond_lock. For the reason, users are + * recommended to use these callback for the accesses to the results. */ struct damon_ctx { + unsigned long sample_interval; + unsigned long aggr_interval; + unsigned long nr_regions; + + struct timespec64 last_aggregation; + + struct task_struct *kdamond; + bool kdamond_stop; + struct mutex kdamond_lock; + struct list_head tasks_list; /* 'damon_task' objects */ + + /* callbacks */ + void (*init_target_regions)(struct damon_ctx *context); + void (*prepare_access_checks)(struct damon_ctx *context); + unsigned int (*check_accesses)(struct damon_ctx *context); + void (*sample_cb)(struct damon_ctx *context); + void (*aggregate_cb)(struct damon_ctx *context); }; +int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids); +int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, + unsigned long aggr_int, unsigned long min_nr_reg); +int damon_start(struct damon_ctx *ctx); +int damon_stop(struct damon_ctx *ctx); + #endif diff --git a/mm/damon.c b/mm/damon.c index 5ab13b1c15cf..5d0b3ab684b3 100644 --- a/mm/damon.c +++ b/mm/damon.c @@ -9,18 +9,27 @@ * This file is constructed in below parts. * * - Functions and macros for DAMON data structures + * - Functions for DAMON core logics and features + * - Functions for the DAMON programming interface * - Functions for the module loading/unloading - * - * The core parts are not implemented yet. */ #define pr_fmt(fmt) "damon: " fmt #include <linux/damon.h> +#include <linux/delay.h> +#include <linux/kthread.h> #include <linux/mm.h> #include <linux/module.h> +#include <linux/page_idle.h> +#include <linux/random.h> +#include <linux/sched/mm.h> +#include <linux/sched/task.h> #include <linux/slab.h> +/* Minimal region size. Every damon_region is aligned by this. */ +#define MIN_REGION PAGE_SIZE + /* * Functions and macros for DAMON data structures */ @@ -167,6 +176,251 @@ static unsigned int nr_damon_regions(struct damon_task *t) return nr_regions; } +/* + * Functions for DAMON core logics and features + */ + +/* + * damon_check_reset_time_interval() - Check if a time interval is elapsed. + * @baseline: the time to check whether the interval has elapsed since + * @interval: the time interval (microseconds) + * + * See whether the given time interval has passed since the given baseline + * time. If so, it also updates the baseline to current time for next check. + * + * Return: true if the time interval has passed, or false otherwise. + */ +static bool damon_check_reset_time_interval(struct timespec64 *baseline, + unsigned long interval) +{ + struct timespec64 now; + + ktime_get_coarse_ts64(&now); + if ((timespec64_to_ns(&now) - timespec64_to_ns(baseline)) < + interval * 1000) + return false; + *baseline = now; + return true; +} + +/* + * Check whether it is time to flush the aggregated information + */ +static bool kdamond_aggregate_interval_passed(struct damon_ctx *ctx) +{ + return damon_check_reset_time_interval(&ctx->last_aggregation, + ctx->aggr_interval); +} + +/* + * Reset the aggregated monitoring results + */ +static void kdamond_reset_aggregated(struct damon_ctx *c) +{ + struct damon_task *t; + struct damon_region *r; + + damon_for_each_task(t, c) { + damon_for_each_region(r, t) + r->nr_accesses = 0; + } +} + +/* + * Check whether current monitoring should be stopped + * + * The monitoring is stopped when either the user requested to stop, or all + * monitoring target tasks are dead. + * + * Returns true if need to stop current monitoring. + */ +static bool kdamond_need_stop(struct damon_ctx *ctx) +{ + struct damon_task *t; + struct task_struct *task; + bool stop; + + mutex_lock(&ctx->kdamond_lock); + stop = ctx->kdamond_stop; + mutex_unlock(&ctx->kdamond_lock); + if (stop) + return true; + + damon_for_each_task(t, ctx) { + /* -1 is reserved for non-process bounded monitoring */ + if (t->pid == -1) + return false; + + task = damon_get_task_struct(t); + if (task) { + put_task_struct(task); + return false; + } + } + + return true; +} + +/* + * The monitoring daemon that runs as a kernel thread + */ +static int kdamond_fn(void *data) +{ + struct damon_ctx *ctx = (struct damon_ctx *)data; + struct damon_task *t; + struct damon_region *r, *next; + + pr_info("kdamond (%d) starts\n", ctx->kdamond->pid); + if (ctx->init_target_regions) + ctx->init_target_regions(ctx); + while (!kdamond_need_stop(ctx)) { + if (ctx->prepare_access_checks) + ctx->prepare_access_checks(ctx); + if (ctx->sample_cb) + ctx->sample_cb(ctx); + + usleep_range(ctx->sample_interval, ctx->sample_interval + 1); + + if (ctx->check_accesses) + ctx->check_accesses(ctx); + + if (kdamond_aggregate_interval_passed(ctx)) { + if (ctx->aggregate_cb) + ctx->aggregate_cb(ctx); + kdamond_reset_aggregated(ctx); + } + + } + damon_for_each_task(t, ctx) { + damon_for_each_region_safe(r, next, t) + damon_destroy_region(r); + } + pr_debug("kdamond (%d) finishes\n", ctx->kdamond->pid); + mutex_lock(&ctx->kdamond_lock); + ctx->kdamond = NULL; + mutex_unlock(&ctx->kdamond_lock); + + do_exit(0); +} + +/* + * Functions for the DAMON programming interface + */ + +static bool damon_kdamond_running(struct damon_ctx *ctx) +{ + bool running; + + mutex_lock(&ctx->kdamond_lock); + running = ctx->kdamond != NULL; + mutex_unlock(&ctx->kdamond_lock); + + return running; +} + +/** + * damon_start() - Starts monitoring with given context. + * @ctx: monitoring context + * + * Return: 0 on success, negative error code otherwise. + */ +int damon_start(struct damon_ctx *ctx) +{ + int err = -EBUSY; + + mutex_lock(&ctx->kdamond_lock); + if (!ctx->kdamond) { + err = 0; + ctx->kdamond_stop = false; + ctx->kdamond = kthread_run(kdamond_fn, ctx, "kdamond"); + if (IS_ERR(ctx->kdamond)) + err = PTR_ERR(ctx->kdamond); + } + mutex_unlock(&ctx->kdamond_lock); + + return err; +} + +/** + * damon_stop() - Stops monitoring of given context. + * @ctx: monitoring context + * + * Return: 0 on success, negative error code otherwise. + */ +int damon_stop(struct damon_ctx *ctx) +{ + mutex_lock(&ctx->kdamond_lock); + if (ctx->kdamond) { + ctx->kdamond_stop = true; + mutex_unlock(&ctx->kdamond_lock); + while (damon_kdamond_running(ctx)) + usleep_range(ctx->sample_interval, + ctx->sample_interval * 2); + return 0; + } + mutex_unlock(&ctx->kdamond_lock); + + return -EPERM; +} + +/** + * damon_set_pids() - Set monitoring target processes. + * @ctx: monitoring context + * @pids: array of target processes pids + * @nr_pids: number of entries in @pids + * + * This function should not be called while the kdamond is running. + * + * Return: 0 on success, negative error code otherwise. + */ +int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids) +{ + ssize_t i; + struct damon_task *t, *next; + + damon_for_each_task_safe(t, next, ctx) + damon_destroy_task(t); + + for (i = 0; i < nr_pids; i++) { + t = damon_new_task(pids[i]); + if (!t) { + pr_err("Failed to alloc damon_task\n"); + return -ENOMEM; + } + damon_add_task(ctx, t); + } + + return 0; +} + +/** + * damon_set_attrs() - Set attributes for the monitoring. + * @ctx: monitoring context + * @sample_int: time interval between samplings + * @aggr_int: time interval between aggregations + * @nr_reg: number of regions + * + * This function should not be called while the kdamond is running. + * Every time interval is in micro-seconds. + * + * Return: 0 on success, negative error code otherwise. + */ +int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, + unsigned long aggr_int, unsigned long nr_reg) +{ + if (nr_reg < 3) { + pr_err("nr_regions (%lu) must be at least 3\n", + nr_reg); + return -EINVAL; + } + + ctx->sample_interval = sample_int; + ctx->aggr_interval = aggr_int; + ctx->nr_regions = nr_reg; + + return 0; +} + /* * Functions for the module loading/unloading */

[v17,03/15] mm/damon: Implement region based sampling

Commit Message

Patch