From patchwork Tue Feb 4 06:23:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 11364069 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 38EEF14B4 for ; Tue, 4 Feb 2020 06:23:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C7C4021741 for ; Tue, 4 Feb 2020 06:23:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ojbKfVqM" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C7C4021741 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CC9E96B0003; Tue, 4 Feb 2020 01:23:34 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C54476B0005; Tue, 4 Feb 2020 01:23:34 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ACC1C6B0006; Tue, 4 Feb 2020 01:23:34 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0165.hostedemail.com [216.40.44.165]) by kanga.kvack.org (Postfix) with ESMTP id 8A0056B0003 for ; Tue, 4 Feb 2020 01:23:34 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 2D4138248047 for ; Tue, 4 Feb 2020 06:23:34 +0000 (UTC) X-FDA: 76451453148.11.bead88_3c2375272833a X-Spam-Summary: 13,1.2,0,18fa7df9ade06a48,d41d8cd98f00b204,sj38.park@gmail.com,:akpm@linux-foundation.org:sjpark@amazon.de:acme@kernel.org:alexander.shishkin@linux.intel.com:amit@kernel.org:brendan.d.gregg@gmail.com:brendanhiggins@google.com:cai@lca.pw:colin.king@canonical.com:corbet@lwn.net:dwmw@amazon.com:jolsa@redhat.com:kirill@shutemov.name:mark.rutland@arm.com:mgorman@suse.de:minchan@kernel.org:mingo@redhat.com:namhyung@kernel.org:peterz@infradead.org:rdunlap@infradead.org:rostedt@goodmis.org:sj38.park@gmail.com:vdavydov.dev@gmail.com::linux-doc@vger.kernel.org:linux-kernel@vger.kernel.org,RULES_HIT:41:69:152:327:355:379:541:800:960:967:968:973:982:988:989:1042:1260:1277:1311:1313:1314:1345:1431:1437:1515:1516:1518:1593:1594:1605:1730:1747:1777:1792:1801:2194:2198:2199:2200:2376:2393:2525:2553:2568:2631:2682:2685:2693:2740:2741:2840:2859:2861:2892:2895:2898:2901:2909:2918:2924:2925:2926:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3165:3167:3608: 3865:386 X-HE-Tag: bead88_3c2375272833a X-Filterd-Recvd-Size: 26780 Received: from mail-pl1-f195.google.com (mail-pl1-f195.google.com [209.85.214.195]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Tue, 4 Feb 2020 06:23:33 +0000 (UTC) Received: by mail-pl1-f195.google.com with SMTP id d9so6832764plo.11 for ; Mon, 03 Feb 2020 22:23:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=8E+UZz4tOW3o5/5fziQljLgAZHX3cCZHEip5ctU5Jio=; b=ojbKfVqMYqyn0dIKgNR1QhUb38j2b7fh8ovJRcmrc811QUG2n4ZfTqkxwykStmbV2x H9FGT182YLcLjH6SGXVXwWRm+rHMaN5XZoCu77o2oCVM90inj7XOhVx1dXb5OAi62XiW hfb39cPEHaIdIaWhIlnzbTvgnUIvDyO56r9P9g92b1rOboGG+dc3BZlrY5DihI+6blF1 WEA3GLTYegw62huBx8MT+PqY7NL5387m6308rBJqkU/BW9GvrQ3VyXZjoHBNcLjnIiKS RoqfMryW2NUN396cBoI1Ud74MS7JO5CnGO9051NBqNE63P0mqts9y0itTcBho1xx+19y Yy3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=8E+UZz4tOW3o5/5fziQljLgAZHX3cCZHEip5ctU5Jio=; b=NGel/0fIVmuVTZyAaNRf1cVcAS6xFGXc0z/oCWR0kltbvNQMvTSOaAO5Ecs4L7BbJF pzvmOIU/GDqS7SzJL5vprF2fe0NbGO26ycgsFCYQPiApV0yy0P9ZgPbho+woWRaqBY/7 +wS0MIR6eT9jKZjN03Ecjz1IWOl9chPa7gjJJK/l1K38KuVilNOAF2wzGcqAoFWBVznl 3Eih8PVlnQqgwuguSsN/7TGyfFWMHK9XFNWDACF/ITmIFkQawDCIh4YNochzBx8wR1xP wt1+egyWd7kFkljo6F9Aeq5JL3v08IbsP7hYql3zQwe3j5eEG+3lhxAfrBkjhrXEExv+ Vj0w== X-Gm-Message-State: APjAAAXHnz8/BdLPjByjKIVc+NPqteMz3BZkKk9T9oBM1nwIrLi2GKAu 54P3oWz2t+fc8eBiS0NbGUg= X-Google-Smtp-Source: APXvYqy1qla9VL3+vZcmuLr+tq//4LOViXgdZtY1InTnMDpq5QDlv4zefug3GRVZCN0uT71tBEBHyQ== X-Received: by 2002:a17:90a:da03:: with SMTP id e3mr4268630pjv.57.1580797411520; Mon, 03 Feb 2020 22:23:31 -0800 (PST) Received: from localhost.localdomain ([106.254.212.20]) by smtp.gmail.com with ESMTPSA id u26sm21880240pfn.46.2020.02.03.22.23.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2020 22:23:30 -0800 (PST) From: sj38.park@gmail.com To: akpm@linux-foundation.org Cc: SeongJae Park , acme@kernel.org, alexander.shishkin@linux.intel.com, amit@kernel.org, brendan.d.gregg@gmail.com, brendanhiggins@google.com, cai@lca.pw, colin.king@canonical.com, corbet@lwn.net, dwmw@amazon.com, jolsa@redhat.com, kirill@shutemov.name, mark.rutland@arm.com, mgorman@suse.de, minchan@kernel.org, mingo@redhat.com, namhyung@kernel.org, peterz@infradead.org, rdunlap@infradead.org, rostedt@goodmis.org, sj38.park@gmail.com, vdavydov.dev@gmail.com, linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 00/11] Introduce Data Access MONitor (DAMON) Date: Tue, 4 Feb 2020 06:23:01 +0000 Message-Id: <20200204062312.19913-1-sj38.park@gmail.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Introduction ============ Memory management decisions can normally be more efficient if finer data access informations are available. However, because finer information usually comes with higher overhead, most systems including Linux made a tradeoff: Forgive some wise decisions and use coarse information and/or light-weight heuristics. A number of experimental data access pattern awared memory management optimizations (refer to 'Appendix D' for more detail) say the sacrifices are huge (2.55x slowdown). However, none of those has successfully adopted to Linux kernel mainly due to the absence of a scalable and efficient data access monitoring mechanism. Refer to 'Appendix C' to see the limitations of existing memory monitoring mechanisms. DAMON is a data access monitoring solution for the problem. It is 1) accurate enough for the DRAM level memory management, 2) light-weight enough to be applied online, and 3) keeps predefined upper-bound overhead regardless of the size of target workloads (thus scalable). Refer to 'Appendix A: Mechanisms of DAMON' if you interested in how it is possible. DAMON is implemented as a standalone kernel module and provides several simple interfaces. Owing to that, though it has mainly designed for the kernel's memory management mechanisms, it can be also used for a wide range of user space programs and people. Refer to 'Appendix B: Expectd Use-cases' for more detailed expected usages of DAMON. Frequently Asked Questions ========================== Q: Why not integrated with perf? A: From the perspective of perf like profilers, DAMON can be thought of as a data source in kernel, like tracepoints, pressure stall information (psi), or idle page tracking. Thus, it can be easily integrated with those. However, this patchset doesn't provide a fancy perf integration because current step of DAMON development is focused on its core logic only. That said, DAMON already provides two interfaces for user space programs, which based on debugfs and tracepoint, respectively. Using the tracepoint interface, you can use DAMON with perf. This patchset also provides the debugfs interface based user space tool for DAMON. It can be used to record, visualize, and analyze data access pattern of target processes in a convenient way. Q: Why a new module, instead of extending perf or other tools? A: First, DAMON aims to be used by other programs including the kernel. Therfore, having dependency to specific tools like perf is not desirable. Second, because it need to be lightweight as much as possible so that it can be used online, any unnecessary overhead such as kernel - user space context switching cost should be avoided. These are the two most biggest reasons why DAMON is implemented in the kernel space. The idle page tracking subsystem would be the kernel module that most seems similar to DAMON. However, it's own interface is not compatible with DAMON. Also, the internal implementation of it has no common part to be reused by DAMON. Q: Can 'perf mem' provide the data required for DAMON? A: On the systems supporting 'perf mem', yes. DAMON is using the PTE Accessed bits in low level. Other H/W or S/W features that can be used for the purpose could be used. However, as explained with above question, DAMON need to be implemented in the kernel space. Evaluations =========== A prototype of DAMON has evaluated on an Intel Xeon E7-8837 machine using 20 benchmarks that picked from SPEC CPU 2006, NAS, Tensorflow Benchmark, SPLASH-2X, and PARSEC 3 benchmark suite. Nonethless, this section provides only summary of the results. For more detail, please refer to the slides used for the introduction of DAMON at the Linux Plumbers Conference 2019[1] or the MIDDLEWARE'19 industrial track paper[2]. Quality ------- We first traced and visualized the data access pattern of each workload. We were able to confirm that the visualized results are reasonably accurate by manually comparing those with the source code of the workloads. To see the usefulness of the monitoring, we optimized 9 memory intensive workloads among them for memory pressure situations using the DAMON outputs. In detail, we identified frequently accessed memory regions in each workload based on the DAMON results and protected them with ``mlock()`` system calls. The optimized versions consistently show speedup (2.55x in best case, 1.65x in average) under memory pressure situation. Overhead -------- We also measured the overhead of DAMON. It was not only under the upperbound we set, but was much lower (0.6 percent of the bound in best case, 13.288 percent of the bound in average). This reduction of the overhead is mainly resulted from the adaptive regions adjustment. We also compared the overhead with that of the straightforward periodic Accessed bit check-based monitoring, which checks the access of every page frame. DAMON's overhead was much smaller than the straightforward mechanism by 94,242.42x in best case, 3,159.61x in average. References ========== Prototypes of DAMON have introduced by an LPC kernel summit track talk[1] and two academic papers[2,3]. Please refer to those for more detailed information, especially the evaluations. [1] SeongJae Park, Tracing Data Access Pattern with Bounded Overhead and Best-effort Accuracy. In The Linux Kernel Summit, September 2019. https://linuxplumbersconf.org/event/4/contributions/548/ [2] SeongJae Park, Yunjae Lee, Heon Y. Yeom, Profiling Dynamic Data Access Patterns with Controlled Overhead and Quality. In 20th ACM/IFIP International Middleware Conference Industry, December 2019. https://dl.acm.org/doi/10.1145/3366626.3368125 [3] SeongJae Park, Yunjae Lee, Yunhee Kim, Heon Y. Yeom, Profiling Dynamic Data Access Patterns with Bounded Overhead and Accuracy. In IEEE International Workshop on Foundations and Applications of Self- Systems (FAS 2019), June 2019. Sequence Of Patches =================== The patches are organized in the following sequence. The first patch introduces DAMON module and it's small common functions. Following three patches (2nd to 4th) implement the core logics of DAMON, regions based sampling, adaptive regions adjustment, and dynamic memory mapping chage adoption, one by one. Next three patches (5th to 7th) adds interfaces of DAMON. Each of those adds an api for other kernel code, a debugfs interface for super users and a tracepoint for other tracepoint supporting tracers such as perf. To provide a minimal reference to the debugfs interface and for more convenient use/tests of the DAMON, the next patch (8th) implements an user space tool. The 9th patch adds a document for administrators of DAMON, and the 10th patch provides DAMON's kunit tests. Finally, the last patch (11th) updates the MAINTAINERS file. The patches are based on the v5.5. You can also clone the complete git tree: $ git clone git://github.com/sjp38/linux -b damon/patches/v3 The web is also available: https://github.com/sjp38/linux/releases/tag/damon/patches/v3 Patch History ============= Changes from v2 (https://lore.kernel.org/linux-mm/20200128085742.14566-1-sjpark@amazon.com/) - Move MAINTAINERS changes to last commit (Brendan Higgins) - Add descriptions for kunittest: why not only entire mappings and what the 4 input sets are trying to test (Brendan Higgins) - Remove 'kdamond_need_stop()' test (Brendan Higgins) - Discuss about the 'perf mem' and DAMON (Peter Zijlstra) - Make CV clearly say what it actually does (Peter Zijlstra) - Answer why new module (Qian Cai) - Diable DAMON by default (Randy Dunlap) - Change the interface: Seperate recording attributes (attrs, record, rules) and allow multiple kdamond instances - Implement kernel API interface Changes from v1 (https://lore.kernel.org/linux-mm/20200120162757.32375-1-sjpark@amazon.com/) - Rebase on v5.5 - Add a tracepoint for integration with other tracers (Kirill A. Shutemov) - document: Add more description for the user space tool (Brendan Higgins) - unittest: Improve readability (Brendan Higgins) - unittest: Use consistent name and helpers function (Brendan Higgins) - Update PG_Young to avoid reclaim logic interference (Yunjae Lee) Changes from RFC (https://lore.kernel.org/linux-mm/20200110131522.29964-1-sjpark@amazon.com/) - Specify an ambiguous plan of access pattern based mm optimizations - Support loadable module build - Cleanup code SeongJae Park (11): Introduce Data Access MONitor (DAMON) mm/damon: Implement region based sampling mm/damon: Adaptively adjust regions mm/damon: Apply dynamic memory mapping changes mm/damon: Implement kernel space API mm/damon: Add debugfs interface mm/damon: Add a tracepoint for result writing mm/damon: Add minimal user-space tools Documentation/admin-guide/mm: Add a document for DAMON mm/damon: Add kunit tests MAINTAINERS: Update for DAMON .../admin-guide/mm/data_access_monitor.rst | 414 +++++ Documentation/admin-guide/mm/index.rst | 1 + MAINTAINERS | 11 + include/linux/damon.h | 71 + include/trace/events/damon.h | 32 + mm/Kconfig | 23 + mm/Makefile | 1 + mm/damon-test.h | 604 +++++++ mm/damon.c | 1412 +++++++++++++++++ tools/damon/.gitignore | 1 + tools/damon/_dist.py | 35 + tools/damon/bin2txt.py | 64 + tools/damon/damo | 37 + tools/damon/heats.py | 358 +++++ tools/damon/nr_regions.py | 88 + tools/damon/record.py | 219 +++ tools/damon/report.py | 45 + tools/damon/wss.py | 94 ++ 18 files changed, 3510 insertions(+) create mode 100644 Documentation/admin-guide/mm/data_access_monitor.rst create mode 100644 include/linux/damon.h create mode 100644 include/trace/events/damon.h create mode 100644 mm/damon-test.h create mode 100644 mm/damon.c create mode 100644 tools/damon/.gitignore create mode 100644 tools/damon/_dist.py create mode 100644 tools/damon/bin2txt.py create mode 100755 tools/damon/damo create mode 100644 tools/damon/heats.py create mode 100644 tools/damon/nr_regions.py create mode 100644 tools/damon/record.py create mode 100644 tools/damon/report.py create mode 100644 tools/damon/wss.py