From patchwork Wed Jan 4 23:10:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 13089181 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13B4DC53210 for ; Wed, 4 Jan 2023 23:11:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 790BA8E0002; Wed, 4 Jan 2023 18:11:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 73FE98E0001; Wed, 4 Jan 2023 18:11:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 607E18E0002; Wed, 4 Jan 2023 18:11:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 522668E0001 for ; Wed, 4 Jan 2023 18:11:01 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2C02E1C6493 for ; Wed, 4 Jan 2023 23:11:01 +0000 (UTC) X-FDA: 80318663922.12.5EED91B Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) by imf22.hostedemail.com (Postfix) with ESMTP id 8E553C0007 for ; Wed, 4 Jan 2023 23:10:59 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=GyjjMhT7; spf=pass (imf22.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672873859; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=B01nNF6+kAsKxYPfPU/0+WQhCbAoy0WeTmIq8oeqfFU=; b=MZFm8bXIC4wgtS0i6vVCZz5Lqzd6V7KK3VL3rpWY5FBGzBoKjpT6oMlDisQ8ZXdLPis2V4 5q45bv0xTOXiK92cL/1XE7HSuZi8+to2IPZtO/qcnJHp4ClUNRT7p8k5BOGHRzL7dnC+xZ UzUzP53WPOcYgKoWAFxpqEdPWB++VLg= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=GyjjMhT7; spf=pass (imf22.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672873859; a=rsa-sha256; cv=none; b=RjjgffgeQiIG2OnqKNbu7Kb4gvjfnkLAeUaAKVt/SZG9X6cKKqqDebsvG8juoZEw1Oe3nB keFrNq+BXKeK3crX944KfOZztaqfNUJ9qXjet/K/oWm874ShQ8kV93lPir/O4S285xN20V 9ft+bJIPl44Xy4xb8ODL98LCHzIM4Mc= Received: by mail-pg1-f176.google.com with SMTP id 79so23326849pgf.11 for ; Wed, 04 Jan 2023 15:10:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=B01nNF6+kAsKxYPfPU/0+WQhCbAoy0WeTmIq8oeqfFU=; b=GyjjMhT7es/DNqMgW9D/+0rv4LOnbs/7E8Bh/PSBkCsynthD771CdAuWM5BgvLkGIt VTaZ7C7AjYBUJzEjN3GtKIL/gjN6g4j4e3JB6f08yVjtSEa0Cu9pZcKUwUlfrmkyv529 qpmuxa9n37Ylo+l9vEd2lneZlm9Uf/iKISUSRA+/iYnmb8B5ni7X0qZhX6V0pfqiTFkz gZ4dA3drZzm3WLx8gx6DXlxXfXrts1YDTlYj5F1GvAXr3nqA4GQE3csabqp1P2Y6WScS h3bk2P1pmHHzdKT16rN2CbUUJa7JwmhQkHT55giYH0RFqcViaQCNatA0UaCT0zXX/82b w5bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=B01nNF6+kAsKxYPfPU/0+WQhCbAoy0WeTmIq8oeqfFU=; b=RC9531RYtBKKNs+VdTvPqsL+LMZbHF4j4yeXw+61BF+TwRSKwXeQPMb/VGSYT4kfoJ 0vhNhAld2yTTfwNXUOROx4SuFTTf46nMCkR/6UWjkS5boA18gFO480C7ZlzWiZIhfAXB HgviedHK4KHkIQZv847eZXUH23WBA5/3bwsncmpI2MFSdF0vVvpOXO7sGwDMU6sw8ZEQ gCRawUVL6atLyEMlPYSdEBIPNvyVuzL3pfZ3XU7Wso+Dh32uaFQxKYvO09DvmY8Xrt5o Gl5aSi89QKS2klWVrEdvDTCbonFA3/m8shp2Cs26FdDLfgipDdq8offhyjpTPu8QGzyp xzew== X-Gm-Message-State: AFqh2krAV2UWOk3levjof4k9nVf8+ZhkQH7GkFFa5OoatcELfMI23B3f 0N9ou89LwRSEOHCjH9CgWMc= X-Google-Smtp-Source: AMrXdXuchPNJZqpGHnwUiiPSgDCfEXE+xEtS+5QXgsKyb3YtY8DtXUlDhakpGA1cQXW/pZClnw2fYQ== X-Received: by 2002:a62:8408:0:b0:575:d06d:1bfa with SMTP id k8-20020a628408000000b00575d06d1bfamr54592617pfd.2.1672873858369; Wed, 04 Jan 2023 15:10:58 -0800 (PST) Received: from localhost (fwdproxy-prn-020.fbsv.net. [2a03:2880:ff:14::face:b00c]) by smtp.gmail.com with ESMTPSA id k3-20020aa79983000000b0057462848b94sm10924829pfh.184.2023.01.04.15.10.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Jan 2023 15:10:57 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, bfoster@redhat.com, willy@infradead.org, kernel-team@meta.com Subject: [PATCH v5 0/3] cachestat: a new syscall for page cache state of files Date: Wed, 4 Jan 2023 15:10:54 -0800 Message-Id: <20230104231057.2632639-1-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 8E553C0007 X-Rspam-User: X-Stat-Signature: hds1purrhcdijaecmizydna6tko9sgyr X-HE-Tag: 1672873859-36236 X-HE-Meta: U2FsdGVkX1+KnBofChJY9oUKdM9mzl6Jeu8dpvmo7Hc7Ltx9YdwfNHoiHI1nU1zs1jHNpaoEzT54/QH1hhfw6JTIvTpdW68BezElsNOsziBFfKkSLnG1r1BYFUtSENcZnPJPmyl8u7JEv4MqrYMjZEX6Qhk+1gt8XEwFNfg9SMLNhXSfPWBqv/yA92la2l0vXe4LquWfddkZgYe+5kKctglztZHr6+z1STgwCVnf1V29lCDEEureShu1oCHkJLKwHN9cweqL8kqdcDP6wqvnx8n7k6UhH4ooW1/BzmC64wLbrw8dC6twh+vNgm0kPv9gRhOV9P85YxUzgDRW/vZvK+dmIKlqQdVGwgpm4IhHXc1Tc7NQXpehyFgV0KdIW4BfIiSwvLtx7Df0h8xJ4DynaX8ZJpq7hDNcrxVJe2f0TlHxg+lVzBnebXFAmBjFH7q1v1/EWvrCVXh0EloFn12Lgnw8IhIqGxYxWj54S0qAMKmYW++6rAbE/8FJURTYTDVqn1hUC2iNu1VulHBhFF4YPZaMJFG7+a6FLI24aOOYdfJdUqlkC0DpbUSsz4sPlzKQ+yS+vUCPXSuHnMTfRAeLfTux8JfS79GtogloSHQBGfgubFtZJPXEgJN/mfUHYMoYAh62BxSZRvm1IHGDRPfAgWB2caGUN6O3mFqflXpKZ+SGasQcnxIVzydBs2RpdTtrdtwpG2ftz4IetYntYkHD1abGJtSsy9HFWYsIg99XgEP22r/7kq/8JCp+FZQWH/NWRyeNJ8oW1fbk0dhgTSlKym/CADt+BIzb5iZzQdl5JovYU04gXMXDk9ceORultnzBq35C6ezDCZA2lAGwAwF/wnenxcUYgf7OMLQC4Q/zP8Yb2hojrdo0wpJZly8xwJs6drfKV7Kv5W9+JbB6VEvxtw5Odu0iZX6ZCcS8gMcRJirU/dZzAGto5svLVhqXQ8jh88mxYSuh0n3mUYzZbyY Y4DrfH1p QEJvt21E6+3nRQ5hd6vPAjTJzRs8fAJj1xJXQYw1D9mkw7DwUQiwG/vnNQrH//hYveY40Xk0SXaX8SXwTCBgCk9udN7BUGVb5KTalMIy6Fuf9g+rCpT98pxHr+SrDSDp7z89h X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Changelog: v5: * Separate first patch into its own series. (suggested by Andrew Morton) * Expose filemap_cachestat() to non-syscall usage (patch 2) (suggested by Brian Foster). * Fix some build errors from last version. (patch 2) * Explain eviction and recent eviction in the draft man page and documentation (suggested by Andrew Morton). (patch 2) v4: * Refactor cachestat and move it to mm/filemap.c (patch 3) (suggested by Brian Foster) * Remove redundant checks (!folio, access_ok) (patch 3) (suggested by Matthew Wilcox and Al Viro) * Fix a bug in handling multipages folio. (patch 3) (suggested by Matthew Wilcox) * Add a selftest for shmem files, which can be used to test huge pages (patch 4) (suggested by Johannes Weiner) v3: * Fix some minor formatting issues and build errors. * Add the new syscall entry to missing architecture syscall tables. (patch 3). * Add flags argument for the syscall. (patch 3). * Clean up the recency refactoring (patch 2) (suggested by Yu Zhao) * Add the new Kconfig (CONFIG_CACHESTAT) to disable the syscall. (patch 3) (suggested by Josh Triplett) v2: * len == 0 means query to EOF. len < 0 is invalid. (patch 3) (suggested by Brian Foster) * Make cachestat extensible by adding the `cstat_size` argument in the syscall (patch 3) There is currently no good way to query the page cache state of large file sets and directory trees. There is mincore(), but it scales poorly: the kernel writes out a lot of bitmap data that userspace has to aggregate, when the user really doesn not care about per-page information in that case. The user also needs to mmap and unmap each file as it goes along, which can be quite slow as well. This series of patches introduces a new system call, cachestat, that summarizes the page cache statistics (number of cached pages, dirty pages, pages marked for writeback, evicted pages etc.) of a file, in a specified range of bytes. It also include a selftest suite that tests some typical usage This interface is inspired by past discussion and concerns with fincore, which has a similar design (and as a result, issues) as mincore. Relevant links: https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04207.html https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04209.html For comparison with mincore, I ran both syscalls on a 2TB sparse file: Using mincore: real 0m37.510s user 0m2.934s sys 0m34.558s Using cachestat: real 0m0.009s user 0m0.000s sys 0m0.009s This series should be applied on top of: workingset: fix confusion around eviction vs refault container https://lkml.org/lkml/2023/1/4/1066 This series consist of 3 patches: Nhat Pham (3): workingset: refactor LRU refault to expose refault recency check cachestat: implement cachestat syscall selftests: Add selftests for cachestat MAINTAINERS | 7 + arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl | 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + include/linux/fs.h | 3 + include/linux/swap.h | 1 + include/linux/syscalls.h | 3 + include/uapi/asm-generic/unistd.h | 5 +- include/uapi/linux/mman.h | 9 + init/Kconfig | 10 + kernel/sys_ni.c | 1 + mm/filemap.c | 143 ++++++++++ mm/workingset.c | 129 ++++++--- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/cachestat/.gitignore | 2 + tools/testing/selftests/cachestat/Makefile | 8 + .../selftests/cachestat/test_cachestat.c | 259 ++++++++++++++++++ 27 files changed, 555 insertions(+), 39 deletions(-) create mode 100644 tools/testing/selftests/cachestat/.gitignore create mode 100644 tools/testing/selftests/cachestat/Makefile create mode 100644 tools/testing/selftests/cachestat/test_cachestat.c base-commit: 1440f576022887004f719883acb094e7e0dd4944 prerequisite-patch-id: 171a43d333e1b267ce14188a5beaea2f313787fb