From patchwork Fri Dec 16 19:21:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 13075336 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5031AC4332F for ; Fri, 16 Dec 2022 19:21:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C919C8E0002; Fri, 16 Dec 2022 14:21:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C430B8E0001; Fri, 16 Dec 2022 14:21:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0A3C8E0002; Fri, 16 Dec 2022 14:21:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A1DC58E0001 for ; Fri, 16 Dec 2022 14:21:53 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 7E7FE1412AB for ; Fri, 16 Dec 2022 19:21:53 +0000 (UTC) X-FDA: 80249139306.02.55E4A7C Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf22.hostedemail.com (Postfix) with ESMTP id EB37CC000A for ; Fri, 16 Dec 2022 19:21:51 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=K0eq7FlE; spf=pass (imf22.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671218512; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=WIGHTz0eabOdyvedXhDmOO3amXHWmKUtSF8hi4QPVYM=; b=uyh2a3gaSflhX1CPdg6zaDN/imLTuvUiLERUc8w1gpi4Hr/fcAX+ub5Kc9RNBcvPycUUok RG74mv6jIEPImRgFYq2LxAzofmTQyPwdhkjnjjg8EFHE4c1r7NRLCOJ64pFOyYPdMbyVln OFsTAGNROqFcvCv09cop3+4N3jq+dWI= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=K0eq7FlE; spf=pass (imf22.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671218512; a=rsa-sha256; cv=none; b=ZIo8vZR8heTuih8VA3GycWTJ9YqbF+aNOZR/4oCtSpu9ktBzUVIsBVnPoftQVy1Dq4Hqzw 95AZZssvLXQHUpBphhECup+qkELA8VsU0bS7FSz9lJnvJmHEI8gzjdk+F03q6sbfLvVlN1 9HB/Fvv7oR+VJmG4/bY84u+gKHSNdl4= Received: by mail-pl1-f172.google.com with SMTP id 4so3278756plj.3 for ; Fri, 16 Dec 2022 11:21:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=WIGHTz0eabOdyvedXhDmOO3amXHWmKUtSF8hi4QPVYM=; b=K0eq7FlE/PQBcn5OgzH+EswODhhIlwrfi3mlASpMf2MLTLXT/8ZKtZ0fIf/TEPf5xg 5P8v2S+I8GgvUDVQ+KJjmvTDtE/Hyz9It8r3XSBWAxnwtRyKXAG+1tVTmo/Mk1B6PlHv Pru/zJfB0Wc+xTdaRgK5GkGcAe5ws3V7jNc+4QFPhS5XqCsXAGvDDzzSU2O9+D6tgAa1 H56BltNEkzAHE1gYbNbZIWSU4fdFQmTvHpolbeKTcVQte+ICmgcLGYU9V9LQw4A8da0O mEfdBMMR27zZxXx/jEVprTQzoOuLCpUTB+S8UOI5Nf5VZI9iMXTcLin2vfmrAThmrDA3 VgZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=WIGHTz0eabOdyvedXhDmOO3amXHWmKUtSF8hi4QPVYM=; b=d2m37jIT0jo5E7JMjVHnZJ1iR46Gne4C7X25gD41752+meFNRtcHgH1sPXi/RBzA7d P7cyQLWSUeAbSO9NhEaUSs5Q5B8RzTHG5S90k0mrNN2HUqCVI7EsGGWds89TeXVrH5tc 3ePwOwkeX0hBqZ+Ocrypmxb4uBZTkgLyE0EEIkpxYGnFKCffFJs/ID70GDPN/WKHXhcr Buhx5T/hZlMXBTfZvutMhcuEXS/dMCQV4WniDz+7+xyfZClN9SZAxVSxD49aQrL7jF0g GTrV+4WZmO2w5oXAC2O92a/VD2kuDcdM1V6PzNFnpOmCmEHIbPNx+o/+8N+cJ88suy87 NMMw== X-Gm-Message-State: ANoB5plNrIAVU0VLhiPOqt1OXU4ZHdIWfIX8BgkbJLAW6K6vf8P0vjAe R6eFVNPOQyMtsDjrICt9DxM= X-Google-Smtp-Source: AA0mqf7hKVxjYoyNhIPVLMPr8b1gSS8HgkQhOv+FnaoHN81bRsiffer8cW1S4jL7Yp9fLxubqkG08A== X-Received: by 2002:a17:90b:1108:b0:218:fa99:8347 with SMTP id gi8-20020a17090b110800b00218fa998347mr34253695pjb.37.1671218510733; Fri, 16 Dec 2022 11:21:50 -0800 (PST) Received: from localhost (fwdproxy-prn-014.fbsv.net. [2a03:2880:ff:e::face:b00c]) by smtp.gmail.com with ESMTPSA id k7-20020a17090a39c700b00200461cfa99sm5145511pjf.11.2022.12.16.11.21.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Dec 2022 11:21:50 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, bfoster@redhat.com, willy@infradead.org, kernel-team@meta.com Subject: [PATCH v4 0/4] cachestat: a new syscall for page cache state of files Date: Fri, 16 Dec 2022 11:21:45 -0800 Message-Id: <20221216192149.3902877-1-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: EB37CC000A X-Rspam-User: X-Stat-Signature: bbz71o99sjhxzpcmmwnukugwu7a98prw X-HE-Tag: 1671218511-859666 X-HE-Meta: U2FsdGVkX1+2IKGNPUzlzgvvfTtmECUQxULm/R5DDRLpEL5Ltmk66VfQcjaASCftfeBkrZ1UlwLQHMmYmj9uyGWG2b2YMmDRtJm/3cgU963fBpxwP+sjbcb0zak0WqSDQtzgKwKGr3IUy9w4yaJGluW6xkjEOXOgt9Gxp1hBvM0DaC8RAafJoZd+ng4lIbfCgc8R+ORUckrfmW97wnrAp1WyQfgDW31SMIIkUqW2PxG+a5UfOq94Um1sEaX2JgXuT9DAQzuBft9GTyczQDohLNowys9uh/+yJp3E1eitIoeKVcaQFjGiokLzdsCAs1VkCc6PtDQof4StrRhNgnNVwIYj7hjWtWcIGpLdxH1A7YK6To5m9Ntilb1iZsDyzTgUBbXNJCSqi+S720jBYIsXHs0xG6q79zbU1w6WOR4kUfSvPPxYr+kQTU4lFWKyfIASW7VR5FoMtZ0F0paBKDu5IW6Dj2C9TluFyamLThVx08U2I9mdxYJeLzoWGtspqNA3sqh9npG5EP/yzGBl3CEYf1c4R0/nJb2zqInM0pPXTjXe+t00nLjM9DnnDNhQop8EONXbwYAwnTVwUkuR2MOfr4Yyri64LgrSt7qx6aXWdB9Skr+jR8XPZm+gm+PU0UElIlhMS1v+guGaDzEW/9D5QTh6xGFFxgw/M9JDNG1hxwI+mjBPKpvqGnbXrQnsyScBasR+b8cDsYkp3cbUdfHc9uZV31Y+AeMgoS9q02/HeuxePiixXq5kT9HLo0k8JzM1kjtnW00Eg0jhT5TIahK/Unj+L7/EdsuJMGi/Xag/jz3hF8K4aZMBlHXDim1EWCvK/5VwF+CghltCy4k5sr/kaRN6PkZdJuokLn9JN/pyK3OtXCyIAC7DLOpLTa+puOvOlk8TOBZt2oO0+Obx8ktuUGSmDBPI6eGu4CZq6T/5R5kFTnUMZJJjtJ2e31dFi+K1znvNNT7xZeeH9S1Dolc CBmQLyOm iDMcuY1u2MrlXhN7u8yGHCLl5V5+yXGjiZRxkn2/tJxbHTSckMY85/dNUkacA+swK7geNV8UG/H8puOVT6oWNAKLX3H8abCc150j4PX+lI3OubbNE8kVCLgR6mdnXXxwKyrkC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Changelog: v4: * Refactor cachestat and move it to mm/filemap.c (patch 3) (suggested by Brian Foster) * Remove redundant checks (!folio, access_ok) (patch 3) (suggested by Matthew Wilcox and Al Viro) * Fix a bug in handling multipages folio. (patch 3) (suggested by Matthew Wilcox) * Add a selftest for shmem files, which can be used to test huge pages (patch 4) (suggested by Johannes Weiner) v3: * Fix some minor formatting issues and build errors. * Add the new syscall entry to missing architecture syscall tables. (patch 3). * Add flags argument for the syscall. (patch 3). * Clean up the recency refactoring (patch 2) (suggested by Yu Zhao) * Add the new Kconfig (CONFIG_CACHESTAT) to disable the syscall. (patch 3) (suggested by Josh Triplett) v2: * len == 0 means query to EOF. len < 0 is invalid. (patch 3) (suggested by Brian Foster) * Make cachestat extensible by adding the `cstat_size` argument in the syscall (patch 3) There is currently no good way to query the page cache state of large file sets and directory trees. There is mincore(), but it scales poorly: the kernel writes out a lot of bitmap data that userspace has to aggregate, when the user really doesn not care about per-page information in that case. The user also needs to mmap and unmap each file as it goes along, which can be quite slow as well. This series of patches introduces a new system call, cachestat, that summarizes the page cache statistics (number of cached pages, dirty pages, pages marked for writeback, evicted pages etc.) of a file, in a specified range of bytes. It also include a selftest suite that tests some typical usage This interface is inspired by past discussion and concerns with fincore, which has a similar design (and as a result, issues) as mincore. Relevant links: https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04207.html https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04209.html For comparison with mincore, I ran both syscalls on a 2TB sparse file: Using mincore: real 0m37.510s user 0m2.934s sys 0m34.558s Using cachestat: real 0m0.009s user 0m0.000s sys 0m0.009s This series consist of 4 patches: Johannes Weiner (1): workingset: fix confusion around eviction vs refault container Nhat Pham (3): workingset: refactor LRU refault to expose refault recency check cachestat: implement cachestat syscall selftests: Add selftests for cachestat MAINTAINERS | 7 + arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl | 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + include/linux/fs.h | 3 + include/linux/swap.h | 1 + include/linux/syscalls.h | 3 + include/uapi/asm-generic/unistd.h | 5 +- include/uapi/linux/mman.h | 9 + init/Kconfig | 10 + kernel/sys_ni.c | 1 + mm/filemap.c | 137 +++++++++ mm/workingset.c | 130 ++++++--- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/cachestat/.gitignore | 2 + tools/testing/selftests/cachestat/Makefile | 8 + .../selftests/cachestat/test_cachestat.c | 259 ++++++++++++++++++ 27 files changed, 550 insertions(+), 39 deletions(-) create mode 100644 tools/testing/selftests/cachestat/.gitignore create mode 100644 tools/testing/selftests/cachestat/Makefile create mode 100644 tools/testing/selftests/cachestat/test_cachestat.c