From patchwork Sun Feb 19 07:33:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 13145801 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3210C636CC for ; Sun, 19 Feb 2023 07:33:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E4C586B0072; Sun, 19 Feb 2023 02:33:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DFBF86B0073; Sun, 19 Feb 2023 02:33:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C9DBC6B0074; Sun, 19 Feb 2023 02:33:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B665E6B0072 for ; Sun, 19 Feb 2023 02:33:28 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 7E6221A1484 for ; Sun, 19 Feb 2023 07:33:28 +0000 (UTC) X-FDA: 80483226096.29.5F043CC Received: from mail-oa1-f43.google.com (mail-oa1-f43.google.com [209.85.160.43]) by imf20.hostedemail.com (Postfix) with ESMTP id 97F3A1C0008 for ; Sun, 19 Feb 2023 07:33:26 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=KlBzoXu6; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.160.43 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676792006; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=YkLqesT26PDlQjhQXn4IlepRpuZlcVw6wZTGqd23uOM=; b=VaQJZOeRsq5FHBNYwqv8eyIBDwzDRyyqoNiQL+6okgge5b4+EntFLKn5xrav0Wza0rYmLJ 13N965WZlB14U8M1DO+oIt7zVmZ5Y6QmN6GTz5gFIDVNNIVH1sywghcvg/lxE0lFi1V01D xbS30V4EWg1r/c/SdhWK1q3dHgzEz/g= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=KlBzoXu6; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.160.43 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676792006; a=rsa-sha256; cv=none; b=i0LjHk3Tbjv1BJb2hCQSWX6QLkmmuYeYHtkNhXg0FCKHHVq70I8MYw3dr27lc0DiLmTXkh 4VF2Vvob7FKSeGu6avCr80bnzSJPsSA076yQDuDWptgofE0vAg2xqgpqz+gmjHkqTTtfeY mMTcfIWPwXwtDvgzGZihtPQoGlJrxOI= Received: by mail-oa1-f43.google.com with SMTP id 586e51a60fabf-171d3ea101cso493190fac.5 for ; Sat, 18 Feb 2023 23:33:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=YkLqesT26PDlQjhQXn4IlepRpuZlcVw6wZTGqd23uOM=; b=KlBzoXu6leOvXsbKZEE/DdXu2JAdlx5qnLIElF3EN/8yJL7saLK+07QI4MkvwlZ0iI r1wNrUXB/thUpranTfqVv1QaB67ma3+o9NgU7GYn+DG9XkJlJZzKdlQ+8MizoMzT+WmO UYlXde/iRVSx6TaigtZDpvctklvxP32qZbEkEGKz6AL/Am0JkkKc9WSDfdrv31VSGkcn bOSI6uDxAUIvdutQFAQycdGtWc/AL0MCn6u6zttbXdVqG0ZFICsru0+8sNcXchq+Wk89 i1zieOJiB6UlQzNvC1p+5bCW0X2CkwQ99Bp+nkYc/P1kOdZ6AY6iDL2aHN/t/zeSETlJ 3EjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=YkLqesT26PDlQjhQXn4IlepRpuZlcVw6wZTGqd23uOM=; b=Hh356rWi4EkJe9QZwBfPzfMmwt7UywA8TNXVk7RS7rta1JWRyTuAA1SApCSb8tZoXF NibqB9MWyRQnGvhQ3fpWK4+0kNuU7KLvdzeBw+cUOdfBq38T33huu8aMDDDqJD4you7h UhpP5+grmqA83F2HZZnS+SOG+tzcOx4YFhN6wdyLalHLS6Ie8OO+bvt4TrpB1eetTpdF oue+AAcIAZ/lU0YrTq+jGDW31HgPhFeWAelrNbftPwm89I7m0Zeutf/ztNt0173hZ0cK 2UP2qvwxhkIJ7JeAEFX07JNPmusEJ4qsTXLauDH1BG5Qvm5Kq0KNyV65UHYmuMDJxE5t Us2A== X-Gm-Message-State: AO0yUKW4VBFbR5ergN2AQV8hJdaJvA6zk88w0VvHrEwBodVV0AktRFeD DrIH/+ksg5cfzQv0eS8eXaE= X-Google-Smtp-Source: AK7set/aNuIVedOzWF1NbBuhIitn6w3joEyjQQTw3I5a5T4DIpsqPLy8GASC15+FjFc7lLA1E1E0tA== X-Received: by 2002:a05:6870:d287:b0:171:9a7c:c32b with SMTP id d7-20020a056870d28700b001719a7cc32bmr2333227oae.8.1676792005533; Sat, 18 Feb 2023 23:33:25 -0800 (PST) Received: from localhost (227.sub-174-197-65.myvzw.com. [174.197.65.227]) by smtp.gmail.com with ESMTPSA id t1-20020a05687044c100b00163b85ef1bfsm3434565oai.35.2023.02.18.23.33.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 18 Feb 2023 23:33:25 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, bfoster@redhat.com, willy@infradead.org, arnd@arndb.de, linux-api@vger.kernel.org, kernel-team@meta.com Subject: [PATCH v10 0/3] cachestat: a new syscall for page cache state of files Date: Sat, 18 Feb 2023 23:33:15 -0800 Message-Id: <20230219073318.366189-1-nphamcs@gmail.com> X-Mailer: git-send-email 2.39.1 MIME-Version: 1.0 X-Rspamd-Queue-Id: 97F3A1C0008 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: d7wu79ezir6k8hggoikgkc8jc3iyw7wb X-HE-Tag: 1676792006-638863 X-HE-Meta: U2FsdGVkX190ts9CnZxapzei/y+HB5MFIqBHAG02E5YTKOic7T1oFwO/ZM1wx8wWSum0UDOVS0p+qEruuYHIw12NTGMchPG2mW530Ouzg4+K4uL65lZUPgI1tGWoEfQr3pS4w+PVlrMbr9iSZNsnIry2ZeffNRtqsxIT7PK65jf4zLVOQMvsiZyQslPb918w1p+dTFYw2BQuGfGENrT3GV3NGy+iglh3jql3uvCdToJLFp4u5H5us5QoboezZ2chtTjvVLA2viNHFNxh6s1CTcqwzd4jtLG7k4bH0XDzXP/ZQV6/uaXenX1oqnF13FRdDN+uuZnbWMVd3p3bzBleRX719/yUiadnxWxz2R2ba9aVyW4lDhaCfwlb/Wqr8SgmSLInxg1tFepe8LPqTncVGmqzRGywti16tqVu4sbu7EOasmAxAL4uWv9E/myU519MaotMKK1jojl+GGbmt0mCGz8r8X9dTsKkyYxAeAc0TPnE3jx0EHdGCGjjw0DoJWGQee76Kd5bEERrQtqBo1ycLkSxiSg5SlvMDmC+6m0HYCggoAfAoilxXcI0muUDnOb8DiOWtQpht/yaIfICrQx2myCr2xM/+aWzRiuRgpYdqRBK3k9ULgdB41AdIqmwzCFxwcYWSjIRQSfUD2UOgdS/Ysf5goP//htjXfGu1l0L9VQ2THF5R6sOrmRda1BuB7ZuyOl62Kwmozz8RVgNTylklCGybAJYHvy6Y1GTqEL1GCHex54Vm/J7SdwM92yyH0RtDEJa4VzdXPv3qb9b3+walmQO7ZngPKGRgAH+Gb5lQJkl6Gngp8ELK0P71NeEqF22kiHh/0HIeahTJ/sKPf0BIeBv5GzvHOdr82uITPOd65aey6BISDI4lLxoFITkjZb9cqP/dJOHH/fYHxNFykDDxHqtPpOwyeIm4Z11croPkZvt1MPI4sJp9/0UN3b6msGf+Z6rdPkxrqz7YGTzWqO CzbL9I3e CYEydM8KS8KkZzDDIOx7ei3pswE5hBB4uN+zMCo9MRSQPxkJQFEWkru4C1nyMfDkY8Qf5p6n/sCrrOVCRtHrIljeQcAjXJGN8JarOx1aPzZswHzaGSN+I/Yf8KhUeKlM8NDEyexUng4m7dpoXyOL6h6DA8NlssKrX5Qmomld7I1FJvk873xFTv1EiNo8P+ntmMV2paWvPKlMA98Gk/Ercm0R72hRQcCf+0x19s8TcWR7H2DwD1/AhaWlDfGPNUsYgA3SoA65g4wc3tjcA6R2mIR0gyLv+UrB7HDQdfAgpX0v/BRJoIP3kCl1VqBHpqPpnod5bQ6IFRrE6QJsCAoH/q3+88/FlN3oJtD4TCFMFxk6/lfwUZK8R5tR8lRCXwyMkApMalNjQIAX0u7vtKykzpheH8LOa24vwUeO8Q7oWGJkrcJXbUz6Yzq9jMDgYy29cRr8InTtoNmJ2+mARitUuH64t4uYd0lo/5bfiDBlpwXMK/mbcG6C0ysI/P19kp0TACOdb36aybPqwzmnPhyKsYI9mkxTkXkCig+z+c7lE39PU/B8kITvn5q5poRPH/6fOTqMU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Changelog: v10: * Reorder the arguments for archs with alignment requirements. (patch 2) (suggested by Arnd Bergmann) v9: * Remove syscall from all the architectures syscall table except x86 (patch 2) * API change: handle different cases for offset and add compat syscall. (patch 2) (suggested by Johannes Weiner and Arnd Bergmann) v8: * Add syscall to mips syscall tables (detected by kernel test robot) (patch 2) * Add a missing return (suggested by Yu Zhao) (patch 2) v7: * Fix and use lru_gen_test_recent (suggested by Brian Foster) (patch 2) * Small formatting and organizational fixes v6: * Add a missing fdput() (suggested by Brian Foster) (patch 2) * Replace cstat_size with cstat_version (suggested by Brian Foster) (patch 2) * Add conditional resched to the xas walk. (suggested by Hillf Danton) (patch 2) v5: * Separate first patch into its own series. (suggested by Andrew Morton) * Expose filemap_cachestat() to non-syscall usage (patch 2) (suggested by Brian Foster). * Fix some build errors from last version. (patch 2) * Explain eviction and recent eviction in the draft man page and documentation (suggested by Andrew Morton). (patch 2) v4: * Refactor cachestat and move it to mm/filemap.c (patch 3) (suggested by Brian Foster) * Remove redundant checks (!folio, access_ok) (patch 3) (suggested by Matthew Wilcox and Al Viro) * Fix a bug in handling multipages folio. (patch 3) (suggested by Matthew Wilcox) * Add a selftest for shmem files, which can be used to test huge pages (patch 4) (suggested by Johannes Weiner) v3: * Fix some minor formatting issues and build errors. * Add the new syscall entry to missing architecture syscall tables. (patch 3). * Add flags argument for the syscall. (patch 3). * Clean up the recency refactoring (patch 2) (suggested by Yu Zhao) * Add the new Kconfig (CONFIG_CACHESTAT) to disable the syscall. (patch 3) (suggested by Josh Triplett) v2: * len == 0 means query to EOF. len < 0 is invalid. (patch 3) (suggested by Brian Foster) * Make cachestat extensible by adding the `cstat_size` argument in the syscall (patch 3) There is currently no good way to query the page cache state of large file sets and directory trees. There is mincore(), but it scales poorly: the kernel writes out a lot of bitmap data that userspace has to aggregate, when the user really doesn not care about per-page information in that case. The user also needs to mmap and unmap each file as it goes along, which can be quite slow as well. This series of patches introduces a new system call, cachestat, that summarizes the page cache statistics (number of cached pages, dirty pages, pages marked for writeback, evicted pages etc.) of a file, in a specified range of bytes. It also include a selftest suite that tests some typical usage. Currently, the syscall is only wired in for x86 architecture. This interface is inspired by past discussion and concerns with fincore, which has a similar design (and as a result, issues) as mincore. Relevant links: https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04207.html https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04209.html For comparison with mincore, I ran both syscalls on a 2TB sparse file: Using mincore: real 0m37.510s user 0m2.934s sys 0m34.558s Using cachestat: real 0m0.009s user 0m0.000s sys 0m0.009s This series should be applied on top of: workingset: fix confusion around eviction vs refault container https://lkml.org/lkml/2023/1/4/1066 This series consist of 3 patches: Nhat Pham (3): workingset: refactor LRU refault to expose refault recency check cachestat: implement cachestat syscall selftests: Add selftests for cachestat MAINTAINERS | 7 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + include/linux/compat.h | 4 +- include/linux/fs.h | 3 + include/linux/swap.h | 1 + include/linux/syscalls.h | 3 + include/uapi/asm-generic/unistd.h | 5 +- include/uapi/linux/mman.h | 9 + init/Kconfig | 10 + kernel/sys_ni.c | 1 + mm/filemap.c | 158 +++++++++++ mm/workingset.c | 142 ++++++---- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/cachestat/.gitignore | 2 + tools/testing/selftests/cachestat/Makefile | 8 + .../selftests/cachestat/test_cachestat.c | 256 ++++++++++++++++++ 17 files changed, 564 insertions(+), 48 deletions(-) create mode 100644 tools/testing/selftests/cachestat/.gitignore create mode 100644 tools/testing/selftests/cachestat/Makefile create mode 100644 tools/testing/selftests/cachestat/test_cachestat.c base-commit: 1440f576022887004f719883acb094e7e0dd4944 prerequisite-patch-id: 171a43d333e1b267ce14188a5beaea2f313787fb --- 2.39.1