From patchwork Tue Apr 14 15:02:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11488153 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 72ECF913 for ; Tue, 14 Apr 2020 15:06:47 +0000 (UTC) Received: from aserp2120.oracle.com (aserp2120.oracle.com [141.146.126.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 443F32076D for ; Tue, 14 Apr 2020 15:06:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 443F32076D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=ocfs2-devel-bounces@oss.oracle.com Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03EF45RU041463; Tue, 14 Apr 2020 15:06:27 GMT Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2120.oracle.com with ESMTP id 30b5um5byy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 14 Apr 2020 15:06:25 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03EF1kWW149133; Tue, 14 Apr 2020 15:06:25 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userp3030.oracle.com with ESMTP id 30bqchx2s6-1 (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO); Tue, 14 Apr 2020 15:06:23 +0000 Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1jON5m-0000yT-4F; Tue, 14 Apr 2020 08:03:14 -0700 Received: from userp3030.oracle.com ([156.151.31.80]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1jON5N-0000qR-5t for ocfs2-devel@oss.oracle.com; Tue, 14 Apr 2020 08:02:49 -0700 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03EF1kwY149141 for ; Tue, 14 Apr 2020 15:02:48 GMT Received: from userp2030.oracle.com (userp2030.oracle.com [156.151.31.89]) by userp3030.oracle.com with ESMTP id 30bqchwqwx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 14 Apr 2020 15:02:48 +0000 Received: from pps.filterd (userp2030.oracle.com [127.0.0.1]) by userp2030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03EEwuk5042415 for ; Tue, 14 Apr 2020 15:02:48 GMT Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by userp2030.oracle.com with ESMTP id 30d6wj88sy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Tue, 14 Apr 2020 15:02:47 +0000 Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1jON58-0006Np-QT; Tue, 14 Apr 2020 15:02:35 +0000 From: Matthew Wilcox To: Andrew Morton Date: Tue, 14 Apr 2020 08:02:08 -0700 Message-Id: <20200414150233.24495-1-willy@infradead.org> X-Mailer: git-send-email 2.21.1 MIME-Version: 1.0 X-PDR: PASS X-Source-IP: 198.137.202.133 X-ServerName: bombadil.infradead.org X-Proofpoint-SPF-Result: None X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9591 signatures=668686 X-Proofpoint-Spam-Details: rule=tap_spam_policies_notspam policy=tap_spam_policies score=0 impostorscore=0 bulkscore=0 adultscore=0 malwarescore=0 priorityscore=0 clxscore=185 spamscore=0 suspectscore=0 lowpriorityscore=0 mlxlogscore=999 mlxscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004140122 X-Spam: Clean X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 mlxlogscore=999 bulkscore=0 malwarescore=0 phishscore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004140122 Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, ocfs2-devel@oss.oracle.com, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-btrfs@vger.kernel.org Subject: [Ocfs2-devel] [PATCH v11 00/25] Change readahead API X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9591 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 mlxlogscore=999 bulkscore=0 malwarescore=0 phishscore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004140122 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9591 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 clxscore=1034 bulkscore=0 mlxscore=0 mlxlogscore=999 lowpriorityscore=0 impostorscore=0 adultscore=0 phishscore=0 spamscore=0 suspectscore=0 malwarescore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004140122 From: "Matthew Wilcox (Oracle)" This series adds a readahead address_space operation to replace the readpages operation. The key difference is that pages are added to the page cache as they are allocated (and then looked up by the filesystem) instead of passing them on a list to the readpages operation and having the filesystem add them to the page cache. It's a net reduction in code for each implementation, more efficient than walking a list, and solves the direct-write vs buffered-read problem reported by yu kuai at https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/20200116063601.39201-1-yukuai3@huawei.com/__;!!GqivPVa7Brio!JMMDWoZS0aKAF-_dQPheEcUWFG_9fnkrIj2Y1fdn6LRj52kEGQLgQVUBcZ-QTCj_PRJySA$ The only unconverted filesystems are those which use fscache. Their conversion is pending Dave Howells' rewrite which will make the conversion substantially easier. This should be completed by the end of the year. I want to thank the reviewers/testers; Dave Chinner, John Hubbard, Eric Biggers, Johannes Thumshirn, Dave Sterba, Zi Yan, Christoph Hellwig and Miklos Szeredi have done a marvellous job of providing constructive criticism. These patches pass an xfstests run on ext4, xfs & btrfs with no regressions that I can tell (some of the tests seem a little flaky before and remain flaky afterwards). This series can also be found at https://urldefense.com/v3/__http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/tags/readahead_v11__;!!GqivPVa7Brio!JMMDWoZS0aKAF-_dQPheEcUWFG_9fnkrIj2Y1fdn6LRj52kEGQLgQVUBcZ-QTCjwD7ZfqQ$ v11: Rebased on v5.7-rc1 - Rewrote the fuse conversion to use __readahead_batch() and fix some bugs. v10: Rebased on linux-next 20200323 - Collected some more reviewed-by tags - Simplify nr_to_read limits (Eric Biggers) - Convert fs/exfat instead of drivers/staging/exfat (Namjae Jeon) - Explicitly convert a pointer to a boolean in f2fs (Eric Biggers) v9: No code changes. Fixed a changelog and added some reviewed-by tags. v8: - btrfs, ext4 and xfs all survive an xfstests run (thanks to Kent Overstreet for providing the ktest framework) - iomap restructuring dropped due to Christoph's opposition and the redesign of readahead_page() meaning it wasn't needed any more. - f2fs_mpage_readpages() made static again - Made iomap_readahead() comment more useful - Added kernel-doc for the entire readahead_control API - Conditionally zero batch_count in readahead_page() (requested by John) - Hold RCU read lock while iterating over the xarray in readahead_page_batch() - Iterate over the correct pages in readahead_page_batch() - Correct the return type of readahead_index() (spotted by Zi Yan) - Added a 'skip_page' parameter to read_pages for better documentation purposes and so we can reuse the readahead_control higher in the call chain in future. - Removed the use_list bool (requested by Christoph) - Removed the explicit initialisation of _nr_pages to 0 (requested by Christoph & John) - Add comments explaining why nr_to_read is being capped (requested by John) - Reshuffled some of the patches: - Split out adding the readahead_control API from the three patches which added it piecemeal - Shift the final two mm patches to be with the other mm patches - Split the f2fs "pass the inode" patch from the "convert to readahead" patch, like ext4 v7: - Now passes an xfstests run on ext4! - Documentation improvements - Move the readahead prototypes out of mm.h (new patch) - readahead_for_each* iterators are gone; replaced with readahead_page() and readahead_page_batch() - page_cache_readahead_limit() renamed to page_cache_readahead_unbounded() and arguments changed - iomap_readahead_actor() restructured differently - The readahead code no longer uses the word 'offset' to reduce ambiguity - read_pages() now maintains the rac so we can just call it and continue instead of mucking around with branches - More assertions - More readahead functions return void v6: - Name the private members of readahead_control with a leading underscore (suggested by Christoph Hellwig) - Fix whitespace in rst file - Remove misleading comment in btrfs patch - Add readahead_next() API and use it in iomap - Add iomap_readahead kerneldoc. - Fix the mpage_readahead kerneldoc - Make various readahead functions return void - Keep readahead_index() and readahead_offset() pointing to the start of this batch through the body. No current user requires this, but it's less surprising. - Add kerneldoc for page_cache_readahead_limit - Make page_idx an unsigned long, and rename it to just 'i' - Get rid of page_offset local variable - Add patch to call memalloc_nofs_save() before allocating pages (suggested by Michal Hocko) - Resplit a lot of patches for more logical progression and easier review (suggested by John Hubbard) - Added sign-offs where received, and I deemed still relevant v5 switched to passing a readahead_control struct (mirroring the writepages_control struct passed to writepages). This has a number of advantages: - It fixes a number of bugs in various implementations, eg forgetting to increment 'start', an off-by-one error in 'nr_pages' or treating 'start' as a byte offset instead of a page offset. - It allows us to change the arguments without changing all the implementations of ->readahead which just call mpage_readahead() or iomap_readahead() - Figuring out which pages haven't been attempted by the implementation is more natural this way. - There's less code in each implementation. Matthew Wilcox (Oracle) (25): mm: Move readahead prototypes from mm.h mm: Return void from various readahead functions mm: Ignore return value of ->readpages mm: Move readahead nr_pages check into read_pages mm: Add new readahead_control API mm: Use readahead_control to pass arguments mm: Rename various 'offset' parameters to 'index' mm: rename readahead loop variable to 'i' mm: Remove 'page_offset' from readahead loop mm: Put readahead pages in cache earlier mm: Add readahead address space operation mm: Move end_index check out of readahead loop mm: Add page_cache_readahead_unbounded mm: Document why we don't set PageReadahead mm: Use memalloc_nofs_save in readahead path fs: Convert mpage_readpages to mpage_readahead btrfs: Convert from readpages to readahead erofs: Convert uncompressed files from readpages to readahead erofs: Convert compressed files from readpages to readahead ext4: Convert from readpages to readahead ext4: Pass the inode to ext4_mpage_readpages f2fs: Convert from readpages to readahead f2fs: Pass the inode to f2fs_mpage_readpages fuse: Convert from readpages to readahead iomap: Convert from readpages to readahead Documentation/filesystems/locking.rst | 6 +- Documentation/filesystems/vfs.rst | 15 ++ block/blk-core.c | 1 + fs/block_dev.c | 7 +- fs/btrfs/extent_io.c | 43 ++-- fs/btrfs/extent_io.h | 3 +- fs/btrfs/inode.c | 16 +- fs/erofs/data.c | 39 ++-- fs/erofs/zdata.c | 29 +-- fs/exfat/inode.c | 7 +- fs/ext2/inode.c | 10 +- fs/ext4/ext4.h | 5 +- fs/ext4/inode.c | 21 +- fs/ext4/readpage.c | 25 +-- fs/ext4/verity.c | 35 +--- fs/f2fs/data.c | 50 ++--- fs/f2fs/f2fs.h | 3 - fs/f2fs/verity.c | 35 +--- fs/fat/inode.c | 7 +- fs/fuse/file.c | 99 +++------- fs/gfs2/aops.c | 23 +-- fs/hpfs/file.c | 7 +- fs/iomap/buffered-io.c | 92 +++------ fs/iomap/trace.h | 2 +- fs/isofs/inode.c | 7 +- fs/jfs/inode.c | 7 +- fs/mpage.c | 38 ++-- fs/nilfs2/inode.c | 15 +- fs/ocfs2/aops.c | 34 ++-- fs/omfs/file.c | 7 +- fs/qnx6/inode.c | 7 +- fs/reiserfs/inode.c | 8 +- fs/udf/inode.c | 7 +- fs/xfs/xfs_aops.c | 13 +- fs/zonefs/super.c | 7 +- include/linux/fs.h | 2 + include/linux/iomap.h | 3 +- include/linux/mm.h | 19 -- include/linux/mpage.h | 4 +- include/linux/pagemap.h | 151 ++++++++++++++ include/trace/events/erofs.h | 6 +- include/trace/events/f2fs.h | 6 +- mm/fadvise.c | 6 +- mm/internal.h | 12 +- mm/migrate.c | 2 +- mm/readahead.c | 275 ++++++++++++++++---------- 46 files changed, 583 insertions(+), 633 deletions(-) base-commit: 8f3d9f354286745c751374f5f1fcafee6b3f3136