From patchwork Mon Feb 17 18:45:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387231 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DE338109A for ; Mon, 17 Feb 2020 18:46:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BC26922B48 for ; Mon, 17 Feb 2020 18:46:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="ED0OnjLT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730124AbgBQSqq (ORCPT ); Mon, 17 Feb 2020 13:46:46 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48410 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730047AbgBQSqe (ORCPT ); Mon, 17 Feb 2020 13:46:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=DP26YoHmQVKFeLAP9Bhl6lTFH6pmN1Lp+nkcg7gbG1k=; b=ED0OnjLTGnEOOo2CZefh2G88KH 3jrEzs4VHYd1wXeUpU6HsfbK77yiq/5cy+zhME6scGgUb7JxdN+7S2/GOG2Z/qiKlQ9PPNooPIlDL OybRjdLpztJcOs3levap4q/E2MTBAw6qzkCsJmjdII63L2Pli1xrIi6t1XVfm74+zw3q/HKqpqI92 ku/YMLp3rt2u0Ir0YBDD5xB8VqqHbl17yEZkNT0jKFY8WYkS3lRvPHPHpHSy5cUS26ajALVTZAmnr JlLOA3WaW8WbFQ9Nk1AytNed39mHzG9AvcRmp2ibIg6GFMUg+LpBV8EZtkxY4LD7oQACkug45a17s 61imUM/A==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-00058e-AJ; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 01/19] mm: Return void from various readahead functions Date: Mon, 17 Feb 2020 10:45:42 -0800 Message-Id: <20200217184613.19668-2-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" ondemand_readahead has two callers, neither of which use the return value. That means that both ra_submit and __do_page_cache_readahead() can return void, and we don't need to worry that a present page in the readahead window causes us to return a smaller nr_pages than we ought to have. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Dave Chinner --- mm/internal.h | 8 ++++---- mm/readahead.c | 24 ++++++++++-------------- 2 files changed, 14 insertions(+), 18 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 3cf20ab3ca01..f779f058118b 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -49,18 +49,18 @@ void unmap_page_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end, struct zap_details *details); -extern unsigned int __do_page_cache_readahead(struct address_space *mapping, +extern void __do_page_cache_readahead(struct address_space *mapping, struct file *filp, pgoff_t offset, unsigned long nr_to_read, unsigned long lookahead_size); /* * Submit IO for the read-ahead request in file_ra_state. */ -static inline unsigned long ra_submit(struct file_ra_state *ra, +static inline void ra_submit(struct file_ra_state *ra, struct address_space *mapping, struct file *filp) { - return __do_page_cache_readahead(mapping, filp, - ra->start, ra->size, ra->async_size); + __do_page_cache_readahead(mapping, filp, + ra->start, ra->size, ra->async_size); } /* diff --git a/mm/readahead.c b/mm/readahead.c index 2fe72cd29b47..8ce46d69e6ae 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -149,10 +149,8 @@ static int read_pages(struct address_space *mapping, struct file *filp, * the pages first, then submits them for I/O. This avoids the very bad * behaviour which would occur if page allocations are causing VM writeback. * We really don't want to intermingle reads and writes like that. - * - * Returns the number of pages requested, or the maximum amount of I/O allowed. */ -unsigned int __do_page_cache_readahead(struct address_space *mapping, +void __do_page_cache_readahead(struct address_space *mapping, struct file *filp, pgoff_t offset, unsigned long nr_to_read, unsigned long lookahead_size) { @@ -166,7 +164,7 @@ unsigned int __do_page_cache_readahead(struct address_space *mapping, gfp_t gfp_mask = readahead_gfp_mask(mapping); if (isize == 0) - goto out; + return; end_index = ((isize - 1) >> PAGE_SHIFT); @@ -211,8 +209,6 @@ unsigned int __do_page_cache_readahead(struct address_space *mapping, if (nr_pages) read_pages(mapping, filp, &page_pool, nr_pages, gfp_mask); BUG_ON(!list_empty(&page_pool)); -out: - return nr_pages; } /* @@ -378,11 +374,10 @@ static int try_context_readahead(struct address_space *mapping, /* * A minimal readahead algorithm for trivial sequential/random reads. */ -static unsigned long -ondemand_readahead(struct address_space *mapping, - struct file_ra_state *ra, struct file *filp, - bool hit_readahead_marker, pgoff_t offset, - unsigned long req_size) +static void ondemand_readahead(struct address_space *mapping, + struct file_ra_state *ra, struct file *filp, + bool hit_readahead_marker, pgoff_t offset, + unsigned long req_size) { struct backing_dev_info *bdi = inode_to_bdi(mapping->host); unsigned long max_pages = ra->ra_pages; @@ -428,7 +423,7 @@ ondemand_readahead(struct address_space *mapping, rcu_read_unlock(); if (!start || start - offset > max_pages) - return 0; + return; ra->start = start; ra->size = start - offset; /* old async_size */ @@ -464,7 +459,8 @@ ondemand_readahead(struct address_space *mapping, * standalone, small random read * Read as is, and do not pollute the readahead state. */ - return __do_page_cache_readahead(mapping, filp, offset, req_size, 0); + __do_page_cache_readahead(mapping, filp, offset, req_size, 0); + return; initial_readahead: ra->start = offset; @@ -489,7 +485,7 @@ ondemand_readahead(struct address_space *mapping, } } - return ra_submit(ra, mapping, filp); + ra_submit(ra, mapping, filp); } /** From patchwork Mon Feb 17 18:45:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387403 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 53F2917F0 for ; Mon, 17 Feb 2020 18:48:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3472A227BF for ; Mon, 17 Feb 2020 18:48:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="KwiTBfAm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730250AbgBQSs5 (ORCPT ); Mon, 17 Feb 2020 13:48:57 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48056 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729788AbgBQSqT (ORCPT ); Mon, 17 Feb 2020 13:46:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=+TXbfoC7OhLaTWBRfdmrI045E/aNC188d51jOi0q5AY=; b=KwiTBfAmNSramg93zsbmeMe93l ZrqnhSVfpTjg1D+RZ65c/0U+Gzn6kYjk8i98GZCTSnNxpNztwTENNatDJ6HdQDKXjmzilkYKcMSUr +lO9Xs8zhoQQf02i4Im2eZpLW+0bpjyxbYdZPqv43MIYmbmwXnrCvbi4rvMZBjnmHBn5YE/N+YoJC Zczha9h/rPLXmuCp/L/K+qpguhk9+Z21I0xrPieUJ49fusKlLYnmksE4sHTjNVRPLlLJlVB6bV6N4 8+B4iJV9GlxmS9pkW8wqfJ7SJe5gFiLNb/A99vIJeJ/uLIqdZFaRAYKCFVCaXiBob2S6z5qTej1B6 up3uu9Ig==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-00058j-BO; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org, Christoph Hellwig Subject: [PATCH v6 02/19] mm: Ignore return value of ->readpages Date: Mon, 17 Feb 2020 10:45:43 -0800 Message-Id: <20200217184613.19668-3-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" We used to assign the return value to a variable, which we then ignored. Remove the pretence of caring. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig Reviewed-by: Dave Chinner --- mm/readahead.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 8ce46d69e6ae..12d13b7792da 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -113,17 +113,16 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages, EXPORT_SYMBOL(read_cache_pages); -static int read_pages(struct address_space *mapping, struct file *filp, +static void read_pages(struct address_space *mapping, struct file *filp, struct list_head *pages, unsigned int nr_pages, gfp_t gfp) { struct blk_plug plug; unsigned page_idx; - int ret; blk_start_plug(&plug); if (mapping->a_ops->readpages) { - ret = mapping->a_ops->readpages(filp, mapping, pages, nr_pages); + mapping->a_ops->readpages(filp, mapping, pages, nr_pages); /* Clean up the remaining pages */ put_pages_list(pages); goto out; @@ -136,12 +135,9 @@ static int read_pages(struct address_space *mapping, struct file *filp, mapping->a_ops->readpage(filp, page); put_page(page); } - ret = 0; out: blk_finish_plug(&plug); - - return ret; } /* From patchwork Mon Feb 17 18:45:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387397 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 08970109A for ; Mon, 17 Feb 2020 18:48:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DC17B227BF for ; Mon, 17 Feb 2020 18:48:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="Y7W3hykN" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729818AbgBQSqU (ORCPT ); Mon, 17 Feb 2020 13:46:20 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48054 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729784AbgBQSqT (ORCPT ); Mon, 17 Feb 2020 13:46:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=eXmGOwIH2jpYulUEG8Zc1CpZzmFCoe4ryYpWkEu7ztU=; b=Y7W3hykN9PWiGQMVy6Nf1Iq07d TX9lYS+aB4W/H6JX5GOUvRAO+zA72+02fKLSnf50aPbdXg8LzxxdxXjkmFWwCHMvxpZOT3tlzmJuB 172pIKk4xlwuVUmRYtUHACAYK9Bh+xwcDRCeSLG64CvWp3090uIfidiMp3wyXCl9kdp2OOjGwpl92 uyD8R1AKIdVtiFyhqmZIr5XtIGnC2sfSkEXAnFghbfjJWrSA9bi7mHKAo4LkpiOr8bKL2ngIFmsGv ljsfAJfGg8WbI3W14skTLYLZnQ9OWS8X/CkNZU7EnFJFlKh9jiTnJq0w21BYfWSNhEMKAUWUG1VfI uLiQX3mw==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-00058n-CQ; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 03/19] mm: Use readahead_control to pass arguments Date: Mon, 17 Feb 2020 10:45:44 -0800 Message-Id: <20200217184613.19668-4-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" In this patch, only between __do_page_cache_readahead() and read_pages(), but it will be extended in upcoming patches. Also add the readahead_count() accessor. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/pagemap.h | 17 +++++++++++++++++ mm/readahead.c | 36 +++++++++++++++++++++--------------- 2 files changed, 38 insertions(+), 15 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index ccb14b6a16b5..982ecda2d4a2 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -630,6 +630,23 @@ static inline int add_to_page_cache(struct page *page, return error; } +/* + * Readahead is of a block of consecutive pages. + */ +struct readahead_control { + struct file *file; + struct address_space *mapping; +/* private: use the readahead_* accessors instead */ + pgoff_t _start; + unsigned int _nr_pages; +}; + +/* The number of pages in this readahead block */ +static inline unsigned int readahead_count(struct readahead_control *rac) +{ + return rac->_nr_pages; +} + static inline unsigned long dir_pages(struct inode *inode) { return (unsigned long)(inode->i_size + PAGE_SIZE - 1) >> diff --git a/mm/readahead.c b/mm/readahead.c index 12d13b7792da..15329309231f 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -113,26 +113,29 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages, EXPORT_SYMBOL(read_cache_pages); -static void read_pages(struct address_space *mapping, struct file *filp, - struct list_head *pages, unsigned int nr_pages, gfp_t gfp) +static void read_pages(struct readahead_control *rac, struct list_head *pages, + gfp_t gfp) { + const struct address_space_operations *aops = rac->mapping->a_ops; struct blk_plug plug; unsigned page_idx; blk_start_plug(&plug); - if (mapping->a_ops->readpages) { - mapping->a_ops->readpages(filp, mapping, pages, nr_pages); + if (aops->readpages) { + aops->readpages(rac->file, rac->mapping, pages, + readahead_count(rac)); /* Clean up the remaining pages */ put_pages_list(pages); goto out; } - for (page_idx = 0; page_idx < nr_pages; page_idx++) { + for (page_idx = 0; page_idx < readahead_count(rac); page_idx++) { struct page *page = lru_to_page(pages); list_del(&page->lru); - if (!add_to_page_cache_lru(page, mapping, page->index, gfp)) - mapping->a_ops->readpage(filp, page); + if (!add_to_page_cache_lru(page, rac->mapping, page->index, + gfp)) + aops->readpage(rac->file, page); put_page(page); } @@ -155,9 +158,13 @@ void __do_page_cache_readahead(struct address_space *mapping, unsigned long end_index; /* The last page we want to read */ LIST_HEAD(page_pool); int page_idx; - unsigned int nr_pages = 0; loff_t isize = i_size_read(inode); gfp_t gfp_mask = readahead_gfp_mask(mapping); + struct readahead_control rac = { + .mapping = mapping, + .file = filp, + ._nr_pages = 0, + }; if (isize == 0) return; @@ -180,10 +187,9 @@ void __do_page_cache_readahead(struct address_space *mapping, * contiguous pages before continuing with the next * batch. */ - if (nr_pages) - read_pages(mapping, filp, &page_pool, nr_pages, - gfp_mask); - nr_pages = 0; + if (readahead_count(&rac)) + read_pages(&rac, &page_pool, gfp_mask); + rac._nr_pages = 0; continue; } @@ -194,7 +200,7 @@ void __do_page_cache_readahead(struct address_space *mapping, list_add(&page->lru, &page_pool); if (page_idx == nr_to_read - lookahead_size) SetPageReadahead(page); - nr_pages++; + rac._nr_pages++; } /* @@ -202,8 +208,8 @@ void __do_page_cache_readahead(struct address_space *mapping, * uptodate then the caller will launch readpage again, and * will then handle the error. */ - if (nr_pages) - read_pages(mapping, filp, &page_pool, nr_pages, gfp_mask); + if (readahead_count(&rac)) + read_pages(&rac, &page_pool, gfp_mask); BUG_ON(!list_empty(&page_pool)); } From patchwork Mon Feb 17 18:45:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387209 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D8286109A for ; Mon, 17 Feb 2020 18:46:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B7E3A24670 for ; Mon, 17 Feb 2020 18:46:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="mG3ezOVT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730007AbgBQSqa (ORCPT ); Mon, 17 Feb 2020 13:46:30 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48328 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729977AbgBQSq2 (ORCPT ); Mon, 17 Feb 2020 13:46:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=GsjiaZugI72vbGelDXh1gHsSqzMvkN1OtgDj3kywYLo=; b=mG3ezOVTlUwDZANXsCHvcKdwJB z3OlpT7AjCBCJphbep/jJMnYwWkqQCv4wTdvjax0x0CW6Nvu8+8tccy4vlbefujrJjJEj82Cnsx1d 0Z0dVex246tKQbAITwLWjKWZ5u/u/QkHJfmmheFgDcrAuXqy1jKQgdOFfs9OG8e4Jv179uA6lHT0C rj7l1UtOFC7pBjOM+uPFkUB4PEe+Mm5ggHB2nPVgYYpq1iZJM1/7wiOXsE7PHbRws4qZi4knTfp9M Abo7HX1ADfB8vHy4nwtU2Q4tJcbww1fj8NQ9P/NHWgtRpkc995sdbYd50dglXP3xbP8Qt8cWX4AgS 6OQh+pkw==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-00058r-DR; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 04/19] mm: Rearrange readahead loop Date: Mon, 17 Feb 2020 10:45:45 -0800 Message-Id: <20200217184613.19668-5-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Move the declaration of 'page' to inside the loop and move the 'kick off a fresh batch' code to the end of the function for easier use in subsequent patches. Signed-off-by: Matthew Wilcox (Oracle) --- mm/readahead.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 15329309231f..3eca59c43a45 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -154,7 +154,6 @@ void __do_page_cache_readahead(struct address_space *mapping, unsigned long lookahead_size) { struct inode *inode = mapping->host; - struct page *page; unsigned long end_index; /* The last page we want to read */ LIST_HEAD(page_pool); int page_idx; @@ -175,6 +174,7 @@ void __do_page_cache_readahead(struct address_space *mapping, * Preallocate as many pages as we will need. */ for (page_idx = 0; page_idx < nr_to_read; page_idx++) { + struct page *page; pgoff_t page_offset = offset + page_idx; if (page_offset > end_index) @@ -183,14 +183,14 @@ void __do_page_cache_readahead(struct address_space *mapping, page = xa_load(&mapping->i_pages, page_offset); if (page && !xa_is_value(page)) { /* - * Page already present? Kick off the current batch of - * contiguous pages before continuing with the next - * batch. + * Page already present? Kick off the current batch + * of contiguous pages before continuing with the + * next batch. This page may be the one we would + * have intended to mark as Readahead, but we don't + * have a stable reference to this page, and it's + * not worth getting one just for that. */ - if (readahead_count(&rac)) - read_pages(&rac, &page_pool, gfp_mask); - rac._nr_pages = 0; - continue; + goto read; } page = __page_cache_alloc(gfp_mask); @@ -201,6 +201,11 @@ void __do_page_cache_readahead(struct address_space *mapping, if (page_idx == nr_to_read - lookahead_size) SetPageReadahead(page); rac._nr_pages++; + continue; +read: + if (readahead_count(&rac)) + read_pages(&rac, &page_pool, gfp_mask); + rac._nr_pages = 0; } /* From patchwork Mon Feb 17 18:45:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387219 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2B50A138D for ; Mon, 17 Feb 2020 18:46:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0AEBB22527 for ; Mon, 17 Feb 2020 18:46:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="h1IUDmbb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730089AbgBQSqj (ORCPT ); Mon, 17 Feb 2020 13:46:39 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48446 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730071AbgBQSqh (ORCPT ); Mon, 17 Feb 2020 13:46:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=7Dxhm80Fw42pmsWKiiJ2urXHPFwIoqKidnXP7wm+j3k=; b=h1IUDmbbrv0kbZmyB9xlf7gQNw iMrB/vhHv9BYzVoWzQPC+jK351aSsx/1tI7T9Gce4M/7CUHI/UIaRAFhGEbR3lwEirF55IcM+Vs+X Bm8uZ1XrLsfUdE2x3+ELxQ/rJ7ALniYcN2AVIpnvKZ3DyOHEGM8UXpMA4l0nuYVZZdFdJYUQfQLZn yfVyAvLe2FHaf+DmWFGGpf/QV3h664luosyXmbBh4lpibf+0zKbBPgIXClJKPrdEKmbfHgiDetLqU uVT0S1+epzjtcroqUq8aqdLWz0AR/IYP8HGUERcKOHeuIPPG7sSFsjIrEq8VQPInemww00PmubEW/ 78Ml4R8Q==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-000593-GV; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 05/19] mm: Remove 'page_offset' from readahead loop Date: Mon, 17 Feb 2020 10:45:48 -0800 Message-Id: <20200217184613.19668-8-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Eliminate the page_offset variable which was confusing with the 'offset' parameter and record the start of each consecutive run of pages in the readahead_control. Signed-off-by: Matthew Wilcox (Oracle) --- mm/readahead.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 3eca59c43a45..74791b96013f 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -162,6 +162,7 @@ void __do_page_cache_readahead(struct address_space *mapping, struct readahead_control rac = { .mapping = mapping, .file = filp, + ._start = offset, ._nr_pages = 0, }; @@ -175,12 +176,11 @@ void __do_page_cache_readahead(struct address_space *mapping, */ for (page_idx = 0; page_idx < nr_to_read; page_idx++) { struct page *page; - pgoff_t page_offset = offset + page_idx; - if (page_offset > end_index) + if (offset > end_index) break; - page = xa_load(&mapping->i_pages, page_offset); + page = xa_load(&mapping->i_pages, offset); if (page && !xa_is_value(page)) { /* * Page already present? Kick off the current batch @@ -196,16 +196,18 @@ void __do_page_cache_readahead(struct address_space *mapping, page = __page_cache_alloc(gfp_mask); if (!page) break; - page->index = page_offset; + page->index = offset; list_add(&page->lru, &page_pool); if (page_idx == nr_to_read - lookahead_size) SetPageReadahead(page); rac._nr_pages++; + offset++; continue; read: if (readahead_count(&rac)) read_pages(&rac, &page_pool, gfp_mask); rac._nr_pages = 0; + rac._start = ++offset; } /* From patchwork Mon Feb 17 18:45:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387251 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E5C8817F0 for ; Mon, 17 Feb 2020 18:47:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BBBB220836 for ; Mon, 17 Feb 2020 18:47:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="kgYGh8XV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730137AbgBQSrB (ORCPT ); Mon, 17 Feb 2020 13:47:01 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48382 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730017AbgBQSqb (ORCPT ); Mon, 17 Feb 2020 13:46:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=TtX4KO46nep3/dtMPTsaoMXZVI5q6MpM39G5vt8u6vs=; b=kgYGh8XV0tmtk1SvQtt5T9mE+L iEDIhgpY9fv8FYxnRE5lKUm/CKMw69PH0iRy/036y/O1hn9EM0Me79CFQVWsn1Hj9XLqb9TazLMoB Nx4Aulf/c8G9KxbxDO6v5QvW73Z1jLAyoUjAA/NlImrTHmswWtpboVcd0agWfeUUhOXgkKbn6FsH6 jReeG0AJ9aUz5vHXMeh2AUU0ZsZDpNgz+UNQEHcvRlCNnfbU+jzskRtuob1OCfFa71cbmy9I9hQs+ eWVqPk7XmQrKJePd3wPdtxGMhnvxR8J1X8LZq3P4nhxkzrZ5te8xo93dDRwREmzFFZBX/4PnkORvk AvwHIyWg==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-00059K-JQ; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org, John Hubbard Subject: [PATCH v6 06/19] mm: rename readahead loop variable to 'i' Date: Mon, 17 Feb 2020 10:45:50 -0800 Message-Id: <20200217184613.19668-10-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Change the type of page_idx to unsigned long, and rename it -- it's just a loop counter, not a page index. Suggested-by: John Hubbard Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Dave Chinner --- mm/readahead.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 74791b96013f..bdc5759000d3 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -156,7 +156,7 @@ void __do_page_cache_readahead(struct address_space *mapping, struct inode *inode = mapping->host; unsigned long end_index; /* The last page we want to read */ LIST_HEAD(page_pool); - int page_idx; + unsigned long i; loff_t isize = i_size_read(inode); gfp_t gfp_mask = readahead_gfp_mask(mapping); struct readahead_control rac = { @@ -174,7 +174,7 @@ void __do_page_cache_readahead(struct address_space *mapping, /* * Preallocate as many pages as we will need. */ - for (page_idx = 0; page_idx < nr_to_read; page_idx++) { + for (i = 0; i < nr_to_read; i++) { struct page *page; if (offset > end_index) @@ -198,7 +198,7 @@ void __do_page_cache_readahead(struct address_space *mapping, break; page->index = offset; list_add(&page->lru, &page_pool); - if (page_idx == nr_to_read - lookahead_size) + if (i == nr_to_read - lookahead_size) SetPageReadahead(page); rac._nr_pages++; offset++; From patchwork Mon Feb 17 18:45:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387299 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2B44417F0 for ; Mon, 17 Feb 2020 18:47:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0BBFE222D9 for ; Mon, 17 Feb 2020 18:47:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="MwopgXm2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729960AbgBQSq0 (ORCPT ); Mon, 17 Feb 2020 13:46:26 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48258 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729945AbgBQSqZ (ORCPT ); Mon, 17 Feb 2020 13:46:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=JHQZUif7GrXF5c6dt1x94JQ/eERbsKsl9szla620yGg=; b=MwopgXm2jrH3SI/KCNCU9kTcM/ dbE6z4ghybS40X+Y522lD+E/Px6QZSdy/WskVldqgb8kNys8BTLFOs9Ix5lX1cQCMVQqBNY0OrA6M SCv5aSifERzuMP8/yzWjikX6Z3wB4tFF1SHtcHtmSpq9fT6oDSWicm0Vbtr37+iu0KFIc29PidL38 kVlwFzmJW+ra21Pw8v6Lc39a5gLT7BcDAW6QvyFTpsZr4vCDl95IvI+nDZChH0dxeoi3k7rD0QpXl m3s4AqkENlFwUpKkFceab08z7tasn1FQzTNd2VM/s+dD0OfE8rlbWrz9U3nQSTby9xGZSfXDdTYgJ 8qXWRWEg==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-00059e-Lj; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 07/19] mm: Put readahead pages in cache earlier Date: Mon, 17 Feb 2020 10:45:52 -0800 Message-Id: <20200217184613.19668-12-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" At allocation time, put the pages in the cache unless we're using ->readpages. Add the readahead_for_each() iterator for the benefit of the ->readpage fallback. This iterator supports huge pages, even though none of the filesystems to be converted do yet. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/pagemap.h | 24 ++++++++++++++++++++++++ mm/readahead.c | 34 +++++++++++++++++----------------- 2 files changed, 41 insertions(+), 17 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 982ecda2d4a2..3613154e79e4 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -639,8 +639,32 @@ struct readahead_control { /* private: use the readahead_* accessors instead */ pgoff_t _start; unsigned int _nr_pages; + unsigned int _batch_count; }; +static inline struct page *readahead_page(struct readahead_control *rac) +{ + struct page *page; + + if (!rac->_nr_pages) + return NULL; + + page = xa_load(&rac->mapping->i_pages, rac->_start); + VM_BUG_ON_PAGE(!PageLocked(page), page); + rac->_batch_count = hpage_nr_pages(page); + + return page; +} + +static inline void readahead_next(struct readahead_control *rac) +{ + rac->_nr_pages -= rac->_batch_count; + rac->_start += rac->_batch_count; +} + +#define readahead_for_each(rac, page) \ + for (; (page = readahead_page(rac)); readahead_next(rac)) + /* The number of pages in this readahead block */ static inline unsigned int readahead_count(struct readahead_control *rac) { diff --git a/mm/readahead.c b/mm/readahead.c index bdc5759000d3..9e430daae42f 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -113,12 +113,11 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages, EXPORT_SYMBOL(read_cache_pages); -static void read_pages(struct readahead_control *rac, struct list_head *pages, - gfp_t gfp) +static void read_pages(struct readahead_control *rac, struct list_head *pages) { const struct address_space_operations *aops = rac->mapping->a_ops; + struct page *page; struct blk_plug plug; - unsigned page_idx; blk_start_plug(&plug); @@ -127,19 +126,13 @@ static void read_pages(struct readahead_control *rac, struct list_head *pages, readahead_count(rac)); /* Clean up the remaining pages */ put_pages_list(pages); - goto out; - } - - for (page_idx = 0; page_idx < readahead_count(rac); page_idx++) { - struct page *page = lru_to_page(pages); - list_del(&page->lru); - if (!add_to_page_cache_lru(page, rac->mapping, page->index, - gfp)) + } else { + readahead_for_each(rac, page) { aops->readpage(rac->file, page); - put_page(page); + put_page(page); + } } -out: blk_finish_plug(&plug); } @@ -159,6 +152,7 @@ void __do_page_cache_readahead(struct address_space *mapping, unsigned long i; loff_t isize = i_size_read(inode); gfp_t gfp_mask = readahead_gfp_mask(mapping); + bool use_list = mapping->a_ops->readpages; struct readahead_control rac = { .mapping = mapping, .file = filp, @@ -196,8 +190,14 @@ void __do_page_cache_readahead(struct address_space *mapping, page = __page_cache_alloc(gfp_mask); if (!page) break; - page->index = offset; - list_add(&page->lru, &page_pool); + if (use_list) { + page->index = offset; + list_add(&page->lru, &page_pool); + } else if (add_to_page_cache_lru(page, mapping, offset, + gfp_mask) < 0) { + put_page(page); + goto read; + } if (i == nr_to_read - lookahead_size) SetPageReadahead(page); rac._nr_pages++; @@ -205,7 +205,7 @@ void __do_page_cache_readahead(struct address_space *mapping, continue; read: if (readahead_count(&rac)) - read_pages(&rac, &page_pool, gfp_mask); + read_pages(&rac, &page_pool); rac._nr_pages = 0; rac._start = ++offset; } @@ -216,7 +216,7 @@ void __do_page_cache_readahead(struct address_space *mapping, * will then handle the error. */ if (readahead_count(&rac)) - read_pages(&rac, &page_pool, gfp_mask); + read_pages(&rac, &page_pool); BUG_ON(!list_empty(&page_pool)); } From patchwork Mon Feb 17 18:45:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387281 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6FA0917F0 for ; Mon, 17 Feb 2020 18:47:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 471F624125 for ; Mon, 17 Feb 2020 18:47:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="s4PzWohV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730144AbgBQSrT (ORCPT ); Mon, 17 Feb 2020 13:47:19 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48334 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729979AbgBQSq2 (ORCPT ); Mon, 17 Feb 2020 13:46:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=yBVyK5Z4hyGJLxZQcWLKM0KimTDAONGE11+VRhSma3M=; b=s4PzWohVzfJ1ELMbBaF0lsGVUz WB3gU8243ONu4dpwTn9pqpA11B72j6RG1yY3GXOTKsC9mSL9x6DjjOgc/8lo+QMIW51KSO2cjDNXL f9NSc1nFBH/DVHO/hKa+o6C3pQIWFd9kczHuIxCiGELrBzhYfHSHBeUc24hgmxnVUZq1r5VcpKALP 0uvJ8GShnMI68seHhrJD0tJMdXJ1X126QfvHQc0ikIBSIu6FOVU8pWHUjG5AbOwDUiHxvEJaADykP TXdBs4lf64bOrdoZxODHzthtx28ULt3YGnEpbXi9saUTpQthCMOe3Vcx9afwnuxlkkhFXr4lIrAyI I7sPpxFg==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-0005A2-OZ; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 08/19] mm: Add readahead address space operation Date: Mon, 17 Feb 2020 10:45:54 -0800 Message-Id: <20200217184613.19668-14-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" This replaces ->readpages with a saner interface: - Return void instead of an ignored error code. - Pages are already in the page cache when ->readahead is called. - Implementation looks up the pages in the page cache instead of having them passed in a linked list. Signed-off-by: Matthew Wilcox (Oracle) --- Documentation/filesystems/locking.rst | 6 +++++- Documentation/filesystems/vfs.rst | 13 +++++++++++++ include/linux/fs.h | 2 ++ include/linux/pagemap.h | 18 ++++++++++++++++++ mm/readahead.c | 8 +++++++- 5 files changed, 45 insertions(+), 2 deletions(-) diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index 5057e4d9dcd1..0ebc4491025a 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -239,6 +239,7 @@ prototypes:: int (*readpage)(struct file *, struct page *); int (*writepages)(struct address_space *, struct writeback_control *); int (*set_page_dirty)(struct page *page); + void (*readahead)(struct readahead_control *); int (*readpages)(struct file *filp, struct address_space *mapping, struct list_head *pages, unsigned nr_pages); int (*write_begin)(struct file *, struct address_space *mapping, @@ -271,7 +272,8 @@ writepage: yes, unlocks (see below) readpage: yes, unlocks writepages: set_page_dirty no -readpages: +readahead: yes, unlocks +readpages: no write_begin: locks the page exclusive write_end: yes, unlocks exclusive bmap: @@ -295,6 +297,8 @@ the request handler (/dev/loop). ->readpage() unlocks the page, either synchronously or via I/O completion. +->readahead() unlocks the pages like ->readpage(). + ->readpages() populates the pagecache with the passed pages and starts I/O against them. They come unlocked upon I/O completion. diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 7d4d09dd5e6d..81ab30fbe45c 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -706,6 +706,7 @@ cache in your filesystem. The following members are defined: int (*readpage)(struct file *, struct page *); int (*writepages)(struct address_space *, struct writeback_control *); int (*set_page_dirty)(struct page *page); + void (*readahead)(struct readahead_control *); int (*readpages)(struct file *filp, struct address_space *mapping, struct list_head *pages, unsigned nr_pages); int (*write_begin)(struct file *, struct address_space *mapping, @@ -781,12 +782,24 @@ cache in your filesystem. The following members are defined: If defined, it should set the PageDirty flag, and the PAGECACHE_TAG_DIRTY tag in the radix tree. +``readahead`` + Called by the VM to read pages associated with the address_space + object. The pages are consecutive in the page cache and are + locked. The implementation should decrement the page refcount + after starting I/O on each page. Usually the page will be + unlocked by the I/O completion handler. If the function does + not attempt I/O on some pages, the caller will decrement the page + refcount and unlock the pages for you. Set PageUptodate if the + I/O completes successfully. Setting PageError on any page will + be ignored; simply unlock the page if an I/O error occurs. + ``readpages`` called by the VM to read pages associated with the address_space object. This is essentially just a vector version of readpage. Instead of just one page, several pages are requested. readpages is only used for read-ahead, so read errors are ignored. If anything goes wrong, feel free to give up. + This interface is deprecated; implement readahead instead. ``write_begin`` Called by the generic buffered write code to ask the filesystem diff --git a/include/linux/fs.h b/include/linux/fs.h index 3cd4fe6b845e..d4e2d2964346 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -292,6 +292,7 @@ enum positive_aop_returns { struct page; struct address_space; struct writeback_control; +struct readahead_control; /* * Write life time hint values. @@ -375,6 +376,7 @@ struct address_space_operations { */ int (*readpages)(struct file *filp, struct address_space *mapping, struct list_head *pages, unsigned nr_pages); + void (*readahead)(struct readahead_control *); int (*write_begin)(struct file *, struct address_space *mapping, loff_t pos, unsigned len, unsigned flags, diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 3613154e79e4..bd4291f78f41 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -665,6 +665,24 @@ static inline void readahead_next(struct readahead_control *rac) #define readahead_for_each(rac, page) \ for (; (page = readahead_page(rac)); readahead_next(rac)) +/* The byte offset into the file of this readahead block */ +static inline loff_t readahead_offset(struct readahead_control *rac) +{ + return (loff_t)rac->_start * PAGE_SIZE; +} + +/* The number of bytes in this readahead block */ +static inline loff_t readahead_length(struct readahead_control *rac) +{ + return (loff_t)rac->_nr_pages * PAGE_SIZE; +} + +/* The index of the first page in this readahead block */ +static inline unsigned int readahead_index(struct readahead_control *rac) +{ + return rac->_start; +} + /* The number of pages in this readahead block */ static inline unsigned int readahead_count(struct readahead_control *rac) { diff --git a/mm/readahead.c b/mm/readahead.c index 9e430daae42f..975ff5e387be 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -121,7 +121,13 @@ static void read_pages(struct readahead_control *rac, struct list_head *pages) blk_start_plug(&plug); - if (aops->readpages) { + if (aops->readahead) { + aops->readahead(rac); + readahead_for_each(rac, page) { + unlock_page(page); + put_page(page); + } + } else if (aops->readpages) { aops->readpages(rac->file, rac->mapping, pages, readahead_count(rac)); /* Clean up the remaining pages */ From patchwork Mon Feb 17 18:45:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387199 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1D09A109A for ; Mon, 17 Feb 2020 18:46:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E886C208C4 for ; Mon, 17 Feb 2020 18:46:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="i7Z6Vj6w" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729842AbgBQSqU (ORCPT ); Mon, 17 Feb 2020 13:46:20 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48050 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729786AbgBQSqT (ORCPT ); Mon, 17 Feb 2020 13:46:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=IdbzKhHNnhdRYLMVi5hhCdd4h8STLZkROVG6Wr3MY44=; b=i7Z6Vj6wwMTAXfgAvdOo2oaMcJ LqeO+lH8TMReny2uW/bz8R7NrjhQMCZqQlovCEjnNTyLqmMQ5II9orC/y2cXO351y/6QfP/eq9SB7 k7mGlh68xoLUqmuu8w09NYV1CHY3kMnxzhv2y3IWUOcCOiJsjCUWC9k9AcpXUh7eaaZCOITgKAxhR fZWUrVojffqwx2VYvkTw7fzwNCMrwSbu+UDrYys3swceDTofPrN0MSqQpsrYBfX0GbYZ6A7yN84L7 2bd1u1HT7USizpCacyRpB+T6OYW99+W+nD6XoejPSt6ZG1DWxn7U2iCE3DFon9qF4AlFPOq8lsKzk R9ByIgAQ==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-0005AD-Pu; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 09/16] btrfs: Convert from readpages to readahead Date: Mon, 17 Feb 2020 10:45:55 -0800 Message-Id: <20200217184613.19668-15-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Use the new readahead operation in btrfs. Add a readahead_for_each_batch() iterator to optimise the loop in the XArray. Signed-off-by: Matthew Wilcox (Oracle) --- fs/btrfs/extent_io.c | 48 ++++++++++++++--------------------------- fs/btrfs/extent_io.h | 3 +-- fs/btrfs/inode.c | 16 ++++++-------- include/linux/pagemap.h | 27 +++++++++++++++++++++++ 4 files changed, 51 insertions(+), 43 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index c0f202741e09..d9f66058e0a7 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4278,52 +4278,36 @@ int extent_writepages(struct address_space *mapping, return ret; } -int extent_readpages(struct address_space *mapping, struct list_head *pages, - unsigned nr_pages) +void extent_readahead(struct readahead_control *rac) { struct bio *bio = NULL; unsigned long bio_flags = 0; struct page *pagepool[16]; struct extent_map *em_cached = NULL; - struct extent_io_tree *tree = &BTRFS_I(mapping->host)->io_tree; - int nr = 0; + struct extent_io_tree *tree = &BTRFS_I(rac->mapping->host)->io_tree; u64 prev_em_start = (u64)-1; + int nr; - while (!list_empty(pages)) { - u64 contig_end = 0; - - for (nr = 0; nr < ARRAY_SIZE(pagepool) && !list_empty(pages);) { - struct page *page = lru_to_page(pages); - - prefetchw(&page->flags); - list_del(&page->lru); - if (add_to_page_cache_lru(page, mapping, page->index, - readahead_gfp_mask(mapping))) { - put_page(page); - break; - } - - pagepool[nr++] = page; - contig_end = page_offset(page) + PAGE_SIZE - 1; - } - - if (nr) { - u64 contig_start = page_offset(pagepool[0]); + readahead_for_each_batch(rac, pagepool, ARRAY_SIZE(pagepool), nr) { + u64 contig_start = page_offset(pagepool[0]); + u64 contig_end = page_offset(pagepool[nr - 1]) + PAGE_SIZE - 1; - ASSERT(contig_start + nr * PAGE_SIZE - 1 == contig_end); + ASSERT(contig_start + nr * PAGE_SIZE - 1 == contig_end); - contiguous_readpages(tree, pagepool, nr, contig_start, - contig_end, &em_cached, &bio, &bio_flags, - &prev_em_start); - } + contiguous_readpages(tree, pagepool, nr, contig_start, + contig_end, &em_cached, &bio, &bio_flags, + &prev_em_start); } if (em_cached) free_extent_map(em_cached); - if (bio) - return submit_one_bio(bio, 0, bio_flags); - return 0; + if (bio) { + int ret = submit_one_bio(bio, 0, bio_flags); + if (ret < 0) { + /* XXX: unlock the pages here? */ + } + } } /* diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 5d205bbaafdc..bddac32948c7 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -198,8 +198,7 @@ int extent_writepages(struct address_space *mapping, struct writeback_control *wbc); int btree_write_cache_pages(struct address_space *mapping, struct writeback_control *wbc); -int extent_readpages(struct address_space *mapping, struct list_head *pages, - unsigned nr_pages); +void extent_readahead(struct readahead_control *rac); int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, __u64 start, __u64 len); void set_page_extent_mapped(struct page *page); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 5b3ec93ff911..d964b2a78ed8 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4794,8 +4794,8 @@ static void evict_inode_truncate_pages(struct inode *inode) /* * Keep looping until we have no more ranges in the io tree. - * We can have ongoing bios started by readpages (called from readahead) - * that have their endio callback (extent_io.c:end_bio_extent_readpage) + * We can have ongoing bios started by readahead that have + * their endio callback (extent_io.c:end_bio_extent_readpage) * still in progress (unlocked the pages in the bio but did not yet * unlocked the ranges in the io tree). Therefore this means some * ranges can still be locked and eviction started because before @@ -6996,11 +6996,11 @@ static int lock_extent_direct(struct inode *inode, u64 lockstart, u64 lockend, * for it to complete) and then invalidate the pages for * this range (through invalidate_inode_pages2_range()), * but that can lead us to a deadlock with a concurrent - * call to readpages() (a buffered read or a defrag call + * call to readahead (a buffered read or a defrag call * triggered a readahead) on a page lock due to an * ordered dio extent we created before but did not have * yet a corresponding bio submitted (whence it can not - * complete), which makes readpages() wait for that + * complete), which makes readahead wait for that * ordered extent to complete while holding a lock on * that page. */ @@ -8239,11 +8239,9 @@ static int btrfs_writepages(struct address_space *mapping, return extent_writepages(mapping, wbc); } -static int -btrfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void btrfs_readahead(struct readahead_control *rac) { - return extent_readpages(mapping, pages, nr_pages); + extent_readahead(rac); } static int __btrfs_releasepage(struct page *page, gfp_t gfp_flags) @@ -10448,7 +10446,7 @@ static const struct address_space_operations btrfs_aops = { .readpage = btrfs_readpage, .writepage = btrfs_writepage, .writepages = btrfs_writepages, - .readpages = btrfs_readpages, + .readahead = btrfs_readahead, .direct_IO = btrfs_direct_IO, .invalidatepage = btrfs_invalidatepage, .releasepage = btrfs_releasepage, diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 4f36c06d064d..1bbb60a0bf16 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -669,6 +669,33 @@ static inline void readahead_next(struct readahead_control *rac) #define readahead_for_each(rac, page) \ for (; (page = readahead_page(rac)); readahead_next(rac)) +static inline unsigned int readahead_page_batch(struct readahead_control *rac, + struct page **array, unsigned int size) +{ + unsigned int batch = 0; + XA_STATE(xas, &rac->mapping->i_pages, rac->_start); + struct page *page; + + rac->_batch_count = 0; + xas_for_each(&xas, page, rac->_start + rac->_nr_pages - 1) { + VM_BUG_ON_PAGE(!PageLocked(page), page); + VM_BUG_ON_PAGE(PageTail(page), page); + array[batch++] = page; + rac->_batch_count += hpage_nr_pages(page); + if (PageHead(page)) + xas_set(&xas, rac->_start + rac->_batch_count); + + if (batch == size) + break; + } + + return batch; +} + +#define readahead_for_each_batch(rac, array, size, nr) \ + for (; (nr = readahead_page_batch(rac, array, size)); \ + readahead_next(rac)) + /* The byte offset into the file of this readahead block */ static inline loff_t readahead_offset(struct readahead_control *rac) { From patchwork Mon Feb 17 18:45:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387341 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9E6E7109A for ; Mon, 17 Feb 2020 18:47:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 643EF208C4 for ; Mon, 17 Feb 2020 18:47:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="gnZCwIeT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730162AbgBQSri (ORCPT ); Mon, 17 Feb 2020 13:47:38 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48252 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729946AbgBQSqZ (ORCPT ); Mon, 17 Feb 2020 13:46:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=/P87Iod30pSEIH33C0cIOKvhfufYbgxpSJWgi1gniro=; b=gnZCwIeTnxQkZGderW7FNO5hKY vL1YURmv7rHmcLKNqq41thyoVxgLsSuwE5c1xaWkXSTODsa8I2hlw639lDGf9c2Gx2uNJ+B1ktw+x uuUnq2yQ7Z7OcmMLxwKjFSoalCLnVGC8K1diWXzRqzbr7j2MCuriIvbngVGOztYn7j2KdjqB5z02c Cq13v4zVeVTFTOAQSwY+XC2fOffa+Qvv5Ash/VPO3rFPLpLaru3DrI2qBVi2WUnQCF9v45RoXgYbv qKkHxnP8WSKjyAL0t68elboLs5IPyfYX2bHVjMpOrl2uxFvLKvYeKTpEI+GU2FGe8+tTWeXLPY608 f6/g80PQ==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-0005Ao-Tx; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org, Junxiao Bi Subject: [PATCH v6 10/19] fs: Convert mpage_readpages to mpage_readahead Date: Mon, 17 Feb 2020 10:45:58 -0800 Message-Id: <20200217184613.19668-18-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Implement the new readahead aop and convert all callers (block_dev, exfat, ext2, fat, gfs2, hpfs, isofs, jfs, nilfs2, ocfs2, omfs, qnx6, reiserfs & udf). The callers are all trivial except for GFS2 & OCFS2. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Junxiao Bi # ocfs2 Reviewed-by: Joseph Qi Reviewed-by: Dave Chinner --- drivers/staging/exfat/exfat_super.c | 7 +++--- fs/block_dev.c | 7 +++--- fs/ext2/inode.c | 10 +++----- fs/fat/inode.c | 7 +++--- fs/gfs2/aops.c | 23 ++++++----------- fs/hpfs/file.c | 7 +++--- fs/iomap/buffered-io.c | 2 +- fs/isofs/inode.c | 7 +++--- fs/jfs/inode.c | 7 +++--- fs/mpage.c | 38 +++++++++-------------------- fs/nilfs2/inode.c | 15 +++--------- fs/ocfs2/aops.c | 34 ++++++++++---------------- fs/omfs/file.c | 7 +++--- fs/qnx6/inode.c | 7 +++--- fs/reiserfs/inode.c | 8 +++--- fs/udf/inode.c | 7 +++--- include/linux/mpage.h | 4 +-- mm/migrate.c | 2 +- 18 files changed, 73 insertions(+), 126 deletions(-) diff --git a/drivers/staging/exfat/exfat_super.c b/drivers/staging/exfat/exfat_super.c index b81d2a87b82e..96aad9b16d31 100644 --- a/drivers/staging/exfat/exfat_super.c +++ b/drivers/staging/exfat/exfat_super.c @@ -3002,10 +3002,9 @@ static int exfat_readpage(struct file *file, struct page *page) return mpage_readpage(page, exfat_get_block); } -static int exfat_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned int nr_pages) +static void exfat_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, exfat_get_block); + mpage_readahead(rac, exfat_get_block); } static int exfat_writepage(struct page *page, struct writeback_control *wbc) @@ -3104,7 +3103,7 @@ static sector_t _exfat_bmap(struct address_space *mapping, sector_t block) static const struct address_space_operations exfat_aops = { .readpage = exfat_readpage, - .readpages = exfat_readpages, + .readahead = exfat_readahead, .writepage = exfat_writepage, .writepages = exfat_writepages, .write_begin = exfat_write_begin, diff --git a/fs/block_dev.c b/fs/block_dev.c index 69bf2fb6f7cd..2fd9c7bd61f6 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -614,10 +614,9 @@ static int blkdev_readpage(struct file * file, struct page * page) return block_read_full_page(page, blkdev_get_block); } -static int blkdev_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void blkdev_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, blkdev_get_block); + mpage_readahead(rac, blkdev_get_block); } static int blkdev_write_begin(struct file *file, struct address_space *mapping, @@ -2062,7 +2061,7 @@ static int blkdev_writepages(struct address_space *mapping, static const struct address_space_operations def_blk_aops = { .readpage = blkdev_readpage, - .readpages = blkdev_readpages, + .readahead = blkdev_readahead, .writepage = blkdev_writepage, .write_begin = blkdev_write_begin, .write_end = blkdev_write_end, diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index c885cf7d724b..2875c0a705b5 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -877,11 +877,9 @@ static int ext2_readpage(struct file *file, struct page *page) return mpage_readpage(page, ext2_get_block); } -static int -ext2_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void ext2_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, ext2_get_block); + mpage_readahead(rac, ext2_get_block); } static int @@ -967,7 +965,7 @@ ext2_dax_writepages(struct address_space *mapping, struct writeback_control *wbc const struct address_space_operations ext2_aops = { .readpage = ext2_readpage, - .readpages = ext2_readpages, + .readahead = ext2_readahead, .writepage = ext2_writepage, .write_begin = ext2_write_begin, .write_end = ext2_write_end, @@ -981,7 +979,7 @@ const struct address_space_operations ext2_aops = { const struct address_space_operations ext2_nobh_aops = { .readpage = ext2_readpage, - .readpages = ext2_readpages, + .readahead = ext2_readahead, .writepage = ext2_nobh_writepage, .write_begin = ext2_nobh_write_begin, .write_end = nobh_write_end, diff --git a/fs/fat/inode.c b/fs/fat/inode.c index 594b05ae16c9..3496f5fc3e6d 100644 --- a/fs/fat/inode.c +++ b/fs/fat/inode.c @@ -210,10 +210,9 @@ static int fat_readpage(struct file *file, struct page *page) return mpage_readpage(page, fat_get_block); } -static int fat_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void fat_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, fat_get_block); + mpage_readahead(rac, fat_get_block); } static void fat_write_failed(struct address_space *mapping, loff_t to) @@ -344,7 +343,7 @@ int fat_block_truncate_page(struct inode *inode, loff_t from) static const struct address_space_operations fat_aops = { .readpage = fat_readpage, - .readpages = fat_readpages, + .readahead = fat_readahead, .writepage = fat_writepage, .writepages = fat_writepages, .write_begin = fat_write_begin, diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index ba83b49ce18c..5e63c13c12c1 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -577,7 +577,7 @@ int gfs2_internal_read(struct gfs2_inode *ip, char *buf, loff_t *pos, } /** - * gfs2_readpages - Read a bunch of pages at once + * gfs2_readahead - Read a bunch of pages at once * @file: The file to read from * @mapping: Address space info * @pages: List of pages to read @@ -590,31 +590,24 @@ int gfs2_internal_read(struct gfs2_inode *ip, char *buf, loff_t *pos, * obviously not something we'd want to do on too regular a basis. * Any I/O we ignore at this time will be done via readpage later. * 2. We don't handle stuffed files here we let readpage do the honours. - * 3. mpage_readpages() does most of the heavy lifting in the common case. + * 3. mpage_readahead() does most of the heavy lifting in the common case. * 4. gfs2_block_map() is relied upon to set BH_Boundary in the right places. */ -static int gfs2_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void gfs2_readahead(struct readahead_control *rac) { - struct inode *inode = mapping->host; + struct inode *inode = rac->mapping->host; struct gfs2_inode *ip = GFS2_I(inode); - struct gfs2_sbd *sdp = GFS2_SB(inode); struct gfs2_holder gh; - int ret; gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh); - ret = gfs2_glock_nq(&gh); - if (unlikely(ret)) + if (gfs2_glock_nq(&gh)) goto out_uninit; if (!gfs2_is_stuffed(ip)) - ret = mpage_readpages(mapping, pages, nr_pages, gfs2_block_map); + mpage_readahead(rac, gfs2_block_map); gfs2_glock_dq(&gh); out_uninit: gfs2_holder_uninit(&gh); - if (unlikely(gfs2_withdrawn(sdp))) - ret = -EIO; - return ret; } /** @@ -828,7 +821,7 @@ static const struct address_space_operations gfs2_aops = { .writepage = gfs2_writepage, .writepages = gfs2_writepages, .readpage = gfs2_readpage, - .readpages = gfs2_readpages, + .readahead = gfs2_readahead, .bmap = gfs2_bmap, .invalidatepage = gfs2_invalidatepage, .releasepage = gfs2_releasepage, @@ -842,7 +835,7 @@ static const struct address_space_operations gfs2_jdata_aops = { .writepage = gfs2_jdata_writepage, .writepages = gfs2_jdata_writepages, .readpage = gfs2_readpage, - .readpages = gfs2_readpages, + .readahead = gfs2_readahead, .set_page_dirty = jdata_set_page_dirty, .bmap = gfs2_bmap, .invalidatepage = gfs2_invalidatepage, diff --git a/fs/hpfs/file.c b/fs/hpfs/file.c index b36abf9cb345..2de0d3492d15 100644 --- a/fs/hpfs/file.c +++ b/fs/hpfs/file.c @@ -125,10 +125,9 @@ static int hpfs_writepage(struct page *page, struct writeback_control *wbc) return block_write_full_page(page, hpfs_get_block, wbc); } -static int hpfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void hpfs_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, hpfs_get_block); + mpage_readahead(rac, hpfs_get_block); } static int hpfs_writepages(struct address_space *mapping, @@ -198,7 +197,7 @@ static int hpfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, const struct address_space_operations hpfs_aops = { .readpage = hpfs_readpage, .writepage = hpfs_writepage, - .readpages = hpfs_readpages, + .readahead = hpfs_readahead, .writepages = hpfs_writepages, .write_begin = hpfs_write_begin, .write_end = hpfs_write_end, diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 7c84c4c027c4..cb3511eb152a 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -359,7 +359,7 @@ iomap_readpage(struct page *page, const struct iomap_ops *ops) } /* - * Just like mpage_readpages and block_read_full_page we always + * Just like mpage_readahead and block_read_full_page we always * return 0 and just mark the page as PageError on errors. This * should be cleaned up all through the stack eventually. */ diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c index 62c0462dc89f..95b1f377ad09 100644 --- a/fs/isofs/inode.c +++ b/fs/isofs/inode.c @@ -1185,10 +1185,9 @@ static int isofs_readpage(struct file *file, struct page *page) return mpage_readpage(page, isofs_get_block); } -static int isofs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void isofs_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, isofs_get_block); + mpage_readahead(rac, isofs_get_block); } static sector_t _isofs_bmap(struct address_space *mapping, sector_t block) @@ -1198,7 +1197,7 @@ static sector_t _isofs_bmap(struct address_space *mapping, sector_t block) static const struct address_space_operations isofs_aops = { .readpage = isofs_readpage, - .readpages = isofs_readpages, + .readahead = isofs_readahead, .bmap = _isofs_bmap }; diff --git a/fs/jfs/inode.c b/fs/jfs/inode.c index 9486afcdac76..6f65bfa9f18d 100644 --- a/fs/jfs/inode.c +++ b/fs/jfs/inode.c @@ -296,10 +296,9 @@ static int jfs_readpage(struct file *file, struct page *page) return mpage_readpage(page, jfs_get_block); } -static int jfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void jfs_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, jfs_get_block); + mpage_readahead(rac, jfs_get_block); } static void jfs_write_failed(struct address_space *mapping, loff_t to) @@ -358,7 +357,7 @@ static ssize_t jfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) const struct address_space_operations jfs_aops = { .readpage = jfs_readpage, - .readpages = jfs_readpages, + .readahead = jfs_readahead, .writepage = jfs_writepage, .writepages = jfs_writepages, .write_begin = jfs_write_begin, diff --git a/fs/mpage.c b/fs/mpage.c index ccba3c4c4479..8a09e6002dc2 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -91,7 +91,7 @@ mpage_alloc(struct block_device *bdev, } /* - * support function for mpage_readpages. The fs supplied get_block might + * support function for mpage_readahead. The fs supplied get_block might * return an up to date buffer. This is used to map that buffer into * the page, which allows readpage to avoid triggering a duplicate call * to get_block. @@ -338,13 +338,8 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) } /** - * mpage_readpages - populate an address space with some pages & start reads against them - * @mapping: the address_space - * @pages: The address of a list_head which contains the target pages. These - * pages have their ->index populated and are otherwise uninitialised. - * The page at @pages->prev has the lowest file offset, and reads should be - * issued in @pages->prev to @pages->next order. - * @nr_pages: The number of pages at *@pages + * mpage_readahead - start reads against pages + * @rac: Describes which pages to read. * @get_block: The filesystem's block mapper function. * * This function walks the pages and the blocks within each page, building and @@ -381,36 +376,25 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) * * This all causes the disk requests to be issued in the correct order. */ -int -mpage_readpages(struct address_space *mapping, struct list_head *pages, - unsigned nr_pages, get_block_t get_block) +void mpage_readahead(struct readahead_control *rac, get_block_t get_block) { + struct page *page; struct mpage_readpage_args args = { .get_block = get_block, .is_readahead = true, }; - unsigned page_idx; - - for (page_idx = 0; page_idx < nr_pages; page_idx++) { - struct page *page = lru_to_page(pages); + readahead_for_each(rac, page) { prefetchw(&page->flags); - list_del(&page->lru); - if (!add_to_page_cache_lru(page, mapping, - page->index, - readahead_gfp_mask(mapping))) { - args.page = page; - args.nr_pages = nr_pages - page_idx; - args.bio = do_mpage_readpage(&args); - } + args.page = page; + args.nr_pages = readahead_count(rac); + args.bio = do_mpage_readpage(&args); put_page(page); } - BUG_ON(!list_empty(pages)); if (args.bio) mpage_bio_submit(REQ_OP_READ, REQ_RAHEAD, args.bio); - return 0; } -EXPORT_SYMBOL(mpage_readpages); +EXPORT_SYMBOL(mpage_readahead); /* * This isn't called much at all @@ -563,7 +547,7 @@ static int __mpage_writepage(struct page *page, struct writeback_control *wbc, * Page has buffers, but they are all unmapped. The page was * created by pagein or read over a hole which was handled by * block_read_full_page(). If this address_space is also - * using mpage_readpages then this can rarely happen. + * using mpage_readahead then this can rarely happen. */ goto confused; } diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c index 671085512e0f..ceeb3b441844 100644 --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -145,18 +145,9 @@ static int nilfs_readpage(struct file *file, struct page *page) return mpage_readpage(page, nilfs_get_block); } -/** - * nilfs_readpages() - implement readpages() method of nilfs_aops {} - * address_space_operations. - * @file - file struct of the file to be read - * @mapping - address_space struct used for reading multiple pages - * @pages - the pages to be read - * @nr_pages - number of pages to be read - */ -static int nilfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned int nr_pages) +static void nilfs_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, nilfs_get_block); + mpage_readahead(rac, nilfs_get_block); } static int nilfs_writepages(struct address_space *mapping, @@ -308,7 +299,7 @@ const struct address_space_operations nilfs_aops = { .readpage = nilfs_readpage, .writepages = nilfs_writepages, .set_page_dirty = nilfs_set_page_dirty, - .readpages = nilfs_readpages, + .readahead = nilfs_readahead, .write_begin = nilfs_write_begin, .write_end = nilfs_write_end, /* .releasepage = nilfs_releasepage, */ diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index 3a67a6518ddf..e8137efaafec 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -350,14 +350,11 @@ static int ocfs2_readpage(struct file *file, struct page *page) * grow out to a tree. If need be, detecting boundary extents could * trivially be added in a future version of ocfs2_get_block(). */ -static int ocfs2_readpages(struct file *filp, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void ocfs2_readahead(struct readahead_control *rac) { - int ret, err = -EIO; - struct inode *inode = mapping->host; + int ret; + struct inode *inode = rac->mapping->host; struct ocfs2_inode_info *oi = OCFS2_I(inode); - loff_t start; - struct page *last; /* * Use the nonblocking flag for the dlm code to avoid page @@ -365,36 +362,31 @@ static int ocfs2_readpages(struct file *filp, struct address_space *mapping, */ ret = ocfs2_inode_lock_full(inode, NULL, 0, OCFS2_LOCK_NONBLOCK); if (ret) - return err; + return; - if (down_read_trylock(&oi->ip_alloc_sem) == 0) { - ocfs2_inode_unlock(inode, 0); - return err; - } + if (down_read_trylock(&oi->ip_alloc_sem) == 0) + goto out_unlock; /* * Don't bother with inline-data. There isn't anything * to read-ahead in that case anyway... */ if (oi->ip_dyn_features & OCFS2_INLINE_DATA_FL) - goto out_unlock; + goto out_up; /* * Check whether a remote node truncated this file - we just * drop out in that case as it's not worth handling here. */ - last = lru_to_page(pages); - start = (loff_t)last->index << PAGE_SHIFT; - if (start >= i_size_read(inode)) - goto out_unlock; + if (readahead_offset(rac) >= i_size_read(inode)) + goto out_up; - err = mpage_readpages(mapping, pages, nr_pages, ocfs2_get_block); + mpage_readahead(rac, ocfs2_get_block); -out_unlock: +out_up: up_read(&oi->ip_alloc_sem); +out_unlock: ocfs2_inode_unlock(inode, 0); - - return err; } /* Note: Because we don't support holes, our allocation has @@ -2474,7 +2466,7 @@ static ssize_t ocfs2_direct_IO(struct kiocb *iocb, struct iov_iter *iter) const struct address_space_operations ocfs2_aops = { .readpage = ocfs2_readpage, - .readpages = ocfs2_readpages, + .readahead = ocfs2_readahead, .writepage = ocfs2_writepage, .write_begin = ocfs2_write_begin, .write_end = ocfs2_write_end, diff --git a/fs/omfs/file.c b/fs/omfs/file.c index d640b9388238..d7b5f09d298c 100644 --- a/fs/omfs/file.c +++ b/fs/omfs/file.c @@ -289,10 +289,9 @@ static int omfs_readpage(struct file *file, struct page *page) return block_read_full_page(page, omfs_get_block); } -static int omfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void omfs_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, omfs_get_block); + mpage_readahead(rac, omfs_get_block); } static int omfs_writepage(struct page *page, struct writeback_control *wbc) @@ -373,7 +372,7 @@ const struct inode_operations omfs_file_inops = { const struct address_space_operations omfs_aops = { .readpage = omfs_readpage, - .readpages = omfs_readpages, + .readahead = omfs_readahead, .writepage = omfs_writepage, .writepages = omfs_writepages, .write_begin = omfs_write_begin, diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c index 345db56c98fd..755293c8c71a 100644 --- a/fs/qnx6/inode.c +++ b/fs/qnx6/inode.c @@ -99,10 +99,9 @@ static int qnx6_readpage(struct file *file, struct page *page) return mpage_readpage(page, qnx6_get_block); } -static int qnx6_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void qnx6_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, qnx6_get_block); + mpage_readahead(rac, qnx6_get_block); } /* @@ -499,7 +498,7 @@ static sector_t qnx6_bmap(struct address_space *mapping, sector_t block) } static const struct address_space_operations qnx6_aops = { .readpage = qnx6_readpage, - .readpages = qnx6_readpages, + .readahead = qnx6_readahead, .bmap = qnx6_bmap }; diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c index 6419e6dacc39..0031070b3692 100644 --- a/fs/reiserfs/inode.c +++ b/fs/reiserfs/inode.c @@ -1160,11 +1160,9 @@ int reiserfs_get_block(struct inode *inode, sector_t block, return retval; } -static int -reiserfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void reiserfs_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, reiserfs_get_block); + mpage_readahead(rac, reiserfs_get_block); } /* @@ -3434,7 +3432,7 @@ int reiserfs_setattr(struct dentry *dentry, struct iattr *attr) const struct address_space_operations reiserfs_address_space_operations = { .writepage = reiserfs_writepage, .readpage = reiserfs_readpage, - .readpages = reiserfs_readpages, + .readahead = reiserfs_readahead, .releasepage = reiserfs_releasepage, .invalidatepage = reiserfs_invalidatepage, .write_begin = reiserfs_write_begin, diff --git a/fs/udf/inode.c b/fs/udf/inode.c index e875bc5668ee..adaba8e8b326 100644 --- a/fs/udf/inode.c +++ b/fs/udf/inode.c @@ -195,10 +195,9 @@ static int udf_readpage(struct file *file, struct page *page) return mpage_readpage(page, udf_get_block); } -static int udf_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void udf_readahead(struct readahead_control *rac) { - return mpage_readpages(mapping, pages, nr_pages, udf_get_block); + mpage_readahead(rac, udf_get_block); } static int udf_write_begin(struct file *file, struct address_space *mapping, @@ -234,7 +233,7 @@ static sector_t udf_bmap(struct address_space *mapping, sector_t block) const struct address_space_operations udf_aops = { .readpage = udf_readpage, - .readpages = udf_readpages, + .readahead = udf_readahead, .writepage = udf_writepage, .writepages = udf_writepages, .write_begin = udf_write_begin, diff --git a/include/linux/mpage.h b/include/linux/mpage.h index 001f1fcf9836..f4f5e90a6844 100644 --- a/include/linux/mpage.h +++ b/include/linux/mpage.h @@ -13,9 +13,9 @@ #ifdef CONFIG_BLOCK struct writeback_control; +struct readahead_control; -int mpage_readpages(struct address_space *mapping, struct list_head *pages, - unsigned nr_pages, get_block_t get_block); +void mpage_readahead(struct readahead_control *, get_block_t get_block); int mpage_readpage(struct page *page, get_block_t get_block); int mpage_writepages(struct address_space *mapping, struct writeback_control *wbc, get_block_t get_block); diff --git a/mm/migrate.c b/mm/migrate.c index b1092876e537..a32122095702 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1020,7 +1020,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage, * to the LRU. Later, when the IO completes the pages are * marked uptodate and unlocked. However, the queueing * could be merging multiple pages for one bio (e.g. - * mpage_readpages). If an allocation happens for the + * mpage_readahead). If an allocation happens for the * second or third page, the process can end up locking * the same page twice and deadlocking. Rather than * trying to be clever about what pages can be locked, From patchwork Mon Feb 17 18:45:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387379 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 10BE3109A for ; Mon, 17 Feb 2020 18:48:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DC03222527 for ; Mon, 17 Feb 2020 18:48:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="CWFqtE8+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729889AbgBQSqV (ORCPT ); Mon, 17 Feb 2020 13:46:21 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48100 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729851AbgBQSqV (ORCPT ); Mon, 17 Feb 2020 13:46:21 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=LRP8iW3fX7X8gMRjaL8r0/h5O3SYseGf5THdZm5lDf4=; b=CWFqtE8+YByR6rLztwBmOt8VmN Ztsago0Z6HhquLJO7jgr3b9xW9olWopZTuvhYnw6fRr3KOx1NgMdQk0NcSnqNxA3zb4lzNxCyjEi7 qIFLPmGwyCwOipFX5Dj6vewwUtfCuQR6zLbkvIdVdGqfwdCPw7Thh3R+z6+Q5VVa67/Ny1L6HjIYB M53eFz6Fwy19kWlIkv3xUS7TaPlsHL83Sht520SvtqISLkauaq/fCKtl4CglnB1jiTxYSgLUeli6t l3CrfQPKNQvGNXZouEXnD0zddfBsW17Wo6obmuHyzsv+TgIe1TxUJjgHMdB5dYViDKes0GellYriP iYbijsKg==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-0005Ay-VQ; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 11/19] btrfs: Convert from readpages to readahead Date: Mon, 17 Feb 2020 10:45:59 -0800 Message-Id: <20200217184613.19668-19-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Use the new readahead operation in btrfs. Add a readahead_for_each_batch() iterator to optimise the loop in the XArray. Signed-off-by: Matthew Wilcox (Oracle) --- fs/btrfs/extent_io.c | 46 +++++++++++++---------------------------- fs/btrfs/extent_io.h | 3 +-- fs/btrfs/inode.c | 16 +++++++------- include/linux/pagemap.h | 27 ++++++++++++++++++++++++ 4 files changed, 49 insertions(+), 43 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index c0f202741e09..e97a6acd6f5d 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4278,52 +4278,34 @@ int extent_writepages(struct address_space *mapping, return ret; } -int extent_readpages(struct address_space *mapping, struct list_head *pages, - unsigned nr_pages) +void extent_readahead(struct readahead_control *rac) { struct bio *bio = NULL; unsigned long bio_flags = 0; struct page *pagepool[16]; struct extent_map *em_cached = NULL; - struct extent_io_tree *tree = &BTRFS_I(mapping->host)->io_tree; - int nr = 0; + struct extent_io_tree *tree = &BTRFS_I(rac->mapping->host)->io_tree; u64 prev_em_start = (u64)-1; + int nr; - while (!list_empty(pages)) { - u64 contig_end = 0; - - for (nr = 0; nr < ARRAY_SIZE(pagepool) && !list_empty(pages);) { - struct page *page = lru_to_page(pages); - - prefetchw(&page->flags); - list_del(&page->lru); - if (add_to_page_cache_lru(page, mapping, page->index, - readahead_gfp_mask(mapping))) { - put_page(page); - break; - } - - pagepool[nr++] = page; - contig_end = page_offset(page) + PAGE_SIZE - 1; - } + readahead_for_each_batch(rac, pagepool, ARRAY_SIZE(pagepool), nr) { + u64 contig_start = page_offset(pagepool[0]); + u64 contig_end = page_offset(pagepool[nr - 1]) + PAGE_SIZE - 1; - if (nr) { - u64 contig_start = page_offset(pagepool[0]); + ASSERT(contig_start + nr * PAGE_SIZE - 1 == contig_end); - ASSERT(contig_start + nr * PAGE_SIZE - 1 == contig_end); - - contiguous_readpages(tree, pagepool, nr, contig_start, - contig_end, &em_cached, &bio, &bio_flags, - &prev_em_start); - } + contiguous_readpages(tree, pagepool, nr, contig_start, + contig_end, &em_cached, &bio, &bio_flags, + &prev_em_start); } if (em_cached) free_extent_map(em_cached); - if (bio) - return submit_one_bio(bio, 0, bio_flags); - return 0; + if (bio) { + if (submit_one_bio(bio, 0, bio_flags)) + return; + } } /* diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 5d205bbaafdc..bddac32948c7 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -198,8 +198,7 @@ int extent_writepages(struct address_space *mapping, struct writeback_control *wbc); int btree_write_cache_pages(struct address_space *mapping, struct writeback_control *wbc); -int extent_readpages(struct address_space *mapping, struct list_head *pages, - unsigned nr_pages); +void extent_readahead(struct readahead_control *rac); int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, __u64 start, __u64 len); void set_page_extent_mapped(struct page *page); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 7d26b4bfb2c6..61d5137ce4e9 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4802,8 +4802,8 @@ static void evict_inode_truncate_pages(struct inode *inode) /* * Keep looping until we have no more ranges in the io tree. - * We can have ongoing bios started by readpages (called from readahead) - * that have their endio callback (extent_io.c:end_bio_extent_readpage) + * We can have ongoing bios started by readahead that have + * their endio callback (extent_io.c:end_bio_extent_readpage) * still in progress (unlocked the pages in the bio but did not yet * unlocked the ranges in the io tree). Therefore this means some * ranges can still be locked and eviction started because before @@ -7004,11 +7004,11 @@ static int lock_extent_direct(struct inode *inode, u64 lockstart, u64 lockend, * for it to complete) and then invalidate the pages for * this range (through invalidate_inode_pages2_range()), * but that can lead us to a deadlock with a concurrent - * call to readpages() (a buffered read or a defrag call + * call to readahead (a buffered read or a defrag call * triggered a readahead) on a page lock due to an * ordered dio extent we created before but did not have * yet a corresponding bio submitted (whence it can not - * complete), which makes readpages() wait for that + * complete), which makes readahead wait for that * ordered extent to complete while holding a lock on * that page. */ @@ -8247,11 +8247,9 @@ static int btrfs_writepages(struct address_space *mapping, return extent_writepages(mapping, wbc); } -static int -btrfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void btrfs_readahead(struct readahead_control *rac) { - return extent_readpages(mapping, pages, nr_pages); + extent_readahead(rac); } static int __btrfs_releasepage(struct page *page, gfp_t gfp_flags) @@ -10456,7 +10454,7 @@ static const struct address_space_operations btrfs_aops = { .readpage = btrfs_readpage, .writepage = btrfs_writepage, .writepages = btrfs_writepages, - .readpages = btrfs_readpages, + .readahead = btrfs_readahead, .direct_IO = btrfs_direct_IO, .invalidatepage = btrfs_invalidatepage, .releasepage = btrfs_releasepage, diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 4f36c06d064d..1bbb60a0bf16 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -669,6 +669,33 @@ static inline void readahead_next(struct readahead_control *rac) #define readahead_for_each(rac, page) \ for (; (page = readahead_page(rac)); readahead_next(rac)) +static inline unsigned int readahead_page_batch(struct readahead_control *rac, + struct page **array, unsigned int size) +{ + unsigned int batch = 0; + XA_STATE(xas, &rac->mapping->i_pages, rac->_start); + struct page *page; + + rac->_batch_count = 0; + xas_for_each(&xas, page, rac->_start + rac->_nr_pages - 1) { + VM_BUG_ON_PAGE(!PageLocked(page), page); + VM_BUG_ON_PAGE(PageTail(page), page); + array[batch++] = page; + rac->_batch_count += hpage_nr_pages(page); + if (PageHead(page)) + xas_set(&xas, rac->_start + rac->_batch_count); + + if (batch == size) + break; + } + + return batch; +} + +#define readahead_for_each_batch(rac, array, size, nr) \ + for (; (nr = readahead_page_batch(rac, array, size)); \ + readahead_next(rac)) + /* The byte offset into the file of this readahead block */ static inline loff_t readahead_offset(struct readahead_control *rac) { From patchwork Mon Feb 17 18:46:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387407 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 29856138D for ; Mon, 17 Feb 2020 18:49:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 09F6620836 for ; Mon, 17 Feb 2020 18:49:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="PilV6Nbn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730240AbgBQSs4 (ORCPT ); Mon, 17 Feb 2020 13:48:56 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48058 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729789AbgBQSqT (ORCPT ); Mon, 17 Feb 2020 13:46:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=c1RHQNQxAXbf5ukt2BNF7izx4iTSuNQC5EsGyGLZYjM=; b=PilV6Nbnp5jeR3BGkYdLmIHP+0 PemIro+LMam5FDrFdQV47s7w0ljGT9Umtu4ajE78hjqlTcMnO1ZqjwyB6UNIiDpsqAq6hVz7y+yPi vKmTQ4UgvuffWOVq3liQTyr+/bXLQzOSae98w3a8DvliVoYSYsyupobLHKFqKbEytaDORWEaIpfY7 yGq4FDarj0LzoQek3vMA3QotZUSztvbszSae+Homo9HXkO747iu2h2BFYhXvrgGzKKpAC9HcZO+MV 7FkaahGbzj8tYko/oTcUrCmw4CDvwZPp3U1buusRJz9oy83OGg4g6GGuFPxZBmY6G9Mg9MKTI8gFM ozQ2KRbA==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPM-0005BL-1r; Mon, 17 Feb 2020 18:46:16 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 12/19] erofs: Convert uncompressed files from readpages to readahead Date: Mon, 17 Feb 2020 10:46:01 -0800 Message-Id: <20200217184613.19668-21-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Use the new readahead operation in erofs Signed-off-by: Matthew Wilcox (Oracle) Acked-by: Gao Xiang Reviewed-by: Dave Chinner --- fs/erofs/data.c | 39 +++++++++++++----------------------- fs/erofs/zdata.c | 2 +- include/trace/events/erofs.h | 6 +++--- 3 files changed, 18 insertions(+), 29 deletions(-) diff --git a/fs/erofs/data.c b/fs/erofs/data.c index fc3a8d8064f8..82ebcee9d178 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -280,47 +280,36 @@ static int erofs_raw_access_readpage(struct file *file, struct page *page) return 0; } -static int erofs_raw_access_readpages(struct file *filp, - struct address_space *mapping, - struct list_head *pages, - unsigned int nr_pages) +static void erofs_raw_access_readahead(struct readahead_control *rac) { erofs_off_t last_block; struct bio *bio = NULL; - gfp_t gfp = readahead_gfp_mask(mapping); - struct page *page = list_last_entry(pages, struct page, lru); - - trace_erofs_readpages(mapping->host, page, nr_pages, true); + struct page *page; - for (; nr_pages; --nr_pages) { - page = list_entry(pages->prev, struct page, lru); + trace_erofs_readpages(rac->mapping->host, readahead_index(rac), + readahead_count(rac), true); + readahead_for_each(rac, page) { prefetchw(&page->flags); - list_del(&page->lru); - if (!add_to_page_cache_lru(page, mapping, page->index, gfp)) { - bio = erofs_read_raw_page(bio, mapping, page, - &last_block, nr_pages, true); + bio = erofs_read_raw_page(bio, rac->mapping, page, &last_block, + readahead_count(rac), true); - /* all the page errors are ignored when readahead */ - if (IS_ERR(bio)) { - pr_err("%s, readahead error at page %lu of nid %llu\n", - __func__, page->index, - EROFS_I(mapping->host)->nid); + /* all the page errors are ignored when readahead */ + if (IS_ERR(bio)) { + pr_err("%s, readahead error at page %lu of nid %llu\n", + __func__, page->index, + EROFS_I(rac->mapping->host)->nid); - bio = NULL; - } + bio = NULL; } - /* pages could still be locked */ put_page(page); } - DBG_BUGON(!list_empty(pages)); /* the rare case (end in gaps) */ if (bio) submit_bio(bio); - return 0; } static int erofs_get_block(struct inode *inode, sector_t iblock, @@ -358,7 +347,7 @@ static sector_t erofs_bmap(struct address_space *mapping, sector_t block) /* for uncompressed (aligned) files and raw access for other files */ const struct address_space_operations erofs_raw_access_aops = { .readpage = erofs_raw_access_readpage, - .readpages = erofs_raw_access_readpages, + .readahead = erofs_raw_access_readahead, .bmap = erofs_bmap, }; diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 80e47f07d946..17f45fcb8c5c 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1315,7 +1315,7 @@ static int z_erofs_readpages(struct file *filp, struct address_space *mapping, struct page *head = NULL; LIST_HEAD(pagepool); - trace_erofs_readpages(mapping->host, lru_to_page(pages), + trace_erofs_readpages(mapping->host, lru_to_page(pages)->index, nr_pages, false); f.headoffset = (erofs_off_t)lru_to_page(pages)->index << PAGE_SHIFT; diff --git a/include/trace/events/erofs.h b/include/trace/events/erofs.h index 27f5caa6299a..bf9806fd1306 100644 --- a/include/trace/events/erofs.h +++ b/include/trace/events/erofs.h @@ -113,10 +113,10 @@ TRACE_EVENT(erofs_readpage, TRACE_EVENT(erofs_readpages, - TP_PROTO(struct inode *inode, struct page *page, unsigned int nrpage, + TP_PROTO(struct inode *inode, pgoff_t start, unsigned int nrpage, bool raw), - TP_ARGS(inode, page, nrpage, raw), + TP_ARGS(inode, start, nrpage, raw), TP_STRUCT__entry( __field(dev_t, dev ) @@ -129,7 +129,7 @@ TRACE_EVENT(erofs_readpages, TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->nid = EROFS_I(inode)->nid; - __entry->start = page->index; + __entry->start = start; __entry->nrpage = nrpage; __entry->raw = raw; ), From patchwork Mon Feb 17 18:46:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387213 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1611317F0 for ; Mon, 17 Feb 2020 18:46:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E152E20725 for ; Mon, 17 Feb 2020 18:46:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="be4JfOkb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730031AbgBQSqc (ORCPT ); Mon, 17 Feb 2020 13:46:32 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48380 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730016AbgBQSqb (ORCPT ); Mon, 17 Feb 2020 13:46:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=Cu/jfqkew46d05JGO+29Z0vrqq1/fRZdHvgOzZKjCA0=; b=be4JfOkbzD49JYAiknAZutrluY 1EIxYedMyKfRjL1xpPpovxulsPRclg28TGXxNhUXnZgjyvm4Yce97DFjlTIeAmz2dLq5jS5SxRUCk meKx6aKtc/KwAtGXPmxkyzlg45fYlTyFklcJ0UHZxbUQkwZocpMlNk+17OJgYvBj+o/I0SJBCcWXn 6WxzfaP2WjRpIBuuzjlhAMYKL6DvUzCJ/oApt+X5UYzfU23iFTybfUfCtds3Erh8nOiESSmKXwEdk Usd/Fsr60a28TlRuOSq4LvkgskpigpIds3IE+czKwr9t2uXzKqtoKSY3LBbNy1Oop+Ldh0fbYhpy0 WLn7+G8g==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPM-0005Bp-5V; Mon, 17 Feb 2020 18:46:16 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 13/16] f2fs: Convert from readpages to readahead Date: Mon, 17 Feb 2020 10:46:04 -0800 Message-Id: <20200217184613.19668-24-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Use the new readahead operation in f2fs Signed-off-by: Matthew Wilcox (Oracle) --- fs/f2fs/data.c | 50 +++++++++++++++---------------------- fs/f2fs/f2fs.h | 5 ++-- include/trace/events/f2fs.h | 6 ++--- 3 files changed, 25 insertions(+), 36 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index b27b72107911..87964e4cb6b8 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2159,13 +2159,11 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret, * use ->readpage() or do the necessary surgery to decouple ->readpages() * from read-ahead. */ -int f2fs_mpage_readpages(struct address_space *mapping, - struct list_head *pages, struct page *page, - unsigned nr_pages, bool is_readahead) +int f2fs_mpage_readpages(struct inode *inode, struct readahead_control *rac, + struct page *page) { struct bio *bio = NULL; sector_t last_block_in_bio = 0; - struct inode *inode = mapping->host; struct f2fs_map_blocks map; #ifdef CONFIG_F2FS_FS_COMPRESSION struct compress_ctx cc = { @@ -2179,6 +2177,7 @@ int f2fs_mpage_readpages(struct address_space *mapping, .nr_cpages = 0, }; #endif + unsigned nr_pages = rac ? readahead_count(rac) : 1; unsigned max_nr_pages = nr_pages; int ret = 0; @@ -2192,15 +2191,9 @@ int f2fs_mpage_readpages(struct address_space *mapping, map.m_may_create = false; for (; nr_pages; nr_pages--) { - if (pages) { - page = list_last_entry(pages, struct page, lru); - + if (rac) { + page = readahead_page(rac); prefetchw(&page->flags); - list_del(&page->lru); - if (add_to_page_cache_lru(page, mapping, - page_index(page), - readahead_gfp_mask(mapping))) - goto next_page; } #ifdef CONFIG_F2FS_FS_COMPRESSION @@ -2210,7 +2203,7 @@ int f2fs_mpage_readpages(struct address_space *mapping, ret = f2fs_read_multi_pages(&cc, &bio, max_nr_pages, &last_block_in_bio, - is_readahead); + rac); f2fs_destroy_compress_ctx(&cc); if (ret) goto set_error_page; @@ -2233,7 +2226,7 @@ int f2fs_mpage_readpages(struct address_space *mapping, #endif ret = f2fs_read_single_page(inode, page, max_nr_pages, &map, - &bio, &last_block_in_bio, is_readahead); + &bio, &last_block_in_bio, rac); if (ret) { #ifdef CONFIG_F2FS_FS_COMPRESSION set_error_page: @@ -2242,8 +2235,10 @@ int f2fs_mpage_readpages(struct address_space *mapping, zero_user_segment(page, 0, PAGE_SIZE); unlock_page(page); } +#ifdef CONFIG_F2FS_FS_COMPRESSION next_page: - if (pages) +#endif + if (rac) put_page(page); #ifdef CONFIG_F2FS_FS_COMPRESSION @@ -2253,16 +2248,15 @@ int f2fs_mpage_readpages(struct address_space *mapping, ret = f2fs_read_multi_pages(&cc, &bio, max_nr_pages, &last_block_in_bio, - is_readahead); + rac); f2fs_destroy_compress_ctx(&cc); } } #endif } - BUG_ON(pages && !list_empty(pages)); if (bio) __submit_bio(F2FS_I_SB(inode), bio, DATA); - return pages ? 0 : ret; + return ret; } static int f2fs_read_data_page(struct file *file, struct page *page) @@ -2281,28 +2275,24 @@ static int f2fs_read_data_page(struct file *file, struct page *page) if (f2fs_has_inline_data(inode)) ret = f2fs_read_inline_data(inode, page); if (ret == -EAGAIN) - ret = f2fs_mpage_readpages(page_file_mapping(page), - NULL, page, 1, false); + ret = f2fs_mpage_readpages(inode, NULL, page); return ret; } -static int f2fs_read_data_pages(struct file *file, - struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void f2fs_readahead(struct readahead_control *rac) { - struct inode *inode = mapping->host; - struct page *page = list_last_entry(pages, struct page, lru); + struct inode *inode = rac->mapping->host; - trace_f2fs_readpages(inode, page, nr_pages); + trace_f2fs_readpages(inode, readahead_index(rac), readahead_count(rac)); if (!f2fs_is_compress_backend_ready(inode)) - return 0; + return; /* If the file has inline data, skip readpages */ if (f2fs_has_inline_data(inode)) - return 0; + return; - return f2fs_mpage_readpages(mapping, pages, NULL, nr_pages, true); + f2fs_mpage_readpages(inode, rac, NULL); } int f2fs_encrypt_one_page(struct f2fs_io_info *fio) @@ -3784,7 +3774,7 @@ static void f2fs_swap_deactivate(struct file *file) const struct address_space_operations f2fs_dblock_aops = { .readpage = f2fs_read_data_page, - .readpages = f2fs_read_data_pages, + .readahead = f2fs_readahead, .writepage = f2fs_write_data_page, .writepages = f2fs_write_data_pages, .write_begin = f2fs_write_begin, diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 5355be6b6755..b5e72dee8826 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3344,9 +3344,8 @@ int f2fs_reserve_new_block(struct dnode_of_data *dn); int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index); int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from); int f2fs_reserve_block(struct dnode_of_data *dn, pgoff_t index); -int f2fs_mpage_readpages(struct address_space *mapping, - struct list_head *pages, struct page *page, - unsigned nr_pages, bool is_readahead); +int f2fs_mpage_readpages(struct inode *inode, struct readahead_control *rac, + struct page *page); struct page *f2fs_get_read_data_page(struct inode *inode, pgoff_t index, int op_flags, bool for_write); struct page *f2fs_find_data_page(struct inode *inode, pgoff_t index); diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h index 67a97838c2a0..d72da4a33883 100644 --- a/include/trace/events/f2fs.h +++ b/include/trace/events/f2fs.h @@ -1375,9 +1375,9 @@ TRACE_EVENT(f2fs_writepages, TRACE_EVENT(f2fs_readpages, - TP_PROTO(struct inode *inode, struct page *page, unsigned int nrpage), + TP_PROTO(struct inode *inode, pgoff_t start, unsigned int nrpage), - TP_ARGS(inode, page, nrpage), + TP_ARGS(inode, start, nrpage), TP_STRUCT__entry( __field(dev_t, dev) @@ -1389,7 +1389,7 @@ TRACE_EVENT(f2fs_readpages, TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; - __entry->start = page->index; + __entry->start = start; __entry->nrpage = nrpage; ), From patchwork Mon Feb 17 18:46:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387431 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D122117F0 for ; Mon, 17 Feb 2020 18:49:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B0C0922527 for ; Mon, 17 Feb 2020 18:49:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="ihuzqvfs" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729708AbgBQSqR (ORCPT ); Mon, 17 Feb 2020 13:46:17 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:47982 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727283AbgBQSqQ (ORCPT ); Mon, 17 Feb 2020 13:46:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=Zs/6ZYscEmaqH6Fn7TkcBh2Ihqysyh0GqwKJS/lYC44=; b=ihuzqvfsUwV+kz9D3mWjKbvcbp KKoscUvyFFVuGvwx/yHq7pdvnF7ldSGMA8GzW22T6bVZoDuoyJKZ83G5/DREVB1fBEF0Gf/0cgzVH z9KoAbQVgik49DAxRVS4mk0it8C30dc3Ejz78b8cU0tzpBP0TQFhKU/30SA1I70s1HFZUs4A8HyfO zVS2RVOvUxXqcibmhZlmApfVIs1BU8HN8cSfmwz9iOh85+OOrWWzHZvuVoREetKEPNvcmkKzMIqzF 2537Brdp9aF5GVHB1v4Nc7ZICnw/ideSMUTEw/NvAjQqz7SZdDtQGdzM+kMh3OF4qsapzvZ9xUxMP BA/R8GUw==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPM-0005C2-6y; Mon, 17 Feb 2020 18:46:16 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 14/19] ext4: Convert from readpages to readahead Date: Mon, 17 Feb 2020 10:46:05 -0800 Message-Id: <20200217184613.19668-25-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Use the new readahead operation in ext4 Signed-off-by: Matthew Wilcox (Oracle) --- fs/ext4/ext4.h | 3 +-- fs/ext4/inode.c | 23 ++++++++++------------- fs/ext4/readpage.c | 22 ++++++++-------------- 3 files changed, 19 insertions(+), 29 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 4441331d06cc..1570a0b51b73 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -3279,8 +3279,7 @@ static inline void ext4_set_de_type(struct super_block *sb, /* readpages.c */ extern int ext4_mpage_readpages(struct address_space *mapping, - struct list_head *pages, struct page *page, - unsigned nr_pages, bool is_readahead); + struct readahead_control *rac, struct page *page); extern int __init ext4_init_post_read_processing(void); extern void ext4_exit_post_read_processing(void); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index e60aca791d3f..b3349bfb75b8 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3218,7 +3218,7 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block) static int ext4_readpage(struct file *file, struct page *page) { int ret = -EAGAIN; - struct inode *inode = page->mapping->host; + struct inode *inode = file_inode(file); trace_ext4_readpage(page); @@ -3226,23 +3226,20 @@ static int ext4_readpage(struct file *file, struct page *page) ret = ext4_readpage_inline(inode, page); if (ret == -EAGAIN) - return ext4_mpage_readpages(page->mapping, NULL, page, 1, - false); + return ext4_mpage_readpages(page->mapping, NULL, page); return ret; } -static int -ext4_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void ext4_readahead(struct readahead_control *rac) { - struct inode *inode = mapping->host; + struct inode *inode = rac->mapping->host; - /* If the file has inline data, no need to do readpages. */ + /* If the file has inline data, no need to do readahead. */ if (ext4_has_inline_data(inode)) - return 0; + return; - return ext4_mpage_readpages(mapping, pages, NULL, nr_pages, true); + ext4_mpage_readpages(rac->mapping, rac, NULL); } static void ext4_invalidatepage(struct page *page, unsigned int offset, @@ -3587,7 +3584,7 @@ static int ext4_set_page_dirty(struct page *page) static const struct address_space_operations ext4_aops = { .readpage = ext4_readpage, - .readpages = ext4_readpages, + .readahead = ext4_readahead, .writepage = ext4_writepage, .writepages = ext4_writepages, .write_begin = ext4_write_begin, @@ -3604,7 +3601,7 @@ static const struct address_space_operations ext4_aops = { static const struct address_space_operations ext4_journalled_aops = { .readpage = ext4_readpage, - .readpages = ext4_readpages, + .readahead = ext4_readahead, .writepage = ext4_writepage, .writepages = ext4_writepages, .write_begin = ext4_write_begin, @@ -3620,7 +3617,7 @@ static const struct address_space_operations ext4_journalled_aops = { static const struct address_space_operations ext4_da_aops = { .readpage = ext4_readpage, - .readpages = ext4_readpages, + .readahead = ext4_readahead, .writepage = ext4_writepage, .writepages = ext4_writepages, .write_begin = ext4_da_write_begin, diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c index c1769afbf799..e14841ade612 100644 --- a/fs/ext4/readpage.c +++ b/fs/ext4/readpage.c @@ -7,8 +7,8 @@ * * This was originally taken from fs/mpage.c * - * The intent is the ext4_mpage_readpages() function here is intended - * to replace mpage_readpages() in the general case, not just for + * The ext4_mpage_readahead() function here is intended to + * replace mpage_readahead() in the general case, not just for * encrypted files. It has some limitations (see below), where it * will fall back to read_block_full_page(), but these limitations * should only be hit when page_size != block_size. @@ -222,8 +222,7 @@ static inline loff_t ext4_readpage_limit(struct inode *inode) } int ext4_mpage_readpages(struct address_space *mapping, - struct list_head *pages, struct page *page, - unsigned nr_pages, bool is_readahead) + struct readahead_control *rac, struct page *page) { struct bio *bio = NULL; sector_t last_block_in_bio = 0; @@ -241,6 +240,7 @@ int ext4_mpage_readpages(struct address_space *mapping, int length; unsigned relative_block = 0; struct ext4_map_blocks map; + unsigned int nr_pages = rac ? readahead_count(rac) : 1; map.m_pblk = 0; map.m_lblk = 0; @@ -251,14 +251,9 @@ int ext4_mpage_readpages(struct address_space *mapping, int fully_mapped = 1; unsigned first_hole = blocks_per_page; - if (pages) { - page = lru_to_page(pages); - + if (rac) { + page = readahead_page(rac); prefetchw(&page->flags); - list_del(&page->lru); - if (add_to_page_cache_lru(page, mapping, page->index, - readahead_gfp_mask(mapping))) - goto next_page; } if (page_has_buffers(page)) @@ -381,7 +376,7 @@ int ext4_mpage_readpages(struct address_space *mapping, bio->bi_iter.bi_sector = blocks[0] << (blkbits - 9); bio->bi_end_io = mpage_end_io; bio_set_op_attrs(bio, REQ_OP_READ, - is_readahead ? REQ_RAHEAD : 0); + rac ? REQ_RAHEAD : 0); } length = first_hole << blkbits; @@ -406,10 +401,9 @@ int ext4_mpage_readpages(struct address_space *mapping, else unlock_page(page); next_page: - if (pages) + if (rac) put_page(page); } - BUG_ON(pages && !list_empty(pages)); if (bio) submit_bio(bio); return 0; From patchwork Mon Feb 17 18:46:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387425 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5E5F517F0 for ; Mon, 17 Feb 2020 18:49:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 35EE322527 for ; Mon, 17 Feb 2020 18:49:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="snPthlEa" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729730AbgBQSqS (ORCPT ); Mon, 17 Feb 2020 13:46:18 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:47996 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729312AbgBQSqQ (ORCPT ); Mon, 17 Feb 2020 13:46:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=1vCvBaqxbguzzL0oqL4ThK9G1MHEV4QPzlA3342SkQo=; b=snPthlEa1laZVQGXgsuWsnm1i/ S9nYhTXEc17hrnBUmC8pxzHsx9KDM8jn96kPDRAjQBIJKTWAYboduinpQZrbzO5s/gj8K3XXhC5eI BSgBq67IKoQitJdFgAfF2FtG6pmxHSKd04UZuzZgguQQJeHmmk30CntJRnWcQYERfOjnuKnaiQjSH 3d9tZnZetBNaFWLpXu93MjZuKbajeWtSRHTmESry9VHjIVXTaJUafTUeMBcCIULTngwlv4eIUALsg 1OJE78jG3Z1BPaaWXBRfn/ZMTWPrw3j9Kc2IBjI0hwD6kIwoJ1s/2sQ/UtOsFUi9TbUR02/VaqWp2 H6Ugi6WQ==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPM-0005D4-Bq; Mon, 17 Feb 2020 18:46:16 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 15/16] iomap: Convert from readpages to readahead Date: Mon, 17 Feb 2020 10:46:08 -0800 Message-Id: <20200217184613.19668-28-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Use the new readahead operation in iomap. Convert XFS and ZoneFS to use it. Signed-off-by: Matthew Wilcox (Oracle) --- fs/iomap/buffered-io.c | 116 ++++++++++++++++------------------------- fs/iomap/trace.h | 2 +- fs/xfs/xfs_aops.c | 13 ++--- fs/zonefs/super.c | 7 ++- include/linux/iomap.h | 3 +- 5 files changed, 54 insertions(+), 87 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index cb3511eb152a..2bfcd5242264 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -214,9 +214,8 @@ iomap_read_end_io(struct bio *bio) struct iomap_readpage_ctx { struct page *cur_page; bool cur_page_in_bio; - bool is_readahead; struct bio *bio; - struct list_head *pages; + struct readahead_control *rac; }; static void @@ -307,11 +306,11 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data, if (ctx->bio) submit_bio(ctx->bio); - if (ctx->is_readahead) /* same as readahead_gfp_mask */ + if (ctx->rac) /* same as readahead_gfp_mask */ gfp |= __GFP_NORETRY | __GFP_NOWARN; ctx->bio = bio_alloc(gfp, min(BIO_MAX_PAGES, nr_vecs)); ctx->bio->bi_opf = REQ_OP_READ; - if (ctx->is_readahead) + if (ctx->rac) ctx->bio->bi_opf |= REQ_RAHEAD; ctx->bio->bi_iter.bi_sector = sector; bio_set_dev(ctx->bio, iomap->bdev); @@ -367,104 +366,77 @@ iomap_readpage(struct page *page, const struct iomap_ops *ops) } EXPORT_SYMBOL_GPL(iomap_readpage); -static struct page * -iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos, - loff_t length, loff_t *done) -{ - while (!list_empty(pages)) { - struct page *page = lru_to_page(pages); - - if (page_offset(page) >= (u64)pos + length) - break; - - list_del(&page->lru); - if (!add_to_page_cache_lru(page, inode->i_mapping, page->index, - GFP_NOFS)) - return page; - - /* - * If we already have a page in the page cache at index we are - * done. Upper layers don't care if it is uptodate after the - * readpages call itself as every page gets checked again once - * actually needed. - */ - *done += PAGE_SIZE; - put_page(page); - } - - return NULL; -} - static loff_t -iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, +iomap_readahead_actor(struct inode *inode, loff_t pos, loff_t length, void *data, struct iomap *iomap, struct iomap *srcmap) { struct iomap_readpage_ctx *ctx = data; - loff_t done, ret; + loff_t ret, done = 0; - for (done = 0; done < length; done += ret) { - if (ctx->cur_page && offset_in_page(pos + done) == 0) { - if (!ctx->cur_page_in_bio) - unlock_page(ctx->cur_page); - put_page(ctx->cur_page); - ctx->cur_page = NULL; - } + while (done < length) { if (!ctx->cur_page) { - ctx->cur_page = iomap_next_page(inode, ctx->pages, - pos, length, &done); - if (!ctx->cur_page) - break; + ctx->cur_page = readahead_page(ctx->rac); ctx->cur_page_in_bio = false; } ret = iomap_readpage_actor(inode, pos + done, length - done, ctx, iomap, srcmap); + if (WARN_ON(ret == 0)) + break; + done += ret; + if (offset_in_page(pos + done) == 0) { + readahead_next(ctx->rac); + if (!ctx->cur_page_in_bio) + unlock_page(ctx->cur_page); + put_page(ctx->cur_page); + ctx->cur_page = NULL; + } } return done; } -int -iomap_readpages(struct address_space *mapping, struct list_head *pages, - unsigned nr_pages, const struct iomap_ops *ops) +/** + * iomap_readahead - Attempt to read pages from a file. + * @rac: Describes the pages to be read. + * @ops: The operations vector for the filesystem. + * + * This function is for filesystems to call to implement their readahead + * address_space operation. + * + * Context: The file is pinned by the caller, and the pages to be read are + * all locked and have an elevated refcount. This function will unlock + * the pages (once I/O has completed on them, or I/O has been determined to + * not be necessary). It will also decrease the refcount once the pages + * have been submitted for I/O. After this point, the page may be removed + * from the page cache, and should not be referenced. + */ +void iomap_readahead(struct readahead_control *rac, const struct iomap_ops *ops) { + struct inode *inode = rac->mapping->host; struct iomap_readpage_ctx ctx = { - .pages = pages, - .is_readahead = true, + .rac = rac, }; - loff_t pos = page_offset(list_entry(pages->prev, struct page, lru)); - loff_t last = page_offset(list_entry(pages->next, struct page, lru)); - loff_t length = last - pos + PAGE_SIZE, ret = 0; + loff_t pos = readahead_offset(rac); + loff_t length = readahead_length(rac); - trace_iomap_readpages(mapping->host, nr_pages); + trace_iomap_readahead(inode, readahead_count(rac)); while (length > 0) { - ret = iomap_apply(mapping->host, pos, length, 0, ops, - &ctx, iomap_readpages_actor); + loff_t ret = iomap_apply(inode, pos, length, 0, ops, + &ctx, iomap_readahead_actor); if (ret <= 0) { WARN_ON_ONCE(ret == 0); - goto done; + break; } pos += ret; length -= ret; } - ret = 0; -done: + if (ctx.bio) submit_bio(ctx.bio); - if (ctx.cur_page) { - if (!ctx.cur_page_in_bio) - unlock_page(ctx.cur_page); - put_page(ctx.cur_page); - } - - /* - * Check that we didn't lose a page due to the arcance calling - * conventions.. - */ - WARN_ON_ONCE(!ret && !list_empty(ctx.pages)); - return ret; + BUG_ON(ctx.cur_page); } -EXPORT_SYMBOL_GPL(iomap_readpages); +EXPORT_SYMBOL_GPL(iomap_readahead); /* * iomap_is_partially_uptodate checks whether blocks within a page are diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h index 6dc227b8c47e..d6ba705f938a 100644 --- a/fs/iomap/trace.h +++ b/fs/iomap/trace.h @@ -39,7 +39,7 @@ DEFINE_EVENT(iomap_readpage_class, name, \ TP_PROTO(struct inode *inode, int nr_pages), \ TP_ARGS(inode, nr_pages)) DEFINE_READPAGE_EVENT(iomap_readpage); -DEFINE_READPAGE_EVENT(iomap_readpages); +DEFINE_READPAGE_EVENT(iomap_readahead); DECLARE_EVENT_CLASS(iomap_page_class, TP_PROTO(struct inode *inode, struct page *page, unsigned long off, diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 58e937be24ce..6e68eeb50b07 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -621,14 +621,11 @@ xfs_vm_readpage( return iomap_readpage(page, &xfs_read_iomap_ops); } -STATIC int -xfs_vm_readpages( - struct file *unused, - struct address_space *mapping, - struct list_head *pages, - unsigned nr_pages) +STATIC void +xfs_vm_readahead( + struct readahead_control *rac) { - return iomap_readpages(mapping, pages, nr_pages, &xfs_read_iomap_ops); + iomap_readahead(rac, &xfs_read_iomap_ops); } static int @@ -644,7 +641,7 @@ xfs_iomap_swapfile_activate( const struct address_space_operations xfs_address_space_operations = { .readpage = xfs_vm_readpage, - .readpages = xfs_vm_readpages, + .readahead = xfs_vm_readahead, .writepage = xfs_vm_writepage, .writepages = xfs_vm_writepages, .set_page_dirty = iomap_set_page_dirty, diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index 8bc6ef82d693..8327a01d3bac 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -78,10 +78,9 @@ static int zonefs_readpage(struct file *unused, struct page *page) return iomap_readpage(page, &zonefs_iomap_ops); } -static int zonefs_readpages(struct file *unused, struct address_space *mapping, - struct list_head *pages, unsigned int nr_pages) +static void zonefs_readahead(struct readahead_control *rac) { - return iomap_readpages(mapping, pages, nr_pages, &zonefs_iomap_ops); + iomap_readahead(rac, &zonefs_iomap_ops); } /* @@ -128,7 +127,7 @@ static int zonefs_writepages(struct address_space *mapping, static const struct address_space_operations zonefs_file_aops = { .readpage = zonefs_readpage, - .readpages = zonefs_readpages, + .readahead = zonefs_readahead, .writepage = zonefs_writepage, .writepages = zonefs_writepages, .set_page_dirty = iomap_set_page_dirty, diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 8b09463dae0d..bc20bd04c2a2 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -155,8 +155,7 @@ loff_t iomap_apply(struct inode *inode, loff_t pos, loff_t length, ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from, const struct iomap_ops *ops); int iomap_readpage(struct page *page, const struct iomap_ops *ops); -int iomap_readpages(struct address_space *mapping, struct list_head *pages, - unsigned nr_pages, const struct iomap_ops *ops); +void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops); int iomap_set_page_dirty(struct page *page); int iomap_is_partially_uptodate(struct page *page, unsigned long from, unsigned long count); From patchwork Mon Feb 17 18:46:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387439 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3CC2017F0 for ; Mon, 17 Feb 2020 18:49:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1D0BC222D9 for ; Mon, 17 Feb 2020 18:49:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="aIbwnU9D" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729678AbgBQSqR (ORCPT ); Mon, 17 Feb 2020 13:46:17 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48000 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729334AbgBQSqQ (ORCPT ); Mon, 17 Feb 2020 13:46:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=TLN+MN9liFnLDsYo+PJ3gSWsvtLSIjBwcyOIhgoJ5VY=; b=aIbwnU9Df1jsoiqLEf4E8y50n0 KBphTMtG/hgBVks+jJRI/Vuxx3Shv4IMDxXkeds4gPurgAVrgFjOWaC1EP6APjAYMLEKWy4XwvMZt a3u4DEFgrWg0/XpWJTch6F0Wh9wbYORfQBhLSkVCgthcPz+s7Q3Z4jw62jqy0mbqANjMnRNb/8JVt HxAdNzJOmH8B3o8l00FIdbnaJkaKdDevSU3ySXjURRVkWTJY+E0+Iz/4cUOGy+7e8cR4kMAcC2sLa Qarn+Emgu1jI3q5pAgDq+ceWqtDk/zy9PXckuZDttd0ifPhelG3O6p9fwbFzN/CQdwNtICx0By+TA 4bZG5qnw==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPM-0005DE-D7; Mon, 17 Feb 2020 18:46:16 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 16/19] fuse: Convert from readpages to readahead Date: Mon, 17 Feb 2020 10:46:09 -0800 Message-Id: <20200217184613.19668-29-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Use the new readahead operation in fuse. Switching away from the read_cache_pages() helper gets rid of an implicit call to put_page(), so we can get rid of the get_page() call in fuse_readpages_fill(). Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Dave Chinner --- fs/fuse/file.c | 46 +++++++++++++++++++--------------------------- 1 file changed, 19 insertions(+), 27 deletions(-) diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 9d67b830fb7a..f64f98708b5e 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -923,9 +923,8 @@ struct fuse_fill_data { unsigned int max_pages; }; -static int fuse_readpages_fill(void *_data, struct page *page) +static int fuse_readpages_fill(struct fuse_fill_data *data, struct page *page) { - struct fuse_fill_data *data = _data; struct fuse_io_args *ia = data->ia; struct fuse_args_pages *ap = &ia->ap; struct inode *inode = data->inode; @@ -941,10 +940,8 @@ static int fuse_readpages_fill(void *_data, struct page *page) fc->max_pages); fuse_send_readpages(ia, data->file); data->ia = ia = fuse_io_alloc(NULL, data->max_pages); - if (!ia) { - unlock_page(page); + if (!ia) return -ENOMEM; - } ap = &ia->ap; } @@ -954,7 +951,6 @@ static int fuse_readpages_fill(void *_data, struct page *page) return -EIO; } - get_page(page); ap->pages[ap->num_pages] = page; ap->descs[ap->num_pages].length = PAGE_SIZE; ap->num_pages++; @@ -962,37 +958,33 @@ static int fuse_readpages_fill(void *_data, struct page *page) return 0; } -static int fuse_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void fuse_readahead(struct readahead_control *rac) { - struct inode *inode = mapping->host; + struct inode *inode = rac->mapping->host; struct fuse_conn *fc = get_fuse_conn(inode); struct fuse_fill_data data; - int err; + struct page *page; - err = -EIO; if (is_bad_inode(inode)) - goto out; + return; - data.file = file; + data.file = rac->file; data.inode = inode; - data.nr_pages = nr_pages; - data.max_pages = min_t(unsigned int, nr_pages, fc->max_pages); -; + data.nr_pages = readahead_count(rac); + data.max_pages = min_t(unsigned int, data.nr_pages, fc->max_pages); data.ia = fuse_io_alloc(NULL, data.max_pages); - err = -ENOMEM; if (!data.ia) - goto out; + return; - err = read_cache_pages(mapping, pages, fuse_readpages_fill, &data); - if (!err) { - if (data.ia->ap.num_pages) - fuse_send_readpages(data.ia, file); - else - fuse_io_free(data.ia); + readahead_for_each(rac, page) { + if (fuse_readpages_fill(&data, page) != 0) + return; } -out: - return err; + + if (data.ia->ap.num_pages) + fuse_send_readpages(data.ia, rac->file); + else + fuse_io_free(data.ia); } static ssize_t fuse_cache_read_iter(struct kiocb *iocb, struct iov_iter *to) @@ -3373,10 +3365,10 @@ static const struct file_operations fuse_file_operations = { static const struct address_space_operations fuse_file_aops = { .readpage = fuse_readpage, + .readahead = fuse_readahead, .writepage = fuse_writepage, .writepages = fuse_writepages, .launder_page = fuse_launder_page, - .readpages = fuse_readpages, .set_page_dirty = __set_page_dirty_nobuffers, .bmap = fuse_bmap, .direct_IO = fuse_direct_IO, From patchwork Mon Feb 17 18:46:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387469 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 45C26138D for ; Mon, 17 Feb 2020 18:49:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2576320836 for ; Mon, 17 Feb 2020 18:49:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="KoRhZEZY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729946AbgBQSts (ORCPT ); Mon, 17 Feb 2020 13:49:48 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48002 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729345AbgBQSqR (ORCPT ); Mon, 17 Feb 2020 13:46:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=TRd7Ps5bWxlwa3OWn+vpade7IP4m4k5O05zkXvvMr5Y=; b=KoRhZEZYO5r+aVQxWky2Sj5cKV x5KBRIm9HCvfLDRhcpo8BUshp6Yq45beYscHHQdrq+ozwaUlWGWSSBKc50p/2rxrcZknOWpduMgh4 xti2jXe2tmFQgy25MwHLfu5E0Fezg/REcQJUwOvzgiH+Lp1MafCFPWO4LTpJ9BSxXFIQHqmqBHFOl 4wIYMJ/6quo6hrkQKaq/0UogZI/X/vzR2t6osdUETyioAtrQegqbLstedpDQB5HdRdJ9OLJofEZ2c i3h+QbctO+ZRIYJgDEdD1xU6PVDrLqT0z+SIF3TRqfiy0Tb5zunZiVX4aOH6WZ0LMoDpDsSPlhkij pVaN5kew==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPM-0005Dk-G5; Mon, 17 Feb 2020 18:46:16 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 17/19] iomap: Restructure iomap_readpages_actor Date: Mon, 17 Feb 2020 10:46:11 -0800 Message-Id: <20200217184613.19668-31-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" By putting the 'have we reached the end of the page' condition at the end of the loop instead of the beginning, we can remove the 'submit the last page' code from iomap_readpages(). Also check that iomap_readpage_actor() didn't return 0, which would lead to an endless loop. Signed-off-by: Matthew Wilcox (Oracle) --- fs/iomap/buffered-io.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index cb3511eb152a..44303f370b2d 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -400,15 +400,9 @@ iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, void *data, struct iomap *iomap, struct iomap *srcmap) { struct iomap_readpage_ctx *ctx = data; - loff_t done, ret; + loff_t ret, done = 0; - for (done = 0; done < length; done += ret) { - if (ctx->cur_page && offset_in_page(pos + done) == 0) { - if (!ctx->cur_page_in_bio) - unlock_page(ctx->cur_page); - put_page(ctx->cur_page); - ctx->cur_page = NULL; - } + while (done < length) { if (!ctx->cur_page) { ctx->cur_page = iomap_next_page(inode, ctx->pages, pos, length, &done); @@ -418,6 +412,15 @@ iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, } ret = iomap_readpage_actor(inode, pos + done, length - done, ctx, iomap, srcmap); + if (WARN_ON(ret == 0)) + break; + done += ret; + if (offset_in_page(pos + done) == 0) { + if (!ctx->cur_page_in_bio) + unlock_page(ctx->cur_page); + put_page(ctx->cur_page); + ctx->cur_page = NULL; + } } return done; @@ -451,11 +454,7 @@ iomap_readpages(struct address_space *mapping, struct list_head *pages, done: if (ctx.bio) submit_bio(ctx.bio); - if (ctx.cur_page) { - if (!ctx.cur_page_in_bio) - unlock_page(ctx.cur_page); - put_page(ctx.cur_page); - } + BUG_ON(ctx.cur_page); /* * Check that we didn't lose a page due to the arcance calling From patchwork Mon Feb 17 18:46:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387413 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 81A1C138D for ; Mon, 17 Feb 2020 18:49:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5673820836 for ; Mon, 17 Feb 2020 18:49:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="U3RYpyXF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729780AbgBQSqT (ORCPT ); Mon, 17 Feb 2020 13:46:19 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48006 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729402AbgBQSqQ (ORCPT ); Mon, 17 Feb 2020 13:46:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=mqxD9lBY9utiLD/Gv5RY8e+xn8vGdjKkc/ZDF2zMRJo=; b=U3RYpyXF+Nrc4iI+IzimEj4Zj0 cAIJ1pHGMiygc1Yhf1PNRoB3zlPkZDqQo4LK9WzX5mc2mQl/+XlSWx7cQ0pqUywCTpXq48kh8R8x8 0msaRzAEczXYLwvRYX75BFYmOojvyZLzWlRU1qrQUIPlB9p0IympsxemGwLWeMqzWV2joWCE7Q0Un CuOlF2NUanM9CIU/uItF8a/d+/DN0Jl6CVJrL3QxVWuwG0YCGSu5sfWjWvdEplyigOU1PWTSPybh4 lI2HpeyJL9ynzP7vT45g1a3Wc/NUOPBV8LTLtGj5m2qLpXkz8fFR3t2C6UcY7s5EMcao5LqO4l0sd sNTuwF7Q==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPM-0005Du-HD; Mon, 17 Feb 2020 18:46:16 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 18/19] iomap: Convert from readpages to readahead Date: Mon, 17 Feb 2020 10:46:12 -0800 Message-Id: <20200217184613.19668-32-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Use the new readahead operation in iomap. Convert XFS and ZoneFS to use it. Signed-off-by: Matthew Wilcox (Oracle) --- fs/iomap/buffered-io.c | 91 +++++++++++++++--------------------------- fs/iomap/trace.h | 2 +- fs/xfs/xfs_aops.c | 13 +++--- fs/zonefs/super.c | 7 ++-- include/linux/iomap.h | 3 +- 5 files changed, 42 insertions(+), 74 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 44303f370b2d..2bfcd5242264 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -214,9 +214,8 @@ iomap_read_end_io(struct bio *bio) struct iomap_readpage_ctx { struct page *cur_page; bool cur_page_in_bio; - bool is_readahead; struct bio *bio; - struct list_head *pages; + struct readahead_control *rac; }; static void @@ -307,11 +306,11 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data, if (ctx->bio) submit_bio(ctx->bio); - if (ctx->is_readahead) /* same as readahead_gfp_mask */ + if (ctx->rac) /* same as readahead_gfp_mask */ gfp |= __GFP_NORETRY | __GFP_NOWARN; ctx->bio = bio_alloc(gfp, min(BIO_MAX_PAGES, nr_vecs)); ctx->bio->bi_opf = REQ_OP_READ; - if (ctx->is_readahead) + if (ctx->rac) ctx->bio->bi_opf |= REQ_RAHEAD; ctx->bio->bi_iter.bi_sector = sector; bio_set_dev(ctx->bio, iomap->bdev); @@ -367,36 +366,8 @@ iomap_readpage(struct page *page, const struct iomap_ops *ops) } EXPORT_SYMBOL_GPL(iomap_readpage); -static struct page * -iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos, - loff_t length, loff_t *done) -{ - while (!list_empty(pages)) { - struct page *page = lru_to_page(pages); - - if (page_offset(page) >= (u64)pos + length) - break; - - list_del(&page->lru); - if (!add_to_page_cache_lru(page, inode->i_mapping, page->index, - GFP_NOFS)) - return page; - - /* - * If we already have a page in the page cache at index we are - * done. Upper layers don't care if it is uptodate after the - * readpages call itself as every page gets checked again once - * actually needed. - */ - *done += PAGE_SIZE; - put_page(page); - } - - return NULL; -} - static loff_t -iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, +iomap_readahead_actor(struct inode *inode, loff_t pos, loff_t length, void *data, struct iomap *iomap, struct iomap *srcmap) { struct iomap_readpage_ctx *ctx = data; @@ -404,10 +375,7 @@ iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, while (done < length) { if (!ctx->cur_page) { - ctx->cur_page = iomap_next_page(inode, ctx->pages, - pos, length, &done); - if (!ctx->cur_page) - break; + ctx->cur_page = readahead_page(ctx->rac); ctx->cur_page_in_bio = false; } ret = iomap_readpage_actor(inode, pos + done, length - done, @@ -416,6 +384,7 @@ iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, break; done += ret; if (offset_in_page(pos + done) == 0) { + readahead_next(ctx->rac); if (!ctx->cur_page_in_bio) unlock_page(ctx->cur_page); put_page(ctx->cur_page); @@ -426,44 +395,48 @@ iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, return done; } -int -iomap_readpages(struct address_space *mapping, struct list_head *pages, - unsigned nr_pages, const struct iomap_ops *ops) +/** + * iomap_readahead - Attempt to read pages from a file. + * @rac: Describes the pages to be read. + * @ops: The operations vector for the filesystem. + * + * This function is for filesystems to call to implement their readahead + * address_space operation. + * + * Context: The file is pinned by the caller, and the pages to be read are + * all locked and have an elevated refcount. This function will unlock + * the pages (once I/O has completed on them, or I/O has been determined to + * not be necessary). It will also decrease the refcount once the pages + * have been submitted for I/O. After this point, the page may be removed + * from the page cache, and should not be referenced. + */ +void iomap_readahead(struct readahead_control *rac, const struct iomap_ops *ops) { + struct inode *inode = rac->mapping->host; struct iomap_readpage_ctx ctx = { - .pages = pages, - .is_readahead = true, + .rac = rac, }; - loff_t pos = page_offset(list_entry(pages->prev, struct page, lru)); - loff_t last = page_offset(list_entry(pages->next, struct page, lru)); - loff_t length = last - pos + PAGE_SIZE, ret = 0; + loff_t pos = readahead_offset(rac); + loff_t length = readahead_length(rac); - trace_iomap_readpages(mapping->host, nr_pages); + trace_iomap_readahead(inode, readahead_count(rac)); while (length > 0) { - ret = iomap_apply(mapping->host, pos, length, 0, ops, - &ctx, iomap_readpages_actor); + loff_t ret = iomap_apply(inode, pos, length, 0, ops, + &ctx, iomap_readahead_actor); if (ret <= 0) { WARN_ON_ONCE(ret == 0); - goto done; + break; } pos += ret; length -= ret; } - ret = 0; -done: + if (ctx.bio) submit_bio(ctx.bio); BUG_ON(ctx.cur_page); - - /* - * Check that we didn't lose a page due to the arcance calling - * conventions.. - */ - WARN_ON_ONCE(!ret && !list_empty(ctx.pages)); - return ret; } -EXPORT_SYMBOL_GPL(iomap_readpages); +EXPORT_SYMBOL_GPL(iomap_readahead); /* * iomap_is_partially_uptodate checks whether blocks within a page are diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h index 6dc227b8c47e..d6ba705f938a 100644 --- a/fs/iomap/trace.h +++ b/fs/iomap/trace.h @@ -39,7 +39,7 @@ DEFINE_EVENT(iomap_readpage_class, name, \ TP_PROTO(struct inode *inode, int nr_pages), \ TP_ARGS(inode, nr_pages)) DEFINE_READPAGE_EVENT(iomap_readpage); -DEFINE_READPAGE_EVENT(iomap_readpages); +DEFINE_READPAGE_EVENT(iomap_readahead); DECLARE_EVENT_CLASS(iomap_page_class, TP_PROTO(struct inode *inode, struct page *page, unsigned long off, diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 58e937be24ce..6e68eeb50b07 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -621,14 +621,11 @@ xfs_vm_readpage( return iomap_readpage(page, &xfs_read_iomap_ops); } -STATIC int -xfs_vm_readpages( - struct file *unused, - struct address_space *mapping, - struct list_head *pages, - unsigned nr_pages) +STATIC void +xfs_vm_readahead( + struct readahead_control *rac) { - return iomap_readpages(mapping, pages, nr_pages, &xfs_read_iomap_ops); + iomap_readahead(rac, &xfs_read_iomap_ops); } static int @@ -644,7 +641,7 @@ xfs_iomap_swapfile_activate( const struct address_space_operations xfs_address_space_operations = { .readpage = xfs_vm_readpage, - .readpages = xfs_vm_readpages, + .readahead = xfs_vm_readahead, .writepage = xfs_vm_writepage, .writepages = xfs_vm_writepages, .set_page_dirty = iomap_set_page_dirty, diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index 8bc6ef82d693..8327a01d3bac 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -78,10 +78,9 @@ static int zonefs_readpage(struct file *unused, struct page *page) return iomap_readpage(page, &zonefs_iomap_ops); } -static int zonefs_readpages(struct file *unused, struct address_space *mapping, - struct list_head *pages, unsigned int nr_pages) +static void zonefs_readahead(struct readahead_control *rac) { - return iomap_readpages(mapping, pages, nr_pages, &zonefs_iomap_ops); + iomap_readahead(rac, &zonefs_iomap_ops); } /* @@ -128,7 +127,7 @@ static int zonefs_writepages(struct address_space *mapping, static const struct address_space_operations zonefs_file_aops = { .readpage = zonefs_readpage, - .readpages = zonefs_readpages, + .readahead = zonefs_readahead, .writepage = zonefs_writepage, .writepages = zonefs_writepages, .set_page_dirty = iomap_set_page_dirty, diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 8b09463dae0d..bc20bd04c2a2 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -155,8 +155,7 @@ loff_t iomap_apply(struct inode *inode, loff_t pos, loff_t length, ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from, const struct iomap_ops *ops); int iomap_readpage(struct page *page, const struct iomap_ops *ops); -int iomap_readpages(struct address_space *mapping, struct list_head *pages, - unsigned nr_pages, const struct iomap_ops *ops); +void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops); int iomap_set_page_dirty(struct page *page); int iomap_is_partially_uptodate(struct page *page, unsigned long from, unsigned long count); From patchwork Mon Feb 17 18:46:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11387443 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D6EB9138D for ; Mon, 17 Feb 2020 18:49:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B7A2322525 for ; Mon, 17 Feb 2020 18:49:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="by4S+Hud" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729982AbgBQStb (ORCPT ); Mon, 17 Feb 2020 13:49:31 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:48022 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729541AbgBQSqR (ORCPT ); Mon, 17 Feb 2020 13:46:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=cAMSPd0G1UKWpX2DYFJf6s7OiGA49hD2HbMDxjJorXI=; b=by4S+HudkubDhJMCqtucpyVJYH +51vmgnEamWn4YLYYEZStDwKyEo7sFXwK4JJdAikHQvi4wLGlUSzXVwrHvqZl/MySIOs8F5a2WJwg XIC4jIfdvSmaswvzMJCrYgtlaHeyHkoWBbkfLI+cq0LKAfekELxKJuW2r5kH6lfA10y22KNnM6zfi ABc5R3/1PqE7XKHVvbCJfr+uDNcwVzXB+QbTzZRmD5Pdi0pTvxwkJ4tKsgOwVwk2HcCTpoV35pv9s djzxIqYl6imCZ8V2jkfCkF3tDtbbHHT1Z6U79+n9yzetFHJprdKqv8EXklKtWyOuHiXMo2ovVvM2n 1x9ivLVQ==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPM-0005EA-Is; Mon, 17 Feb 2020 18:46:16 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org, Cong Wang , Michal Hocko Subject: [PATCH v6 19/19] mm: Use memalloc_nofs_save in readahead path Date: Mon, 17 Feb 2020 10:46:13 -0800 Message-Id: <20200217184613.19668-33-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: "Matthew Wilcox (Oracle)" Ensure that memory allocations in the readahead path do not attempt to reclaim file-backed pages, which could lead to a deadlock. It is possible, though unlikely this is the root cause of a problem observed by Cong Wang. Signed-off-by: Matthew Wilcox (Oracle) Reported-by: Cong Wang Suggested-by: Michal Hocko --- mm/readahead.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/mm/readahead.c b/mm/readahead.c index 94d499cfb657..8f9c0dba24e7 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -22,6 +22,7 @@ #include #include #include +#include #include "internal.h" @@ -174,6 +175,18 @@ void page_cache_readahead_limit(struct address_space *mapping, ._nr_pages = 0, }; + /* + * Partway through the readahead operation, we will have added + * locked pages to the page cache, but will not yet have submitted + * them for I/O. Adding another page may need to allocate memory, + * which can trigger memory reclaim. Telling the VM we're in + * the middle of a filesystem operation will cause it to not + * touch file-backed pages, preventing a deadlock. Most (all?) + * filesystems already specify __GFP_NOFS in their mapping's + * gfp_mask, but let's be explicit here. + */ + unsigned int nofs = memalloc_nofs_save(); + /* * Preallocate as many pages as we will need. */ @@ -227,6 +240,7 @@ void page_cache_readahead_limit(struct address_space *mapping, if (readahead_count(&rac)) read_pages(&rac, &page_pool); BUG_ON(!list_empty(&page_pool)); + memalloc_nofs_restore(nofs); } EXPORT_SYMBOL_GPL(page_cache_readahead_limit);