From patchwork Thu May 31 18:06:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10441711 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A405F602BF for ; Thu, 31 May 2018 18:07:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 90E9C28E05 for ; Thu, 31 May 2018 18:07:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 85A0528F91; Thu, 31 May 2018 18:07:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE, T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DFF3228E05 for ; Thu, 31 May 2018 18:07:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F4DD6B0275; Thu, 31 May 2018 14:06:57 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 27B8C6B0277; Thu, 31 May 2018 14:06:57 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 031B76B0278; Thu, 31 May 2018 14:06:56 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl0-f69.google.com (mail-pl0-f69.google.com [209.85.160.69]) by kanga.kvack.org (Postfix) with ESMTP id AD28C6B0275 for ; Thu, 31 May 2018 14:06:56 -0400 (EDT) Received: by mail-pl0-f69.google.com with SMTP id e1-v6so13634463pld.23 for ; Thu, 31 May 2018 11:06:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=jr+txmCNeb6OTBIJycQoShtm0cjZ4wgoKv0BHgNJJ1Q=; b=BGPviZDuLg+xFv6rHgwOIvqJnjd+PSVyeveH6UvHyptbK2NxFPRhLzEbY8PnrtV4XD cmy+K8ctb8OT3Th+mZ35U+OhTNDYe4KivfzTyICqW0FACiT0gmMCsV9VQI8Jt8CW1Exz M1AJr0NDsGB65FBpJAHr8K7760gOgum8dYld6e7/ic4LCTbC5Bwz4hXkvS5PcxBH+yfP nufG4S8muwLju0cJS9FMHTkec1vWobQ+qBPdTmPSWhCH6UJcIN+p70hkhi1Wqr1TFByP BsrjGuMNBZ8NDO8z1Lq4ZtquNlwDVC88NAzJ83g8dXmh51VNqjaaIJ3BlRVR35onXM6r OAUQ== X-Gm-Message-State: ALKqPwe7ix23oysePTVjaCnivm4F7uy6ARL2RjzhsK8+TsgD0UqhkraC igCLv+G0RNszQz/ScP7PH14EagWvQPoKMdEtSJjwoJyy4+kV8K8ZF6cIr9mYeNw5bsIU3XLXGPs +MDEid+s6zrjRIXSpOyCyLF3WX468jigoLqY3hHUnozS8LEiwnG4L5CrOhYCrk4M= X-Received: by 2002:a17:902:1566:: with SMTP id b35-v6mr7963417plh.107.1527790016382; Thu, 31 May 2018 11:06:56 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJk5zSqzyaNVUO3vA82Z/QDUEXmHqy8qAkGL29D4pDC6RnsQE/trHv+Rtp/eoW/w4a9dnKT X-Received: by 2002:a17:902:1566:: with SMTP id b35-v6mr7963376plh.107.1527790015310; Thu, 31 May 2018 11:06:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527790015; cv=none; d=google.com; s=arc-20160816; b=b+cRyER6YhgQYeqLu+z1yjeTKkkniT12hsjViXbCMjutN7fkDML6E7Rokx6ZuPmsvt B/QtyWP8t5yFAfzX5dWhQGG1pcwQngZ+Llvia2YGSwuZCJYaPITEsdX8oKdwXZ1JUb4r OZum7uxegoxJfFwuSW2X/5R3yE5blXgOyAhDiIPRaZdVckdZ61T61ZcKpYAkvvEN+2dU U9vm1eI1vCQkIGYdHy1DQepMTZgKRSNYJttyK6/QFBWRFoL5+4fyNceRAjUH/5xHgCcw pCWP51ERUuMwP9n9UghQxvGKUkY5INpGU4M2ZNn2B61pS/md7yCH54Mvj8n6MkIUZFFL N7DQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=jr+txmCNeb6OTBIJycQoShtm0cjZ4wgoKv0BHgNJJ1Q=; b=KIuwKq2ROZxSomf97TDnOSivblqCz8XrK/RQ1VjsJgfH+KgZb+KLQplHH0RkRELUMc ywu763lzqSxnOdnBVo4CzCw7fNuoELFPdJUfUFc47naXXq65E5zKAjkyNqKexOnYpGpY RVvdZLEv7WVUhNlS1CkygJF5PzYBXmBvwcudJ77UyaLXz2g2Ew/VwPZwVAlHtWC+Tff+ Ze8o7JU2LDzHht8I3S9s6pgDf8N9gSRR+yLuHJQuL3KaHGrJQXIF2Sswdl5WbHA2jFkh QKZeTXxH8ldfFagq9q4bJeXYdVLJZyfFV5ZO2kRCsrGtswi/WaXJNhmSvtRomHRBniX2 FyeA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20170209 header.b=Ltq9f8Hc; spf=pass (google.com: best guess record for domain of batv+67dfbb8f664e29ec4bbc+5394+infradead.org+hch@bombadil.srs.infradead.org designates 2607:7c80:54:e::133 as permitted sender) smtp.mailfrom=BATV+67dfbb8f664e29ec4bbc+5394+infradead.org+hch@bombadil.srs.infradead.org Received: from bombadil.infradead.org (bombadil.infradead.org. [2607:7c80:54:e::133]) by mx.google.com with ESMTPS id m6-v6si29797374pgq.611.2018.05.31.11.06.55 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 31 May 2018 11:06:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of batv+67dfbb8f664e29ec4bbc+5394+infradead.org+hch@bombadil.srs.infradead.org designates 2607:7c80:54:e::133 as permitted sender) client-ip=2607:7c80:54:e::133; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20170209 header.b=Ltq9f8Hc; spf=pass (google.com: best guess record for domain of batv+67dfbb8f664e29ec4bbc+5394+infradead.org+hch@bombadil.srs.infradead.org designates 2607:7c80:54:e::133 as permitted sender) smtp.mailfrom=BATV+67dfbb8f664e29ec4bbc+5394+infradead.org+hch@bombadil.srs.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=References:In-Reply-To:Message-Id: Date:Subject:Cc:To:From:Sender:Reply-To:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=jr+txmCNeb6OTBIJycQoShtm0cjZ4wgoKv0BHgNJJ1Q=; b=Ltq9f8HcV/50ZyPnrThA5vd+D S0P0HTQ0v6k0PY9Uhhogj5Do7N/hwH2lzfBAy6RVTVlETL+q1dbXUG17GzmEDOHQrkkhaWApFA7Ve KQPHvoYFYI0ZUiWcPqyNUB0j7+S4Xl+R2Lw5vhjquYiOrbEpI6dLvfWhnpmguVmjS9Dxh5IkPh6QX 1UjLK5kNFciCs8JgYGXVVgtNg+swUDThioZClg+0yCmxyiEbhU1YMask1svTf8yesAqZCiYNflpBI mBZzbO+DAOWyVprQ81z9XDCWxj0AV8WrIFualzTd79whY63w7WEEWBm1Bu+Z9Oo2og1UGwmU7ie72 /0A6att8Q==; Received: from 213-225-38-123.nat.highway.a1.net ([213.225.38.123] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1fORyQ-0003T6-A4; Thu, 31 May 2018 18:06:54 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 11/13] iomap: add an iomap-based readpage and readpages implementation Date: Thu, 31 May 2018 20:06:12 +0200 Message-Id: <20180531180614.21506-12-hch@lst.de> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180531180614.21506-1-hch@lst.de> References: <20180531180614.21506-1-hch@lst.de> X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Simply use iomap_apply to iterate over the file and a submit a bio for each non-uptodate but mapped region and zero everything else. Note that as-is this can not be used for file systems with a blocksize smaller than the page size, but that support will be added later. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong --- fs/iomap.c | 214 +++++++++++++++++++++++++++++++++++++++++- include/linux/iomap.h | 4 + 2 files changed, 217 insertions(+), 1 deletion(-) diff --git a/fs/iomap.c b/fs/iomap.c index b0bc928672af..106720355963 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -1,6 +1,6 @@ /* * Copyright (C) 2010 Red Hat, Inc. - * Copyright (c) 2016 Christoph Hellwig. + * Copyright (c) 2016-2018 Christoph Hellwig. * * This program is free software; you can redistribute it and/or modify it * under the terms and conditions of the GNU General Public License, @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -102,6 +103,217 @@ iomap_sector(struct iomap *iomap, loff_t pos) return (iomap->addr + pos - iomap->offset) >> SECTOR_SHIFT; } +static void +iomap_read_end_io(struct bio *bio) +{ + int error = blk_status_to_errno(bio->bi_status); + struct bio_vec *bvec; + int i; + + bio_for_each_segment_all(bvec, bio, i) + page_endio(bvec->bv_page, false, error); + bio_put(bio); +} + +struct iomap_readpage_ctx { + struct page *cur_page; + bool cur_page_in_bio; + bool is_readahead; + struct bio *bio; + struct list_head *pages; +}; + +static loff_t +iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data, + struct iomap *iomap) +{ + struct iomap_readpage_ctx *ctx = data; + struct page *page = ctx->cur_page; + unsigned poff = pos & (PAGE_SIZE - 1); + unsigned plen = min_t(loff_t, PAGE_SIZE - poff, length); + bool is_contig = false; + sector_t sector; + + /* we don't support blocksize < PAGE_SIZE quite yet. */ + WARN_ON_ONCE(pos != page_offset(page)); + WARN_ON_ONCE(plen != PAGE_SIZE); + + if (iomap->type != IOMAP_MAPPED || pos >= i_size_read(inode)) { + zero_user(page, poff, plen); + SetPageUptodate(page); + goto done; + } + + ctx->cur_page_in_bio = true; + + /* + * Try to merge into a previous segment if we can. + */ + sector = iomap_sector(iomap, pos); + if (ctx->bio && bio_end_sector(ctx->bio) == sector) { + if (__bio_try_merge_page(ctx->bio, page, plen, poff)) + goto done; + is_contig = true; + } + + if (!ctx->bio || !is_contig || bio_full(ctx->bio)) { + gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL); + int nr_vecs = (length + PAGE_SIZE - 1) >> PAGE_SHIFT; + + if (ctx->bio) + submit_bio(ctx->bio); + + if (ctx->is_readahead) /* same as readahead_gfp_mask */ + gfp |= __GFP_NORETRY | __GFP_NOWARN; + ctx->bio = bio_alloc(gfp, min(BIO_MAX_PAGES, nr_vecs)); + ctx->bio->bi_opf = REQ_OP_READ; + if (ctx->is_readahead) + ctx->bio->bi_opf |= REQ_RAHEAD; + ctx->bio->bi_iter.bi_sector = sector; + bio_set_dev(ctx->bio, iomap->bdev); + ctx->bio->bi_end_io = iomap_read_end_io; + } + + __bio_add_page(ctx->bio, page, plen, poff); +done: + return plen; +} + +int +iomap_readpage(struct page *page, const struct iomap_ops *ops) +{ + struct iomap_readpage_ctx ctx = { .cur_page = page }; + struct inode *inode = page->mapping->host; + unsigned poff; + loff_t ret; + + WARN_ON_ONCE(page_has_buffers(page)); + + for (poff = 0; poff < PAGE_SIZE; poff += ret) { + ret = iomap_apply(inode, page_offset(page) + poff, + PAGE_SIZE - poff, 0, ops, &ctx, + iomap_readpage_actor); + if (ret <= 0) { + WARN_ON_ONCE(ret == 0); + SetPageError(page); + break; + } + } + + if (ctx.bio) { + submit_bio(ctx.bio); + WARN_ON_ONCE(!ctx.cur_page_in_bio); + } else { + WARN_ON_ONCE(ctx.cur_page_in_bio); + unlock_page(page); + } + + /* + * Just like mpage_readpages and block_read_full_page we always + * return 0 and just mark the page as PageError on errors. This + * should be cleaned up all through the stack eventually. + */ + return 0; +} +EXPORT_SYMBOL_GPL(iomap_readpage); + +static struct page * +iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos, + loff_t length, loff_t *done) +{ + while (!list_empty(pages)) { + struct page *page = lru_to_page(pages); + + if (page_offset(page) >= (u64)pos + length) + break; + + list_del(&page->lru); + if (!add_to_page_cache_lru(page, inode->i_mapping, page->index, + GFP_NOFS)) + return page; + + /* + * If we already have a page in the page cache at index we are + * done. Upper layers don't care if it is uptodate after the + * readpages call itself as every page gets checked again once + * actually needed. + */ + *done += PAGE_SIZE; + put_page(page); + } + + return NULL; +} + +static loff_t +iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, + void *data, struct iomap *iomap) +{ + struct iomap_readpage_ctx *ctx = data; + loff_t done, ret; + + for (done = 0; done < length; done += ret) { + if (ctx->cur_page && ((pos + done) & (PAGE_SIZE - 1)) == 0) { + if (!ctx->cur_page_in_bio) + unlock_page(ctx->cur_page); + put_page(ctx->cur_page); + ctx->cur_page = NULL; + } + if (!ctx->cur_page) { + ctx->cur_page = iomap_next_page(inode, ctx->pages, + pos, length, &done); + if (!ctx->cur_page) + break; + ctx->cur_page_in_bio = false; + } + ret = iomap_readpage_actor(inode, pos + done, length - done, + ctx, iomap); + } + + return done; +} + +int +iomap_readpages(struct address_space *mapping, struct list_head *pages, + unsigned nr_pages, const struct iomap_ops *ops) +{ + struct iomap_readpage_ctx ctx = { + .pages = pages, + .is_readahead = true, + }; + loff_t pos = page_offset(list_entry(pages->prev, struct page, lru)); + loff_t last = page_offset(list_entry(pages->next, struct page, lru)); + loff_t length = last - pos + PAGE_SIZE, ret = 0; + + while (length > 0) { + ret = iomap_apply(mapping->host, pos, length, 0, ops, + &ctx, iomap_readpages_actor); + if (ret <= 0) { + WARN_ON_ONCE(ret == 0); + goto done; + } + pos += ret; + length -= ret; + } + ret = 0; +done: + if (ctx.bio) + submit_bio(ctx.bio); + if (ctx.cur_page) { + if (!ctx.cur_page_in_bio) + unlock_page(ctx.cur_page); + put_page(ctx.cur_page); + } + + /* + * Check that we didn't lose a page due to the arcance calling + * conventions.. + */ + WARN_ON_ONCE(!ret && !list_empty(ctx.pages)); + return ret; +} +EXPORT_SYMBOL_GPL(iomap_readpages); + static void iomap_write_failed(struct inode *inode, loff_t pos, unsigned len) { diff --git a/include/linux/iomap.h b/include/linux/iomap.h index a044a824da85..7300d30ca495 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -9,6 +9,7 @@ struct fiemap_extent_info; struct inode; struct iov_iter; struct kiocb; +struct page; struct vm_area_struct; struct vm_fault; @@ -88,6 +89,9 @@ struct iomap_ops { ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from, const struct iomap_ops *ops); +int iomap_readpage(struct page *page, const struct iomap_ops *ops); +int iomap_readpages(struct address_space *mapping, struct list_head *pages, + unsigned nr_pages, const struct iomap_ops *ops); int iomap_file_dirty(struct inode *inode, loff_t pos, loff_t len, const struct iomap_ops *ops); int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,