From patchwork Fri May 18 16:48:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10411295 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 587D3601F9 for ; Fri, 18 May 2018 16:49:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 48A0328894 for ; Fri, 18 May 2018 16:49:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3D240288BD; Fri, 18 May 2018 16:49:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE, T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 90C5728619 for ; Fri, 18 May 2018 16:49:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 07F866B0615; Fri, 18 May 2018 12:49:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id F1EF26B0616; Fri, 18 May 2018 12:49:21 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC0CC6B0617; Fri, 18 May 2018 12:49:21 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl0-f69.google.com (mail-pl0-f69.google.com [209.85.160.69]) by kanga.kvack.org (Postfix) with ESMTP id 920786B0615 for ; Fri, 18 May 2018 12:49:21 -0400 (EDT) Received: by mail-pl0-f69.google.com with SMTP id x2-v6so5387444plv.0 for ; Fri, 18 May 2018 09:49:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=LnB01HOvOLjExso6XplVzqQlEQ8YRx9GzkGVzEKCn1g=; b=l/cgZUDOZzSlPjP6/G6qekZAfjarExwuIzPE38EUcsFdKxruzFyOvuFx4xswvJQxLA gQ+SRmmfEflpk5k43sVPekhrGCrp+v/p38ODCNYfhxRjb17pAowsQPGkEdmygxQHfKa1 J4KTmLQX+73QZyroPFnAKrH0zKeL9IOcGU3h1b1S/2q8yGOi4ytjcGARxyXYUm5p0XUW gYqTCDabMcaZXtXgmr+kjCJJ4ey+NCspLWQhq7VO0exQriBDGWx+wPPYjCnlaqfdvSi3 xIICBiDZnThOhojf2zLstg8dgreQGX/QJlLF/jontdYGsV/1axF1qCabCk/r8v/zDQ7S CLxg== X-Gm-Message-State: ALKqPwc6Ol3yESWUnTySDHSd8R8oPGO0cT0xoTCXaiOOVi+1/kW+6EEh OU/mZDdviyRvil81YOGwBDYjvjWbCswrBzhqyyCHMud5etD6qUtostAFwAJ0nhPsY/wn2mYEjym VZZX8EnNvUSc5PBf7c4FlbcIMhxUn2yMx12il5Y6434ruWWcUhQZCPk5r+i+kf48= X-Received: by 2002:a17:902:7782:: with SMTP id o2-v6mr10564974pll.247.1526662161260; Fri, 18 May 2018 09:49:21 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpwLZScrcbavXtBS9QceqJoZmwlChXbGJJKK6ymLgFoSuosru5sMki1/YqkdMlSiaphX70b X-Received: by 2002:a17:902:7782:: with SMTP id o2-v6mr10564938pll.247.1526662160429; Fri, 18 May 2018 09:49:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526662160; cv=none; d=google.com; s=arc-20160816; b=BP7rnfEyLm0KfH9vJe6bORKWamPuDYCpj6uZDNfjpC2dS9LMaXb+JBxYnEmkcX5MuF wa+Peh0qFW7lOQBQIbx9haoZU3pCFLQKdfP+2ntkvNf1k6LArwrkYe5ojvgcNy68yq3n fPlpDVuPWrYDSxkGJ1kydRxcpF8j/yMJwWc0vtbj+SIz+21rhjlWy1Ta1W6tQV3gUfSN yFGQQt3zD2d/vDcWloIVFReYmVc1o4ABkkCBb87RTinqc4xfXPCFVNL8pyLqA1XzEcqh btlIlljA8eUIEnLdGvsefJoHmAlLTp4dTDQNRRNCuZzFZruM3EkRTC7xDRn9rvTEHyFG KduA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=LnB01HOvOLjExso6XplVzqQlEQ8YRx9GzkGVzEKCn1g=; b=JYbFF6Rv0wusM+7mVa4srLPCFNAaiXzFDDrYt6lyXcQ6IC/zagQu10MwdNSDtZXkCs odvzTIMD5OGrfJK0BrHn6RVqYphZZpYRRoMw9UymwlI0ai+JGNAIyDkEfNITVkszXfKG t1s8kFFA/2N6nuECUdPH/DqaklJzc4GLOiD6SjojGrfCsThunVHrY9OXIlZcOlRb33dA E54mSbnTap1UvU40HBYp/zWQWt+WmXAixAXU4telHtx+GvsYU1Ww67rvsGmFMKK3kKpe 2nISwK6eFsrhqHqyyP4hv/Y8hJ/lwmkS7o03N3WNauAjZcSfanblxVJTK9kPp91JG8Kn GSmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20170209 header.b=CWjbTLcH; spf=pass (google.com: best guess record for domain of batv+77ddf8e9b1b344f28472+5381+infradead.org+hch@bombadil.srs.infradead.org designates 2607:7c80:54:e::133 as permitted sender) smtp.mailfrom=BATV+77ddf8e9b1b344f28472+5381+infradead.org+hch@bombadil.srs.infradead.org Received: from bombadil.infradead.org (bombadil.infradead.org. [2607:7c80:54:e::133]) by mx.google.com with ESMTPS id b6-v6si8043535pls.583.2018.05.18.09.49.20 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 May 2018 09:49:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of batv+77ddf8e9b1b344f28472+5381+infradead.org+hch@bombadil.srs.infradead.org designates 2607:7c80:54:e::133 as permitted sender) client-ip=2607:7c80:54:e::133; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20170209 header.b=CWjbTLcH; spf=pass (google.com: best guess record for domain of batv+77ddf8e9b1b344f28472+5381+infradead.org+hch@bombadil.srs.infradead.org designates 2607:7c80:54:e::133 as permitted sender) smtp.mailfrom=BATV+77ddf8e9b1b344f28472+5381+infradead.org+hch@bombadil.srs.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=References:In-Reply-To:Message-Id: Date:Subject:Cc:To:From:Sender:Reply-To:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=LnB01HOvOLjExso6XplVzqQlEQ8YRx9GzkGVzEKCn1g=; b=CWjbTLcHp8gs9PugP6aKwx8p7 pdBVNbq6bnivgN5jqW2jelw7zMAOzL4ocYaQ/1XpSFaRUWHiUqxyCWE9xDS651PnjHG8wnbBJcdwW ELerBCOE6dgsXRP5FnwEGigngPCNIW1cY5EQ4pWsD320q2ueAKocCwX5NNW2ZlGWg10voR62dl6Uf aLqJHQzYvSaK7GbaCgD02P56DiYFGADAWk3gsBM1rB/1XQ0qaYI6WopRGyvtXoew7vK8yUXWzcuww OLmhhi0vq0CpfsZ10QL8mCA7Nq0Y1q8SAjjhMRavMTYzc2ny5zAE3GLG2ZHU95jlH7YRjE/0B+16N yzC2pyYxg==; Received: from 80-109-164-210.cable.dynamic.surfer.at ([80.109.164.210] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1fJiZD-0006Zr-Hb; Fri, 18 May 2018 16:49:19 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 16/34] iomap: add initial support for writes without buffer heads Date: Fri, 18 May 2018 18:48:12 +0200 Message-Id: <20180518164830.1552-17-hch@lst.de> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180518164830.1552-1-hch@lst.de> References: <20180518164830.1552-1-hch@lst.de> X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP For now just limited to blocksize == PAGE_SIZE, where we can simply read in the full page in write begin, and just set the whole page dirty after copying data into it. This code is enabled by default and XFS will now be feed pages without buffer heads in ->writepage and ->writepages. If a file system sets the IOMAP_F_BUFFER_HEAD flag on the iomap the old path will still be used, this both helps the transition in XFS and prepares for the gfs2 migration to the iomap infrastructure. Signed-off-by: Christoph Hellwig --- fs/iomap.c | 132 ++++++++++++++++++++++++++++++++++++++---- fs/xfs/xfs_iomap.c | 6 +- include/linux/iomap.h | 2 + 3 files changed, 127 insertions(+), 13 deletions(-) diff --git a/fs/iomap.c b/fs/iomap.c index 821671af2618..cd4c563db80a 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -314,6 +314,58 @@ iomap_write_failed(struct inode *inode, loff_t pos, unsigned len) truncate_pagecache_range(inode, max(pos, i_size), pos + len); } +static int +iomap_read_page_sync(struct inode *inode, loff_t block_start, struct page *page, + unsigned poff, unsigned plen, struct iomap *iomap) +{ + struct bio_vec bvec; + struct bio bio; + int ret; + + bio_init(&bio, &bvec, 1); + bio.bi_opf = REQ_OP_READ; + bio.bi_iter.bi_sector = iomap_sector(iomap, block_start); + bio_set_dev(&bio, iomap->bdev); + __bio_add_page(&bio, page, plen, poff); + ret = submit_bio_wait(&bio); + if (ret < 0 && iomap_block_needs_zeroing(inode, block_start, iomap)) + zero_user(page, poff, plen); + return ret; +} + +static int +__iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, + struct page *page, struct iomap *iomap) +{ + loff_t block_size = i_blocksize(inode); + loff_t block_start = pos & ~(block_size - 1); + loff_t block_end = (pos + len + block_size - 1) & ~(block_size - 1); + unsigned poff = block_start & (PAGE_SIZE - 1); + unsigned plen = min_t(loff_t, PAGE_SIZE - poff, block_end - block_start); + int status; + + WARN_ON_ONCE(i_blocksize(inode) < PAGE_SIZE); + + if (PageUptodate(page)) + return 0; + + if (iomap_block_needs_zeroing(inode, block_start, iomap)) { + unsigned from = pos & (PAGE_SIZE - 1), to = from + len; + unsigned pend = poff + plen; + + if (poff < from || pend > to) + zero_user_segments(page, poff, from, to, pend); + } else { + status = iomap_read_page_sync(inode, block_start, page, + poff, plen, iomap); + if (status < 0) + return status; + SetPageUptodate(page); + } + + return 0; +} + static int iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, struct page **pagep, struct iomap *iomap) @@ -331,7 +383,10 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, if (!page) return -ENOMEM; - status = __block_write_begin_int(page, pos, len, NULL, iomap); + if (iomap->flags & IOMAP_F_BUFFER_HEAD) + status = __block_write_begin_int(page, pos, len, NULL, iomap); + else + status = __iomap_write_begin(inode, pos, len, page, iomap); if (unlikely(status)) { unlock_page(page); put_page(page); @@ -344,14 +399,63 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, return status; } +int +iomap_set_page_dirty(struct page *page) +{ + struct address_space *mapping = page_mapping(page); + int newly_dirty; + + if (unlikely(!mapping)) + return !TestSetPageDirty(page); + + /* + * Lock out page->mem_cgroup migration to keep PageDirty + * synchronized with per-memcg dirty page counters. + */ + lock_page_memcg(page); + newly_dirty = !TestSetPageDirty(page); + if (newly_dirty) + __set_page_dirty(page, mapping, 0); + unlock_page_memcg(page); + + if (newly_dirty) + __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); + return newly_dirty; +} +EXPORT_SYMBOL_GPL(iomap_set_page_dirty); + +static int +__iomap_write_end(struct inode *inode, loff_t pos, unsigned len, + unsigned copied, struct page *page, struct iomap *iomap) +{ + unsigned start = pos & (PAGE_SIZE - 1); + + if (unlikely(copied < len)) { + /* see block_write_end() for an explanation */ + if (!PageUptodate(page)) + copied = 0; + if (iomap_block_needs_zeroing(inode, pos, iomap)) + zero_user(page, start + copied, len - copied); + } + + flush_dcache_page(page); + SetPageUptodate(page); + iomap_set_page_dirty(page); + return __generic_write_end(inode, pos, copied, page); +} + static int iomap_write_end(struct inode *inode, loff_t pos, unsigned len, - unsigned copied, struct page *page) + unsigned copied, struct page *page, struct iomap *iomap) { int ret; - ret = generic_write_end(NULL, inode->i_mapping, pos, len, - copied, page, NULL); + if (iomap->flags & IOMAP_F_BUFFER_HEAD) + ret = generic_write_end(NULL, inode->i_mapping, pos, len, + copied, page, NULL); + else + ret = __iomap_write_end(inode, pos, len, copied, page, iomap); + if (ret < len) iomap_write_failed(inode, pos, len); return ret; @@ -406,7 +510,8 @@ iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data, flush_dcache_page(page); - status = iomap_write_end(inode, pos, bytes, copied, page); + status = iomap_write_end(inode, pos, bytes, copied, page, + iomap); if (unlikely(status < 0)) break; copied = status; @@ -500,7 +605,7 @@ iomap_dirty_actor(struct inode *inode, loff_t pos, loff_t length, void *data, WARN_ON_ONCE(!PageUptodate(page)); - status = iomap_write_end(inode, pos, bytes, bytes, page); + status = iomap_write_end(inode, pos, bytes, bytes, page, iomap); if (unlikely(status <= 0)) { if (WARN_ON_ONCE(status == 0)) return -EIO; @@ -552,7 +657,7 @@ static int iomap_zero(struct inode *inode, loff_t pos, unsigned offset, zero_user(page, offset, bytes); mark_page_accessed(page); - return iomap_write_end(inode, pos, bytes, bytes, page); + return iomap_write_end(inode, pos, bytes, bytes, page, iomap); } static int iomap_dax_zero(loff_t pos, unsigned offset, unsigned bytes, @@ -638,11 +743,16 @@ iomap_page_mkwrite_actor(struct inode *inode, loff_t pos, loff_t length, struct page *page = data; int ret; - ret = __block_write_begin_int(page, pos, length, NULL, iomap); - if (ret) - return ret; + if (iomap->flags & IOMAP_F_BUFFER_HEAD) { + ret = __block_write_begin_int(page, pos, length, NULL, iomap); + if (ret) + return ret; + block_commit_write(page, 0, length); + } else { + WARN_ON_ONCE(!PageUptodate(page)); + WARN_ON_ONCE(i_blocksize(inode) < PAGE_SIZE); + } - block_commit_write(page, 0, length); return length; } diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index c6ce6f9335b6..da6d1995e460 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -638,7 +638,7 @@ xfs_file_iomap_begin_delay( * Flag newly allocated delalloc blocks with IOMAP_F_NEW so we punch * them out if the write happens to fail. */ - iomap->flags = IOMAP_F_NEW; + iomap->flags |= IOMAP_F_NEW; trace_xfs_iomap_alloc(ip, offset, count, 0, &got); done: if (isnullstartblock(got.br_startblock)) @@ -1031,6 +1031,8 @@ xfs_file_iomap_begin( if (XFS_FORCED_SHUTDOWN(mp)) return -EIO; + iomap->flags |= IOMAP_F_BUFFER_HEAD; + if (((flags & (IOMAP_WRITE | IOMAP_DIRECT)) == IOMAP_WRITE) && !IS_DAX(inode) && !xfs_get_extsz_hint(ip)) { /* Reserve delalloc blocks for regular writeback. */ @@ -1131,7 +1133,7 @@ xfs_file_iomap_begin( if (error) return error; - iomap->flags = IOMAP_F_NEW; + iomap->flags |= IOMAP_F_NEW; trace_xfs_iomap_alloc(ip, offset, length, 0, &imap); out_finish: diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 7300d30ca495..4d3d9d0cd69f 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -30,6 +30,7 @@ struct vm_fault; */ #define IOMAP_F_NEW 0x01 /* blocks have been newly allocated */ #define IOMAP_F_DIRTY 0x02 /* uncommitted metadata */ +#define IOMAP_F_BUFFER_HEAD 0x04 /* file system requires buffer heads */ /* * Flags that only need to be reported for IOMAP_REPORT requests: @@ -92,6 +93,7 @@ ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from, int iomap_readpage(struct page *page, const struct iomap_ops *ops); int iomap_readpages(struct address_space *mapping, struct list_head *pages, unsigned nr_pages, const struct iomap_ops *ops); +int iomap_set_page_dirty(struct page *page); int iomap_file_dirty(struct inode *inode, loff_t pos, loff_t len, const struct iomap_ops *ops); int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,