From patchwork Sun Sep 1 20:08:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125645 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 492321510 for ; Sun, 1 Sep 2019 20:09:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3210122CF7 for ; Sun, 1 Sep 2019 20:09:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728879AbfIAUIu (ORCPT ); Sun, 1 Sep 2019 16:08:50 -0400 Received: from mx2.suse.de ([195.135.220.15]:50372 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728610AbfIAUIt (ORCPT ); Sun, 1 Sep 2019 16:08:49 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B2210B07B; Sun, 1 Sep 2019 20:08:48 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG Date: Sun, 1 Sep 2019 15:08:22 -0500 Message-Id: <20190901200836.14959-2-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues To improve debugging abilities, especially invalid option asserts. Signed-off-by: Goldwyn Rodrigues Signed-off-by: Goldwyn Rodrigues Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong --- fs/Kconfig | 3 +++ include/linux/iomap.h | 11 +++++++++++ 2 files changed, 14 insertions(+) diff --git a/fs/Kconfig b/fs/Kconfig index bfb1c6095c7a..4bed5df9b55f 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -19,6 +19,9 @@ if BLOCK config FS_IOMAP bool +config FS_IOMAP_DEBUG + bool "Debugging for the iomap code" + depends on FS_IOMAP source "fs/ext2/Kconfig" source "fs/ext4/Kconfig" diff --git a/include/linux/iomap.h b/include/linux/iomap.h index bc499ceae392..209b6c93674e 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -18,6 +18,17 @@ struct page; struct vm_area_struct; struct vm_fault; +#ifdef CONFIG_FS_IOMAP_DEBUG +#define iomap_assert(expr) \ + if(!(expr)) { \ + printk( "Assertion failed! %s,%s,%s,line=%d\n",\ +#expr,__FILE__,__func__,__LINE__); \ + BUG(); \ + } +#else +#define iomap_assert(expr) +#endif + /* * Types of block ranges for iomap mappings: */ From patchwork Sun Sep 1 20:08:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125653 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E466413B1 for ; Sun, 1 Sep 2019 20:09:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B695E2190F for ; Sun, 1 Sep 2019 20:09:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728923AbfIAUIy (ORCPT ); Sun, 1 Sep 2019 16:08:54 -0400 Received: from mx2.suse.de ([195.135.220.15]:50396 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728610AbfIAUIx (ORCPT ); Sun, 1 Sep 2019 16:08:53 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id CC513AE3F; Sun, 1 Sep 2019 20:08:50 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 02/15] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O Date: Sun, 1 Sep 2019 15:08:23 -0500 Message-Id: <20190901200836.14959-3-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Introduces a new type IOMAP_COW, which means the data at offset must be read from a srcmap and copied before performing the write on the offset. The srcmap is used to identify where the read is to be performed from. This is passed to iomap->begin() of the respective filesystem, which is supposed to put in the details for reading before performing the copy for CoW. Signed-off-by: Goldwyn Rodrigues --- fs/dax.c | 8 +++++--- fs/ext2/inode.c | 2 +- fs/ext4/inode.c | 2 +- fs/gfs2/bmap.c | 3 ++- fs/iomap/apply.c | 5 +++-- fs/iomap/buffered-io.c | 14 +++++++------- fs/iomap/direct-io.c | 2 +- fs/iomap/fiemap.c | 4 ++-- fs/iomap/seek.c | 4 ++-- fs/iomap/swapfile.c | 3 ++- fs/xfs/xfs_iomap.c | 9 ++++++--- include/linux/iomap.h | 6 ++++-- 12 files changed, 36 insertions(+), 26 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 6bf81f931de3..e961d8dc23ef 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1090,7 +1090,7 @@ EXPORT_SYMBOL_GPL(__dax_zero_page_range); static loff_t dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap) + struct iomap *iomap, struct iomap *srcmap) { struct block_device *bdev = iomap->bdev; struct dax_device *dax_dev = iomap->dax_dev; @@ -1248,6 +1248,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, unsigned long vaddr = vmf->address; loff_t pos = (loff_t)vmf->pgoff << PAGE_SHIFT; struct iomap iomap = { 0 }; + struct iomap srcmap = { 0 }; unsigned flags = IOMAP_FAULT; int error, major = 0; bool write = vmf->flags & FAULT_FLAG_WRITE; @@ -1292,7 +1293,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, * the file system block size to be equal the page size, which means * that we never have to deal with more than a single extent here. */ - error = ops->iomap_begin(inode, pos, PAGE_SIZE, flags, &iomap); + error = ops->iomap_begin(inode, pos, PAGE_SIZE, flags, &iomap, &srcmap); if (iomap_errp) *iomap_errp = error; if (error) { @@ -1472,6 +1473,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, struct inode *inode = mapping->host; vm_fault_t result = VM_FAULT_FALLBACK; struct iomap iomap = { 0 }; + struct iomap srcmap = { 0 }; pgoff_t max_pgoff; void *entry; loff_t pos; @@ -1546,7 +1548,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, * to look up our filesystem block. */ pos = (loff_t)xas.xa_index << PAGE_SHIFT; - error = ops->iomap_begin(inode, pos, PMD_SIZE, iomap_flags, &iomap); + error = ops->iomap_begin(inode, pos, PMD_SIZE, iomap_flags, &iomap, &srcmap); if (error) goto unlock_entry; diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 7004ce581a32..467c13ff6b40 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -801,7 +801,7 @@ int ext2_get_block(struct inode *inode, sector_t iblock, #ifdef CONFIG_FS_DAX static int ext2_iomap_begin(struct inode *inode, loff_t offset, loff_t length, - unsigned flags, struct iomap *iomap) + unsigned flags, struct iomap *iomap, struct iomap *srcmap) { unsigned int blkbits = inode->i_blkbits; unsigned long first_block = offset >> blkbits; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 420fe3deed39..918f94eff799 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3453,7 +3453,7 @@ static bool ext4_inode_datasync_dirty(struct inode *inode) } static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length, - unsigned flags, struct iomap *iomap) + unsigned flags, struct iomap *iomap, struct iomap *srcmap) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); unsigned int blkbits = inode->i_blkbits; diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c index 4f8b5fd6c81f..0189262989f2 100644 --- a/fs/gfs2/bmap.c +++ b/fs/gfs2/bmap.c @@ -1164,7 +1164,8 @@ static int gfs2_iomap_begin_write(struct inode *inode, loff_t pos, } static int gfs2_iomap_begin(struct inode *inode, loff_t pos, loff_t length, - unsigned flags, struct iomap *iomap) + unsigned flags, struct iomap *iomap, + struct iomap *srcmap) { struct gfs2_inode *ip = GFS2_I(inode); struct metapath mp = { .mp_aheight = 1, }; diff --git a/fs/iomap/apply.c b/fs/iomap/apply.c index 54c02aecf3cd..6cdb362fff36 100644 --- a/fs/iomap/apply.c +++ b/fs/iomap/apply.c @@ -24,6 +24,7 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, const struct iomap_ops *ops, void *data, iomap_actor_t actor) { struct iomap iomap = { 0 }; + struct iomap srcmap = { 0 }; loff_t written = 0, ret; /* @@ -38,7 +39,7 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, * expose transient stale data. If the reserve fails, we can safely * back out at this point as there is nothing to undo. */ - ret = ops->iomap_begin(inode, pos, length, flags, &iomap); + ret = ops->iomap_begin(inode, pos, length, flags, &iomap, &srcmap); if (ret) return ret; if (WARN_ON(iomap.offset > pos)) @@ -58,7 +59,7 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, * we can do the copy-in page by page without having to worry about * failures exposing transient data. */ - written = actor(inode, pos, length, data, &iomap); + written = actor(inode, pos, length, data, &iomap, &srcmap); /* * Now the data has been copied, commit the range we've copied. This diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index e25901ae3ff4..f27756c0b31c 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -205,7 +205,7 @@ iomap_read_inline_data(struct inode *inode, struct page *page, static loff_t iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap) + struct iomap *iomap, struct iomap *srcmap) { struct iomap_readpage_ctx *ctx = data; struct page *page = ctx->cur_page; @@ -351,7 +351,7 @@ iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos, static loff_t iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { struct iomap_readpage_ctx *ctx = data; loff_t done, ret; @@ -371,7 +371,7 @@ iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, ctx->cur_page_in_bio = false; } ret = iomap_readpage_actor(inode, pos + done, length - done, - ctx, iomap); + ctx, iomap, srcmap); } return done; @@ -736,7 +736,7 @@ iomap_write_end(struct inode *inode, loff_t pos, unsigned len, static loff_t iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap) + struct iomap *iomap, struct iomap *srcmap) { struct iov_iter *i = data; long status = 0; @@ -853,7 +853,7 @@ __iomap_read_page(struct inode *inode, loff_t offset) static loff_t iomap_dirty_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap) + struct iomap *iomap, struct iomap *srcmap) { long status = 0; ssize_t written = 0; @@ -942,7 +942,7 @@ static int iomap_dax_zero(loff_t pos, unsigned offset, unsigned bytes, static loff_t iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { bool *did_zero = data; loff_t written = 0; @@ -1011,7 +1011,7 @@ EXPORT_SYMBOL_GPL(iomap_truncate_page); static loff_t iomap_page_mkwrite_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { struct page *page = data; int ret; diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 10517cea9682..5279029c7a3c 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -362,7 +362,7 @@ iomap_dio_inline_actor(struct inode *inode, loff_t pos, loff_t length, static loff_t iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { struct iomap_dio *dio = data; diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c index f26fdd36e383..690ef2d7c6c8 100644 --- a/fs/iomap/fiemap.c +++ b/fs/iomap/fiemap.c @@ -44,7 +44,7 @@ static int iomap_to_fiemap(struct fiemap_extent_info *fi, static loff_t iomap_fiemap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap) + struct iomap *iomap, struct iomap *srcmap) { struct fiemap_ctx *ctx = data; loff_t ret = length; @@ -111,7 +111,7 @@ EXPORT_SYMBOL_GPL(iomap_fiemap); static loff_t iomap_bmap_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { sector_t *bno = data, addr; diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c index c04bad4b2b43..89f61d93c0bc 100644 --- a/fs/iomap/seek.c +++ b/fs/iomap/seek.c @@ -119,7 +119,7 @@ page_cache_seek_hole_data(struct inode *inode, loff_t offset, loff_t length, static loff_t iomap_seek_hole_actor(struct inode *inode, loff_t offset, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { switch (iomap->type) { case IOMAP_UNWRITTEN: @@ -165,7 +165,7 @@ EXPORT_SYMBOL_GPL(iomap_seek_hole); static loff_t iomap_seek_data_actor(struct inode *inode, loff_t offset, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { switch (iomap->type) { case IOMAP_HOLE: diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c index 152a230f668d..a648dbf6991e 100644 --- a/fs/iomap/swapfile.c +++ b/fs/iomap/swapfile.c @@ -76,7 +76,8 @@ static int iomap_swapfile_add_extent(struct iomap_swapfile_info *isi) * distinction between written and unwritten extents. */ static loff_t iomap_swapfile_activate_actor(struct inode *inode, loff_t pos, - loff_t count, void *data, struct iomap *iomap) + loff_t count, void *data, struct iomap *iomap, + struct iomap *srcmap) { struct iomap_swapfile_info *isi = data; int error; diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 3a4310d7cb59..8321733c16c3 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -922,7 +922,8 @@ xfs_file_iomap_begin( loff_t offset, loff_t length, unsigned flags, - struct iomap *iomap) + struct iomap *iomap, + struct iomap *srcmap) { struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; @@ -1145,7 +1146,8 @@ xfs_seek_iomap_begin( loff_t offset, loff_t length, unsigned flags, - struct iomap *iomap) + struct iomap *iomap, + struct iomap *srcmap) { struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; @@ -1231,7 +1233,8 @@ xfs_xattr_iomap_begin( loff_t offset, loff_t length, unsigned flags, - struct iomap *iomap) + struct iomap *iomap, + struct iomap *srcmap) { struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 209b6c93674e..9c082d36286a 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -37,6 +37,7 @@ struct vm_fault; #define IOMAP_MAPPED 0x03 /* blocks allocated at @addr */ #define IOMAP_UNWRITTEN 0x04 /* blocks allocated at @addr in unwritten state */ #define IOMAP_INLINE 0x05 /* data inline in the inode */ +#define IOMAP_COW 0x06 /* copy data from srcmap before writing */ /* * Flags for all iomap mappings: @@ -121,7 +122,8 @@ struct iomap_ops { * The actual length is returned in iomap->length. */ int (*iomap_begin)(struct inode *inode, loff_t pos, loff_t length, - unsigned flags, struct iomap *iomap); + unsigned flags, struct iomap *iomap, + struct iomap *srcmap); /* * Commit and/or unreserve space previous allocated using iomap_begin. @@ -137,7 +139,7 @@ struct iomap_ops { * Main iomap iterator function. */ typedef loff_t (*iomap_actor_t)(struct inode *inode, loff_t pos, loff_t len, - void *data, struct iomap *iomap); + void *data, struct iomap *iomap, struct iomap *srcmap); loff_t iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, const struct iomap_ops *ops, void *data, From patchwork Sun Sep 1 20:08:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125649 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8BB951890 for ; Sun, 1 Sep 2019 20:09:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 73C4E2190F for ; Sun, 1 Sep 2019 20:09:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728933AbfIAUIy (ORCPT ); Sun, 1 Sep 2019 16:08:54 -0400 Received: from mx2.suse.de ([195.135.220.15]:50408 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728888AbfIAUIy (ORCPT ); Sun, 1 Sep 2019 16:08:54 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id DBDB3AE96; Sun, 1 Sep 2019 20:08:52 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 03/15] iomap: Read page from srcmap for IOMAP_COW Date: Sun, 1 Sep 2019 15:08:24 -0500 Message-Id: <20190901200836.14959-4-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues In case of a IOMAP_COW, read a page from the srcmap before performing a write on the page. Signed-off-by: Goldwyn Rodrigues Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index f27756c0b31c..99d63ba14d1b 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -581,7 +581,7 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, static int iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, - struct page **pagep, struct iomap *iomap) + struct page **pagep, struct iomap *iomap, struct iomap *srcmap) { const struct iomap_page_ops *page_ops = iomap->page_ops; pgoff_t index = pos >> PAGE_SHIFT; @@ -605,12 +605,17 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, goto out_no_page; } - if (iomap->type == IOMAP_INLINE) + if (iomap->type == IOMAP_INLINE) { iomap_read_inline_data(inode, page, iomap); - else if (iomap->flags & IOMAP_F_BUFFER_HEAD) + } else if (iomap->type == IOMAP_COW) { + iomap_assert(!(iomap->flags & IOMAP_F_BUFFER_HEAD)); + iomap_assert(srcmap->type == IOMAP_HOLE || srcmap->addr > 0); + status = __iomap_write_begin(inode, pos, len, page, srcmap); + } else if (iomap->flags & IOMAP_F_BUFFER_HEAD) { status = __block_write_begin_int(page, pos, len, NULL, iomap); - else + } else { status = __iomap_write_begin(inode, pos, len, page, iomap); + } if (unlikely(status)) goto out_unlock; @@ -772,7 +777,7 @@ iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data, } status = iomap_write_begin(inode, pos, bytes, flags, &page, - iomap); + iomap, srcmap); if (unlikely(status)) break; @@ -871,7 +876,7 @@ iomap_dirty_actor(struct inode *inode, loff_t pos, loff_t length, void *data, return PTR_ERR(rpage); status = iomap_write_begin(inode, pos, bytes, - AOP_FLAG_NOFS, &page, iomap); + AOP_FLAG_NOFS, &page, iomap, srcmap); put_page(rpage); if (unlikely(status)) return status; @@ -917,13 +922,13 @@ iomap_file_dirty(struct inode *inode, loff_t pos, loff_t len, EXPORT_SYMBOL_GPL(iomap_file_dirty); static int iomap_zero(struct inode *inode, loff_t pos, unsigned offset, - unsigned bytes, struct iomap *iomap) + unsigned bytes, struct iomap *iomap, struct iomap *srcmap) { struct page *page; int status; status = iomap_write_begin(inode, pos, bytes, AOP_FLAG_NOFS, &page, - iomap); + iomap, srcmap); if (status) return status; @@ -961,7 +966,7 @@ iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count, if (IS_DAX(inode)) status = iomap_dax_zero(pos, offset, bytes, iomap); else - status = iomap_zero(inode, pos, offset, bytes, iomap); + status = iomap_zero(inode, pos, offset, bytes, iomap, srcmap); if (status < 0) return status; From patchwork Sun Sep 1 20:08:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125657 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3BD2217EF for ; Sun, 1 Sep 2019 20:09:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 17B272190F for ; Sun, 1 Sep 2019 20:09:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728958AbfIAUI6 (ORCPT ); Sun, 1 Sep 2019 16:08:58 -0400 Received: from mx2.suse.de ([195.135.220.15]:50420 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728888AbfIAUI4 (ORCPT ); Sun, 1 Sep 2019 16:08:56 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id DE6C9AE3F; Sun, 1 Sep 2019 20:08:54 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 04/15] btrfs: Eliminate PagePrivate for btrfs data pages Date: Sun, 1 Sep 2019 15:08:25 -0500 Message-Id: <20190901200836.14959-5-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues While most of the code works just eliminating page's private field and related code, there is a problem when we are cloning. The extent assumes the data is uptodate. Clear the EXTENT_UPTODATE flag for the extent so the next time the file is read, it is forced to be read from the disk as opposed to pagecache. This patch is required to make sure we don't conflict with iomap's usage of page->private. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/compression.c | 1 - fs/btrfs/extent_io.c | 13 ------------- fs/btrfs/extent_io.h | 2 -- fs/btrfs/file.c | 1 - fs/btrfs/free-space-cache.c | 1 - fs/btrfs/inode.c | 15 +-------------- fs/btrfs/ioctl.c | 4 ++-- fs/btrfs/relocation.c | 2 -- 8 files changed, 3 insertions(+), 36 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 60c47b417a4b..fe41fa3d2999 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -481,7 +481,6 @@ static noinline int add_ra_bio_pages(struct inode *inode, * for these bytes in the file. But, we have to make * sure they map to this compressed extent on disk. */ - set_page_extent_mapped(page); lock_extent(tree, last_offset, end); read_lock(&em_tree->lock); em = lookup_extent_mapping(em_tree, last_offset, diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 1ff438fd5bc2..27233fb6660c 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3005,15 +3005,6 @@ static void attach_extent_buffer_page(struct extent_buffer *eb, } } -void set_page_extent_mapped(struct page *page) -{ - if (!PagePrivate(page)) { - SetPagePrivate(page); - get_page(page); - set_page_private(page, EXTENT_PAGE_PRIVATE); - } -} - static struct extent_map * __get_extent_map(struct inode *inode, struct page *page, size_t pg_offset, u64 start, u64 len, get_extent_t *get_extent, @@ -3074,8 +3065,6 @@ static int __do_readpage(struct extent_io_tree *tree, size_t blocksize = inode->i_sb->s_blocksize; unsigned long this_bio_flag = 0; - set_page_extent_mapped(page); - if (!PageUptodate(page)) { if (cleancache_get_page(page) == 0) { BUG_ON(blocksize != PAGE_SIZE); @@ -3589,8 +3578,6 @@ static int __extent_writepage(struct page *page, struct writeback_control *wbc, pg_offset = 0; - set_page_extent_mapped(page); - if (!epd->extent_locked) { ret = writepage_delalloc(inode, page, wbc, start, &nr_written); if (ret == 1) diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 401423b16976..8082774371b5 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -416,8 +416,6 @@ int extent_readpages(struct address_space *mapping, struct list_head *pages, unsigned nr_pages); int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, __u64 start, __u64 len); -void set_page_extent_mapped(struct page *page); - struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start); struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 58a18ed11546..4466a09f2d98 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1539,7 +1539,6 @@ lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages, * delalloc bits and dirty the pages as required. */ for (i = 0; i < num_pages; i++) { - set_page_extent_mapped(pages[i]); WARN_ON(!PageLocked(pages[i])); } diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 062be9dde4c6..9a0c519bd6d4 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -395,7 +395,6 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, struct inode *inode for (i = 0; i < io_ctl->num_pages; i++) { clear_page_dirty_for_io(io_ctl->pages[i]); - set_page_extent_mapped(io_ctl->pages[i]); } return 0; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ee582a36653d..258bacefdf5f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4932,7 +4932,6 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len, wait_on_page_writeback(page); lock_extent_bits(io_tree, block_start, block_end, &cached_state); - set_page_extent_mapped(page); ordered = btrfs_lookup_ordered_extent(inode, block_start); if (ordered) { @@ -8754,13 +8753,7 @@ btrfs_readpages(struct file *file, struct address_space *mapping, static int __btrfs_releasepage(struct page *page, gfp_t gfp_flags) { - int ret = try_release_extent_mapping(page, gfp_flags); - if (ret == 1) { - ClearPagePrivate(page); - set_page_private(page, 0); - put_page(page); - } - return ret; + return try_release_extent_mapping(page, gfp_flags); } static int btrfs_releasepage(struct page *page, gfp_t gfp_flags) @@ -8878,11 +8871,6 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset, } ClearPageChecked(page); - if (PagePrivate(page)) { - ClearPagePrivate(page); - set_page_private(page, 0); - put_page(page); - } } /* @@ -8961,7 +8949,6 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) wait_on_page_writeback(page); lock_extent_bits(io_tree, page_start, page_end, &cached_state); - set_page_extent_mapped(page); /* * we can't set the delalloc bits if there are pending ordered diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 818f7ec8bb0e..861617e3d0c9 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1355,7 +1355,6 @@ static int cluster_pages_for_defrag(struct inode *inode, for (i = 0; i < i_done; i++) { clear_page_dirty_for_io(pages[i]); ClearPageChecked(pages[i]); - set_page_extent_mapped(pages[i]); set_page_dirty(pages[i]); unlock_page(pages[i]); put_page(pages[i]); @@ -3550,6 +3549,7 @@ static int btrfs_clone(struct inode *src, struct inode *inode, int ret; const u64 len = olen_aligned; u64 last_dest_end = destoff; + struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree; ret = -ENOMEM; buf = kvmalloc(fs_info->nodesize, GFP_KERNEL); @@ -3864,6 +3864,7 @@ static int btrfs_clone(struct inode *src, struct inode *inode, destoff, olen, no_time_update); } + clear_extent_uptodate(tree, destoff, destoff+olen, NULL); out: btrfs_free_path(path); kvfree(buf); @@ -3935,7 +3936,6 @@ static noinline int btrfs_clone_files(struct file *file, struct file *file_src, truncate_inode_pages_range(&inode->i_data, round_down(destoff, PAGE_SIZE), round_up(destoff + len, PAGE_SIZE) - 1); - return ret; } diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 7f219851fa23..612988b7eb27 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3300,8 +3300,6 @@ static int relocate_file_extent_cluster(struct inode *inode, lock_extent(&BTRFS_I(inode)->io_tree, page_start, page_end); - set_page_extent_mapped(page); - if (nr < cluster->nr && page_start + offset == cluster->boundary[nr]) { set_extent_bits(&BTRFS_I(inode)->io_tree, From patchwork Sun Sep 1 20:08:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125661 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7AEA51510 for ; Sun, 1 Sep 2019 20:09:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 623552190F for ; Sun, 1 Sep 2019 20:09:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728972AbfIAUI7 (ORCPT ); Sun, 1 Sep 2019 16:08:59 -0400 Received: from mx2.suse.de ([195.135.220.15]:50460 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728934AbfIAUI6 (ORCPT ); Sun, 1 Sep 2019 16:08:58 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 04E64B0BF; Sun, 1 Sep 2019 20:08:57 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 05/15] btrfs: Add a simple buffered iomap write Date: Sun, 1 Sep 2019 15:08:26 -0500 Message-Id: <20190901200836.14959-6-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues This one is a long patch. Most of the code is "inspired" by fs/btrfs/file.c. To keep the size small, all removals are in following patches. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/Kconfig | 1 + fs/btrfs/Makefile | 2 +- fs/btrfs/ctree.h | 1 + fs/btrfs/file.c | 4 +- fs/btrfs/iomap.c | 383 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 388 insertions(+), 3 deletions(-) create mode 100644 fs/btrfs/iomap.c diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig index 38651fae7f21..ba87d21885ca 100644 --- a/fs/btrfs/Kconfig +++ b/fs/btrfs/Kconfig @@ -14,6 +14,7 @@ config BTRFS_FS select RAID6_PQ select XOR_BLOCKS select SRCU + select FS_IOMAP help Btrfs is a general purpose copy-on-write filesystem with extents, diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 76a843198bcb..f88e696b0698 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -11,7 +11,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \ reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \ - block-rsv.o delalloc-space.o + block-rsv.o delalloc-space.o iomap.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 94660063a162..9c501c7826b4 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3243,6 +3243,7 @@ int btrfs_fdatawrite_range(struct inode *inode, loff_t start, loff_t end); loff_t btrfs_remap_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t len, unsigned int remap_flags); +size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from); /* tree-defrag.c */ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 4466a09f2d98..0707db04d3cc 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1829,7 +1829,7 @@ static ssize_t __btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) return written; pos = iocb->ki_pos; - written_buffered = btrfs_buffered_write(iocb, from); + written_buffered = btrfs_buffered_iomap_write(iocb, from); if (written_buffered < 0) { err = written_buffered; goto out; @@ -1966,7 +1966,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, if (iocb->ki_flags & IOCB_DIRECT) { num_written = __btrfs_direct_write(iocb, from); } else { - num_written = btrfs_buffered_write(iocb, from); + num_written = btrfs_buffered_iomap_write(iocb, from); if (num_written > 0) iocb->ki_pos = pos + num_written; if (clean_page) diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c new file mode 100644 index 000000000000..025ccbf471bf --- /dev/null +++ b/fs/btrfs/iomap.c @@ -0,0 +1,383 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * iomap support for BTRFS + * + * Copyright (c) 2019 SUSE Linux + * Author: Goldwyn Rodrigues + */ + +#include +#include "ctree.h" +#include "btrfs_inode.h" +#include "volumes.h" +#include "disk-io.h" +#include "delalloc-space.h" + +struct btrfs_iomap { + u64 start; + u64 end; + bool nocow; + int extents_locked; + ssize_t reserved_bytes; + struct extent_changeset *data_reserved; + struct extent_state *cached_state; +}; + + +/* + * This function locks the extent and properly waits for data=ordered extents + * to finish before allowing the pages to be modified if need. + * + * The return value: + * 1 - the extent is locked + * 0 - the extent is not locked, and everything is OK + * -EAGAIN - need re-prepare the pages + * the other < 0 number - Something wrong happens + */ +static noinline int +lock_and_cleanup_extent(struct btrfs_inode *inode, loff_t pos, + size_t write_bytes, + u64 *lockstart, u64 *lockend, + struct extent_state **cached_state) +{ + struct btrfs_fs_info *fs_info = inode->root->fs_info; + u64 start_pos; + u64 last_pos; + int ret = 0; + + start_pos = round_down(pos, fs_info->sectorsize); + last_pos = start_pos + + round_up(pos + write_bytes - start_pos, + fs_info->sectorsize) - 1; + + if (start_pos < inode->vfs_inode.i_size) { + struct btrfs_ordered_extent *ordered; + + lock_extent_bits(&inode->io_tree, start_pos, last_pos, + cached_state); + ordered = btrfs_lookup_ordered_range(inode, start_pos, + last_pos - start_pos + 1); + if (ordered && + ordered->file_offset + ordered->len > start_pos && + ordered->file_offset <= last_pos) { + unlock_extent_cached(&inode->io_tree, start_pos, + last_pos, cached_state); + btrfs_start_ordered_extent(&inode->vfs_inode, + ordered, 1); + btrfs_put_ordered_extent(ordered); + return -EAGAIN; + } + if (ordered) + btrfs_put_ordered_extent(ordered); + + *lockstart = start_pos; + *lockend = last_pos; + ret = 1; + } + + return ret; +} + +static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos, + size_t *write_bytes) +{ + struct btrfs_fs_info *fs_info = inode->root->fs_info; + struct btrfs_root *root = inode->root; + struct btrfs_ordered_extent *ordered; + u64 lockstart, lockend; + u64 num_bytes; + int ret; + + ret = btrfs_start_write_no_snapshotting(root); + if (!ret) + return -ENOSPC; + + lockstart = round_down(pos, fs_info->sectorsize); + lockend = round_up(pos + *write_bytes, + fs_info->sectorsize) - 1; + + while (1) { + lock_extent(&inode->io_tree, lockstart, lockend); + ordered = btrfs_lookup_ordered_range(inode, lockstart, + lockend - lockstart + 1); + if (!ordered) { + break; + } + unlock_extent(&inode->io_tree, lockstart, lockend); + btrfs_start_ordered_extent(&inode->vfs_inode, ordered, 1); + btrfs_put_ordered_extent(ordered); + } + + num_bytes = lockend - lockstart + 1; + ret = can_nocow_extent(&inode->vfs_inode, lockstart, &num_bytes, + NULL, NULL, NULL); + if (ret <= 0) { + ret = 0; + btrfs_end_write_no_snapshotting(root); + } else { + *write_bytes = min_t(size_t, *write_bytes , + num_bytes - pos + lockstart); + } + + unlock_extent(&inode->io_tree, lockstart, lockend); + + return ret; +} + +static int btrfs_find_new_delalloc_bytes(struct btrfs_inode *inode, + const u64 start, + const u64 len, + struct extent_state **cached_state) +{ + u64 search_start = start; + const u64 end = start + len - 1; + + while (search_start < end) { + const u64 search_len = end - search_start + 1; + struct extent_map *em; + u64 em_len; + int ret = 0; + + em = btrfs_get_extent(inode, NULL, 0, search_start, + search_len, 0); + if (IS_ERR(em)) + return PTR_ERR(em); + + if (em->block_start != EXTENT_MAP_HOLE) + goto next; + + em_len = em->len; + if (em->start < search_start) + em_len -= search_start - em->start; + if (em_len > search_len) + em_len = search_len; + + ret = set_extent_bit(&inode->io_tree, search_start, + search_start + em_len - 1, + EXTENT_DELALLOC_NEW, + NULL, cached_state, GFP_NOFS); +next: + search_start = extent_map_end(em); + free_extent_map(em); + if (ret) + return ret; + } + return 0; +} + +static void btrfs_buffered_page_done(struct inode *inode, loff_t pos, + unsigned copied, struct page *page, + struct iomap *iomap) +{ + if (!page) + return; + SetPageUptodate(page); + ClearPageChecked(page); + set_page_dirty(page); + get_page(page); +} + + +static const struct iomap_page_ops btrfs_buffered_page_ops = { + .page_done = btrfs_buffered_page_done, +}; + + +static int btrfs_buffered_iomap_begin(struct inode *inode, loff_t pos, + loff_t length, unsigned flags, struct iomap *iomap, + struct iomap *srcmap) +{ + int ret; + size_t write_bytes = length; + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + size_t sector_offset = pos & (fs_info->sectorsize - 1); + struct btrfs_iomap *bi; + + bi = kzalloc(sizeof(struct btrfs_iomap), GFP_NOFS); + if (!bi) + return -ENOMEM; + + bi->reserved_bytes = round_up(write_bytes + sector_offset, + fs_info->sectorsize); + + /* Reserve data space */ + ret = btrfs_check_data_free_space(inode, &bi->data_reserved, pos, + write_bytes); + if (ret < 0) { + /* + * Space allocation failed. Let's check if we can + * continue I/O without allocations + */ + if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW | + BTRFS_INODE_PREALLOC)) && + check_can_nocow(BTRFS_I(inode), pos, + &write_bytes) > 0) { + bi->nocow = true; + /* + * our prealloc extent may be smaller than + * write_bytes, so scale down. + */ + bi->reserved_bytes = round_up(write_bytes + + sector_offset, + fs_info->sectorsize); + } else { + goto error; + } + } + + WARN_ON(bi->reserved_bytes == 0); + + /* We have the data space allocated, reserve the metadata now */ + ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode), + bi->reserved_bytes); + if (ret) { + struct btrfs_root *root = BTRFS_I(inode)->root; + if (!bi->nocow) + btrfs_free_reserved_data_space(inode, + bi->data_reserved, pos, + write_bytes); + else + btrfs_end_write_no_snapshotting(root); + goto error; + } + + do { + ret = lock_and_cleanup_extent( + BTRFS_I(inode), pos, write_bytes, &bi->start, + &bi->end, &bi->cached_state); + } while (ret == -EAGAIN); + + if (ret < 0) { + btrfs_delalloc_release_extents(BTRFS_I(inode), + bi->reserved_bytes, true); + goto release; + } else { + bi->extents_locked = ret; + } + iomap->private = bi; + iomap->length = round_up(write_bytes, fs_info->sectorsize); + iomap->offset = round_down(pos, fs_info->sectorsize); + iomap->addr = IOMAP_NULL_ADDR; + iomap->type = IOMAP_DELALLOC; + iomap->bdev = fs_info->fs_devices->latest_bdev; + iomap->page_ops = &btrfs_buffered_page_ops; + return 0; +release: + if (bi->extents_locked) + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, + bi->end, &bi->cached_state); + if (bi->nocow) { + struct btrfs_root *root = BTRFS_I(inode)->root; + btrfs_end_write_no_snapshotting(root); + btrfs_delalloc_release_metadata(BTRFS_I(inode), + bi->reserved_bytes, true); + } else { + btrfs_delalloc_release_space(inode, bi->data_reserved, + round_down(pos, fs_info->sectorsize), + bi->reserved_bytes, true); + } + extent_changeset_free(bi->data_reserved); + +error: + kfree(bi); + return ret; +} + +static int btrfs_buffered_iomap_end(struct inode *inode, loff_t pos, + loff_t length, ssize_t written, unsigned flags, + struct iomap *iomap) +{ + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + struct btrfs_iomap *bi = iomap->private; + ssize_t release_bytes = round_down(bi->reserved_bytes - written, + 1 << fs_info->sb->s_blocksize_bits); + unsigned int extra_bits = 0; + u64 start_pos = pos & ~((u64) fs_info->sectorsize - 1); + u64 num_bytes = round_up(written + pos - start_pos, + fs_info->sectorsize); + u64 end_of_last_block = start_pos + num_bytes - 1; + int ret = 0; + + if (release_bytes > 0) { + if (bi->nocow) { + btrfs_delalloc_release_metadata(BTRFS_I(inode), + release_bytes, true); + } else { + u64 __pos = round_down(pos + written, fs_info->sectorsize); + btrfs_delalloc_release_space(inode, bi->data_reserved, + __pos, release_bytes, true); + } + } + + /* + * The pages may have already been dirty, clear out old accounting so + * we can set things up properly + */ + clear_extent_bit(&BTRFS_I(inode)->io_tree, start_pos, end_of_last_block, + EXTENT_DIRTY | EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING | + EXTENT_DEFRAG, 0, 0, &bi->cached_state); + + if (!btrfs_is_free_space_inode(BTRFS_I(inode))) { + if (start_pos >= i_size_read(inode) && + !(BTRFS_I(inode)->flags & BTRFS_INODE_PREALLOC)) { + /* + * There can't be any extents following eof in this case + * so just set the delalloc new bit for the range + * directly. + */ + extra_bits |= EXTENT_DELALLOC_NEW; + } else { + ret = btrfs_find_new_delalloc_bytes(BTRFS_I(inode), + start_pos, num_bytes, + &bi->cached_state); + if (ret) + goto unlock; + } + } + + ret = btrfs_set_extent_delalloc(inode, start_pos, end_of_last_block, + extra_bits, &bi->cached_state, 0); +unlock: + if (bi->extents_locked) + unlock_extent_cached(&BTRFS_I(inode)->io_tree, + bi->start, bi->end, &bi->cached_state); + + if (bi->nocow) { + struct btrfs_root *root = BTRFS_I(inode)->root; + btrfs_end_write_no_snapshotting(root); + if (written > 0) { + u64 start = round_down(pos, fs_info->sectorsize); + u64 end = round_up(pos + written, fs_info->sectorsize) - 1; + set_extent_bit(&BTRFS_I(inode)->io_tree, start, end, + EXTENT_NORESERVE, NULL, NULL, GFP_NOFS); + } + + } + btrfs_delalloc_release_extents(BTRFS_I(inode), bi->reserved_bytes, + true); + + if (written < fs_info->nodesize) + btrfs_btree_balance_dirty(fs_info); + + extent_changeset_free(bi->data_reserved); + kfree(bi); + return ret; +} + +static const struct iomap_ops btrfs_buffered_iomap_ops = { + .iomap_begin = btrfs_buffered_iomap_begin, + .iomap_end = btrfs_buffered_iomap_end, +}; + +size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from) +{ + ssize_t written; + struct inode *inode = file_inode(iocb->ki_filp); + written = iomap_file_buffered_write(iocb, from, &btrfs_buffered_iomap_ops); + if (written > 0) + iocb->ki_pos += written; + if (iocb->ki_pos > i_size_read(inode)) + i_size_write(inode, iocb->ki_pos); + return written; +} + From patchwork Sun Sep 1 20:08:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125665 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BDC2B13B1 for ; Sun, 1 Sep 2019 20:09:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A3A892190F for ; Sun, 1 Sep 2019 20:09:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728991AbfIAUJC (ORCPT ); Sun, 1 Sep 2019 16:09:02 -0400 Received: from mx2.suse.de ([195.135.220.15]:50500 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728934AbfIAUJA (ORCPT ); Sun, 1 Sep 2019 16:09:00 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 1962BB07B; Sun, 1 Sep 2019 20:08:59 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 06/15] btrfs: Add CoW in iomap based writes Date: Sun, 1 Sep 2019 15:08:27 -0500 Message-Id: <20190901200836.14959-7-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Set iomap->type to IOMAP_COW and fill up the source map in case the I/O is not page aligned. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/iomap.c | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c index 025ccbf471bf..8ed93d427962 100644 --- a/fs/btrfs/iomap.c +++ b/fs/btrfs/iomap.c @@ -165,6 +165,35 @@ static int btrfs_find_new_delalloc_bytes(struct btrfs_inode *inode, return 0; } +/* + * get_iomap: Get the block map and fill the iomap structure + * @pos: file position + * @length: I/O length + * @iomap: The iomap structure to fill + */ + +static int get_iomap(struct inode *inode, loff_t pos, loff_t length, + struct iomap *iomap) +{ + struct extent_map *em; + iomap->addr = IOMAP_NULL_ADDR; + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0); + if (IS_ERR(em)) + return PTR_ERR(em); + /* XXX Do we need to check for em->flags here? */ + if (em->block_start == EXTENT_MAP_HOLE) { + iomap->type = IOMAP_HOLE; + } else { + iomap->addr = em->block_start; + iomap->type = IOMAP_MAPPED; + } + iomap->offset = em->start; + iomap->bdev = em->bdev; + iomap->length = em->len; + free_extent_map(em); + return 0; +} + static void btrfs_buffered_page_done(struct inode *inode, loff_t pos, unsigned copied, struct page *page, struct iomap *iomap) @@ -190,6 +219,7 @@ static int btrfs_buffered_iomap_begin(struct inode *inode, loff_t pos, int ret; size_t write_bytes = length; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + size_t end; size_t sector_offset = pos & (fs_info->sectorsize - 1); struct btrfs_iomap *bi; @@ -257,8 +287,18 @@ static int btrfs_buffered_iomap_begin(struct inode *inode, loff_t pos, iomap->private = bi; iomap->length = round_up(write_bytes, fs_info->sectorsize); iomap->offset = round_down(pos, fs_info->sectorsize); + end = pos + write_bytes; + /* Set IOMAP_COW if start/end is not page aligned */ + if (((pos & (PAGE_SIZE - 1)) || (end & (PAGE_SIZE - 1)))) { + iomap->type = IOMAP_COW; + ret = get_iomap(inode, pos, length, srcmap); + if (ret < 0) + goto release; + } else { + iomap->type = IOMAP_DELALLOC; + } + iomap->addr = IOMAP_NULL_ADDR; - iomap->type = IOMAP_DELALLOC; iomap->bdev = fs_info->fs_devices->latest_bdev; iomap->page_ops = &btrfs_buffered_page_ops; return 0; From patchwork Sun Sep 1 20:08:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125669 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A315D1510 for ; Sun, 1 Sep 2019 20:09:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7904622CF7 for ; Sun, 1 Sep 2019 20:09:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729012AbfIAUJE (ORCPT ); Sun, 1 Sep 2019 16:09:04 -0400 Received: from mx2.suse.de ([195.135.220.15]:50518 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728973AbfIAUJD (ORCPT ); Sun, 1 Sep 2019 16:09:03 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id ABCF5B626; Sun, 1 Sep 2019 20:09:01 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 07/15] btrfs: remove buffered write code made unnecessary Date: Sun, 1 Sep 2019 15:08:28 -0500 Message-Id: <20190901200836.14959-8-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Better done in a separate patch to keep the main patch short(er) Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/file.c | 463 -------------------------------------------------------- 1 file changed, 463 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 0707db04d3cc..f7087e28ac08 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -390,79 +390,6 @@ int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info) return 0; } -/* simple helper to fault in pages and copy. This should go away - * and be replaced with calls into generic code. - */ -static noinline int btrfs_copy_from_user(loff_t pos, size_t write_bytes, - struct page **prepared_pages, - struct iov_iter *i) -{ - size_t copied = 0; - size_t total_copied = 0; - int pg = 0; - int offset = offset_in_page(pos); - - while (write_bytes > 0) { - size_t count = min_t(size_t, - PAGE_SIZE - offset, write_bytes); - struct page *page = prepared_pages[pg]; - /* - * Copy data from userspace to the current page - */ - copied = iov_iter_copy_from_user_atomic(page, i, offset, count); - - /* Flush processor's dcache for this page */ - flush_dcache_page(page); - - /* - * if we get a partial write, we can end up with - * partially up to date pages. These add - * a lot of complexity, so make sure they don't - * happen by forcing this copy to be retried. - * - * The rest of the btrfs_file_write code will fall - * back to page at a time copies after we return 0. - */ - if (!PageUptodate(page) && copied < count) - copied = 0; - - iov_iter_advance(i, copied); - write_bytes -= copied; - total_copied += copied; - - /* Return to btrfs_file_write_iter to fault page */ - if (unlikely(copied == 0)) - break; - - if (copied < PAGE_SIZE - offset) { - offset += copied; - } else { - pg++; - offset = 0; - } - } - return total_copied; -} - -/* - * unlocks pages after btrfs_file_write is done with them - */ -static void btrfs_drop_pages(struct page **pages, size_t num_pages) -{ - size_t i; - for (i = 0; i < num_pages; i++) { - /* page checked is some magic around finding pages that - * have been modified without going through btrfs_set_page_dirty - * clear it here. There should be no need to mark the pages - * accessed as prepare_pages should have marked them accessed - * in prepare_pages via find_or_create_page() - */ - ClearPageChecked(pages[i]); - unlock_page(pages[i]); - put_page(pages[i]); - } -} - static int btrfs_find_new_delalloc_bytes(struct btrfs_inode *inode, const u64 start, const u64 len, @@ -1387,164 +1314,6 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, return 0; } -/* - * on error we return an unlocked page and the error value - * on success we return a locked page and 0 - */ -static int prepare_uptodate_page(struct inode *inode, - struct page *page, u64 pos, - bool force_uptodate) -{ - int ret = 0; - - if (((pos & (PAGE_SIZE - 1)) || force_uptodate) && - !PageUptodate(page)) { - ret = btrfs_readpage(NULL, page); - if (ret) - return ret; - lock_page(page); - if (!PageUptodate(page)) { - unlock_page(page); - return -EIO; - } - if (page->mapping != inode->i_mapping) { - unlock_page(page); - return -EAGAIN; - } - } - return 0; -} - -/* - * this just gets pages into the page cache and locks them down. - */ -static noinline int prepare_pages(struct inode *inode, struct page **pages, - size_t num_pages, loff_t pos, - size_t write_bytes, bool force_uptodate) -{ - int i; - unsigned long index = pos >> PAGE_SHIFT; - gfp_t mask = btrfs_alloc_write_mask(inode->i_mapping); - int err = 0; - int faili; - - for (i = 0; i < num_pages; i++) { -again: - pages[i] = find_or_create_page(inode->i_mapping, index + i, - mask | __GFP_WRITE); - if (!pages[i]) { - faili = i - 1; - err = -ENOMEM; - goto fail; - } - - if (i == 0) - err = prepare_uptodate_page(inode, pages[i], pos, - force_uptodate); - if (!err && i == num_pages - 1) - err = prepare_uptodate_page(inode, pages[i], - pos + write_bytes, false); - if (err) { - put_page(pages[i]); - if (err == -EAGAIN) { - err = 0; - goto again; - } - faili = i - 1; - goto fail; - } - wait_on_page_writeback(pages[i]); - } - - return 0; -fail: - while (faili >= 0) { - unlock_page(pages[faili]); - put_page(pages[faili]); - faili--; - } - return err; - -} - -/* - * This function locks the extent and properly waits for data=ordered extents - * to finish before allowing the pages to be modified if need. - * - * The return value: - * 1 - the extent is locked - * 0 - the extent is not locked, and everything is OK - * -EAGAIN - need re-prepare the pages - * the other < 0 number - Something wrong happens - */ -static noinline int -lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages, - size_t num_pages, loff_t pos, - size_t write_bytes, - u64 *lockstart, u64 *lockend, - struct extent_state **cached_state) -{ - struct btrfs_fs_info *fs_info = inode->root->fs_info; - u64 start_pos; - u64 last_pos; - int i; - int ret = 0; - - start_pos = round_down(pos, fs_info->sectorsize); - last_pos = start_pos - + round_up(pos + write_bytes - start_pos, - fs_info->sectorsize) - 1; - - if (start_pos < inode->vfs_inode.i_size) { - struct btrfs_ordered_extent *ordered; - - lock_extent_bits(&inode->io_tree, start_pos, last_pos, - cached_state); - ordered = btrfs_lookup_ordered_range(inode, start_pos, - last_pos - start_pos + 1); - if (ordered && - ordered->file_offset + ordered->len > start_pos && - ordered->file_offset <= last_pos) { - unlock_extent_cached(&inode->io_tree, start_pos, - last_pos, cached_state); - for (i = 0; i < num_pages; i++) { - unlock_page(pages[i]); - put_page(pages[i]); - } - btrfs_start_ordered_extent(&inode->vfs_inode, - ordered, 1); - btrfs_put_ordered_extent(ordered); - return -EAGAIN; - } - if (ordered) - btrfs_put_ordered_extent(ordered); - - *lockstart = start_pos; - *lockend = last_pos; - ret = 1; - } - - /* - * It's possible the pages are dirty right now, but we don't want - * to clean them yet because copy_from_user may catch a page fault - * and we might have to fall back to one page at a time. If that - * happens, we'll unlock these pages and we'd have a window where - * reclaim could sneak in and drop the once-dirty page on the floor - * without writing it. - * - * We have the pages locked and the extent range locked, so there's - * no way someone can start IO on any dirty pages in this range. - * - * We'll call btrfs_dirty_pages() later on, and that will flip around - * delalloc bits and dirty the pages as required. - */ - for (i = 0; i < num_pages; i++) { - WARN_ON(!PageLocked(pages[i])); - } - - return ret; -} - static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos, size_t *write_bytes) { @@ -1581,238 +1350,6 @@ static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos, return ret; } -static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, - struct iov_iter *i) -{ - struct file *file = iocb->ki_filp; - loff_t pos = iocb->ki_pos; - struct inode *inode = file_inode(file); - struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); - struct btrfs_root *root = BTRFS_I(inode)->root; - struct page **pages = NULL; - struct extent_state *cached_state = NULL; - struct extent_changeset *data_reserved = NULL; - u64 release_bytes = 0; - u64 lockstart; - u64 lockend; - size_t num_written = 0; - int nrptrs; - int ret = 0; - bool only_release_metadata = false; - bool force_page_uptodate = false; - - nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE), - PAGE_SIZE / (sizeof(struct page *))); - nrptrs = min(nrptrs, current->nr_dirtied_pause - current->nr_dirtied); - nrptrs = max(nrptrs, 8); - pages = kmalloc_array(nrptrs, sizeof(struct page *), GFP_KERNEL); - if (!pages) - return -ENOMEM; - - while (iov_iter_count(i) > 0) { - size_t offset = offset_in_page(pos); - size_t sector_offset; - size_t write_bytes = min(iov_iter_count(i), - nrptrs * (size_t)PAGE_SIZE - - offset); - size_t num_pages = DIV_ROUND_UP(write_bytes + offset, - PAGE_SIZE); - size_t reserve_bytes; - size_t dirty_pages; - size_t copied; - size_t dirty_sectors; - size_t num_sectors; - int extents_locked; - - WARN_ON(num_pages > nrptrs); - - /* - * Fault pages before locking them in prepare_pages - * to avoid recursive lock - */ - if (unlikely(iov_iter_fault_in_readable(i, write_bytes))) { - ret = -EFAULT; - break; - } - - sector_offset = pos & (fs_info->sectorsize - 1); - reserve_bytes = round_up(write_bytes + sector_offset, - fs_info->sectorsize); - - extent_changeset_release(data_reserved); - ret = btrfs_check_data_free_space(inode, &data_reserved, pos, - write_bytes); - if (ret < 0) { - if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW | - BTRFS_INODE_PREALLOC)) && - check_can_nocow(BTRFS_I(inode), pos, - &write_bytes) > 0) { - /* - * For nodata cow case, no need to reserve - * data space. - */ - only_release_metadata = true; - /* - * our prealloc extent may be smaller than - * write_bytes, so scale down. - */ - num_pages = DIV_ROUND_UP(write_bytes + offset, - PAGE_SIZE); - reserve_bytes = round_up(write_bytes + - sector_offset, - fs_info->sectorsize); - } else { - break; - } - } - - WARN_ON(reserve_bytes == 0); - ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode), - reserve_bytes); - if (ret) { - if (!only_release_metadata) - btrfs_free_reserved_data_space(inode, - data_reserved, pos, - write_bytes); - else - btrfs_end_write_no_snapshotting(root); - break; - } - - release_bytes = reserve_bytes; -again: - /* - * This is going to setup the pages array with the number of - * pages we want, so we don't really need to worry about the - * contents of pages from loop to loop - */ - ret = prepare_pages(inode, pages, num_pages, - pos, write_bytes, - force_page_uptodate); - if (ret) { - btrfs_delalloc_release_extents(BTRFS_I(inode), - reserve_bytes, true); - break; - } - - extents_locked = lock_and_cleanup_extent_if_need( - BTRFS_I(inode), pages, - num_pages, pos, write_bytes, &lockstart, - &lockend, &cached_state); - if (extents_locked < 0) { - if (extents_locked == -EAGAIN) - goto again; - btrfs_delalloc_release_extents(BTRFS_I(inode), - reserve_bytes, true); - ret = extents_locked; - break; - } - - copied = btrfs_copy_from_user(pos, write_bytes, pages, i); - - num_sectors = BTRFS_BYTES_TO_BLKS(fs_info, reserve_bytes); - dirty_sectors = round_up(copied + sector_offset, - fs_info->sectorsize); - dirty_sectors = BTRFS_BYTES_TO_BLKS(fs_info, dirty_sectors); - - /* - * if we have trouble faulting in the pages, fall - * back to one page at a time - */ - if (copied < write_bytes) - nrptrs = 1; - - if (copied == 0) { - force_page_uptodate = true; - dirty_sectors = 0; - dirty_pages = 0; - } else { - force_page_uptodate = false; - dirty_pages = DIV_ROUND_UP(copied + offset, - PAGE_SIZE); - } - - if (num_sectors > dirty_sectors) { - /* release everything except the sectors we dirtied */ - release_bytes -= dirty_sectors << - fs_info->sb->s_blocksize_bits; - if (only_release_metadata) { - btrfs_delalloc_release_metadata(BTRFS_I(inode), - release_bytes, true); - } else { - u64 __pos; - - __pos = round_down(pos, - fs_info->sectorsize) + - (dirty_pages << PAGE_SHIFT); - btrfs_delalloc_release_space(inode, - data_reserved, __pos, - release_bytes, true); - } - } - - release_bytes = round_up(copied + sector_offset, - fs_info->sectorsize); - - if (copied > 0) - ret = btrfs_dirty_pages(inode, pages, dirty_pages, - pos, copied, &cached_state); - if (extents_locked) - unlock_extent_cached(&BTRFS_I(inode)->io_tree, - lockstart, lockend, &cached_state); - btrfs_delalloc_release_extents(BTRFS_I(inode), reserve_bytes, - true); - if (ret) { - btrfs_drop_pages(pages, num_pages); - break; - } - - release_bytes = 0; - if (only_release_metadata) - btrfs_end_write_no_snapshotting(root); - - if (only_release_metadata && copied > 0) { - lockstart = round_down(pos, - fs_info->sectorsize); - lockend = round_up(pos + copied, - fs_info->sectorsize) - 1; - - set_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, - lockend, EXTENT_NORESERVE, NULL, - NULL, GFP_NOFS); - only_release_metadata = false; - } - - btrfs_drop_pages(pages, num_pages); - - cond_resched(); - - balance_dirty_pages_ratelimited(inode->i_mapping); - if (dirty_pages < (fs_info->nodesize >> PAGE_SHIFT) + 1) - btrfs_btree_balance_dirty(fs_info); - - pos += copied; - num_written += copied; - } - - kfree(pages); - - if (release_bytes) { - if (only_release_metadata) { - btrfs_end_write_no_snapshotting(root); - btrfs_delalloc_release_metadata(BTRFS_I(inode), - release_bytes, true); - } else { - btrfs_delalloc_release_space(inode, data_reserved, - round_down(pos, fs_info->sectorsize), - release_bytes, true); - } - } - - extent_changeset_free(data_reserved); - return num_written ? num_written : ret; -} - static ssize_t __btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) { struct file *file = iocb->ki_filp; From patchwork Sun Sep 1 20:08:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125673 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 921EB13B1 for ; Sun, 1 Sep 2019 20:09:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7B16922CF7 for ; Sun, 1 Sep 2019 20:09:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729025AbfIAUJG (ORCPT ); Sun, 1 Sep 2019 16:09:06 -0400 Received: from mx2.suse.de ([195.135.220.15]:50546 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729010AbfIAUJE (ORCPT ); Sun, 1 Sep 2019 16:09:04 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B9D02B63B; Sun, 1 Sep 2019 20:09:03 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 08/15] fs: Export generic_file_buffered_read() Date: Sun, 1 Sep 2019 15:08:29 -0500 Message-Id: <20190901200836.14959-9-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Export generic_file_buffered_read() to be used to supplement incomplete direct reads. While we are at it, correct the comments and variable names. Signed-off-by: Goldwyn Rodrigues --- include/linux/fs.h | 2 ++ mm/filemap.c | 13 +++++++------ 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 997a530ff4e9..26d827434060 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3041,6 +3041,8 @@ extern int generic_file_rw_checks(struct file *file_in, struct file *file_out); extern int generic_copy_file_checks(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, size_t *count, unsigned int flags); +extern ssize_t generic_file_buffered_read(struct kiocb *iocb, + struct iov_iter *to, ssize_t already_read); extern ssize_t generic_file_read_iter(struct kiocb *, struct iov_iter *); extern ssize_t __generic_file_write_iter(struct kiocb *, struct iov_iter *); extern ssize_t generic_file_write_iter(struct kiocb *, struct iov_iter *); diff --git a/mm/filemap.c b/mm/filemap.c index d0cf700bf201..2121ae01eae8 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2014,7 +2014,7 @@ static void shrink_readahead_size_eio(struct file *filp, * generic_file_buffered_read - generic file read routine * @iocb: the iocb to read * @iter: data destination - * @written: already copied + * @copied: already copied * * This is a generic file read routine, and uses the * mapping->a_ops->readpage() function for the actual low-level stuff. @@ -2023,11 +2023,11 @@ static void shrink_readahead_size_eio(struct file *filp, * of the logic when it comes to error handling etc. * * Return: - * * total number of bytes copied, including those the were already @written + * * total number of bytes copied, including those the were @copied * * negative error code if nothing was copied */ -static ssize_t generic_file_buffered_read(struct kiocb *iocb, - struct iov_iter *iter, ssize_t written) +ssize_t generic_file_buffered_read(struct kiocb *iocb, + struct iov_iter *iter, ssize_t copied) { struct file *filp = iocb->ki_filp; struct address_space *mapping = filp->f_mapping; @@ -2168,7 +2168,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, prev_offset = offset; put_page(page); - written += ret; + copied += ret; if (!iov_iter_count(iter)) goto out; if (ret < nr) { @@ -2276,8 +2276,9 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, *ppos = ((loff_t)index << PAGE_SHIFT) + offset; file_accessed(filp); - return written ? written : error; + return copied ? copied : error; } +EXPORT_SYMBOL_GPL(generic_file_buffered_read); /** * generic_file_read_iter - generic filesystem read routine From patchwork Sun Sep 1 20:08:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125679 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C3F0D17EF for ; Sun, 1 Sep 2019 20:09:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AD9FE22CF7 for ; Sun, 1 Sep 2019 20:09:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729037AbfIAUJJ (ORCPT ); Sun, 1 Sep 2019 16:09:09 -0400 Received: from mx2.suse.de ([195.135.220.15]:50566 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729020AbfIAUJH (ORCPT ); Sun, 1 Sep 2019 16:09:07 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C9120B62E; Sun, 1 Sep 2019 20:09:05 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 09/15] btrfs: basic direct read operation Date: Sun, 1 Sep 2019 15:08:30 -0500 Message-Id: <20190901200836.14959-10-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Add btrfs_dio_iomap_ops for iomap.begin() function. In order to accomodate dio reads, add a new function btrfs_file_read_iter() which would call btrfs_dio_iomap_read() for DIO reads and fallback to generic_file_buffered_read otherwise. Changed parameter written in generic_file_buffered_read() to already_read. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/file.c | 13 ++++++++++++- fs/btrfs/iomap.c | 20 ++++++++++++++++++++ 3 files changed, 34 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 9c501c7826b4..5ca3a365e639 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3243,7 +3243,9 @@ int btrfs_fdatawrite_range(struct inode *inode, loff_t start, loff_t end); loff_t btrfs_remap_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t len, unsigned int remap_flags); +/* iomap.c */ size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from); +ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to); /* tree-defrag.c */ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index f7087e28ac08..e7f67d514ba8 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2839,9 +2839,20 @@ static int btrfs_file_open(struct inode *inode, struct file *filp) return generic_file_open(inode, filp); } +static ssize_t btrfs_file_read_iter(struct kiocb *iocb, struct iov_iter *to) +{ + ssize_t ret = 0; + if (iocb->ki_flags & IOCB_DIRECT) + ret = btrfs_dio_iomap_read(iocb, to); + if (ret < 0) + return ret; + + return generic_file_buffered_read(iocb, to, ret); +} + const struct file_operations btrfs_file_operations = { .llseek = btrfs_file_llseek, - .read_iter = generic_file_read_iter, + .read_iter = btrfs_file_read_iter, .splice_read = generic_file_splice_read, .write_iter = btrfs_file_write_iter, .mmap = btrfs_file_mmap, diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c index 8ed93d427962..ee94c687db1a 100644 --- a/fs/btrfs/iomap.c +++ b/fs/btrfs/iomap.c @@ -421,3 +421,23 @@ size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from) return written; } +static int btrfs_dio_iomap_begin(struct inode *inode, loff_t pos, + loff_t length, unsigned flags, struct iomap *iomap, + struct iomap *srcmap) +{ + return get_iomap(inode, pos, length, iomap); +} + +static const struct iomap_ops btrfs_dio_iomap_ops = { + .iomap_begin = btrfs_dio_iomap_begin, +}; + +ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to) +{ + struct inode *inode = file_inode(iocb->ki_filp); + ssize_t ret; + inode_lock_shared(inode); + ret = iomap_dio_rw(iocb, to, &btrfs_dio_iomap_ops, NULL); + inode_unlock_shared(inode); + return ret; +} From patchwork Sun Sep 1 20:08:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125681 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5F26613B1 for ; Sun, 1 Sep 2019 20:09:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4820F22CF7 for ; Sun, 1 Sep 2019 20:09:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729048AbfIAUJK (ORCPT ); Sun, 1 Sep 2019 16:09:10 -0400 Received: from mx2.suse.de ([195.135.220.15]:50578 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729031AbfIAUJJ (ORCPT ); Sun, 1 Sep 2019 16:09:09 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E3C1DB64E; Sun, 1 Sep 2019 20:09:07 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 10/15] btrfs: Carve out btrfs_get_extent_map_write() out of btrfs_get_blocks_write() Date: Sun, 1 Sep 2019 15:08:31 -0500 Message-Id: <20190901200836.14959-11-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues This makes btrfs_get_extent_map_write() independent of Direct I/O code. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/inode.c | 40 +++++++++++++++++++++++++++------------- 2 files changed, 29 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 5ca3a365e639..c25dfd8e619b 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3168,6 +3168,8 @@ struct inode *btrfs_iget_path(struct super_block *s, struct btrfs_key *location, struct btrfs_path *path); struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location, struct btrfs_root *root, int *was_new); +int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, + struct inode *inode, u64 start, u64 len); struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, struct page *page, size_t pg_offset, u64 start, u64 end, int create); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 258bacefdf5f..24895793fd91 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7592,11 +7592,10 @@ static int btrfs_get_blocks_direct_read(struct extent_map *em, return 0; } -static int btrfs_get_blocks_direct_write(struct extent_map **map, - struct buffer_head *bh_result, - struct inode *inode, - struct btrfs_dio_data *dio_data, - u64 start, u64 len) +int btrfs_get_extent_map_write(struct extent_map **map, + struct buffer_head *bh, + struct inode *inode, + u64 start, u64 len) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct extent_map *em = *map; @@ -7650,22 +7649,38 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, */ btrfs_free_reserved_data_space_noquota(inode, start, len); - goto skip_cow; + /* skip COW */ + goto out; } } /* this will cow the extent */ - len = bh_result->b_size; + if (bh) + len = bh->b_size; free_extent_map(em); *map = em = btrfs_new_extent_direct(inode, start, len); - if (IS_ERR(em)) { - ret = PTR_ERR(em); - goto out; - } + if (IS_ERR(em)) + return PTR_ERR(em); +out: + return ret; +} +static int btrfs_get_blocks_direct_write(struct extent_map **map, + struct buffer_head *bh_result, + struct inode *inode, + struct btrfs_dio_data *dio_data, + u64 start, u64 len) +{ + int ret; + struct extent_map *em; + + ret = btrfs_get_extent_map_write(map, bh_result, inode, + start, len); + if (ret < 0) + return ret; + em = *map; len = min(len, em->len - (start - em->start)); -skip_cow: bh_result->b_blocknr = (em->block_start + (start - em->start)) >> inode->i_blkbits; bh_result->b_size = len; @@ -7686,7 +7701,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, dio_data->reserve -= len; dio_data->unsubmitted_oe_range_end = start + len; current->journal_info = dio_data; -out: return ret; } From patchwork Sun Sep 1 20:08:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125685 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4B9DC1510 for ; Sun, 1 Sep 2019 20:09:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3460F2190F for ; Sun, 1 Sep 2019 20:09:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729058AbfIAUJN (ORCPT ); Sun, 1 Sep 2019 16:09:13 -0400 Received: from mx2.suse.de ([195.135.220.15]:50610 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729043AbfIAUJM (ORCPT ); Sun, 1 Sep 2019 16:09:12 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 090D6B65E; Sun, 1 Sep 2019 20:09:10 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 11/15] btrfs: Rename __endio_write_update_ordered() to btrfs_update_ordered_extent() Date: Sun, 1 Sep 2019 15:08:32 -0500 Message-Id: <20190901200836.14959-12-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Since we will be using it in another part of the code, use a better name to declare it non-static Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 7 +++++-- fs/btrfs/inode.c | 14 +++++--------- 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index c25dfd8e619b..04c119ca229b 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3171,8 +3171,11 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location, int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, struct inode *inode, u64 start, u64 len); struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, - struct page *page, size_t pg_offset, - u64 start, u64 end, int create); + struct page *page, size_t pg_offset, + u64 start, u64 end, int create); +void btrfs_update_ordered_extent(struct inode *inode, + const u64 offset, const u64 bytes, + const bool uptodate); int btrfs_update_inode(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct inode *inode); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 24895793fd91..d415534ce733 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -89,10 +89,6 @@ static struct extent_map *create_io_em(struct inode *inode, u64 start, u64 len, u64 ram_bytes, int compress_type, int type); -static void __endio_write_update_ordered(struct inode *inode, - const u64 offset, const u64 bytes, - const bool uptodate); - /* * Cleanup all submitted ordered extents in specified range to handle errors * from the btrfs_run_delalloc_range() callback. @@ -133,7 +129,7 @@ static inline void btrfs_cleanup_ordered_extents(struct inode *inode, bytes -= PAGE_SIZE; } - return __endio_write_update_ordered(inode, offset, bytes, false); + return btrfs_update_ordered_extent(inode, offset, bytes, false); } static int btrfs_dirty_inode(struct inode *inode); @@ -8176,7 +8172,7 @@ static void btrfs_endio_direct_read(struct bio *bio) bio_put(bio); } -static void __endio_write_update_ordered(struct inode *inode, +void btrfs_update_ordered_extent(struct inode *inode, const u64 offset, const u64 bytes, const bool uptodate) { @@ -8229,7 +8225,7 @@ static void btrfs_endio_direct_write(struct bio *bio) struct btrfs_dio_private *dip = bio->bi_private; struct bio *dio_bio = dip->dio_bio; - __endio_write_update_ordered(dip->inode, dip->logical_offset, + btrfs_update_ordered_extent(dip->inode, dip->logical_offset, dip->bytes, !bio->bi_status); kfree(dip); @@ -8546,7 +8542,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, bio = NULL; } else { if (write) - __endio_write_update_ordered(inode, + btrfs_update_ordered_extent(inode, file_offset, dio_bio->bi_iter.bi_size, false); @@ -8686,7 +8682,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) */ if (dio_data.unsubmitted_oe_range_start < dio_data.unsubmitted_oe_range_end) - __endio_write_update_ordered(inode, + btrfs_update_ordered_extent(inode, dio_data.unsubmitted_oe_range_start, dio_data.unsubmitted_oe_range_end - dio_data.unsubmitted_oe_range_start, From patchwork Sun Sep 1 20:08:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125689 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 270CA17EF for ; Sun, 1 Sep 2019 20:09:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0E2BE22CF7 for ; Sun, 1 Sep 2019 20:09:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729064AbfIAUJO (ORCPT ); Sun, 1 Sep 2019 16:09:14 -0400 Received: from mx2.suse.de ([195.135.220.15]:50658 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729051AbfIAUJN (ORCPT ); Sun, 1 Sep 2019 16:09:13 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 2367DB65F; Sun, 1 Sep 2019 20:09:12 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 12/15] iomap: use a function pointer for dio submits Date: Sun, 1 Sep 2019 15:08:33 -0500 Message-Id: <20190901200836.14959-13-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues This helps filesystems to perform tasks on the bio while submitting for I/O. Since btrfs requires the position we are working on, pass pos to iomap_dio_submit_bio() Signed-off-by: Goldwyn Rodrigues Signed-off-by: Christoph Hellwig --- fs/iomap/direct-io.c | 16 +++++++++++----- include/linux/iomap.h | 1 + 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 5279029c7a3c..00b8af35cf97 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -59,7 +59,7 @@ int iomap_dio_iopoll(struct kiocb *kiocb, bool spin) EXPORT_SYMBOL_GPL(iomap_dio_iopoll); static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap, - struct bio *bio) + struct bio *bio, loff_t pos) { atomic_inc(&dio->ref); @@ -67,7 +67,13 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap, bio_set_polled(bio, dio->iocb); dio->submit.last_queue = bdev_get_queue(iomap->bdev); - dio->submit.cookie = submit_bio(bio); + if (iomap->submit_io) { + iomap->submit_io(bio, file_inode(dio->iocb->ki_filp), + pos); + dio->submit.cookie = BLK_QC_T_NONE; + } else { + dio->submit.cookie = submit_bio(bio); + } } static ssize_t iomap_dio_complete(struct iomap_dio *dio) @@ -195,7 +201,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, get_page(page); __bio_add_page(bio, page, len, 0); bio_set_op_attrs(bio, REQ_OP_WRITE, flags); - iomap_dio_submit_bio(dio, iomap, bio); + iomap_dio_submit_bio(dio, iomap, bio, pos); } static loff_t @@ -301,11 +307,11 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, iov_iter_advance(dio->submit.iter, n); dio->size += n; - pos += n; copied += n; nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES); - iomap_dio_submit_bio(dio, iomap, bio); + iomap_dio_submit_bio(dio, iomap, bio, pos); + pos += n; } while (nr_pages); /* diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 9c082d36286a..465d736d74ac 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -80,6 +80,7 @@ struct iomap { void *inline_data; void *private; /* filesystem private */ const struct iomap_page_ops *page_ops; + dio_submit_t *submit_io; }; static inline sector_t From patchwork Sun Sep 1 20:08:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125693 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C83BF13B1 for ; Sun, 1 Sep 2019 20:09:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A653622CF7 for ; Sun, 1 Sep 2019 20:09:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729072AbfIAUJQ (ORCPT ); Sun, 1 Sep 2019 16:09:16 -0400 Received: from mx2.suse.de ([195.135.220.15]:50680 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729051AbfIAUJQ (ORCPT ); Sun, 1 Sep 2019 16:09:16 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 3C15FB660; Sun, 1 Sep 2019 20:09:14 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 13/15] btrfs: Use iomap_dio_rw for performing direct I/O writes Date: Sun, 1 Sep 2019 15:08:34 -0500 Message-Id: <20190901200836.14959-14-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues btrfs_iomap_init() is a function to be used to btrfs_iomap structure which is used to pass information between iomap begin() and end(). All data reservations and allocations must be performed in this function. For reads, btrfs_iomap allocation is not required. We perform space allocation and reservation before the iomap_dio_rw() call, as opposed to iomap_begin(). This is how the current direct I/O path performs. The problem with putting in the iomap_begin() is that the transaction needs to be committed before new allocation is performed. Performing a transaction for every direct sub-write will reduce performance. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 3 ++ fs/btrfs/file.c | 2 +- fs/btrfs/inode.c | 14 +++-- fs/btrfs/iomap.c | 153 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- 4 files changed, 160 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 04c119ca229b..7f84b7e47c8a 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3202,6 +3202,8 @@ int btrfs_writepage_cow_fixup(struct page *page, u64 start, u64 end); void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start, u64 end, int uptodate); extern const struct dentry_operations btrfs_dentry_operations; +void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, + loff_t file_offset); /* ioctl.c */ long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg); @@ -3251,6 +3253,7 @@ loff_t btrfs_remap_file_range(struct file *file_in, loff_t pos_in, /* iomap.c */ size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from); ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to); +ssize_t btrfs_dio_iomap_write(struct kiocb *iocb, struct iov_iter *from); /* tree-defrag.c */ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index e7f67d514ba8..5d4347e12cdc 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1501,7 +1501,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, atomic_inc(&BTRFS_I(inode)->sync_writers); if (iocb->ki_flags & IOCB_DIRECT) { - num_written = __btrfs_direct_write(iocb, from); + num_written = btrfs_dio_iomap_write(iocb, from); } else { num_written = btrfs_buffered_iomap_write(iocb, from); if (num_written > 0) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d415534ce733..323d72858c9c 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8167,9 +8167,8 @@ static void btrfs_endio_direct_read(struct bio *bio) kfree(dip); dio_bio->bi_status = err; - dio_end_io(dio_bio); + bio_endio(dio_bio); btrfs_io_bio_free_csum(io_bio); - bio_put(bio); } void btrfs_update_ordered_extent(struct inode *inode, @@ -8231,8 +8230,7 @@ static void btrfs_endio_direct_write(struct bio *bio) kfree(dip); dio_bio->bi_status = bio->bi_status; - dio_end_io(dio_bio); - bio_put(bio); + bio_endio(dio_bio); } static blk_status_t btrfs_submit_bio_start_direct_io(void *private_data, @@ -8464,8 +8462,8 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip) return 0; } -static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, - loff_t file_offset) +void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, + loff_t file_offset) { struct btrfs_dio_private *dip = NULL; struct bio *bio = NULL; @@ -8536,7 +8534,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, /* * The end io callbacks free our dip, do the final put on bio * and all the cleanup and final put for dio_bio (through - * dio_end_io()). + * end_io()). */ dip = NULL; bio = NULL; @@ -8555,7 +8553,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, * Releases and cleans up our dio_bio, no need to bio_put() * nor bio_endio()/bio_io_error() against dio_bio. */ - dio_end_io(dio_bio); + bio_endio(dio_bio); } if (bio) bio_put(bio); diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c index ee94c687db1a..554f30d83d57 100644 --- a/fs/btrfs/iomap.c +++ b/fs/btrfs/iomap.c @@ -7,6 +7,7 @@ */ #include +#include #include "ctree.h" #include "btrfs_inode.h" #include "volumes.h" @@ -421,15 +422,110 @@ size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from) return written; } +static struct btrfs_iomap *btrfs_iomap_init(struct inode *inode, + struct extent_map **em, + loff_t pos, loff_t length) +{ + int ret = 0; + struct extent_map *map = *em; + struct btrfs_iomap *bi; + u64 num_bytes; + + bi = kzalloc(sizeof(struct btrfs_iomap), GFP_NOFS); + if (!bi) + return ERR_PTR(-ENOMEM); + + bi->start = round_down(pos, PAGE_SIZE); + bi->end = PAGE_ALIGN(pos + length) - 1; + num_bytes = bi->end - bi->start + 1; + + /* Wait for existing ordered extents in range to finish */ + btrfs_wait_ordered_range(inode, bi->start, num_bytes); + + lock_extent_bits(&BTRFS_I(inode)->io_tree, bi->start, bi->end, &bi->cached_state); + + if (ret) { + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + kfree(bi); + return ERR_PTR(ret); + } + + refcount_inc(&map->refs); + ret = btrfs_get_extent_map_write(em, NULL, + inode, bi->start, num_bytes); + if (ret) { + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + kfree(bi); + return ERR_PTR(ret); + } + free_extent_map(map); + return bi; +} + static int btrfs_dio_iomap_begin(struct inode *inode, loff_t pos, - loff_t length, unsigned flags, struct iomap *iomap, - struct iomap *srcmap) + loff_t length, unsigned flags, struct iomap *iomap, + struct iomap *srcmap) +{ + struct extent_map *em; + struct btrfs_iomap *bi = NULL; + + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0); + + if (flags & IOMAP_WRITE) { + srcmap->offset = em->start; + srcmap->length = em->len; + srcmap->bdev = em->bdev; + if (em->block_start == EXTENT_MAP_HOLE) { + srcmap->type = IOMAP_HOLE; + } else { + srcmap->type = IOMAP_MAPPED; + srcmap->addr = em->block_start; + } + bi = btrfs_iomap_init(inode, &em, pos, length); + if (IS_ERR(bi)) + return PTR_ERR(bi); + } + + iomap->offset = em->start; + iomap->length = em->len; + iomap->bdev = em->bdev; + + if (em->block_start == EXTENT_MAP_HOLE) { + iomap->type = IOMAP_HOLE; + } else { + iomap->type = IOMAP_MAPPED; + iomap->addr = em->block_start; + } + iomap->private = bi; + iomap->submit_io = btrfs_submit_direct; + return 0; +} + +static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, + loff_t length, ssize_t written, unsigned flags, + struct iomap *iomap) { - return get_iomap(inode, pos, length, iomap); + struct btrfs_iomap *bi = iomap->private; + u64 wend; + loff_t release_bytes; + + if (!bi) + return 0; + + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + + wend = PAGE_ALIGN(pos + written); + release_bytes = wend - bi->end - 1; + kfree(bi); + return 0; } static const struct iomap_ops btrfs_dio_iomap_ops = { .iomap_begin = btrfs_dio_iomap_begin, + .iomap_end = btrfs_dio_iomap_end, }; ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to) @@ -441,3 +537,54 @@ ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to) inode_unlock_shared(inode); return ret; } + +ssize_t btrfs_dio_iomap_write(struct kiocb *iocb, struct iov_iter *from) +{ + struct file *file = iocb->ki_filp; + struct inode *inode = file_inode(file); + ssize_t written, written_buffered; + loff_t pos = iocb->ki_pos, endbyte; + size_t count = iov_iter_count(from); + struct extent_changeset *data_reserved = NULL; + int err; + + btrfs_delalloc_reserve_space(inode, &data_reserved, pos, count); + + written = iomap_dio_rw(iocb, from, &btrfs_dio_iomap_ops, NULL); + if (written < count) { + ssize_t done = (written < 0) ? 0 : written; + btrfs_delalloc_release_space(inode, data_reserved, pos, count - done, + true); + } + btrfs_delalloc_release_extents(BTRFS_I(inode), count, false); + extent_changeset_free(data_reserved); + + if (written < 0 || !iov_iter_count(from)) + return written; + + pos = iocb->ki_pos; + written_buffered = btrfs_buffered_iomap_write(iocb, from); + if (written_buffered < 0) { + err = written_buffered; + goto out; + } + /* + * Ensure all data is persisted. We want the next direct IO read to be + * able to read what was just written. + */ + endbyte = pos + written_buffered - 1; + err = btrfs_fdatawrite_range(inode, pos, endbyte); + if (err) + goto out; + err = filemap_fdatawait_range(inode->i_mapping, pos, endbyte); + if (err) + goto out; + written += written_buffered; + iocb->ki_pos = pos + written_buffered; + invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT, + endbyte >> PAGE_SHIFT); +out: + if (written > 0 && iocb->ki_pos > i_size_read(inode)) + i_size_write(inode, iocb->ki_pos); + return written ? written : err; +} From patchwork Sun Sep 1 20:08:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125701 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 17BF713B1 for ; Sun, 1 Sep 2019 20:09:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EB7B722D6D for ; Sun, 1 Sep 2019 20:09:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729090AbfIAUJT (ORCPT ); Sun, 1 Sep 2019 16:09:19 -0400 Received: from mx2.suse.de ([195.135.220.15]:50714 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727517AbfIAUJT (ORCPT ); Sun, 1 Sep 2019 16:09:19 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 45DF0B661; Sun, 1 Sep 2019 20:09:16 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 14/15] btrfs: Remove btrfs_dio_data and __btrfs_direct_write Date: Sun, 1 Sep 2019 15:08:35 -0500 Message-Id: <20190901200836.14959-15-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues btrfs_dio_data is unnecessary since we are now storing all informaiton in btrfs_iomap. Advantage: We don't abuse current->journal_info anymore :) --- fs/btrfs/file.c | 40 ---------------------------- fs/btrfs/inode.c | 81 ++------------------------------------------------------ 2 files changed, 2 insertions(+), 119 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 5d4347e12cdc..e6c1ffd74660 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1350,46 +1350,6 @@ static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos, return ret; } -static ssize_t __btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) -{ - struct file *file = iocb->ki_filp; - struct inode *inode = file_inode(file); - loff_t pos; - ssize_t written; - ssize_t written_buffered; - loff_t endbyte; - int err; - - written = generic_file_direct_write(iocb, from); - - if (written < 0 || !iov_iter_count(from)) - return written; - - pos = iocb->ki_pos; - written_buffered = btrfs_buffered_iomap_write(iocb, from); - if (written_buffered < 0) { - err = written_buffered; - goto out; - } - /* - * Ensure all data is persisted. We want the next direct IO read to be - * able to read what was just written. - */ - endbyte = pos + written_buffered - 1; - err = btrfs_fdatawrite_range(inode, pos, endbyte); - if (err) - goto out; - err = filemap_fdatawait_range(inode->i_mapping, pos, endbyte); - if (err) - goto out; - written += written_buffered; - iocb->ki_pos = pos + written_buffered; - invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT, - endbyte >> PAGE_SHIFT); -out: - return written ? written : err; -} - static void update_time_for_write(struct inode *inode) { struct timespec64 now; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 323d72858c9c..87fbe73ca2e4 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -54,13 +54,6 @@ struct btrfs_iget_args { struct btrfs_root *root; }; -struct btrfs_dio_data { - u64 reserve; - u64 unsubmitted_oe_range_start; - u64 unsubmitted_oe_range_end; - int overwrite; -}; - static const struct inode_operations btrfs_dir_inode_operations; static const struct inode_operations btrfs_symlink_inode_operations; static const struct inode_operations btrfs_dir_ro_inode_operations; @@ -7664,7 +7657,6 @@ int btrfs_get_extent_map_write(struct extent_map **map, static int btrfs_get_blocks_direct_write(struct extent_map **map, struct buffer_head *bh_result, struct inode *inode, - struct btrfs_dio_data *dio_data, u64 start, u64 len) { int ret; @@ -7686,17 +7678,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, if (!test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) set_buffer_new(bh_result); - /* - * Need to update the i_size under the extent lock so buffered - * readers will get the updated i_size when we unlock. - */ - if (!dio_data->overwrite && start + len > i_size_read(inode)) - i_size_write(inode, start + len); - - WARN_ON(dio_data->reserve < len); - dio_data->reserve -= len; - dio_data->unsubmitted_oe_range_end = start + len; - current->journal_info = dio_data; return ret; } @@ -7706,7 +7687,6 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct extent_map *em; struct extent_state *cached_state = NULL; - struct btrfs_dio_data *dio_data = NULL; u64 start = iblock << inode->i_blkbits; u64 lockstart, lockend; u64 len = bh_result->b_size; @@ -7721,16 +7701,6 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, lockstart = start; lockend = start + len - 1; - if (current->journal_info) { - /* - * Need to pull our outstanding extents and set journal_info to NULL so - * that anything that needs to check if there's a transaction doesn't get - * confused. - */ - dio_data = current->journal_info; - current->journal_info = NULL; - } - /* * If this errors out it's because we couldn't invalidate pagecache for * this range and we need to fallback to buffered. @@ -7770,7 +7740,7 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, if (create) { ret = btrfs_get_blocks_direct_write(&em, bh_result, inode, - dio_data, start, len); + start, len); if (ret < 0) goto unlock_err; @@ -7808,8 +7778,6 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend, unlock_bits, 1, 0, &cached_state); err: - if (dio_data) - current->journal_info = dio_data; return ret; } @@ -8498,21 +8466,6 @@ void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, dip->subio_endio = btrfs_subio_endio_read; } - /* - * Reset the range for unsubmitted ordered extents (to a 0 length range) - * even if we fail to submit a bio, because in such case we do the - * corresponding error handling below and it must not be done a second - * time by btrfs_direct_IO(). - */ - if (write) { - struct btrfs_dio_data *dio_data = current->journal_info; - - dio_data->unsubmitted_oe_range_end = dip->logical_offset + - dip->bytes; - dio_data->unsubmitted_oe_range_start = - dio_data->unsubmitted_oe_range_end; - } - ret = btrfs_submit_direct_hook(dip); if (!ret) return; @@ -8598,7 +8551,6 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) struct file *file = iocb->ki_filp; struct inode *inode = file->f_mapping->host; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); - struct btrfs_dio_data dio_data = { 0 }; struct extent_changeset *data_reserved = NULL; loff_t offset = iocb->ki_pos; size_t count = 0; @@ -8631,7 +8583,6 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) * not unlock the i_mutex at this case. */ if (offset + count <= inode->i_size) { - dio_data.overwrite = 1; inode_unlock(inode); relock = true; } else if (iocb->ki_flags & IOCB_NOWAIT) { @@ -8643,16 +8594,6 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) if (ret) goto out; - /* - * We need to know how many extents we reserved so that we can - * do the accounting properly if we go over the number we - * originally calculated. Abuse current->journal_info for this. - */ - dio_data.reserve = round_up(count, - fs_info->sectorsize); - dio_data.unsubmitted_oe_range_start = (u64)offset; - dio_data.unsubmitted_oe_range_end = (u64)offset; - current->journal_info = &dio_data; down_read(&BTRFS_I(inode)->dio_sem); } else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK, &BTRFS_I(inode)->runtime_flags)) { @@ -8667,25 +8608,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) btrfs_submit_direct, flags); if (iov_iter_rw(iter) == WRITE) { up_read(&BTRFS_I(inode)->dio_sem); - current->journal_info = NULL; - if (ret < 0 && ret != -EIOCBQUEUED) { - if (dio_data.reserve) - btrfs_delalloc_release_space(inode, data_reserved, - offset, dio_data.reserve, true); - /* - * On error we might have left some ordered extents - * without submitting corresponding bios for them, so - * cleanup them up to avoid other tasks getting them - * and waiting for them to complete forever. - */ - if (dio_data.unsubmitted_oe_range_start < - dio_data.unsubmitted_oe_range_end) - btrfs_update_ordered_extent(inode, - dio_data.unsubmitted_oe_range_start, - dio_data.unsubmitted_oe_range_end - - dio_data.unsubmitted_oe_range_start, - false); - } else if (ret >= 0 && (size_t)ret < count) + if (ret >= 0 && (size_t)ret < count) btrfs_delalloc_release_space(inode, data_reserved, offset, count - (size_t)ret, true); btrfs_delalloc_release_extents(BTRFS_I(inode), count, false); From patchwork Sun Sep 1 20:08:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11125697 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 46D5A1510 for ; Sun, 1 Sep 2019 20:09:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2FD8E22CF7 for ; Sun, 1 Sep 2019 20:09:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729106AbfIAUJU (ORCPT ); Sun, 1 Sep 2019 16:09:20 -0400 Received: from mx2.suse.de ([195.135.220.15]:50760 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729051AbfIAUJT (ORCPT ); Sun, 1 Sep 2019 16:09:19 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 511E6B666; Sun, 1 Sep 2019 20:09:18 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, darrick.wong@oracle.com, hch@lst.de, david@fromorbit.com, riteshh@linux.ibm.com, Goldwyn Rodrigues Subject: [PATCH 15/15] btrfs: update inode size during bio completion Date: Sun, 1 Sep 2019 15:08:36 -0500 Message-Id: <20190901200836.14959-16-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190901200836.14959-1-rgoldwyn@suse.de> References: <20190901200836.14959-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Update the inode size for dio writes during bio completion. This ties the success of the underlying block layer whether to increase the size of the inode. Especially for in aio cases. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/inode.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 87fbe73ca2e4..f87a9dd154a9 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8191,9 +8191,13 @@ static void btrfs_endio_direct_write(struct bio *bio) { struct btrfs_dio_private *dip = bio->bi_private; struct bio *dio_bio = dip->dio_bio; + struct inode *inode = dip->inode; - btrfs_update_ordered_extent(dip->inode, dip->logical_offset, + btrfs_update_ordered_extent(inode, dip->logical_offset, dip->bytes, !bio->bi_status); + if (!bio->bi_status && + i_size_read(inode) < dip->logical_offset + dip->bytes) + i_size_write(inode, dip->logical_offset + dip->bytes); kfree(dip);