From patchwork Mon Jan 29 22:18:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Gruenbacher X-Patchwork-Id: 10191001 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 969E360212 for ; Mon, 29 Jan 2018 22:19:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 78F9B28613 for ; Mon, 29 Jan 2018 22:19:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6DD1F2891D; Mon, 29 Jan 2018 22:19:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 943F428613 for ; Mon, 29 Jan 2018 22:18:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751833AbeA2WS6 (ORCPT ); Mon, 29 Jan 2018 17:18:58 -0500 Received: from mx1.redhat.com ([209.132.183.28]:58114 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751579AbeA2WS5 (ORCPT ); Mon, 29 Jan 2018 17:18:57 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9CCE11752A2; Mon, 29 Jan 2018 22:18:57 +0000 (UTC) Received: from max.home.com (ovpn-116-127.ams2.redhat.com [10.36.116.127]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4AA6218AB9; Mon, 29 Jan 2018 22:18:53 +0000 (UTC) From: Andreas Gruenbacher To: cluster-devel@redhat.com Cc: Christoph Hellwig , Dave Chinner , linux-fsdevel@vger.kernel.org, Andreas Gruenbacher Subject: [PATCH v2 3/6] gfs2: Iomap cleanups and improvements Date: Mon, 29 Jan 2018 23:18:41 +0100 Message-Id: <20180129221844.19476-4-agruenba@redhat.com> In-Reply-To: <20180129221844.19476-1-agruenba@redhat.com> References: <20180129221844.19476-1-agruenba@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 29 Jan 2018 22:18:57 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Clean up gfs2_iomap_alloc and gfs2_iomap_get. Document how gfs2_iomap_alloc works. gfs2_iomap_alloc now needs to be called separately after gfs2_iomap_get where necessary; this will be used later by iomap write. Move gfs2_iomap_ops into bmap.c. Signed-off-by: Andreas Gruenbacher --- fs/gfs2/bmap.c | 197 ++++++++++++++++++++++++++++++-------------------------- fs/gfs2/bmap.h | 4 +- fs/gfs2/inode.c | 4 -- 3 files changed, 109 insertions(+), 96 deletions(-) diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c index 72e22b99ade7..c9edf48f3fe8 100644 --- a/fs/gfs2/bmap.c +++ b/fs/gfs2/bmap.c @@ -422,22 +422,6 @@ static inline unsigned int gfs2_extent_length(void *start, unsigned int len, __b return (ptr - first); } -static inline void bmap_lock(struct gfs2_inode *ip, int create) -{ - if (create) - down_write(&ip->i_rw_mutex); - else - down_read(&ip->i_rw_mutex); -} - -static inline void bmap_unlock(struct gfs2_inode *ip, int create) -{ - if (create) - up_write(&ip->i_rw_mutex); - else - up_read(&ip->i_rw_mutex); -} - static inline __be64 *gfs2_indirect_init(struct metapath *mp, struct gfs2_glock *gl, unsigned int i, unsigned offset, u64 bn) @@ -464,15 +448,11 @@ enum alloc_state { }; /** - * gfs2_bmap_alloc - Build a metadata tree of the requested height + * gfs2_iomap_alloc - Build a metadata tree of the requested height * @inode: The GFS2 inode - * @lblock: The logical starting block of the extent - * @bh_map: This is used to return the mapping details - * @zero_new: True if newly allocated blocks should be zeroed + * @iomap: The iomap structure + * @flags: iomap flags * @mp: The metapath, with proper height information calculated - * @maxlen: The max number of data blocks to alloc - * @dblock: Pointer to return the resulting new block - * @dblks: Pointer to return the number of blocks allocated * * In this routine we may have to alloc: * i) Indirect blocks to grow the metadata tree height @@ -485,6 +465,13 @@ enum alloc_state { * blocks are available, there will only be one request per bmap call) * and uses the state machine to initialise the blocks in order. * + * Right now, this function will allocate at most one indirect block + * worth of data -- with a default block size of 4K, that's slightly + * less than 2M. If this limitation is ever removed to allow huge + * allocations, we would probably still want to limit the iomap size we + * return to avoid stalling other tasks during huge writes; the next + * iomap iteration would then find the blocks already allocated. + * * Returns: errno on error */ @@ -511,6 +498,8 @@ static int gfs2_iomap_alloc(struct inode *inode, struct iomap *iomap, gfs2_trans_add_meta(ip->i_gl, dibh); + down_write(&ip->i_rw_mutex); + if (mp->mp_fheight == mp->mp_aheight) { struct buffer_head *bh; int eob; @@ -546,11 +535,10 @@ static int gfs2_iomap_alloc(struct inode *inode, struct iomap *iomap, blks = dblks + iblks; i = mp->mp_aheight; do { - int error; n = blks - alloced; - error = gfs2_alloc_blocks(ip, &bn, &n, 0, NULL); - if (error) - return error; + ret = gfs2_alloc_blocks(ip, &bn, &n, 0, NULL); + if (ret) + goto out; alloced += n; if (state != ALLOC_DATA || gfs2_is_jdata(ip)) gfs2_trans_add_unrevoke(sdp, bn, n); @@ -606,7 +594,7 @@ static int gfs2_iomap_alloc(struct inode *inode, struct iomap *iomap, dblks = n; ptr = metapointer(end_of_metadata, mp); iomap->addr = bn << inode->i_blkbits; - iomap->flags |= IOMAP_F_NEW; + iomap->flags |= IOMAP_F_MERGED | IOMAP_F_NEW; while (n-- > 0) *ptr++ = cpu_to_be64(bn++); if (flags & IOMAP_ZERO) { @@ -625,8 +613,10 @@ static int gfs2_iomap_alloc(struct inode *inode, struct iomap *iomap, iomap->length = (u64)dblks << inode->i_blkbits; ip->i_height = mp->mp_fheight; gfs2_add_inode_blocks(&ip->i_inode, alloced); - gfs2_dinode_out(ip, mp->mp_bh[0]->b_data); - return 0; + gfs2_dinode_out(ip, dibh->b_data); +out: + up_write(&ip->i_rw_mutex); + return ret; } /** @@ -635,7 +625,7 @@ static int gfs2_iomap_alloc(struct inode *inode, struct iomap *iomap, * @lblock: The logical starting block number * @mp: The metapath * - * Returns: The hole size in bytes + * Returns: The hole size in blocks * */ static u64 hole_size(struct inode *inode, sector_t lblock, struct metapath *mp) @@ -682,7 +672,7 @@ static u64 hole_size(struct inode *inode, sector_t lblock, struct metapath *mp) if (hgt && (mp->mp_list[hgt - 1] < mp_eof.mp_list[hgt - 1])) (mp->mp_list[hgt - 1])++; } - return holesz << inode->i_blkbits; + return holesz; } static void gfs2_stuffed_iomap(struct inode *inode, struct iomap *iomap) @@ -698,55 +688,46 @@ static void gfs2_stuffed_iomap(struct inode *inode, struct iomap *iomap) } /** - * gfs2_iomap_begin - Map blocks from an inode to disk blocks + * gfs2_iomap_get - Map blocks from an inode to disk blocks * @inode: The inode * @pos: Starting position in bytes * @length: Length to map, in bytes * @flags: iomap flags * @iomap: The iomap structure + * @mp: The metapath * * Returns: errno */ -int gfs2_iomap_begin(struct inode *inode, loff_t pos, loff_t length, - unsigned flags, struct iomap *iomap) +static int gfs2_iomap_get(struct inode *inode, loff_t pos, loff_t length, + unsigned flags, struct iomap *iomap, + struct metapath *mp) { struct gfs2_inode *ip = GFS2_I(inode); struct gfs2_sbd *sdp = GFS2_SB(inode); - struct metapath mp = { .mp_aheight = 1, }; unsigned int factor = sdp->sd_sb.sb_bsize; const u64 *arr = sdp->sd_heightsize; __be64 *ptr; sector_t lblock; - sector_t lend; + sector_t lblock_stop; int ret; int eob; - unsigned int len; + u64 len; struct buffer_head *bh; u8 height; - trace_gfs2_iomap_start(ip, pos, length, flags); - if (!length) { - ret = -EINVAL; - goto out; - } + if (!length) + return -EINVAL; if ((flags & IOMAP_REPORT) && gfs2_is_stuffed(ip)) { - gfs2_stuffed_iomap(inode, iomap); - if (pos >= iomap->length) + if (pos >= i_size_read(inode)) return -ENOENT; - ret = 0; - goto out; + gfs2_stuffed_iomap(inode, iomap); + return 0; } lblock = pos >> inode->i_blkbits; - lend = (pos + length + sdp->sd_sb.sb_bsize - 1) >> inode->i_blkbits; - - iomap->offset = lblock << inode->i_blkbits; - iomap->addr = IOMAP_NULL_ADDR; - iomap->type = IOMAP_HOLE; - iomap->length = (u64)(lend - lblock) << inode->i_blkbits; - iomap->flags = IOMAP_F_MERGED; - bmap_lock(ip, 0); + lblock_stop = (pos + length - 1) >> inode->i_blkbits; + len = lblock_stop - lblock + 1; /* * Directory data blocks have a struct gfs2_meta_header header, so the @@ -758,61 +739,87 @@ int gfs2_iomap_begin(struct inode *inode, loff_t pos, loff_t length, arr = sdp->sd_jheightsize; } - ret = gfs2_meta_inode_buffer(ip, &mp.mp_bh[0]); + down_read(&ip->i_rw_mutex); + + ret = gfs2_meta_inode_buffer(ip, &mp->mp_bh[0]); if (ret) - goto out_release; + goto unlock; height = ip->i_height; while ((lblock + 1) * factor > arr[height]) height++; - find_metapath(sdp, lblock, &mp, height); + find_metapath(sdp, lblock, mp, height); if (height > ip->i_height || gfs2_is_stuffed(ip)) goto do_alloc; - ret = lookup_metapath(ip, &mp); - if (ret) - goto out_release; + ret = lookup_metapath(ip, mp); + if (ret < 0) + goto unlock; + ret = 0; - if (mp.mp_aheight != ip->i_height) + if (mp->mp_aheight != ip->i_height) goto do_alloc; - ptr = metapointer(ip->i_height - 1, &mp); + ptr = metapointer(ip->i_height - 1, mp); if (*ptr == 0) goto do_alloc; - iomap->type = IOMAP_MAPPED; iomap->addr = be64_to_cpu(*ptr) << inode->i_blkbits; + iomap->type = IOMAP_MAPPED; - bh = mp.mp_bh[ip->i_height - 1]; - len = gfs2_extent_length(bh->b_data, bh->b_size, ptr, lend - lblock, &eob); + bh = mp->mp_bh[ip->i_height - 1]; + len = gfs2_extent_length(bh->b_data, bh->b_size, ptr, len, &eob); + iomap->flags = IOMAP_F_MERGED; if (eob) iomap->flags |= IOMAP_F_BOUNDARY; - iomap->length = (u64)len << inode->i_blkbits; - ret = 0; - -out_release: - release_metapath(&mp); - bmap_unlock(ip, 0); out: - trace_gfs2_iomap_end(ip, iomap, ret); + iomap->offset = lblock << inode->i_blkbits; + iomap->length = len << inode->i_blkbits; + iomap->bdev = inode->i_sb->s_bdev; +unlock: + up_read(&ip->i_rw_mutex); return ret; do_alloc: if (!(flags & IOMAP_WRITE)) { if (pos >= i_size_read(inode)) { ret = -ENOENT; - goto out_release; + goto unlock; } - ret = 0; - iomap->length = hole_size(inode, lblock, &mp); - goto out_release; + len = hole_size(inode, lblock, mp); } + iomap->addr = IOMAP_NULL_ADDR; + iomap->type = IOMAP_HOLE; + iomap->flags = 0; + goto out; +} - ret = gfs2_iomap_alloc(inode, iomap, flags, &mp); - goto out_release; +static int gfs2_iomap_begin(struct inode *inode, loff_t pos, loff_t length, + unsigned flags, struct iomap *iomap) +{ + struct gfs2_inode *ip = GFS2_I(inode); + struct metapath mp = { .mp_aheight = 1, }; + int ret; + + trace_gfs2_iomap_start(ip, pos, length, flags); + if (flags & IOMAP_WRITE) { + ret = gfs2_iomap_get(inode, pos, length, flags, iomap, &mp); + if (!ret && iomap->type == IOMAP_HOLE) + ret = gfs2_iomap_alloc(inode, iomap, flags, &mp); + release_metapath(&mp); + } else { + ret = gfs2_iomap_get(inode, pos, length, flags, iomap, &mp); + release_metapath(&mp); + } + trace_gfs2_iomap_end(ip, iomap, ret); + return ret; } +const struct iomap_ops gfs2_iomap_ops = { + .iomap_begin = gfs2_iomap_begin, +}; + /** * gfs2_block_map - Map a block from an inode to a disk block * @inode: The inode @@ -831,27 +838,37 @@ int gfs2_block_map(struct inode *inode, sector_t lblock, struct buffer_head *bh_map, int create) { struct gfs2_inode *ip = GFS2_I(inode); + loff_t pos = (loff_t)lblock << inode->i_blkbits; + loff_t length = bh_map->b_size; + struct metapath mp = { .mp_aheight = 1, }; struct iomap iomap; - int ret, flags = 0; + int ret; clear_buffer_mapped(bh_map); clear_buffer_new(bh_map); clear_buffer_boundary(bh_map); trace_gfs2_bmap(ip, bh_map, lblock, create, 1); - if (create) - flags |= IOMAP_WRITE; - if (buffer_zeronew(bh_map)) - flags |= IOMAP_ZERO; - ret = gfs2_iomap_begin(inode, (loff_t)lblock << inode->i_blkbits, - bh_map->b_size, flags, &iomap); - if (ret) { - if (!create && ret == -ENOENT) { - /* Return unmapped buffer beyond the end of file. */ + if (create) { + int flags = IOMAP_WRITE; + if (buffer_zeronew(bh_map)) + flags |= IOMAP_ZERO; + ret = gfs2_iomap_get(inode, pos, length, flags, &iomap, &mp); + if (!ret && iomap.type == IOMAP_HOLE) + ret = gfs2_iomap_alloc(inode, &iomap, flags, &mp); + release_metapath(&mp); + } else { + ret = gfs2_iomap_get(inode, pos, length, 0, &iomap, &mp); + release_metapath(&mp); + + /* Return unmapped buffer beyond the end of file. */ + if (ret == -ENOENT) { ret = 0; + goto out; } - goto out; } + if (ret) + goto out; if (iomap.length > bh_map->b_size) { iomap.length = bh_map->b_size; diff --git a/fs/gfs2/bmap.h b/fs/gfs2/bmap.h index c3402fe00653..5d563c29cb0a 100644 --- a/fs/gfs2/bmap.h +++ b/fs/gfs2/bmap.h @@ -46,11 +46,11 @@ static inline void gfs2_write_calc_reserv(const struct gfs2_inode *ip, } } +extern const struct iomap_ops gfs2_iomap_ops; + extern int gfs2_unstuff_dinode(struct gfs2_inode *ip, struct page *page); extern int gfs2_block_map(struct inode *inode, sector_t lblock, struct buffer_head *bh, int create); -extern int gfs2_iomap_begin(struct inode *inode, loff_t pos, loff_t length, - unsigned flags, struct iomap *iomap); extern int gfs2_extent_map(struct inode *inode, u64 lblock, int *new, u64 *dblock, unsigned *extlen); extern int gfs2_setattr_size(struct inode *inode, u64 size); diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index 20281992d456..5cc193e5737b 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -2015,10 +2015,6 @@ static int gfs2_getattr(const struct path *path, struct kstat *stat, return 0; } -const struct iomap_ops gfs2_iomap_ops = { - .iomap_begin = gfs2_iomap_begin, -}; - static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, u64 start, u64 len) {