From patchwork Sat Jun 10 11:39:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13274821 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AFE0C7EE29 for ; Sat, 10 Jun 2023 11:39:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234473AbjFJLjZ (ORCPT ); Sat, 10 Jun 2023 07:39:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229483AbjFJLjY (ORCPT ); Sat, 10 Jun 2023 07:39:24 -0400 Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B467DE1; Sat, 10 Jun 2023 04:39:22 -0700 (PDT) Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1b3afd2f9bdso851825ad.0; Sat, 10 Jun 2023 04:39:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686397162; x=1688989162; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kLakpYlOgtpTGHjNGchyAecnENkM2mlU3HiJHuOh+O0=; b=h+Y0a6nU9wk0f1ii/YKnzDJiUQsKco5JR/L0h7+9/qtSTxKPH3ADFq+uTmkPwm3uMJ WtiXJ33N1yQ66jbB1Vix0MFegvm1bt3cjq4ulsB5eZsltFz3J8xD+zu9M4wQdTb6n1Kd FumGLTRXSaNZT9ReX3c7ETWUmVG++Yn8TukAH6NTmJ1RgkKsSUjMdL1ghrFLeMlnZqBP ygiTNLa4U/Htn+QFD3l0FKwR5Hu8wtj1EAWFvtTu4vV5ng2fI2wbQFgIFUp/xTqTzDWx lkKPlA2ERF2V2nCtW4qbO3RxI0WRMPUh0jg9rBFNpXE0DT+11mGaD0mJ1g3H1BtLisAy 3A9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686397162; x=1688989162; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kLakpYlOgtpTGHjNGchyAecnENkM2mlU3HiJHuOh+O0=; b=QJtaslnOTaHmzxuJLQ7eavqHRFVMDvj9ubAsQXyT6t3UIZ4stCnLTA9Q/s5anIP4Fx +uY/OhT4vdprkudXQchxn1QSDHT5wZ5J7IVTuxQVPMfh8caG3rX51axRCVZvhzH+QpO1 h+nfx82QPqkG3k2uzTO7GiPP+083YdmgE9/5dOf/V90unFSrUQCO24d5cXiL/u8TH2fI fM5gU2uBfcY2BSITnNdqqN1o3WKff92aqH7RH4rM9NqEeNraEAlzEcTy/Jp58a/lQtoE ggFazTrU6fPnLcJPbBII1G5ftAMpXNdA3mD/tKssic+oOFb9mSwB9PgDjF0CYh7sbbWD Z+QQ== X-Gm-Message-State: AC+VfDwMLwTfXP4NpOknMGnz/xqrw2OKLvjlIpjWQPygYyW2EBPlbufP /V/MFmcQ7kutoCtj5Ojr9NN0RtDPJ3s= X-Google-Smtp-Source: ACHHUZ7YooNtt6OukIMy3yDTgTXeo6B0g69smAYruk8H0tAMhuRZ1oJQ+wecBr/5sWDj1ttBWVMKIA== X-Received: by 2002:a17:902:9b8d:b0:1ad:dac0:5125 with SMTP id y13-20020a1709029b8d00b001addac05125mr1622387plp.11.1686397161628; Sat, 10 Jun 2023 04:39:21 -0700 (PDT) Received: from dw-tp.ihost.com ([49.207.220.159]) by smtp.gmail.com with ESMTPSA id n10-20020a170902e54a00b001aaf5dcd762sm4753698plf.214.2023.06.10.04.39.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Jun 2023 04:39:21 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Christoph Hellwig , "Darrick J. Wong" , Matthew Wilcox , Dave Chinner , Brian Foster , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [PATCHv9 1/6] iomap: Rename iomap_page to iomap_folio_state and others Date: Sat, 10 Jun 2023 17:09:02 +0530 Message-Id: <12b297f38307ed980fe505d03111db3fd887f5f0.1686395560.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org struct iomap_page actually tracks per-block state of a folio. Hence it make sense to rename some of these function names and data structures for e.g. 1. struct iomap_page (iop) -> struct iomap_folio_state (ifs) 2. iomap_page_create() -> iomap_ifs_alloc() 3. iomap_page_release() -> iomap_ifs_free() 4. to_iomap_page() -> iomap_get_ifs() Since in later patches we are also going to add per-block dirty state tracking to iomap_folio_state. Hence this patch also renames "uptodate" & "uptodate_lock" members of iomap_folio_state to "state" and"state_lock". Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 146 ++++++++++++++++++++--------------------- 1 file changed, 73 insertions(+), 73 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 063133ec77f4..779205fe228f 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -24,17 +24,17 @@ #define IOEND_BATCH_SIZE 4096 /* - * Structure allocated for each folio when block size < folio size - * to track sub-folio uptodate status and I/O completions. + * Structure allocated for each folio to track per-block uptodate state + * and I/O completions. */ -struct iomap_page { +struct iomap_folio_state { atomic_t read_bytes_pending; atomic_t write_bytes_pending; - spinlock_t uptodate_lock; - unsigned long uptodate[]; + spinlock_t state_lock; + unsigned long state[]; }; -static inline struct iomap_page *to_iomap_page(struct folio *folio) +static inline struct iomap_folio_state *iomap_get_ifs(struct folio *folio) { if (folio_test_private(folio)) return folio_get_private(folio); @@ -43,45 +43,45 @@ static inline struct iomap_page *to_iomap_page(struct folio *folio) static struct bio_set iomap_ioend_bioset; -static struct iomap_page * -iomap_page_create(struct inode *inode, struct folio *folio, unsigned int flags) +static struct iomap_folio_state *iomap_ifs_alloc(struct inode *inode, + struct folio *folio, unsigned int flags) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = iomap_get_ifs(folio); unsigned int nr_blocks = i_blocks_per_folio(inode, folio); gfp_t gfp; - if (iop || nr_blocks <= 1) - return iop; + if (ifs || nr_blocks <= 1) + return ifs; if (flags & IOMAP_NOWAIT) gfp = GFP_NOWAIT; else gfp = GFP_NOFS | __GFP_NOFAIL; - iop = kzalloc(struct_size(iop, uptodate, BITS_TO_LONGS(nr_blocks)), + ifs = kzalloc(struct_size(ifs, state, BITS_TO_LONGS(nr_blocks)), gfp); - if (iop) { - spin_lock_init(&iop->uptodate_lock); + if (ifs) { + spin_lock_init(&ifs->state_lock); if (folio_test_uptodate(folio)) - bitmap_fill(iop->uptodate, nr_blocks); - folio_attach_private(folio, iop); + bitmap_fill(ifs->state, nr_blocks); + folio_attach_private(folio, ifs); } - return iop; + return ifs; } -static void iomap_page_release(struct folio *folio) +static void iomap_ifs_free(struct folio *folio) { - struct iomap_page *iop = folio_detach_private(folio); + struct iomap_folio_state *ifs = folio_detach_private(folio); struct inode *inode = folio->mapping->host; unsigned int nr_blocks = i_blocks_per_folio(inode, folio); - if (!iop) + if (!ifs) return; - WARN_ON_ONCE(atomic_read(&iop->read_bytes_pending)); - WARN_ON_ONCE(atomic_read(&iop->write_bytes_pending)); - WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != + WARN_ON_ONCE(atomic_read(&ifs->read_bytes_pending)); + WARN_ON_ONCE(atomic_read(&ifs->write_bytes_pending)); + WARN_ON_ONCE(bitmap_full(ifs->state, nr_blocks) != folio_test_uptodate(folio)); - kfree(iop); + kfree(ifs); } /* @@ -90,7 +90,7 @@ static void iomap_page_release(struct folio *folio) static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, loff_t *pos, loff_t length, size_t *offp, size_t *lenp) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = iomap_get_ifs(folio); loff_t orig_pos = *pos; loff_t isize = i_size_read(inode); unsigned block_bits = inode->i_blkbits; @@ -105,12 +105,12 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, * per-block uptodate status and adjust the offset and length if needed * to avoid reading in already uptodate ranges. */ - if (iop) { + if (ifs) { unsigned int i; /* move forward for each leading block marked uptodate */ for (i = first; i <= last; i++) { - if (!test_bit(i, iop->uptodate)) + if (!test_bit(i, ifs->state)) break; *pos += block_size; poff += block_size; @@ -120,7 +120,7 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, /* truncate len if we find any trailing uptodate block(s) */ for ( ; i <= last; i++) { - if (test_bit(i, iop->uptodate)) { + if (test_bit(i, ifs->state)) { plen -= (last - i + 1) * block_size; last = i - 1; break; @@ -144,26 +144,26 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, *lenp = plen; } -static void iomap_iop_set_range_uptodate(struct folio *folio, - struct iomap_page *iop, size_t off, size_t len) +static void iomap_ifs_set_range_uptodate(struct folio *folio, + struct iomap_folio_state *ifs, size_t off, size_t len) { struct inode *inode = folio->mapping->host; unsigned first = off >> inode->i_blkbits; unsigned last = (off + len - 1) >> inode->i_blkbits; unsigned long flags; - spin_lock_irqsave(&iop->uptodate_lock, flags); - bitmap_set(iop->uptodate, first, last - first + 1); - if (bitmap_full(iop->uptodate, i_blocks_per_folio(inode, folio))) + spin_lock_irqsave(&ifs->state_lock, flags); + bitmap_set(ifs->state, first, last - first + 1); + if (bitmap_full(ifs->state, i_blocks_per_folio(inode, folio))) folio_mark_uptodate(folio); - spin_unlock_irqrestore(&iop->uptodate_lock, flags); + spin_unlock_irqrestore(&ifs->state_lock, flags); } static void iomap_set_range_uptodate(struct folio *folio, - struct iomap_page *iop, size_t off, size_t len) + struct iomap_folio_state *ifs, size_t off, size_t len) { - if (iop) - iomap_iop_set_range_uptodate(folio, iop, off, len); + if (ifs) + iomap_ifs_set_range_uptodate(folio, ifs, off, len); else folio_mark_uptodate(folio); } @@ -171,16 +171,16 @@ static void iomap_set_range_uptodate(struct folio *folio, static void iomap_finish_folio_read(struct folio *folio, size_t offset, size_t len, int error) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = iomap_get_ifs(folio); if (unlikely(error)) { folio_clear_uptodate(folio); folio_set_error(folio); } else { - iomap_set_range_uptodate(folio, iop, offset, len); + iomap_set_range_uptodate(folio, ifs, offset, len); } - if (!iop || atomic_sub_and_test(len, &iop->read_bytes_pending)) + if (!ifs || atomic_sub_and_test(len, &ifs->read_bytes_pending)) folio_unlock(folio); } @@ -213,7 +213,7 @@ struct iomap_readpage_ctx { static int iomap_read_inline_data(const struct iomap_iter *iter, struct folio *folio) { - struct iomap_page *iop; + struct iomap_folio_state *ifs; const struct iomap *iomap = iomap_iter_srcmap(iter); size_t size = i_size_read(iter->inode) - iomap->offset; size_t poff = offset_in_page(iomap->offset); @@ -231,15 +231,15 @@ static int iomap_read_inline_data(const struct iomap_iter *iter, if (WARN_ON_ONCE(size > iomap->length)) return -EIO; if (offset > 0) - iop = iomap_page_create(iter->inode, folio, iter->flags); + ifs = iomap_ifs_alloc(iter->inode, folio, iter->flags); else - iop = to_iomap_page(folio); + ifs = iomap_get_ifs(folio); addr = kmap_local_folio(folio, offset); memcpy(addr, iomap->inline_data, size); memset(addr + size, 0, PAGE_SIZE - poff - size); kunmap_local(addr); - iomap_set_range_uptodate(folio, iop, offset, PAGE_SIZE - poff); + iomap_set_range_uptodate(folio, ifs, offset, PAGE_SIZE - poff); return 0; } @@ -260,7 +260,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, loff_t pos = iter->pos + offset; loff_t length = iomap_length(iter) - offset; struct folio *folio = ctx->cur_folio; - struct iomap_page *iop; + struct iomap_folio_state *ifs; loff_t orig_pos = pos; size_t poff, plen; sector_t sector; @@ -269,20 +269,20 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, return iomap_read_inline_data(iter, folio); /* zero post-eof blocks as the page may be mapped */ - iop = iomap_page_create(iter->inode, folio, iter->flags); + ifs = iomap_ifs_alloc(iter->inode, folio, iter->flags); iomap_adjust_read_range(iter->inode, folio, &pos, length, &poff, &plen); if (plen == 0) goto done; if (iomap_block_needs_zeroing(iter, pos)) { folio_zero_range(folio, poff, plen); - iomap_set_range_uptodate(folio, iop, poff, plen); + iomap_set_range_uptodate(folio, ifs, poff, plen); goto done; } ctx->cur_folio_in_bio = true; - if (iop) - atomic_add(plen, &iop->read_bytes_pending); + if (ifs) + atomic_add(plen, &ifs->read_bytes_pending); sector = iomap_sector(iomap, pos); if (!ctx->bio || @@ -436,11 +436,11 @@ EXPORT_SYMBOL_GPL(iomap_readahead); */ bool iomap_is_partially_uptodate(struct folio *folio, size_t from, size_t count) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = iomap_get_ifs(folio); struct inode *inode = folio->mapping->host; unsigned first, last, i; - if (!iop) + if (!ifs) return false; /* Caller's range may extend past the end of this folio */ @@ -451,7 +451,7 @@ bool iomap_is_partially_uptodate(struct folio *folio, size_t from, size_t count) last = (from + count - 1) >> inode->i_blkbits; for (i = first; i <= last; i++) - if (!test_bit(i, iop->uptodate)) + if (!test_bit(i, ifs->state)) return false; return true; } @@ -490,7 +490,7 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags) */ if (folio_test_dirty(folio) || folio_test_writeback(folio)) return false; - iomap_page_release(folio); + iomap_ifs_free(folio); return true; } EXPORT_SYMBOL_GPL(iomap_release_folio); @@ -507,12 +507,12 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len) if (offset == 0 && len == folio_size(folio)) { WARN_ON_ONCE(folio_test_writeback(folio)); folio_cancel_dirty(folio); - iomap_page_release(folio); + iomap_ifs_free(folio); } else if (folio_test_large(folio)) { - /* Must release the iop so the page can be split */ + /* Must release the ifs so the page can be split */ WARN_ON_ONCE(!folio_test_uptodate(folio) && folio_test_dirty(folio)); - iomap_page_release(folio); + iomap_ifs_free(folio); } } EXPORT_SYMBOL_GPL(iomap_invalidate_folio); @@ -547,7 +547,7 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, size_t len, struct folio *folio) { const struct iomap *srcmap = iomap_iter_srcmap(iter); - struct iomap_page *iop; + struct iomap_folio_state *ifs; loff_t block_size = i_blocksize(iter->inode); loff_t block_start = round_down(pos, block_size); loff_t block_end = round_up(pos + len, block_size); @@ -559,8 +559,8 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, return 0; folio_clear_error(folio); - iop = iomap_page_create(iter->inode, folio, iter->flags); - if ((iter->flags & IOMAP_NOWAIT) && !iop && nr_blocks > 1) + ifs = iomap_ifs_alloc(iter->inode, folio, iter->flags); + if ((iter->flags & IOMAP_NOWAIT) && !ifs && nr_blocks > 1) return -EAGAIN; do { @@ -589,7 +589,7 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, if (status) return status; } - iomap_set_range_uptodate(folio, iop, poff, plen); + iomap_set_range_uptodate(folio, ifs, poff, plen); } while ((block_start += plen) < block_end); return 0; @@ -696,7 +696,7 @@ static int iomap_write_begin(struct iomap_iter *iter, loff_t pos, static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, size_t copied, struct folio *folio) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = iomap_get_ifs(folio); flush_dcache_folio(folio); /* @@ -712,7 +712,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, */ if (unlikely(copied < len && !folio_test_uptodate(folio))) return 0; - iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len); + iomap_set_range_uptodate(folio, ifs, offset_in_folio(folio, pos), len); filemap_dirty_folio(inode->i_mapping, folio); return copied; } @@ -1290,17 +1290,17 @@ EXPORT_SYMBOL_GPL(iomap_page_mkwrite); static void iomap_finish_folio_write(struct inode *inode, struct folio *folio, size_t len, int error) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = iomap_get_ifs(folio); if (error) { folio_set_error(folio); mapping_set_error(inode->i_mapping, error); } - WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !iop); - WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) <= 0); + WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !ifs); + WARN_ON_ONCE(ifs && atomic_read(&ifs->write_bytes_pending) <= 0); - if (!iop || atomic_sub_and_test(len, &iop->write_bytes_pending)) + if (!ifs || atomic_sub_and_test(len, &ifs->write_bytes_pending)) folio_end_writeback(folio); } @@ -1567,7 +1567,7 @@ iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t offset, */ static void iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio, - struct iomap_page *iop, struct iomap_writepage_ctx *wpc, + struct iomap_folio_state *ifs, struct iomap_writepage_ctx *wpc, struct writeback_control *wbc, struct list_head *iolist) { sector_t sector = iomap_sector(&wpc->iomap, pos); @@ -1585,8 +1585,8 @@ iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio, bio_add_folio(wpc->ioend->io_bio, folio, len, poff); } - if (iop) - atomic_add(len, &iop->write_bytes_pending); + if (ifs) + atomic_add(len, &ifs->write_bytes_pending); wpc->ioend->io_size += len; wbc_account_cgroup_owner(wbc, &folio->page, len); } @@ -1612,7 +1612,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, struct writeback_control *wbc, struct inode *inode, struct folio *folio, u64 end_pos) { - struct iomap_page *iop = iomap_page_create(inode, folio, 0); + struct iomap_folio_state *ifs = iomap_ifs_alloc(inode, folio, 0); struct iomap_ioend *ioend, *next; unsigned len = i_blocksize(inode); unsigned nblocks = i_blocks_per_folio(inode, folio); @@ -1620,7 +1620,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, int error = 0, count = 0, i; LIST_HEAD(submit_list); - WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0); + WARN_ON_ONCE(ifs && atomic_read(&ifs->write_bytes_pending) != 0); /* * Walk through the folio to find areas to write back. If we @@ -1628,7 +1628,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, * invalid, grab a new one. */ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) { - if (iop && !test_bit(i, iop->uptodate)) + if (ifs && !test_bit(i, ifs->state)) continue; error = wpc->ops->map_blocks(wpc, inode, pos); @@ -1639,7 +1639,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, continue; if (wpc->iomap.type == IOMAP_HOLE) continue; - iomap_add_to_ioend(inode, pos, folio, iop, wpc, wbc, + iomap_add_to_ioend(inode, pos, folio, ifs, wpc, wbc, &submit_list); count++; } From patchwork Sat Jun 10 11:39:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13274822 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4A18C7EE2F for ; Sat, 10 Jun 2023 11:39:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234512AbjFJLjb (ORCPT ); Sat, 10 Jun 2023 07:39:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229483AbjFJLj1 (ORCPT ); Sat, 10 Jun 2023 07:39:27 -0400 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD3D6E1; Sat, 10 Jun 2023 04:39:25 -0700 (PDT) Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1b025d26f4fso16342275ad.1; Sat, 10 Jun 2023 04:39:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686397165; x=1688989165; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3LqXrHyMAliGHHfl145sr39mbdmvYioGfCu9qktj5nI=; b=HhlZubRk4/08niZvdPfzYydYXx4kCMWS9FKznmOtJm3RM53zr5q9dr5KdxGFxDX4s4 OsOpu1l9MS+KdIqfXv+WTBf8K7tx1jqvry9KZLFAFt5fMUwmVjfvokE1fJKQ26nDg5/C Q1GRyL7sd6J7gy6AOIGiSrwOtS0UPz1D9mrAzTm2CIFwS4nrCMU/SLmMlXCPgxVMvdBu hfUWUgUslmodiF22PypMJCxiyU35K8HY7CnK1zgCoxUsnVzOwZNdjRGSQQYoTFs/1XwZ 9eX/Z22cuvLLSxrpeRY+2c59X1BlDLGCRzqA8vWg4aD94s5XjTWv1bKckHt3QuW2LtV3 Fb4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686397165; x=1688989165; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3LqXrHyMAliGHHfl145sr39mbdmvYioGfCu9qktj5nI=; b=SLgxJ+u3E2aQKIbYGBBloc8XGlJzwkuLloljgUMTd11bEzqcm/uKDu0jv0zLdFWdIs jHZVLyBZdr7bhdG3reEj6I+MxqwPE1bNYU8jLx4fTR4TelfJVEVk2Rw0Z5uay59RMOXi PFQUWV+k80yn/QlhGMMW7PPz/3snNwNv9BZez/Ppf1Mjhe8DaZz82BRYoNef5qThWxEL QTG1ayaUFpxzirbLeg0m+/fTz+mr7yp1fJ1W22SdLIcFnEi20Cz91BbDv4q4re10jpyu GKXAKwayItyu81Z49pGcaQqFMe7NBE6izsHAvJgpJM4dy4iPM6dxPcgrYqeY39Nqp0M6 Vjyw== X-Gm-Message-State: AC+VfDwM2S4VXAbrhePTGUdPPXAuPPPCkhhsynA1q+doNFqnVBL9FrcA afudXgLUM/Id23HHDabu0mOtuO7XwB8= X-Google-Smtp-Source: ACHHUZ5I7NHj2ddnHR3pPPbkZyCTOB37ZnCj8PAT3tC6o6nx4If/+jU2mHc588RCJHIsIwSU41EHxw== X-Received: by 2002:a17:902:ce87:b0:1ad:cba5:5505 with SMTP id f7-20020a170902ce8700b001adcba55505mr1875714plg.14.1686397165006; Sat, 10 Jun 2023 04:39:25 -0700 (PDT) Received: from dw-tp.ihost.com ([49.207.220.159]) by smtp.gmail.com with ESMTPSA id n10-20020a170902e54a00b001aaf5dcd762sm4753698plf.214.2023.06.10.04.39.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Jun 2023 04:39:24 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Christoph Hellwig , "Darrick J. Wong" , Matthew Wilcox , Dave Chinner , Brian Foster , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [PATCHv9 2/6] iomap: Drop ifs argument from iomap_set_range_uptodate() Date: Sat, 10 Jun 2023 17:09:03 +0530 Message-Id: <183fa9098b3506d945fed8a71cadeff82e03c059.1686395560.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org iomap_folio_state (ifs) can be derived directly from the folio, making it unnecessary to pass "ifs" as an argument to iomap_set_range_uptodate(). This patch eliminates "ifs" argument from iomap_set_range_uptodate() function. Also, the definition of iomap_set_range_uptodate() and iomap_ifs_set_range_uptodate() functions are moved above iomap_ifs_alloc(). In upcoming patches, we plan to introduce additional helper routines for handling dirty state, with the intention of consolidating all of "ifs" state handling routines at one place. Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 67 +++++++++++++++++++++--------------------- 1 file changed, 33 insertions(+), 34 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 779205fe228f..e237f2b786bc 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -43,6 +43,33 @@ static inline struct iomap_folio_state *iomap_get_ifs(struct folio *folio) static struct bio_set iomap_ioend_bioset; +static void iomap_ifs_set_range_uptodate(struct folio *folio, + struct iomap_folio_state *ifs, size_t off, size_t len) +{ + struct inode *inode = folio->mapping->host; + unsigned int first_blk = off >> inode->i_blkbits; + unsigned int last_blk = (off + len - 1) >> inode->i_blkbits; + unsigned int nr_blks = last_blk - first_blk + 1; + unsigned long flags; + + spin_lock_irqsave(&ifs->state_lock, flags); + bitmap_set(ifs->state, first_blk, nr_blks); + if (bitmap_full(ifs->state, i_blocks_per_folio(inode, folio))) + folio_mark_uptodate(folio); + spin_unlock_irqrestore(&ifs->state_lock, flags); +} + +static void iomap_set_range_uptodate(struct folio *folio, size_t off, + size_t len) +{ + struct iomap_folio_state *ifs = iomap_get_ifs(folio); + + if (ifs) + iomap_ifs_set_range_uptodate(folio, ifs, off, len); + else + folio_mark_uptodate(folio); +} + static struct iomap_folio_state *iomap_ifs_alloc(struct inode *inode, struct folio *folio, unsigned int flags) { @@ -144,30 +171,6 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, *lenp = plen; } -static void iomap_ifs_set_range_uptodate(struct folio *folio, - struct iomap_folio_state *ifs, size_t off, size_t len) -{ - struct inode *inode = folio->mapping->host; - unsigned first = off >> inode->i_blkbits; - unsigned last = (off + len - 1) >> inode->i_blkbits; - unsigned long flags; - - spin_lock_irqsave(&ifs->state_lock, flags); - bitmap_set(ifs->state, first, last - first + 1); - if (bitmap_full(ifs->state, i_blocks_per_folio(inode, folio))) - folio_mark_uptodate(folio); - spin_unlock_irqrestore(&ifs->state_lock, flags); -} - -static void iomap_set_range_uptodate(struct folio *folio, - struct iomap_folio_state *ifs, size_t off, size_t len) -{ - if (ifs) - iomap_ifs_set_range_uptodate(folio, ifs, off, len); - else - folio_mark_uptodate(folio); -} - static void iomap_finish_folio_read(struct folio *folio, size_t offset, size_t len, int error) { @@ -177,7 +180,7 @@ static void iomap_finish_folio_read(struct folio *folio, size_t offset, folio_clear_uptodate(folio); folio_set_error(folio); } else { - iomap_set_range_uptodate(folio, ifs, offset, len); + iomap_set_range_uptodate(folio, offset, len); } if (!ifs || atomic_sub_and_test(len, &ifs->read_bytes_pending)) @@ -213,7 +216,6 @@ struct iomap_readpage_ctx { static int iomap_read_inline_data(const struct iomap_iter *iter, struct folio *folio) { - struct iomap_folio_state *ifs; const struct iomap *iomap = iomap_iter_srcmap(iter); size_t size = i_size_read(iter->inode) - iomap->offset; size_t poff = offset_in_page(iomap->offset); @@ -231,15 +233,13 @@ static int iomap_read_inline_data(const struct iomap_iter *iter, if (WARN_ON_ONCE(size > iomap->length)) return -EIO; if (offset > 0) - ifs = iomap_ifs_alloc(iter->inode, folio, iter->flags); - else - ifs = iomap_get_ifs(folio); + iomap_ifs_alloc(iter->inode, folio, iter->flags); addr = kmap_local_folio(folio, offset); memcpy(addr, iomap->inline_data, size); memset(addr + size, 0, PAGE_SIZE - poff - size); kunmap_local(addr); - iomap_set_range_uptodate(folio, ifs, offset, PAGE_SIZE - poff); + iomap_set_range_uptodate(folio, offset, PAGE_SIZE - poff); return 0; } @@ -276,7 +276,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, if (iomap_block_needs_zeroing(iter, pos)) { folio_zero_range(folio, poff, plen); - iomap_set_range_uptodate(folio, ifs, poff, plen); + iomap_set_range_uptodate(folio, poff, plen); goto done; } @@ -589,7 +589,7 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, if (status) return status; } - iomap_set_range_uptodate(folio, ifs, poff, plen); + iomap_set_range_uptodate(folio, poff, plen); } while ((block_start += plen) < block_end); return 0; @@ -696,7 +696,6 @@ static int iomap_write_begin(struct iomap_iter *iter, loff_t pos, static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, size_t copied, struct folio *folio) { - struct iomap_folio_state *ifs = iomap_get_ifs(folio); flush_dcache_folio(folio); /* @@ -712,7 +711,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, */ if (unlikely(copied < len && !folio_test_uptodate(folio))) return 0; - iomap_set_range_uptodate(folio, ifs, offset_in_folio(folio, pos), len); + iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len); filemap_dirty_folio(inode->i_mapping, folio); return copied; } From patchwork Sat Jun 10 11:39:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13274823 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C36E7C7EE2F for ; Sat, 10 Jun 2023 11:39:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234514AbjFJLjg (ORCPT ); Sat, 10 Jun 2023 07:39:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234475AbjFJLja (ORCPT ); Sat, 10 Jun 2023 07:39:30 -0400 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B660E1; Sat, 10 Jun 2023 04:39:29 -0700 (PDT) Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1b065154b79so12848165ad.1; Sat, 10 Jun 2023 04:39:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686397168; x=1688989168; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5YnkMLxxMAoajrg82W2zOCJjIzJLeLgSUkRYuvBlmSw=; b=DBh0Cz4A/1cNPfsBbq0Lpttiu7TeTC/k03ZVSjNboQYZ15peg6J4za/6b0qi1DrHXR 9uUgA3f6kxhcbd3EhmEPIo6WILtcdWYwtrKdmfEWNBQ8ZYE/Ts2iEVv3632wQTU8bpOj yDqJvov84WwebEJHP0NwdSKqDuZHCpwjMFZ7kqpyibmjACzrY1QySpAPc2vOCqZ3TQ2J C/TNw2+gkPy2qrCUjE3cyZthGse0lqmWqLvYX/jmh5MN57GYyegkU3AQyXKgEdCvR69o MDpTAFUJglyxOo3/A+Gn+HHyeVoJh7NsITOXWwSkqSjbrGy3AiFRlEN+WTxi2Kou6nzg OUnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686397168; x=1688989168; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5YnkMLxxMAoajrg82W2zOCJjIzJLeLgSUkRYuvBlmSw=; b=IDJS7wwOHT+lK53K99kpRVwwybs+3cMheHslxjwBkaDYov3UOMzfaggxZp9snfxwQx i12iYKq3RivsIobpM+kPFfYvLahTipy/u1iKr7W1qsVGuWLIScN6YogvJXjbb/m5MY7t kl03nJdebfeYerSgHNEuzli8B32rJc2OuxJftqgzSvBjekGENFcREcnd124vmQnDE40u tw2tlmyQUCWqhplzjlyEwd4wCRTbK9r5W0KU0M9Aalqi75gN2cBuYeLKtQ/XypvdMCjg ZFKTDY0bC0GZQ/5S3N0dojqGydc2xRXWoQ+nfbark5HYHoAvvVQXI4guDZSS/rfGYN0j 3rIg== X-Gm-Message-State: AC+VfDznaKeAUrIjpWaDJQFyWATkGFf8ozrVuS6XJ8rd98sS9dM7VSHc cZMqeYnhHOLZ8I4lgG87/KtTavNqQcI= X-Google-Smtp-Source: ACHHUZ43CvmsE1DiKGey7OF4xIlx9gfg7GbzGmYGbBSep3CGLvhU8NywX3amXBmD7YY2EVtDosWKYw== X-Received: by 2002:a17:902:cecd:b0:1b1:82a6:7c82 with SMTP id d13-20020a170902cecd00b001b182a67c82mr1667907plg.27.1686397168455; Sat, 10 Jun 2023 04:39:28 -0700 (PDT) Received: from dw-tp.ihost.com ([49.207.220.159]) by smtp.gmail.com with ESMTPSA id n10-20020a170902e54a00b001aaf5dcd762sm4753698plf.214.2023.06.10.04.39.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Jun 2023 04:39:28 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Christoph Hellwig , "Darrick J. Wong" , Matthew Wilcox , Dave Chinner , Brian Foster , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [PATCHv9 3/6] iomap: Add some uptodate state handling helpers for ifs state bitmap Date: Sat, 10 Jun 2023 17:09:04 +0530 Message-Id: <606c3279db7cc189dd3cd94d162a056c23b67514.1686395560.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch adds two of the helper routines iomap_ifs_is_fully_uptodate() and iomap_ifs_is_block_uptodate() for managing uptodate state of ifs state bitmap. In later patches ifs state bitmap array will also handle dirty state of all blocks of a folio. Hence this patch adds some helper routines for handling uptodate state of the ifs state bitmap. Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index e237f2b786bc..206808f6e818 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -43,6 +43,20 @@ static inline struct iomap_folio_state *iomap_get_ifs(struct folio *folio) static struct bio_set iomap_ioend_bioset; +static inline bool iomap_ifs_is_fully_uptodate(struct folio *folio, + struct iomap_folio_state *ifs) +{ + struct inode *inode = folio->mapping->host; + + return bitmap_full(ifs->state, i_blocks_per_folio(inode, folio)); +} + +static inline bool iomap_ifs_is_block_uptodate(struct iomap_folio_state *ifs, + unsigned int block) +{ + return test_bit(block, ifs->state); +} + static void iomap_ifs_set_range_uptodate(struct folio *folio, struct iomap_folio_state *ifs, size_t off, size_t len) { @@ -54,7 +68,7 @@ static void iomap_ifs_set_range_uptodate(struct folio *folio, spin_lock_irqsave(&ifs->state_lock, flags); bitmap_set(ifs->state, first_blk, nr_blks); - if (bitmap_full(ifs->state, i_blocks_per_folio(inode, folio))) + if (iomap_ifs_is_fully_uptodate(folio, ifs)) folio_mark_uptodate(folio); spin_unlock_irqrestore(&ifs->state_lock, flags); } @@ -99,14 +113,12 @@ static struct iomap_folio_state *iomap_ifs_alloc(struct inode *inode, static void iomap_ifs_free(struct folio *folio) { struct iomap_folio_state *ifs = folio_detach_private(folio); - struct inode *inode = folio->mapping->host; - unsigned int nr_blocks = i_blocks_per_folio(inode, folio); if (!ifs) return; WARN_ON_ONCE(atomic_read(&ifs->read_bytes_pending)); WARN_ON_ONCE(atomic_read(&ifs->write_bytes_pending)); - WARN_ON_ONCE(bitmap_full(ifs->state, nr_blocks) != + WARN_ON_ONCE(iomap_ifs_is_fully_uptodate(folio, ifs) != folio_test_uptodate(folio)); kfree(ifs); } @@ -137,7 +149,7 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, /* move forward for each leading block marked uptodate */ for (i = first; i <= last; i++) { - if (!test_bit(i, ifs->state)) + if (!iomap_ifs_is_block_uptodate(ifs, i)) break; *pos += block_size; poff += block_size; @@ -147,7 +159,7 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, /* truncate len if we find any trailing uptodate block(s) */ for ( ; i <= last; i++) { - if (test_bit(i, ifs->state)) { + if (iomap_ifs_is_block_uptodate(ifs, i)) { plen -= (last - i + 1) * block_size; last = i - 1; break; @@ -451,7 +463,7 @@ bool iomap_is_partially_uptodate(struct folio *folio, size_t from, size_t count) last = (from + count - 1) >> inode->i_blkbits; for (i = first; i <= last; i++) - if (!test_bit(i, ifs->state)) + if (!iomap_ifs_is_block_uptodate(ifs, i)) return false; return true; } @@ -1627,7 +1639,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, * invalid, grab a new one. */ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) { - if (ifs && !test_bit(i, ifs->state)) + if (ifs && !iomap_ifs_is_block_uptodate(ifs, i)) continue; error = wpc->ops->map_blocks(wpc, inode, pos); From patchwork Sat Jun 10 11:39:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13274824 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3127DC83005 for ; Sat, 10 Jun 2023 11:39:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234525AbjFJLjh (ORCPT ); Sat, 10 Jun 2023 07:39:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231437AbjFJLjd (ORCPT ); Sat, 10 Jun 2023 07:39:33 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF1FC3AA7; Sat, 10 Jun 2023 04:39:32 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1b3b3f67ad6so702085ad.3; Sat, 10 Jun 2023 04:39:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686397172; x=1688989172; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5FKcoxBwK0qnW6H27wS5t14OVQv/jaudrF3qmLKf+oE=; b=GhZoRSiJ9D7z5xfhTIJkAE8dU7f+dlfm9hB7UfM6UzV7wFp7WjThn6zSH4VtVUMP+V Jbup8ulnBZePVJgjKsQWEhI0ITFfUxjPE/WbAJF+KUFlAyjrB8S6DyTw24+pRF6xP2H/ eVsNw7zUGftXbQZEuVfig1NWgmP8fqcmayAerKJf39/tCEPJyEAbQnXaH0EYYozFXBXd SDAwNlmFbnJC+7OQBpoC6ZOTvgJwLvkGK91aWGHZUSIsUL77rROXeWdXJiBXLoOdxNZE Vxsb00W0rG1q76tWchJjGWj1+AP8WYcekY8k7uFy4cZhw9aEulyDOk1o+2i7zeJ4nR0U pdDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686397172; x=1688989172; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5FKcoxBwK0qnW6H27wS5t14OVQv/jaudrF3qmLKf+oE=; b=asAXEKNxFtJ/JvWgufjFTF0nZvhM6TsC7lvV3miK19pi/YWDcp5nbgvjReUgH82Yo+ yEEcxUI3iwqxvaz5+Cgv/VHDCPG6K54eU3zwlHW8TqOPFPkuxxsVjdaUvnOn/eJ35QGp Twb2QXaop7cZxbP9wOw4wyL7HiuHwTH2XCzpX1FZ+DkisJXrEfGPhp4UiBnw4HSbr5ka OpdzJnDTjipk+/cWKwsNE/vQQ0nvlx4pXYXZhpzgAi0+RSuxaDiM66XS/vWSakeUugdo 0euOn4QkGhRl1nPzNXscUfzPQMR1zwQ0P3aMHImBA2TKvgDgSvY1KKYXMR0g40bS9FnG h7jA== X-Gm-Message-State: AC+VfDyVGUINKmCAUaR8DGNapt5H1QNfLQo+y9H15z6oAQz7WBULkdaY DWCX4KAk9hULyfvKh7aLzcywhwQoHWQ= X-Google-Smtp-Source: ACHHUZ5gtLIfzMbXpo3ZOCuEYnQSF/zbk5yMJwytjqy7vchASO3kgEXMl0GlwakbVnJ1dEaLLc1duQ== X-Received: by 2002:a17:902:db0f:b0:1af:981b:eeff with SMTP id m15-20020a170902db0f00b001af981beeffmr1833274plx.64.1686397171891; Sat, 10 Jun 2023 04:39:31 -0700 (PDT) Received: from dw-tp.ihost.com ([49.207.220.159]) by smtp.gmail.com with ESMTPSA id n10-20020a170902e54a00b001aaf5dcd762sm4753698plf.214.2023.06.10.04.39.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Jun 2023 04:39:31 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Christoph Hellwig , "Darrick J. Wong" , Matthew Wilcox , Dave Chinner , Brian Foster , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [PATCHv9 4/6] iomap: Refactor iomap_write_delalloc_punch() function out Date: Sat, 10 Jun 2023 17:09:05 +0530 Message-Id: <62950460a9e78804df28c548327d779a8d53243f.1686395560.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch factors iomap_write_delalloc_punch() function out. This function is resposible for actual punch out operation. The reason for doing this is, to avoid deep indentation when we bring punch-out of individual non-dirty blocks within a dirty folio in a later patch (which adds per-block dirty status handling to iomap) to avoid delalloc block leak. Reviewed-by: Darrick J. Wong Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 54 ++++++++++++++++++++++++++---------------- 1 file changed, 34 insertions(+), 20 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 206808f6e818..1261f26479af 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -888,6 +888,33 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i, } EXPORT_SYMBOL_GPL(iomap_file_buffered_write); +static int iomap_write_delalloc_punch(struct inode *inode, struct folio *folio, + loff_t *punch_start_byte, loff_t start_byte, loff_t end_byte, + int (*punch)(struct inode *inode, loff_t offset, loff_t length)) +{ + int ret = 0; + + if (!folio_test_dirty(folio)) + return ret; + + /* if dirty, punch up to offset */ + if (start_byte > *punch_start_byte) { + ret = punch(inode, *punch_start_byte, + start_byte - *punch_start_byte); + if (ret) + goto out; + } + /* + * Make sure the next punch start is correctly bound to + * the end of this data range, not the end of the folio. + */ + *punch_start_byte = min_t(loff_t, end_byte, + folio_next_index(folio) << PAGE_SHIFT); + +out: + return ret; +} + /* * Scan the data range passed to us for dirty page cache folios. If we find a * dirty folio, punch out the preceeding range and update the offset from which @@ -911,6 +938,7 @@ static int iomap_write_delalloc_scan(struct inode *inode, { while (start_byte < end_byte) { struct folio *folio; + int ret; /* grab locked page */ folio = filemap_lock_folio(inode->i_mapping, @@ -921,26 +949,12 @@ static int iomap_write_delalloc_scan(struct inode *inode, continue; } - /* if dirty, punch up to offset */ - if (folio_test_dirty(folio)) { - if (start_byte > *punch_start_byte) { - int error; - - error = punch(inode, *punch_start_byte, - start_byte - *punch_start_byte); - if (error) { - folio_unlock(folio); - folio_put(folio); - return error; - } - } - - /* - * Make sure the next punch start is correctly bound to - * the end of this data range, not the end of the folio. - */ - *punch_start_byte = min_t(loff_t, end_byte, - folio_next_index(folio) << PAGE_SHIFT); + ret = iomap_write_delalloc_punch(inode, folio, punch_start_byte, + start_byte, end_byte, punch); + if (ret) { + folio_unlock(folio); + folio_put(folio); + return ret; } /* move offset to start of next folio in range */ From patchwork Sat Jun 10 11:39:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13274825 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F8CCC87FDC for ; Sat, 10 Jun 2023 11:39:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234532AbjFJLjj (ORCPT ); Sat, 10 Jun 2023 07:39:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234475AbjFJLjh (ORCPT ); Sat, 10 Jun 2023 07:39:37 -0400 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9DCECE1; Sat, 10 Jun 2023 04:39:36 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1b065154b79so12848835ad.1; Sat, 10 Jun 2023 04:39:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686397176; x=1688989176; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=EyPzAEGGPRspXW63qviQl5WaHG4admX+rKhNWOPMKZ4=; b=IPfqVdqC8nEEH9U/D/1dQLnY/PXx4eBDeA7Gf1SaGqYSLQGW27I0GT6y4wCS6ueIGQ iatCuRkPh5/Q1f6svIL8QD7qO2Rqv2Q+bIF7sA3dS+CX8vo1STBA5AtVKb0wfoOK42wR a75jVFsCQdPLBppKAhES+iIojW0u6gQ/DR7dmWhDN/Ff+1LHVe8Pm0wmxQ2i0X+McNP0 XXncXkzdXj4SmrfJjvnUXrw0swSCkw8ZwmQX+93/ERDmoGs4ZnQgEqbVk12OHpGywQA7 3Mk5Sh1tCg/AlDUxcVtFijTWXo7ptWMMbN/4qJr0TkMVcPHoihl1sr3QGaGQLlET9BT6 GauQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686397176; x=1688989176; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EyPzAEGGPRspXW63qviQl5WaHG4admX+rKhNWOPMKZ4=; b=hh26NtUD5pHdKQkqkg6ZK9xfyd4tzWxiYFyBajilDr41jCHxVdvvfTsUgYezU/kNnJ 1zACcuvYoUjFwDICzuYSo74m1VEGTLWVu6/r0MExs7yZSlD+eUrtvbuAyd1mmfUD1YAR 0MlS+tDVR2YkVdCwxSd8bgX+layjq8FCvdwOGm9CwQPuXmHtXsu0CWg2pwvQ7Qp9EGhH Vbh1Wwcei+0zudpYmy+U/6sKCt9QAcwJVupgLx1E+H8DUcytMdzRF0GVdcj+UrFZQZzo f1NpAzlhzP7iSLroqqsTJqhicZmEX6w9P6AHtKCoc3G1iLp9JNHSJivgey2rxiEjMnnZ dOYw== X-Gm-Message-State: AC+VfDyAubSHwpTaDADVEb5lM/O6QTlwcDclPD/ItQYkwcWwsA5hBxcK icRrK7QmGAeXNvkb/INGAu2X6SvYkys= X-Google-Smtp-Source: ACHHUZ4+1uFpqiGwuS4o3sdBdHVs98XMNBhtbt3aD85AiRfAL5m21n35GTBRi4VdVzQj/Q0S0TVkqw== X-Received: by 2002:a17:902:d486:b0:1a6:9762:6eed with SMTP id c6-20020a170902d48600b001a697626eedmr1895866plg.22.1686397175689; Sat, 10 Jun 2023 04:39:35 -0700 (PDT) Received: from dw-tp.ihost.com ([49.207.220.159]) by smtp.gmail.com with ESMTPSA id n10-20020a170902e54a00b001aaf5dcd762sm4753698plf.214.2023.06.10.04.39.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Jun 2023 04:39:35 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Christoph Hellwig , "Darrick J. Wong" , Matthew Wilcox , Dave Chinner , Brian Foster , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" , Christoph Hellwig Subject: [PATCHv9 5/6] iomap: Allocate ifs in ->write_begin() early Date: Sat, 10 Jun 2023 17:09:06 +0530 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org We dont need to allocate an ifs in ->write_begin() for writes where the position and length completely overlap with the given folio. Therefore, such cases are skipped. Currently when the folio is uptodate, we only allocate ifs at writeback time (in iomap_writepage_map()). This is ok until now, but when we are going to add support for per-block dirty state bitmap in ifs, this could cause some performance degradation. The reason is that if we don't allocate ifs during ->write_begin(), then we will never mark the necessary dirty bits in ->write_end() call. And we will have to mark all the bits as dirty at the writeback time, that could cause the same write amplification and performance problems as it is now. Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Signed-off-by: Ritesh Harjani (IBM) --- fs/iomap/buffered-io.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 1261f26479af..c6dcb0f0d22f 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -567,14 +567,23 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, size_t from = offset_in_folio(folio, pos), to = from + len; size_t poff, plen; - if (folio_test_uptodate(folio)) + /* + * If the write completely overlaps the current folio, then + * entire folio will be dirtied so there is no need for + * per-block state tracking structures to be attached to this folio. + */ + if (pos <= folio_pos(folio) && + pos + len >= folio_pos(folio) + folio_size(folio)) return 0; - folio_clear_error(folio); ifs = iomap_ifs_alloc(iter->inode, folio, iter->flags); if ((iter->flags & IOMAP_NOWAIT) && !ifs && nr_blocks > 1) return -EAGAIN; + if (folio_test_uptodate(folio)) + return 0; + folio_clear_error(folio); + do { iomap_adjust_read_range(iter->inode, folio, &block_start, block_end - block_start, &poff, &plen); From patchwork Sat Jun 10 11:39:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13274826 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70831C7EE2F for ; Sat, 10 Jun 2023 11:39:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234475AbjFJLjq (ORCPT ); Sat, 10 Jun 2023 07:39:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230084AbjFJLjl (ORCPT ); Sat, 10 Jun 2023 07:39:41 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 539ECE1; Sat, 10 Jun 2023 04:39:40 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1b24b34b59fso16456825ad.3; Sat, 10 Jun 2023 04:39:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686397179; x=1688989179; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lZ7Q4R5U7AY6PpcGfOaRes0larMmuTaPyvKsUXfEUVY=; b=sp9xDMlRGxg2169BgRUkHr3pdFfCATWmLmAYHToSg5hSWNgw+WBYVsd2mG20p6jM2B tZPJfql8IwZoNwFKqipf4269EaEZuExqgcgpLge1fIxOrQ43xgDydXC8wdmuAS+QjmGn 4EKnB8Iiv9dKulnSrBa4yXzdwhwfZMi6tEB7j54tShnBWzXKcLZrwJbHaecWfCyNbg9M 4fU6RXEzxXntUtIqJFH5hyoNw5m1lsnVi8/DBLr/Cb+vf5KPkweDvdsBbKeSfYjIPeMM 3zUcaA7CsFlOz7u9OI2d5qCku0kBF8Bzm+dGQsVzgRK2BxoNU4RaeidM48+PX6vTILX4 6o9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686397179; x=1688989179; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lZ7Q4R5U7AY6PpcGfOaRes0larMmuTaPyvKsUXfEUVY=; b=JJ58+aQzSCcvrgzefEcwb9INVuCSp8wntaail9Ln1pClqH7sikeOISTYDtZgS623SX 6ze7UWhBOI95GVDyqtDX4Mw6rX0MXX86sC6ubuOjt0nXAEMj9XvDCZOTBVbCCv04kZfb jta5fRTZGfiP/hRV2U7IgO9QFN9bfRtDzu6j5z4sX+lkaFaZMC0OKp0/LfwbFxtsGQ1E +cLB420CvUjNds1R5VwdqUwHw/Occu+3Aep+0JMEB7KSfJsiHukIMV7eNKoI0cjdtNIx 78+C8SWxCVY2v00OiR9upGs3dNwX0dnQ7uColLoxr3YTL0GzIE283Ks6NSWCkn70gWZB AX3A== X-Gm-Message-State: AC+VfDwB1Bv8SseT5gFLUk4LXGAnF9Q0oWdLut1IL3q+yfOA8UoTb4C2 VF34TD5ZQY5S3SyyqVoKqMUfXQarYOE= X-Google-Smtp-Source: ACHHUZ50yi6clq+ANzvNvGUIwcfvj6EPxcOZy+mSzeHN/p0/RgCMy7U2UZADu4bH4it8f/MFoUfKJQ== X-Received: by 2002:a17:902:d4cb:b0:1b1:9f8a:6c18 with SMTP id o11-20020a170902d4cb00b001b19f8a6c18mr1789355plg.25.1686397179326; Sat, 10 Jun 2023 04:39:39 -0700 (PDT) Received: from dw-tp.ihost.com ([49.207.220.159]) by smtp.gmail.com with ESMTPSA id n10-20020a170902e54a00b001aaf5dcd762sm4753698plf.214.2023.06.10.04.39.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Jun 2023 04:39:39 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Christoph Hellwig , "Darrick J. Wong" , Matthew Wilcox , Dave Chinner , Brian Foster , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" , Aravinda Herle Subject: [PATCHv9 6/6] iomap: Add per-block dirty state tracking to improve performance Date: Sat, 10 Jun 2023 17:09:07 +0530 Message-Id: <954d2e61dedbada996653c9d780be70a48dc66ae.1686395560.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When filesystem blocksize is less than folio size (either with mapping_large_folio_support() or with blocksize < pagesize) and when the folio is uptodate in pagecache, then even a byte write can cause an entire folio to be written to disk during writeback. This happens because we currently don't have a mechanism to track per-block dirty state within struct iomap_folio_state. We currently only track uptodate state. This patch implements support for tracking per-block dirty state in iomap_folio_state->state bitmap. This should help improve the filesystem write performance and help reduce write amplification. Performance testing of below fio workload reveals ~16x performance improvement using nvme with XFS (4k blocksize) on Power (64K pagesize) FIO reported write bw scores improved from around ~28 MBps to ~452 MBps. 1. [global] ioengine=psync rw=randwrite overwrite=1 pre_read=1 direct=0 bs=4k size=1G dir=./ numjobs=8 fdatasync=1 runtime=60 iodepth=64 group_reporting=1 [fio-run] 2. Also our internal performance team reported that this patch improves their database workload performance by around ~83% (with XFS on Power) Reported-by: Aravinda Herle Reported-by: Brian Foster Signed-off-by: Ritesh Harjani (IBM) --- fs/gfs2/aops.c | 2 +- fs/iomap/buffered-io.c | 158 +++++++++++++++++++++++++++++++++++++---- fs/xfs/xfs_aops.c | 2 +- fs/zonefs/file.c | 2 +- include/linux/iomap.h | 1 + 5 files changed, 147 insertions(+), 18 deletions(-) -- 2.40.1 diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index a5f4be6b9213..75efec3c3b71 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -746,7 +746,7 @@ static const struct address_space_operations gfs2_aops = { .writepages = gfs2_writepages, .read_folio = gfs2_read_folio, .readahead = gfs2_readahead, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .bmap = gfs2_bmap, diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index c6dcb0f0d22f..d5b8d134921c 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -24,7 +24,7 @@ #define IOEND_BATCH_SIZE 4096 /* - * Structure allocated for each folio to track per-block uptodate state + * Structure allocated for each folio to track per-block uptodate, dirty state * and I/O completions. */ struct iomap_folio_state { @@ -34,6 +34,26 @@ struct iomap_folio_state { unsigned long state[]; }; +enum iomap_block_state { + IOMAP_ST_UPTODATE, + IOMAP_ST_DIRTY, + + IOMAP_ST_MAX, +}; + +static void iomap_ifs_calc_range(struct folio *folio, size_t off, size_t len, + enum iomap_block_state state, unsigned int *first_blkp, + unsigned int *nr_blksp) +{ + struct inode *inode = folio->mapping->host; + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio); + unsigned int first_blk = off >> inode->i_blkbits; + unsigned int last_blk = (off + len - 1) >> inode->i_blkbits; + + *first_blkp = first_blk + (state * blks_per_folio); + *nr_blksp = last_blk - first_blk + 1; +} + static inline struct iomap_folio_state *iomap_get_ifs(struct folio *folio) { if (folio_test_private(folio)) @@ -60,12 +80,11 @@ static inline bool iomap_ifs_is_block_uptodate(struct iomap_folio_state *ifs, static void iomap_ifs_set_range_uptodate(struct folio *folio, struct iomap_folio_state *ifs, size_t off, size_t len) { - struct inode *inode = folio->mapping->host; - unsigned int first_blk = off >> inode->i_blkbits; - unsigned int last_blk = (off + len - 1) >> inode->i_blkbits; - unsigned int nr_blks = last_blk - first_blk + 1; + unsigned int first_blk, nr_blks; unsigned long flags; + iomap_ifs_calc_range(folio, off, len, IOMAP_ST_UPTODATE, &first_blk, + &nr_blks); spin_lock_irqsave(&ifs->state_lock, flags); bitmap_set(ifs->state, first_blk, nr_blks); if (iomap_ifs_is_fully_uptodate(folio, ifs)) @@ -84,6 +103,59 @@ static void iomap_set_range_uptodate(struct folio *folio, size_t off, folio_mark_uptodate(folio); } +static inline bool iomap_ifs_is_block_dirty(struct folio *folio, + struct iomap_folio_state *ifs, int block) +{ + struct inode *inode = folio->mapping->host; + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio); + + return test_bit(block + blks_per_folio, ifs->state); +} + +static void iomap_ifs_clear_range_dirty(struct folio *folio, + struct iomap_folio_state *ifs, size_t off, size_t len) +{ + unsigned int first_blk, nr_blks; + unsigned long flags; + + iomap_ifs_calc_range(folio, off, len, IOMAP_ST_DIRTY, &first_blk, + &nr_blks); + spin_lock_irqsave(&ifs->state_lock, flags); + bitmap_clear(ifs->state, first_blk, nr_blks); + spin_unlock_irqrestore(&ifs->state_lock, flags); +} + +static void iomap_clear_range_dirty(struct folio *folio, size_t off, size_t len) +{ + struct iomap_folio_state *ifs = iomap_get_ifs(folio); + + if (!ifs) + return; + iomap_ifs_clear_range_dirty(folio, ifs, off, len); +} + +static void iomap_ifs_set_range_dirty(struct folio *folio, + struct iomap_folio_state *ifs, size_t off, size_t len) +{ + unsigned int first_blk, nr_blks; + unsigned long flags; + + iomap_ifs_calc_range(folio, off, len, IOMAP_ST_DIRTY, &first_blk, + &nr_blks); + spin_lock_irqsave(&ifs->state_lock, flags); + bitmap_set(ifs->state, first_blk, nr_blks); + spin_unlock_irqrestore(&ifs->state_lock, flags); +} + +static void iomap_set_range_dirty(struct folio *folio, size_t off, size_t len) +{ + struct iomap_folio_state *ifs = iomap_get_ifs(folio); + + if (!ifs) + return; + iomap_ifs_set_range_dirty(folio, ifs, off, len); +} + static struct iomap_folio_state *iomap_ifs_alloc(struct inode *inode, struct folio *folio, unsigned int flags) { @@ -99,14 +171,24 @@ static struct iomap_folio_state *iomap_ifs_alloc(struct inode *inode, else gfp = GFP_NOFS | __GFP_NOFAIL; - ifs = kzalloc(struct_size(ifs, state, BITS_TO_LONGS(nr_blocks)), - gfp); - if (ifs) { - spin_lock_init(&ifs->state_lock); - if (folio_test_uptodate(folio)) - bitmap_fill(ifs->state, nr_blocks); - folio_attach_private(folio, ifs); - } + /* + * ifs->state tracks two sets of state flags when the + * filesystem block size is smaller than the folio size. + * The first state tracks per-block uptodate and the + * second tracks per-block dirty state. + */ + ifs = kzalloc(struct_size(ifs, state, + BITS_TO_LONGS(IOMAP_ST_MAX * nr_blocks)), gfp); + if (!ifs) + return ifs; + + spin_lock_init(&ifs->state_lock); + if (folio_test_uptodate(folio)) + bitmap_set(ifs->state, 0, nr_blocks); + if (folio_test_dirty(folio)) + bitmap_set(ifs->state, nr_blocks, nr_blocks); + folio_attach_private(folio, ifs); + return ifs; } @@ -529,6 +611,17 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len) } EXPORT_SYMBOL_GPL(iomap_invalidate_folio); +bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio) +{ + struct inode *inode = mapping->host; + size_t len = folio_size(folio); + + iomap_ifs_alloc(inode, folio, 0); + iomap_set_range_dirty(folio, 0, len); + return filemap_dirty_folio(mapping, folio); +} +EXPORT_SYMBOL_GPL(iomap_dirty_folio); + static void iomap_write_failed(struct inode *inode, loff_t pos, unsigned len) { @@ -733,6 +826,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, if (unlikely(copied < len && !folio_test_uptodate(folio))) return 0; iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len); + iomap_set_range_dirty(folio, offset_in_folio(folio, pos), copied); filemap_dirty_folio(inode->i_mapping, folio); return copied; } @@ -902,6 +996,10 @@ static int iomap_write_delalloc_punch(struct inode *inode, struct folio *folio, int (*punch)(struct inode *inode, loff_t offset, loff_t length)) { int ret = 0; + struct iomap_folio_state *ifs; + unsigned int first_blk, last_blk, i; + loff_t last_byte; + u8 blkbits = inode->i_blkbits; if (!folio_test_dirty(folio)) return ret; @@ -913,6 +1011,30 @@ static int iomap_write_delalloc_punch(struct inode *inode, struct folio *folio, if (ret) goto out; } + /* + * When we have per-block dirty tracking, there can be + * blocks within a folio which are marked uptodate + * but not dirty. In that case it is necessary to punch + * out such blocks to avoid leaking any delalloc blocks. + */ + ifs = iomap_get_ifs(folio); + if (!ifs) + goto skip_ifs_punch; + + last_byte = min_t(loff_t, end_byte - 1, + (folio_next_index(folio) << PAGE_SHIFT) - 1); + first_blk = offset_in_folio(folio, start_byte) >> blkbits; + last_blk = offset_in_folio(folio, last_byte) >> blkbits; + for (i = first_blk; i <= last_blk; i++) { + if (!iomap_ifs_is_block_dirty(folio, ifs, i)) { + ret = punch(inode, folio_pos(folio) + (i << blkbits), + 1 << blkbits); + if (ret) + goto out; + } + } + +skip_ifs_punch: /* * Make sure the next punch start is correctly bound to * the end of this data range, not the end of the folio. @@ -1646,7 +1768,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, struct writeback_control *wbc, struct inode *inode, struct folio *folio, u64 end_pos) { - struct iomap_folio_state *ifs = iomap_ifs_alloc(inode, folio, 0); + struct iomap_folio_state *ifs = iomap_get_ifs(folio); struct iomap_ioend *ioend, *next; unsigned len = i_blocksize(inode); unsigned nblocks = i_blocks_per_folio(inode, folio); @@ -1654,6 +1776,11 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, int error = 0, count = 0, i; LIST_HEAD(submit_list); + if (!ifs && nblocks > 1) { + ifs = iomap_ifs_alloc(inode, folio, 0); + iomap_set_range_dirty(folio, 0, folio_size(folio)); + } + WARN_ON_ONCE(ifs && atomic_read(&ifs->write_bytes_pending) != 0); /* @@ -1662,7 +1789,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, * invalid, grab a new one. */ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) { - if (ifs && !iomap_ifs_is_block_uptodate(ifs, i)) + if (ifs && !iomap_ifs_is_block_dirty(folio, ifs, i)) continue; error = wpc->ops->map_blocks(wpc, inode, pos); @@ -1706,6 +1833,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, } } + iomap_clear_range_dirty(folio, 0, end_pos - folio_pos(folio)); folio_start_writeback(folio); folio_unlock(folio); diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 2ef78aa1d3f6..77c7332ae197 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -578,7 +578,7 @@ const struct address_space_operations xfs_address_space_operations = { .read_folio = xfs_vm_read_folio, .readahead = xfs_vm_readahead, .writepages = xfs_vm_writepages, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .bmap = xfs_vm_bmap, diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c index 132f01d3461f..e508c8e97372 100644 --- a/fs/zonefs/file.c +++ b/fs/zonefs/file.c @@ -175,7 +175,7 @@ const struct address_space_operations zonefs_file_aops = { .read_folio = zonefs_read_folio, .readahead = zonefs_readahead, .writepages = zonefs_writepages, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .migrate_folio = filemap_migrate_folio, diff --git a/include/linux/iomap.h b/include/linux/iomap.h index e2b836c2e119..eb9335c46bf3 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -264,6 +264,7 @@ bool iomap_is_partially_uptodate(struct folio *, size_t from, size_t count); struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos); bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags); void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len); +bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio); int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len, const struct iomap_ops *ops); int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,