Message ID | 20230523081322.331337-9-hch@lst.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/16] btrfs: fix range_end calculation in extent_write_locked_range | expand |
On Tue, May 23, 2023 at 10:13:14AM +0200, Christoph Hellwig wrote:
> PageError is not used by the VFS/MM and deprecated.
Is there some reference other than code? I remember LSF/MM talks about
writeback, reducing page flags but I can't find anything that would say
that PageError is not used anymore. For such core changes in the
neighboring subysystems I kind of rely on lwn.net, hearsay or accidental
notice on fsdevel.
Removing the Error bit handling looks overall good but I'd also need to
refresh my understanding of the interactions with writeback.
On Mon, May 29, 2023 at 07:52:10PM +0200, David Sterba wrote: > On Tue, May 23, 2023 at 10:13:14AM +0200, Christoph Hellwig wrote: > > PageError is not used by the VFS/MM and deprecated. > > Is there some reference other than code? I remember LSF/MM talks about > writeback, reducing page flags but I can't find anything that would say > that PageError is not used anymore. For such core changes in the > neighboring subysystems I kind of rely on lwn.net, hearsay or accidental > notice on fsdevel. > > Removing the Error bit handling looks overall good but I'd also need to > refresh my understanding of the interactions with writeback. willy is the driving force behind the PageErro removal, so maybe he has a coherent writeup. But the basic idea is: - page flag space availability is limited, and killing any one we can easily is a good thing - PageError is not well defined and not part of any VM or VFS contract, so in addition to freeing up space it also generally tends to remove confusion.
On Tue, May 30, 2023 at 07:45:47AM +0200, Christoph Hellwig wrote: > On Mon, May 29, 2023 at 07:52:10PM +0200, David Sterba wrote: > > On Tue, May 23, 2023 at 10:13:14AM +0200, Christoph Hellwig wrote: > > > PageError is not used by the VFS/MM and deprecated. > > > > Is there some reference other than code? I remember LSF/MM talks about > > writeback, reducing page flags but I can't find anything that would say > > that PageError is not used anymore. For such core changes in the > > neighboring subysystems I kind of rely on lwn.net, hearsay or accidental > > notice on fsdevel. > > > > Removing the Error bit handling looks overall good but I'd also need to > > refresh my understanding of the interactions with writeback. > > willy is the driving force behind the PageErro removal, so maybe he > has a coherent writeup. But the basic idea is: > > - page flag space availability is limited, and killing any one we can > easily is a good thing > - PageError is not well defined and not part of any VM or VFS contract, > so in addition to freeing up space it also generally tends to remove > confusion. I don't think I have a coherent writeup. Basically: - The VFS and VM do not check the error flag $ git grep folio_test_error fs/gfs2/lops.c: if (folio_test_error(folio)) mm/migrate.c: if (folio_test_error(folio)) (the use in mm/migrate.c replicates the error flag on migration) A similar grep -w PageError finds only tests in filesystems and comments. - We do not document clearly under what circumstances the error flag is set or cleared If we can remove all uses of testing the error flag, then we can remove everywhere that sets or clears the flag. This is usually a simple matter of checking folio_test_uptodate() instead of !PageError(), since the folio will not be marked uptodate if there is a read error. Write errors are not normally tracked on a per-folio basis. Instead they are tracked through mapping_set_error(). I did a number of these conversions about a year ago; I'm happy Christoph is making progress with btrfs here. See 'git log 6e8e79fc8443' for many of them.
On Tue, May 23, 2023 at 10:13:14AM +0200, Christoph Hellwig wrote: > Note that the error propagation for superblock writes still uses > PageError for now. This patchset cleans up all the other Error bits so for super block we can first switch to another flag like PageChecked, reworking the separate page for superblock I tired some time ago still had problems.
On Tue, May 30, 2023 at 07:08:38AM +0100, Matthew Wilcox wrote: > On Tue, May 30, 2023 at 07:45:47AM +0200, Christoph Hellwig wrote: > > On Mon, May 29, 2023 at 07:52:10PM +0200, David Sterba wrote: > > > On Tue, May 23, 2023 at 10:13:14AM +0200, Christoph Hellwig wrote: > > > > PageError is not used by the VFS/MM and deprecated. > > > > > > Is there some reference other than code? I remember LSF/MM talks about > > > writeback, reducing page flags but I can't find anything that would say > > > that PageError is not used anymore. For such core changes in the > > > neighboring subysystems I kind of rely on lwn.net, hearsay or accidental > > > notice on fsdevel. > > > > > > Removing the Error bit handling looks overall good but I'd also need to > > > refresh my understanding of the interactions with writeback. > > > > willy is the driving force behind the PageErro removal, so maybe he > > has a coherent writeup. But the basic idea is: > > > > - page flag space availability is limited, and killing any one we can > > easily is a good thing > > - PageError is not well defined and not part of any VM or VFS contract, > > so in addition to freeing up space it also generally tends to remove > > confusion. > > I don't think I have a coherent writeup. Basically: > > - The VFS and VM do not check the error flag > > $ git grep folio_test_error > fs/gfs2/lops.c: if (folio_test_error(folio)) > mm/migrate.c: if (folio_test_error(folio)) > > (the use in mm/migrate.c replicates the error flag on migration) > > A similar grep -w PageError finds only tests in filesystems and > comments. > > - We do not document clearly under what circumstances the error flag > is set or cleared > > If we can remove all uses of testing the error flag, then we can remove > everywhere that sets or clears the flag. This is usually a simple > matter of checking folio_test_uptodate() instead of !PageError(), > since the folio will not be marked uptodate if there is a read error. > Write errors are not normally tracked on a per-folio basis. Instead they > are tracked through mapping_set_error(). > > I did a number of these conversions about a year ago; I'm happy Christoph > is making progress with btrfs here. See 'git log 6e8e79fc8443' for many > of them. Thank you, that helped. I found one case of folio_set_error() that seems to be redundant: 189 static noinline void end_compressed_writeback(const struct compressed_bio *cb) 190 { 191 struct inode *inode = &cb->bbio.inode->vfs_inode; 192 struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); 193 unsigned long index = cb->start >> PAGE_SHIFT; 194 unsigned long end_index = (cb->start + cb->len - 1) >> PAGE_SHIFT; 195 struct folio_batch fbatch; 196 const int errno = blk_status_to_errno(cb->bbio.bio.bi_status); 197 int i; 198 int ret; 199 200 if (errno) 201 mapping_set_error(inode->i_mapping, errno); 202 203 folio_batch_init(&fbatch); 204 while (index <= end_index) { 205 ret = filemap_get_folios(inode->i_mapping, &index, end_index, 206 &fbatch); 207 208 if (ret == 0) 209 return; 210 211 for (i = 0; i < ret; i++) { 212 struct folio *folio = fbatch.folios[i]; 213 214 if (errno) 215 folio_set_error(folio); ^^^^^^^^^^^^^^^^^^^^^^^ The error is set on the mapping on line 201 under the same condition. With that removed and the super block write error handling moved away from the Error bit the btrfs code will be Error-clean. 216 btrfs_page_clamp_clear_writeback(fs_info, &folio->page, 217 cb->start, cb->len); 218 } 219 folio_batch_release(&fbatch); 220 } 221 /* the inode may be gone now */ 222 }
On Tue, May 30, 2023 at 03:23:11PM +0200, David Sterba wrote: > On Tue, May 23, 2023 at 10:13:14AM +0200, Christoph Hellwig wrote: > > Note that the error propagation for superblock writes still uses > > PageError for now. > > This patchset cleans up all the other Error bits so for super block we > can first switch to another flag like PageChecked, reworking the > separate page for superblock I tired some time ago still had problems. Yes. But the superblock writing all works based on the block devices page cache, so it doesn't interact with the per-normal file pagecache touched here, or the magic btree inode page cache touched in the previous series. I was planning to take a closer look at the superblock handling when I find some time.
On Tue, May 30, 2023 at 03:34:46PM +0200, David Sterba wrote:
> I found one case of folio_set_error() that seems to be redundant:
Ooops, I missed that as it's one of the few folio based things in
btrfs. I'll fix it up together with the other little things in a
resend.
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index d7b31888efa17a..28610ed0fae913 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -223,8 +223,6 @@ static int process_one_page(struct btrfs_fs_info *fs_info, if (page_ops & PAGE_SET_ORDERED) btrfs_page_clamp_set_ordered(fs_info, page, start, len); - if (page_ops & PAGE_SET_ERROR) - btrfs_page_clamp_set_error(fs_info, page, start, len); if (page_ops & PAGE_START_WRITEBACK) { btrfs_page_clamp_clear_dirty(fs_info, page, start, len); btrfs_page_clamp_set_writeback(fs_info, page, start, len); @@ -497,12 +495,10 @@ static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len) ASSERT(page_offset(page) <= start && start + len <= page_offset(page) + PAGE_SIZE); - if (uptodate && btrfs_verify_page(page, start)) { + if (uptodate && btrfs_verify_page(page, start)) btrfs_page_set_uptodate(fs_info, page, start, len); - } else { + else btrfs_page_clear_uptodate(fs_info, page, start, len); - btrfs_page_set_error(fs_info, page, start, len); - } if (!btrfs_is_subpage(fs_info, page)) unlock_page(page); @@ -530,7 +526,6 @@ void end_extent_writepage(struct page *page, int err, u64 start, u64 end) len = end + 1 - start; btrfs_page_clear_uptodate(fs_info, page, start, len); - btrfs_page_set_error(fs_info, page, start, len); ret = err < 0 ? err : -EIO; mapping_set_error(page->mapping, ret); } @@ -1059,7 +1054,6 @@ static int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, ret = set_page_extent_mapped(page); if (ret < 0) { unlock_extent(tree, start, end, NULL); - btrfs_page_set_error(fs_info, page, start, PAGE_SIZE); unlock_page(page); return ret; } @@ -1263,11 +1257,9 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode, } ret = btrfs_run_delalloc_range(inode, page, delalloc_start, delalloc_end, &page_started, &nr_written, wbc); - if (ret) { - btrfs_page_set_error(inode->root->fs_info, page, - page_offset(page), PAGE_SIZE); + if (ret) return ret; - } + /* * delalloc_end is already one less than the total length, so * we don't subtract one from PAGE_SIZE @@ -1420,7 +1412,6 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode, em = btrfs_get_extent(inode, NULL, 0, cur, end - cur + 1); if (IS_ERR(em)) { - btrfs_page_set_error(fs_info, page, cur, end - cur + 1); ret = PTR_ERR_OR_ZERO(em); goto out_error; } @@ -1519,9 +1510,6 @@ static int __extent_writepage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl WARN_ON(!PageLocked(page)); - btrfs_page_clear_error(btrfs_sb(inode->i_sb), page, - page_offset(page), PAGE_SIZE); - pg_offset = offset_in_page(i_size); if (page->index > end_index || (page->index == end_index && !pg_offset)) { @@ -1534,10 +1522,8 @@ static int __extent_writepage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl memzero_page(page, pg_offset, PAGE_SIZE - pg_offset); ret = set_page_extent_mapped(page); - if (ret < 0) { - SetPageError(page); + if (ret < 0) goto done; - } if (!bio_ctrl->extent_locked) { ret = writepage_delalloc(BTRFS_I(inode), page, bio_ctrl->wbc); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index c4d4ac0428ee74..35b99fde75abb1 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1153,8 +1153,6 @@ static int submit_uncompressed_range(struct btrfs_inode *inode, const u64 page_start = page_offset(locked_page); const u64 page_end = page_start + PAGE_SIZE - 1; - btrfs_page_set_error(inode->root->fs_info, locked_page, - page_start, PAGE_SIZE); set_page_writeback(locked_page); end_page_writeback(locked_page); end_extent_writepage(locked_page, ret, page_start, page_end); @@ -3028,7 +3026,6 @@ static void btrfs_writepage_fixup_worker(struct btrfs_work *work) mapping_set_error(page->mapping, ret); end_extent_writepage(page, ret, page_start, page_end); clear_page_dirty_for_io(page); - SetPageError(page); } btrfs_page_clear_checked(inode->root->fs_info, page, page_start, PAGE_SIZE); unlock_page(page); diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c index 045117ca0ddc43..9e9a5e26a15736 100644 --- a/fs/btrfs/subpage.c +++ b/fs/btrfs/subpage.c @@ -100,9 +100,6 @@ void btrfs_init_subpage_info(struct btrfs_subpage_info *subpage_info, u32 sector subpage_info->uptodate_offset = cur; cur += nr_bits; - subpage_info->error_offset = cur; - cur += nr_bits; - subpage_info->dirty_offset = cur; cur += nr_bits; @@ -416,35 +413,6 @@ void btrfs_subpage_clear_uptodate(const struct btrfs_fs_info *fs_info, spin_unlock_irqrestore(&subpage->lock, flags); } -void btrfs_subpage_set_error(const struct btrfs_fs_info *fs_info, - struct page *page, u64 start, u32 len) -{ - struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; - unsigned int start_bit = subpage_calc_start_bit(fs_info, page, - error, start, len); - unsigned long flags; - - spin_lock_irqsave(&subpage->lock, flags); - bitmap_set(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits); - SetPageError(page); - spin_unlock_irqrestore(&subpage->lock, flags); -} - -void btrfs_subpage_clear_error(const struct btrfs_fs_info *fs_info, - struct page *page, u64 start, u32 len) -{ - struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; - unsigned int start_bit = subpage_calc_start_bit(fs_info, page, - error, start, len); - unsigned long flags; - - spin_lock_irqsave(&subpage->lock, flags); - bitmap_clear(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits); - if (subpage_test_bitmap_all_zero(fs_info, subpage, error)) - ClearPageError(page); - spin_unlock_irqrestore(&subpage->lock, flags); -} - void btrfs_subpage_set_dirty(const struct btrfs_fs_info *fs_info, struct page *page, u64 start, u32 len) { @@ -606,7 +574,6 @@ bool btrfs_subpage_test_##name(const struct btrfs_fs_info *fs_info, \ return ret; \ } IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(uptodate); -IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(error); IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(dirty); IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(writeback); IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(ordered); @@ -674,7 +641,6 @@ bool btrfs_page_clamp_test_##name(const struct btrfs_fs_info *fs_info, \ } IMPLEMENT_BTRFS_PAGE_OPS(uptodate, SetPageUptodate, ClearPageUptodate, PageUptodate); -IMPLEMENT_BTRFS_PAGE_OPS(error, SetPageError, ClearPageError, PageError); IMPLEMENT_BTRFS_PAGE_OPS(dirty, set_page_dirty, clear_page_dirty_for_io, PageDirty); IMPLEMENT_BTRFS_PAGE_OPS(writeback, set_page_writeback, end_page_writeback, diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h index 0e80ad33690466..998c1b78066e53 100644 --- a/fs/btrfs/subpage.h +++ b/fs/btrfs/subpage.h @@ -8,17 +8,17 @@ /* * Extra info for subpapge bitmap. * - * For subpage we pack all uptodate/error/dirty/writeback/ordered bitmaps into + * For subpage we pack all uptodate/dirty/writeback/ordered bitmaps into * one larger bitmap. * * This structure records how they are organized in the bitmap: * - * /- uptodate_offset /- error_offset /- dirty_offset + * /- uptodate_offset /- dirty_offset /- ordered_offset * | | | * v v v - * |u|u|u|u|........|u|u|e|e|.......|e|e| ... |o|o| + * |u|u|u|u|........|u|u|d|d|.......|d|d|o|o|.......|o|o| * |<- bitmap_nr_bits ->| - * |<--------------- total_nr_bits ---------------->| + * |<----------------- total_nr_bits ------------------>| */ struct btrfs_subpage_info { /* Number of bits for each bitmap */ @@ -32,7 +32,6 @@ struct btrfs_subpage_info { * @bitmap_size, which is calculated from PAGE_SIZE / sectorsize. */ unsigned int uptodate_offset; - unsigned int error_offset; unsigned int dirty_offset; unsigned int writeback_offset; unsigned int ordered_offset; @@ -141,7 +140,6 @@ bool btrfs_page_clamp_test_##name(const struct btrfs_fs_info *fs_info, \ struct page *page, u64 start, u32 len); DECLARE_BTRFS_SUBPAGE_OPS(uptodate); -DECLARE_BTRFS_SUBPAGE_OPS(error); DECLARE_BTRFS_SUBPAGE_OPS(dirty); DECLARE_BTRFS_SUBPAGE_OPS(writeback); DECLARE_BTRFS_SUBPAGE_OPS(ordered);
PageError is not used by the VFS/MM and deprecated. Btrfs now only sets the flag and never clears it for data pages, so just remove all places setting it, and the subpage error bit. Note that the error propagation for superblock writes still uses PageError for now. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/btrfs/extent_io.c | 24 +++++------------------- fs/btrfs/inode.c | 3 --- fs/btrfs/subpage.c | 34 ---------------------------------- fs/btrfs/subpage.h | 10 ++++------ 4 files changed, 9 insertions(+), 62 deletions(-)