Message ID | 21ee2b756ce8ad1dcf1b9ecdfec84f0b87c271f5.1743239672.git.wqu@suse.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | btrfs: add the missing preparations exposed by initial large data folio support | expand |
On Sat, Mar 29, 2025 at 9:20 AM Qu Wenruo <wqu@suse.com> wrote: > > [BUG] > When running btrfs/004 with 4K fs block size and 64K page size, > sometimes fsstress workload can take 100% CPU for a while, but not long > enough to trigger a 120s hang warning. > > [CAUSE] > When such 100% CPU usage happens, btrfs_punch_hole_lock_range() is > always in the call trace. > > With extra ftrace debugs, it shows something like this: > > btrfs_punch_hole_lock_range: r/i=5/2745 start=4096(65536) What's this r/i=5/2745? Since this is from a custom tracepoint you made for debugging, I suggest removing that part from the changelog. It's not obvious (to me at least) what it is and it's not relevant to understand the problem being fixed. > end=20479(18446744073709551615) enter > > Where 4096 is the @lockstart parameter, 65536 is the rounded up > @page_lockstart, 20479 is the @lockend parameter. > So far so good. > > But the large number (u64)-1 is the @page_lockend, which is not correct. > > This is caused by the fact that round_down(locked + 1, PAGE_SIZE) > results 0. > > In the above case, the range is inside the same page, and we do not even > need to call filemap_range_has_page(), not to mention to call it with > (u64)-1 as the end. > > This behavior will cause btrfs_punch_hole_lock_range() to busy loop > waiting for irrelevant range to has its pages to be dropped. > > [FIX] > Calculate @page_lockend by just rounding down @lockend, without > decreasing the value by one. > So @page_lockend will no longer overflow. > > Then exit early if @page_lockend is no larger than @page_lockestart. @page_lockestart -> @page_lockstart > As it means either the range is inside the same page, or the two pages > are adjacent already. > > Finally only decrease @page_lockend when calling > filemap_range_has_page(). > > Fixes: 0528476b6ac7 ("btrfs: fix the filemap_range_has_page() call in btrfs_punch_hole_lock_range()") > Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> With those minor things fixed up, it looks good, thanks. > --- > fs/btrfs/file.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 589d36f8de12..7c147ef9368d 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -2126,15 +2126,20 @@ static void btrfs_punch_hole_lock_range(struct inode *inode, > * will always return true. > * So here we need to do extra page alignment for > * filemap_range_has_page(). > + * > + * And do not decrease page_lockend right now, as it can be 0. > */ > const u64 page_lockstart = round_up(lockstart, PAGE_SIZE); > - const u64 page_lockend = round_down(lockend + 1, PAGE_SIZE) - 1; > + const u64 page_lockend = round_down(lockend + 1, PAGE_SIZE); > > while (1) { > truncate_pagecache_range(inode, lockstart, lockend); > > lock_extent(&BTRFS_I(inode)->io_tree, lockstart, lockend, > cached_state); > + /* The same page or adjacent pages. */ > + if (page_lockend <= page_lockstart) > + break; > /* > * We can't have ordered extents in the range, nor dirty/writeback > * pages, because we have locked the inode's VFS lock in exclusive > @@ -2146,7 +2151,7 @@ static void btrfs_punch_hole_lock_range(struct inode *inode, > * we do, unlock the range and retry. > */ > if (!filemap_range_has_page(inode->i_mapping, page_lockstart, > - page_lockend)) > + page_lockend - 1)) > break; > > unlock_extent(&BTRFS_I(inode)->io_tree, lockstart, lockend, > -- > 2.49.0 > >
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 589d36f8de12..7c147ef9368d 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2126,15 +2126,20 @@ static void btrfs_punch_hole_lock_range(struct inode *inode, * will always return true. * So here we need to do extra page alignment for * filemap_range_has_page(). + * + * And do not decrease page_lockend right now, as it can be 0. */ const u64 page_lockstart = round_up(lockstart, PAGE_SIZE); - const u64 page_lockend = round_down(lockend + 1, PAGE_SIZE) - 1; + const u64 page_lockend = round_down(lockend + 1, PAGE_SIZE); while (1) { truncate_pagecache_range(inode, lockstart, lockend); lock_extent(&BTRFS_I(inode)->io_tree, lockstart, lockend, cached_state); + /* The same page or adjacent pages. */ + if (page_lockend <= page_lockstart) + break; /* * We can't have ordered extents in the range, nor dirty/writeback * pages, because we have locked the inode's VFS lock in exclusive @@ -2146,7 +2151,7 @@ static void btrfs_punch_hole_lock_range(struct inode *inode, * we do, unlock the range and retry. */ if (!filemap_range_has_page(inode->i_mapping, page_lockstart, - page_lockend)) + page_lockend - 1)) break; unlock_extent(&BTRFS_I(inode)->io_tree, lockstart, lockend,
[BUG] When running btrfs/004 with 4K fs block size and 64K page size, sometimes fsstress workload can take 100% CPU for a while, but not long enough to trigger a 120s hang warning. [CAUSE] When such 100% CPU usage happens, btrfs_punch_hole_lock_range() is always in the call trace. With extra ftrace debugs, it shows something like this: btrfs_punch_hole_lock_range: r/i=5/2745 start=4096(65536) end=20479(18446744073709551615) enter Where 4096 is the @lockstart parameter, 65536 is the rounded up @page_lockstart, 20479 is the @lockend parameter. So far so good. But the large number (u64)-1 is the @page_lockend, which is not correct. This is caused by the fact that round_down(locked + 1, PAGE_SIZE) results 0. In the above case, the range is inside the same page, and we do not even need to call filemap_range_has_page(), not to mention to call it with (u64)-1 as the end. This behavior will cause btrfs_punch_hole_lock_range() to busy loop waiting for irrelevant range to has its pages to be dropped. [FIX] Calculate @page_lockend by just rounding down @lockend, without decreasing the value by one. So @page_lockend will no longer overflow. Then exit early if @page_lockend is no larger than @page_lockestart. As it means either the range is inside the same page, or the two pages are adjacent already. Finally only decrease @page_lockend when calling filemap_range_has_page(). Fixes: 0528476b6ac7 ("btrfs: fix the filemap_range_has_page() call in btrfs_punch_hole_lock_range()") Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/file.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)