Message ID | 20240418001356.95857-1-mcgrof@kernel.org (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | fstests: add fsstress + compaction test | expand |
On Wed, Apr 17, 2024 at 05:13:56PM -0700, Luis Chamberlain wrote: > Running compaction while we run fsstress can crash older kernels as per > korg#218227 [0], the fix for that [0] has been posted [1] but that patch > is not yet on v6.9-rc4 and the patch requires changes for v6.9. It doesn't require changes, it just has prerequisites: https://lore.kernel.org/all/ZgHhcojXc9QjynUI@casper.infradead.org/ > Today I find that v6.9-rc4 is also hitting an unrecoverable hung task > between compaction and fsstress while running generic/476 on the > following kdevops test sections [2]: > > * xfs_nocrc > * xfs_nocrc_2k > * xfs_nocrc_4k > > Analyzing the trace I see the guest uses loopback block devices for the > fstests TEST_DEV, the loopback file uses sparsefiles on a btrfs > partition. The contention based on traces [3] [4] seems to be that we > have somehow have fsstress + compaction race on folio_wait_bit_common(). What do you mean by "race"? Here's what I see: Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task kcompactd0:72 blocked for more than 120 seconds. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: task:kcompactd0 state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00004000 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Call Trace: Apr 16 23:06:11 base-xfs-nocrc-2k kernel: <TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __schedule+0x3d9/0xaf0 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: schedule+0x26/0xf0 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: io_schedule+0x42/0x70 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: folio_wait_bit_common+0x123/0x370 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_wake_page_function+0x10/0x10 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: migrate_pages_batch+0x69a/0xd70 But you didn't run the backtrace through scripts/decode_stacktrace.sh so I can't figure out what we're waiting on. > We have this happening: > > a) kthread compaction --> migrate_pages_batch() > --> folio_wait_bit_common() > b) workqueue on btrfs writeback wb_workfn --> extent_write_cache_pages() > --> folio_wait_bit_common() > c) workqueue on loopback loop_rootcg_workfn() --> filemap_fdatawrite_wbc() > --> folio_wait_bit_common() > d) kthread xfsaild --> blk_mq_submit_bio() --> wbt_wait() > > I tried to reproduce but couldn't easily do so, so I wrote this test > to help, and with this I have 100% failure rate so far out of 2 runs. > > Given we also have korg#218227 and that patch likely needing > backporting, folks will want a reproducer for this issue. This should > hopefully help with that case and this new separate issue. > > To reproduce with kdevops just: > > make defconfig-xfs_nocrc_2k -j $(nproc) > make -j $(nproc) > make fstests > make linux > make fstests-baseline TESTS=generic/733 > tail -f guestfs/*-xfs-nocrc-2k/console.log > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=218227 > [1] https://lore.kernel.org/all/7ee2bb8c-441a-418b-ba3a-d305f69d31c8@suse.cz/T/#u > [2] https://github.com/linux-kdevops/kdevops/blob/main/playbooks/roles/fstests/templates/xfs/xfs.config > [3] https://gist.github.com/mcgrof/4dfa3264f513ce6ca398414326cfab84 > [4] https://gist.github.com/mcgrof/f40a9f31a43793dac928ce287cfacfeb > > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> > --- > > Note: kdevops uses its own fork of fstests which has this merged > already, so the above should just work. If it's your first time using > kdevops be sure to just read the README for the first time users: > > https://github.com/linux-kdevops/kdevops/blob/main/docs/kdevops-first-run.md > > common/rc | 7 ++++++ > tests/generic/744 | 56 +++++++++++++++++++++++++++++++++++++++++++ > tests/generic/744.out | 2 ++ > 3 files changed, 65 insertions(+) > create mode 100755 tests/generic/744 > create mode 100644 tests/generic/744.out > > diff --git a/common/rc b/common/rc > index b7b77ac1b46d..d4432f5ce259 100644 > --- a/common/rc > +++ b/common/rc > @@ -120,6 +120,13 @@ _require_hugepages() > _notrun "Kernel does not report huge page size" > } > > +# Requires CONFIG_COMPACTION > +_require_compaction() > +{ > + if [ ! -f /proc/sys/vm/compact_memory ]; then > + _notrun "Need compaction enabled CONFIG_COMPACTION=y" > + fi > +} > # Get hugepagesize in bytes > _get_hugepagesize() > { > diff --git a/tests/generic/744 b/tests/generic/744 > new file mode 100755 > index 000000000000..2b3c0c7e92fb > --- /dev/null > +++ b/tests/generic/744 > @@ -0,0 +1,56 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2024 Luis Chamberlain. All Rights Reserved. > +# > +# FS QA Test 744 > +# > +# fsstress + compaction test > +# > +. ./common/preamble > +_begin_fstest auto rw long_rw stress soak smoketest > + > +_cleanup() > +{ > + cd / > + rm -f $tmp.* > + $KILLALL_PROG -9 fsstress > /dev/null 2>&1 > +} > + > +# Import common functions. > + > +# real QA test starts here > + > +# Modify as appropriate. > +_supported_fs generic > + > +_require_scratch > +_require_compaction > +_require_command "$KILLALL_PROG" "killall" > + > +echo "Silence is golden." > + > +_scratch_mkfs > $seqres.full 2>&1 > +_scratch_mount >> $seqres.full 2>&1 > + > +nr_cpus=$((LOAD_FACTOR * 4)) > +nr_ops=$((25000 * nr_cpus * TIME_FACTOR)) > +fsstress_args=(-w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus) > + > +# start a background getxattr loop for the existing xattr > +runfile="$tmp.getfattr" > +touch $runfile > +while [ -e $runfile ]; do > + echo 1 > /proc/sys/vm/compact_memory > + sleep 15 > +done & > +getfattr_pid=$! > + > +test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION") > + > +$FSSTRESS_PROG $FSSTRESS_AVOID "${fsstress_args[@]}" >> $seqres.full > + > +rm -f $runfile > +wait > /dev/null 2>&1 > + > +status=0 > +exit > diff --git a/tests/generic/744.out b/tests/generic/744.out > new file mode 100644 > index 000000000000..205c684fa995 > --- /dev/null > +++ b/tests/generic/744.out > @@ -0,0 +1,2 @@ > +QA output created by 744 > +Silence is golden > -- > 2.43.0 >
On Thu, Apr 18, 2024 at 02:39:19AM +0100, Matthew Wilcox wrote: > On Wed, Apr 17, 2024 at 05:13:56PM -0700, Luis Chamberlain wrote: > > Running compaction while we run fsstress can crash older kernels as per > > korg#218227 [0], the fix for that [0] has been posted [1] but that patch > > is not yet on v6.9-rc4 and the patch requires changes for v6.9. > > It doesn't require changes, it just has prerequisites: > > https://lore.kernel.org/all/ZgHhcojXc9QjynUI@casper.infradead.org/ Oh alrighty, thanks I'll give those two a shot to see if that fixes this other lockup. > > Today I find that v6.9-rc4 is also hitting an unrecoverable hung task > > between compaction and fsstress while running generic/476 on the > > following kdevops test sections [2]: > > > > * xfs_nocrc > > * xfs_nocrc_2k > > * xfs_nocrc_4k > > > > Analyzing the trace I see the guest uses loopback block devices for the > > fstests TEST_DEV, the loopback file uses sparsefiles on a btrfs > > partition. The contention based on traces [3] [4] seems to be that we > > have somehow have fsstress + compaction race on folio_wait_bit_common(). > > What do you mean by "race"? Here's what I see: > > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task kcompactd0:72 blocked for more than 120 seconds. > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: task:kcompactd0 state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00004000 > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Call Trace: > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: <TASK> > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __schedule+0x3d9/0xaf0 > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: schedule+0x26/0xf0 > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: io_schedule+0x42/0x70 > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: folio_wait_bit_common+0x123/0x370 > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_wake_page_function+0x10/0x10 > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: migrate_pages_batch+0x69a/0xd70 > > But you didn't run the backtrace through scripts/decode_stacktrace.sh > so I can't figure out what we're waiting on. Oh, lemme do that. OK here's the full log and at the end of this email is just the relevant traces. https://gist.github.com/mcgrof/5e2886a33fd9448e1033acfbe244e82f Apr 16 21:45:28 base-xfs-nocrc-2k unknown: run fstests generic/476 at 2024-04-16 21:45:28 Apr 16 21:45:29 base-xfs-nocrc-2k kernel: XFS (loop16): Mounting V4 Filesystem 2edc77d4-f00a-40d8-9d7f-b1e3876a156e Apr 16 21:45:29 base-xfs-nocrc-2k kernel: XFS (loop16): Ending clean mount Apr 16 21:45:29 base-xfs-nocrc-2k kernel: xfs filesystem being mounted at /media/test supports timestamps until 2038-01-19 (0x7fffffff) Apr 16 21:45:29 base-xfs-nocrc-2k kernel: XFS (loop5): Mounting V4 Filesystem c13935a8-e0a2-491f-a58f-ed56430999dc Apr 16 21:45:29 base-xfs-nocrc-2k kernel: XFS (loop5): Ending clean mount Apr 16 21:45:29 base-xfs-nocrc-2k kernel: xfs filesystem being mounted at /media/scratch supports timestamps until 2038-01-19 (0x7fffffff) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task kcompactd0:72 blocked for more than 120 seconds. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: task:kcompactd0 state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00004000 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Call Trace: Apr 16 23:06:11 base-xfs-nocrc-2k kernel: <TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: folio_wait_bit_common (mm/filemap.c:1275 (discriminator 4)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_wake_page_function (mm/filemap.c:1091) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: migrate_pages_batch (./include/linux/pagemap.h:1048 mm/migrate.c:1486 mm/migrate.c:1700) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_compaction_free (mm/compaction.c:1907) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_compaction_alloc (mm/compaction.c:1855) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __mod_node_page_state (mm/vmstat.c:403 (discriminator 2)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: migrate_pages (mm/migrate.c:1849 mm/migrate.c:1953) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_compaction_alloc (mm/compaction.c:1855) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_compaction_free (mm/compaction.c:1907) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? folio_putback_lru (./arch/x86/include/asm/atomic.h:23 ./include/linux/atomic/atomic-arch-fallback.h:457 ./include/linux/atomic/atomic-instrumented.h:33 ./include/linux/page_ref.h:67 ./include/linux/mm.h:1134 ./include/linux/mm.h:1140 ./include/linux/mm.h:1505 mm/vmscan.c:819) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: compact_zone (mm/compaction.c:2663) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? update_load_avg (kernel/sched/fair.c:4411 kernel/sched/fair.c:4748) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: compact_node (mm/compaction.c:2925) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: kcompactd (./include/linux/mmzone.h:1501 mm/compaction.c:2257 mm/compaction.c:3222) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_autoremove_wake_function (kernel/sched/wait.c:383) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_kcompactd (mm/compaction.c:3168) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: </TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task kworker/u38:8:807213 blocked for more than 120 seconds. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: task:kworker/u38:8 state:D stack:0 pid:807213 tgid:807213 ppid:2 flags:0x00004000 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Workqueue: writeback wb_workfn (flush-btrfs-2) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Call Trace: Apr 16 23:06:11 base-xfs-nocrc-2k kernel: <TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: folio_wait_bit_common (mm/filemap.c:1275 (discriminator 4)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_wake_page_function (mm/filemap.c:1091) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: extent_write_cache_pages (fs/btrfs/extent_io.c:2130) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:974 (discriminator 1) kernel/sched/core.c:5852 (discriminator 1) kernel/sched/core.c:5849 (discriminator 1) kernel/sched/core.c:5877 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? stack_depot_save_flags (lib/stackdepot.c:609) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? put_cpu_partial (./arch/x86/include/asm/irqflags.h:134 (discriminator 1) mm/slub.c:2978 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: extent_writepages (fs/btrfs/extent_io.c:131 fs/btrfs/extent_io.c:2275) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: do_writepages (mm/page-writeback.c:2612) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? update_sd_lb_stats.constprop.0 (kernel/sched/fair.c:9896 (discriminator 2) kernel/sched/fair.c:10577 (discriminator 2)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __writeback_single_inode (fs/fs-writeback.c:1659) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? _raw_spin_lock (./arch/x86/include/asm/atomic.h:115 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:134 (discriminator 4) kernel/locking/spinlock.c:154 (discriminator 4)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: writeback_sb_inodes (fs/fs-writeback.c:1943) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __writeback_inodes_wb (fs/fs-writeback.c:2013) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: wb_writeback (fs/fs-writeback.c:2119) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: wb_workfn (fs/fs-writeback.c:2276 (discriminator 1) fs/fs-writeback.c:2304 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: process_one_work (kernel/workqueue.c:3254) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: worker_thread (kernel/workqueue.c:3329 (discriminator 2) kernel/workqueue.c:3416 (discriminator 2)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_worker_thread (kernel/workqueue.c:3362) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: </TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task kworker/u34:9:1268436 blocked for more than 120 seconds. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: task:kworker/u34:9 state:D stack:0 pid:1268436 tgid:1268436 ppid:2 flags:0x00004000 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Workqueue: loop5 loop_rootcg_workfn [loop] Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Call Trace: Apr 16 23:06:11 base-xfs-nocrc-2k kernel: <TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: folio_wait_bit_common (mm/filemap.c:1275 (discriminator 4)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_wake_page_function (mm/filemap.c:1091) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: btrfs_folio_start_writer_lock (./include/linux/pagemap.h:1048 fs/btrfs/subpage.c:394) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: lock_delalloc_pages (fs/btrfs/extent_io.c:281 (discriminator 1)) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:977 kernel/sched/core.c:5852 kernel/sched/core.c:5849 kernel/sched/core.c:5877) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? _raw_spin_unlock (./include/linux/spinlock_api_smp.h:143 (discriminator 3) kernel/locking/spinlock.c:186 (discriminator 3)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? btrfs_find_delalloc_range (fs/btrfs/extent-io-tree.c:1036) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: find_lock_delalloc_range (fs/btrfs/extent_io.c:377) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: writepage_delalloc (fs/btrfs/extent_io.c:1217 (discriminator 1)) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: extent_write_cache_pages (fs/btrfs/extent_io.c:1475 (discriminator 1) fs/btrfs/extent_io.c:2152 (discriminator 1)) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: extent_writepages (fs/btrfs/extent_io.c:131 fs/btrfs/extent_io.c:2275) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_end_bbio_data_write (fs/btrfs/extent_io.c:463) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: do_writepages (mm/page-writeback.c:2612) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pv_queued_spin_lock_slowpath (kernel/locking/qspinlock.c:565 (discriminator 38)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pv_queued_spin_lock_slowpath (kernel/locking/qspinlock.c:565 (discriminator 38)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: filemap_fdatawrite_wbc (mm/filemap.c:398 mm/filemap.c:387) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __filemap_fdatawrite_range (mm/filemap.c:431) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: start_ordered_ops.constprop.0 (fs/btrfs/file.c:3873 fs/btrfs/file.c:1743) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: btrfs_sync_file (fs/btrfs/file.c:1818) btrfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: loop_process_work (drivers/block/loop.c:329 (discriminator 1) drivers/block/loop.c:475 (discriminator 1) drivers/block/loop.c:1907 (discriminator 1) drivers/block/loop.c:1942 (discriminator 1)) loop Apr 16 23:06:11 base-xfs-nocrc-2k kernel: process_one_work (kernel/workqueue.c:3254) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: worker_thread (kernel/workqueue.c:3329 (discriminator 2) kernel/workqueue.c:3416 (discriminator 2)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:115 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_worker_thread (kernel/workqueue.c:3362) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: </TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task xfsaild/loop5:1377891 blocked for more than 120 seconds. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: task:xfsaild/loop5 state:D stack:0 pid:1377891 tgid:1377891 ppid:2 flags:0x00004000 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Call Trace: Apr 16 23:06:11 base-xfs-nocrc-2k kernel: <TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_wbt_cleanup_cb (block/blk-wbt.c:575) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_wbt_inflight_cb (block/blk-wbt.c:569) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: rq_qos_wait (block/blk-rq-qos.c:284 (discriminator 4)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_rq_qos_wake_function (block/blk-rq-qos.c:208) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_wbt_inflight_cb (block/blk-wbt.c:569) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: wbt_wait (block/blk-wbt.c:660) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __rq_qos_throttle (block/blk-rq-qos.c:66) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: blk_mq_submit_bio (block/blk-mq.c:2880 block/blk-mq.c:2984) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: submit_bio_noacct_nocheck (./include/linux/bio.h:639 block/blk-core.c:701 block/blk-core.c:729) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? submit_bio_noacct (block/blk-core.c:758 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: _xfs_buf_ioapply (fs/xfs/xfs_buf.c:1584 fs/xfs/xfs_buf.c:1671) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? xfs_dir_ino_validate (fs/xfs/libxfs/xfs_dir2.c:220 (discriminator 2)) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:977 kernel/sched/core.c:5852 kernel/sched/core.c:5849 kernel/sched/core.c:5877) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __xfs_buf_submit (./arch/x86/include/asm/atomic.h:67 ./include/linux/atomic/atomic-arch-fallback.h:2278 ./include/linux/atomic/atomic-instrumented.h:1384 fs/xfs/xfs_buf.c:1762) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_buf_delwri_submit_buffers (fs/xfs/xfs_buf.c:2280 (discriminator 2)) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfsaild (fs/xfs/xfs_trans_ail.c:560 (discriminator 1) fs/xfs/xfs_trans_ail.c:671 (discriminator 1)) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:115 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_xfsaild (fs/xfs/xfs_trans_ail.c:597) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: </TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task fsstress:1377894 blocked for more than 120 seconds. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: task:fsstress state:D stack:0 pid:1377894 tgid:1377894 ppid:1377892 flags:0x00004002 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Call Trace: Apr 16 23:06:11 base-xfs-nocrc-2k kernel: <TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: schedule_timeout (kernel/time/timer.c:2559) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? blk_finish_plug (block/blk-core.c:1218 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __wait_for_common (kernel/sched/completion.c:95 kernel/sched/completion.c:116) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_schedule_timeout (kernel/time/timer.c:2544) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_buf_iowait (fs/xfs/xfs_buf.c:1691) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __xfs_buf_submit (fs/xfs/xfs_buf.c:1770) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_buf_read_map (fs/xfs/xfs_buf.c:870) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? xfs_btree_read_buf_block (./fs/xfs/xfs_trans.h:210 fs/xfs/libxfs/xfs_btree.c:1432) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:289 (discriminator 1)) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? xfs_btree_read_buf_block (./fs/xfs/xfs_trans.h:210 fs/xfs/libxfs/xfs_btree.c:1432) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_btree_read_buf_block (./fs/xfs/xfs_trans.h:210 fs/xfs/libxfs/xfs_btree.c:1432) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_btree_lookup_get_block (fs/xfs/libxfs/xfs_btree.c:1934) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_btree_lookup (fs/xfs/libxfs/xfs_btree.c:2045) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? kmem_cache_alloc (./include/linux/kmemleak.h:42 mm/slub.c:3802 mm/slub.c:3845 mm/slub.c:3852) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? xfs_bunmapi (fs/xfs/libxfs/xfs_bmap.c:5671) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? xfs_free_file_space (fs/xfs/xfs_bmap_util.c:786 fs/xfs/xfs_bmap_util.c:856) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? xfs_file_fallocate (fs/xfs/xfs_file.c:1004 (discriminator 1)) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? vfs_fallocate (fs/open.c:330) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_bmap_del_extent_real (fs/xfs/libxfs/xfs_bmap.c:5173) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __xfs_bunmapi (fs/xfs/libxfs/xfs_bmap.c:5593) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? xfs_trans_alloc_inode (fs/xfs/xfs_trans.c:1220) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_bunmapi (fs/xfs/libxfs/xfs_bmap.c:5671) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_free_file_space (fs/xfs/xfs_bmap_util.c:786 fs/xfs/xfs_bmap_util.c:856) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_file_fallocate (fs/xfs/xfs_file.c:1004 (discriminator 1)) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? apparmor_file_permission (security/apparmor/include/cred.h:24 security/apparmor/include/cred.h:49 security/apparmor/include/cred.h:76 security/apparmor/include/cred.h:109 security/apparmor/lsm.c:533 security/apparmor/lsm.c:546) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: vfs_fallocate (fs/open.c:330) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ioctl_preallocate (fs/ioctl.c:293) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __x64_sys_ioctl (fs/ioctl.c:903 fs/ioctl.c:890 fs/ioctl.c:890) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: RIP: 0033:0x7f4c8f5a2c5b Apr 16 23:06:11 base-xfs-nocrc-2k kernel: RSP: 002b:00007ffd503d7980 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: RAX: ffffffffffffffda RBX: 00000000000d3809 RCX: 00007f4c8f5a2c5b Apr 16 23:06:11 base-xfs-nocrc-2k kernel: RDX: 00007ffd503d7a00 RSI: 000000004030582b RDI: 0000000000000003 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: RBP: 0000000000000003 R08: 000000000000004f R09: 00007ffd503d79ec Apr 16 23:06:11 base-xfs-nocrc-2k kernel: R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000000 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: R13: 000000000026acd9 R14: 8f5c28f5c28f5c29 R15: 0000558585e10990 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: </TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task fsstress:1377895 blocked for more than 120 seconds. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 23:06:11 base-xfs-nocrc-2k kernel: task:fsstress state:D stack:0 pid:1377895 tgid:1377895 ppid:1377892 flags:0x00000002 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: Call Trace: Apr 16 23:06:11 base-xfs-nocrc-2k kernel: <TASK> Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: schedule_timeout (kernel/time/timer.c:2559) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? blk_finish_plug (block/blk-core.c:1218 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __wait_for_common (kernel/sched/completion.c:95 kernel/sched/completion.c:116) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? __pfx_schedule_timeout (kernel/time/timer.c:2544) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_buf_iowait (fs/xfs/xfs_buf.c:1691) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __xfs_buf_submit (fs/xfs/xfs_buf.c:1770) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_buf_read_map (fs/xfs/xfs_buf.c:870) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? xfs_da_read_buf (fs/xfs/libxfs/xfs_da_btree.c:2676) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:289 (discriminator 1)) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? xfs_da_read_buf (fs/xfs/libxfs/xfs_da_btree.c:2676) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_da_read_buf (fs/xfs/libxfs/xfs_da_btree.c:2676) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_attr3_leaf_read (fs/xfs/libxfs/xfs_attr_leaf.c:458) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? xfs_attr_is_leaf (fs/xfs/libxfs/xfs_attr.c:95) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_attr_leaf_hasname (fs/xfs/libxfs/xfs_attr.c:1206) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_attr_leaf_get (fs/xfs/libxfs/xfs_attr.c:1275) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_attr_get (fs/xfs/libxfs/xfs_attr.c:276) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_xattr_get (fs/xfs/xfs_xattr.c:143) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __vfs_getxattr (fs/xattr.c:423) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: cap_inode_need_killpriv (security/commoncap.c:303) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: security_inode_need_killpriv (security/security.c:2504 (discriminator 13)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: file_remove_privs_flags (fs/inode.c:2016 fs/inode.c:2046) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? xfs_break_dax_layouts (fs/xfs/xfs_file.c:884) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? tomoyo_init_request_info (security/tomoyo/util.c:1026) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? tomoyo_path_number_perm (security/tomoyo/file.c:719 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? aa_file_perm (security/apparmor/file.c:591) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: file_modified (fs/inode.c:2167 fs/inode.c:2196) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_file_fallocate (fs/xfs/xfs_file.c:999 (discriminator 1)) xfs Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ? apparmor_file_permission (security/apparmor/include/cred.h:24 security/apparmor/include/cred.h:49 security/apparmor/include/cred.h:76 security/apparmor/include/cred.h:109 security/apparmor/lsm.c:533 security/apparmor/lsm.c:546) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: vfs_fallocate (fs/open.c:330) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: ioctl_preallocate (fs/ioctl.c:293) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __x64_sys_ioctl (fs/ioctl.c:903 fs/ioctl.c:890 fs/ioctl.c:890) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Apr 16 23:06:11 base-xfs-nocrc-2k kernel: RIP: 0033:0x7f4c8f5a2c5b Apr 16 23:06:11 base-xfs-nocrc-2k kernel: RSP: 002b:00007ffd503d7980 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: RAX: ffffffffffffffda RBX: 00000000000d9386 RCX: 00007f4c8f5a2c5b Apr 16 23:06:11 base-xfs-nocrc-2k kernel: RDX: 00007ffd503d7a00 RSI: 000000004030582a RDI: 0000000000000003 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: RBP: 0000000000000003 R08: 000000000000000f R09: 00007ffd503d79ec Apr 16 23:06:11 base-xfs-nocrc-2k kernel: R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000000 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: R13: 0000000000049b5f R14: 8f5c28f5c28f5c29 R15: 0000558585e106d0 Apr 16 23:06:11 base-xfs-nocrc-2k kernel: </TASK> Apr 16 23:08:11 base-xfs-nocrc-2k kernel: INFO: task kcompactd0:72 blocked for more than 241 seconds. Apr 16 23:08:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 Apr 16 23:08:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 23:08:11 base-xfs-nocrc-2k kernel: task:kcompactd0 state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00004000 Apr 16 23:08:11 base-xfs-nocrc-2k kernel: Call Trace: Apr 16 23:08:11 base-xfs-nocrc-2k kernel: <TASK> Apr 16 23:08:11 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: folio_wait_bit_common (mm/filemap.c:1275 (discriminator 4)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_wake_page_function (mm/filemap.c:1091) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: migrate_pages_batch (./include/linux/pagemap.h:1048 mm/migrate.c:1486 mm/migrate.c:1700) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_compaction_free (mm/compaction.c:1907) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_compaction_alloc (mm/compaction.c:1855) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __mod_node_page_state (mm/vmstat.c:403 (discriminator 2)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: migrate_pages (mm/migrate.c:1849 mm/migrate.c:1953) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_compaction_alloc (mm/compaction.c:1855) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_compaction_free (mm/compaction.c:1907) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? folio_putback_lru (./arch/x86/include/asm/atomic.h:23 ./include/linux/atomic/atomic-arch-fallback.h:457 ./include/linux/atomic/atomic-instrumented.h:33 ./include/linux/page_ref.h:67 ./include/linux/mm.h:1134 ./include/linux/mm.h:1140 ./include/linux/mm.h:1505 mm/vmscan.c:819) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: compact_zone (mm/compaction.c:2663) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? update_load_avg (kernel/sched/fair.c:4411 kernel/sched/fair.c:4748) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: compact_node (mm/compaction.c:2925) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: kcompactd (./include/linux/mmzone.h:1501 mm/compaction.c:2257 mm/compaction.c:3222) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_autoremove_wake_function (kernel/sched/wait.c:383) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_kcompactd (mm/compaction.c:3168) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: </TASK> Apr 16 23:08:11 base-xfs-nocrc-2k kernel: INFO: task kworker/u38:8:807213 blocked for more than 241 seconds. Apr 16 23:08:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 Apr 16 23:08:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 23:08:11 base-xfs-nocrc-2k kernel: task:kworker/u38:8 state:D stack:0 pid:807213 tgid:807213 ppid:2 flags:0x00004000 Apr 16 23:08:11 base-xfs-nocrc-2k kernel: Workqueue: writeback wb_workfn (flush-btrfs-2) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: Call Trace: Apr 16 23:08:11 base-xfs-nocrc-2k kernel: <TASK> Apr 16 23:08:11 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: folio_wait_bit_common (mm/filemap.c:1275 (discriminator 4)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_wake_page_function (mm/filemap.c:1091) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: extent_write_cache_pages (fs/btrfs/extent_io.c:2130) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:974 (discriminator 1) kernel/sched/core.c:5852 (discriminator 1) kernel/sched/core.c:5849 (discriminator 1) kernel/sched/core.c:5877 (discriminator 1)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? stack_depot_save_flags (lib/stackdepot.c:609) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? put_cpu_partial (./arch/x86/include/asm/irqflags.h:134 (discriminator 1) mm/slub.c:2978 (discriminator 1)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: extent_writepages (fs/btrfs/extent_io.c:131 fs/btrfs/extent_io.c:2275) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: do_writepages (mm/page-writeback.c:2612) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? update_sd_lb_stats.constprop.0 (kernel/sched/fair.c:9896 (discriminator 2) kernel/sched/fair.c:10577 (discriminator 2)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: __writeback_single_inode (fs/fs-writeback.c:1659) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? _raw_spin_lock (./arch/x86/include/asm/atomic.h:115 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:134 (discriminator 4) kernel/locking/spinlock.c:154 (discriminator 4)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: writeback_sb_inodes (fs/fs-writeback.c:1943) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: __writeback_inodes_wb (fs/fs-writeback.c:2013) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: wb_writeback (fs/fs-writeback.c:2119) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: wb_workfn (fs/fs-writeback.c:2276 (discriminator 1) fs/fs-writeback.c:2304 (discriminator 1)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: process_one_work (kernel/workqueue.c:3254) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: worker_thread (kernel/workqueue.c:3329 (discriminator 2) kernel/workqueue.c:3416 (discriminator 2)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_worker_thread (kernel/workqueue.c:3362) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: </TASK> Apr 16 23:08:11 base-xfs-nocrc-2k kernel: INFO: task kworker/u34:9:1268436 blocked for more than 241 seconds. Apr 16 23:08:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 Apr 16 23:08:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 23:08:11 base-xfs-nocrc-2k kernel: task:kworker/u34:9 state:D stack:0 pid:1268436 tgid:1268436 ppid:2 flags:0x00004000 Apr 16 23:08:11 base-xfs-nocrc-2k kernel: Workqueue: loop5 loop_rootcg_workfn [loop] Apr 16 23:08:11 base-xfs-nocrc-2k kernel: Call Trace: Apr 16 23:08:11 base-xfs-nocrc-2k kernel: <TASK> Apr 16 23:08:11 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: folio_wait_bit_common (mm/filemap.c:1275 (discriminator 4)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_wake_page_function (mm/filemap.c:1091) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: btrfs_folio_start_writer_lock (./include/linux/pagemap.h:1048 fs/btrfs/subpage.c:394) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: lock_delalloc_pages (fs/btrfs/extent_io.c:281 (discriminator 1)) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:977 kernel/sched/core.c:5852 kernel/sched/core.c:5849 kernel/sched/core.c:5877) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? _raw_spin_unlock (./include/linux/spinlock_api_smp.h:143 (discriminator 3) kernel/locking/spinlock.c:186 (discriminator 3)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? btrfs_find_delalloc_range (fs/btrfs/extent-io-tree.c:1036) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: find_lock_delalloc_range (fs/btrfs/extent_io.c:377) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: writepage_delalloc (fs/btrfs/extent_io.c:1217 (discriminator 1)) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: extent_write_cache_pages (fs/btrfs/extent_io.c:1475 (discriminator 1) fs/btrfs/extent_io.c:2152 (discriminator 1)) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: extent_writepages (fs/btrfs/extent_io.c:131 fs/btrfs/extent_io.c:2275) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_end_bbio_data_write (fs/btrfs/extent_io.c:463) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: do_writepages (mm/page-writeback.c:2612) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pv_queued_spin_lock_slowpath (kernel/locking/qspinlock.c:565 (discriminator 38)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pv_queued_spin_lock_slowpath (kernel/locking/qspinlock.c:565 (discriminator 38)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: filemap_fdatawrite_wbc (mm/filemap.c:398 mm/filemap.c:387) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: __filemap_fdatawrite_range (mm/filemap.c:431) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: start_ordered_ops.constprop.0 (fs/btrfs/file.c:3873 fs/btrfs/file.c:1743) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: btrfs_sync_file (fs/btrfs/file.c:1818) btrfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: loop_process_work (drivers/block/loop.c:329 (discriminator 1) drivers/block/loop.c:475 (discriminator 1) drivers/block/loop.c:1907 (discriminator 1) drivers/block/loop.c:1942 (discriminator 1)) loop Apr 16 23:08:11 base-xfs-nocrc-2k kernel: process_one_work (kernel/workqueue.c:3254) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: worker_thread (kernel/workqueue.c:3329 (discriminator 2) kernel/workqueue.c:3416 (discriminator 2)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:115 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_worker_thread (kernel/workqueue.c:3362) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: </TASK> Apr 16 23:08:11 base-xfs-nocrc-2k kernel: INFO: task xfsaild/loop5:1377891 blocked for more than 241 seconds. Apr 16 23:08:11 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4 #4 Apr 16 23:08:11 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 23:08:11 base-xfs-nocrc-2k kernel: task:xfsaild/loop5 state:D stack:0 pid:1377891 tgid:1377891 ppid:2 flags:0x00004000 Apr 16 23:08:11 base-xfs-nocrc-2k kernel: Call Trace: Apr 16 23:08:11 base-xfs-nocrc-2k kernel: <TASK> Apr 16 23:08:11 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_wbt_cleanup_cb (block/blk-wbt.c:575) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_wbt_inflight_cb (block/blk-wbt.c:569) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: rq_qos_wait (block/blk-rq-qos.c:284 (discriminator 4)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_rq_qos_wake_function (block/blk-rq-qos.c:208) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_wbt_inflight_cb (block/blk-wbt.c:569) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: wbt_wait (block/blk-wbt.c:660) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: __rq_qos_throttle (block/blk-rq-qos.c:66) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: blk_mq_submit_bio (block/blk-mq.c:2880 block/blk-mq.c:2984) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: submit_bio_noacct_nocheck (./include/linux/bio.h:639 block/blk-core.c:701 block/blk-core.c:729) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? submit_bio_noacct (block/blk-core.c:758 (discriminator 1)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: _xfs_buf_ioapply (fs/xfs/xfs_buf.c:1584 fs/xfs/xfs_buf.c:1671) xfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? xfs_dir_ino_validate (fs/xfs/libxfs/xfs_dir2.c:220 (discriminator 2)) xfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:977 kernel/sched/core.c:5852 kernel/sched/core.c:5849 kernel/sched/core.c:5877) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: __xfs_buf_submit (./arch/x86/include/asm/atomic.h:67 ./include/linux/atomic/atomic-arch-fallback.h:2278 ./include/linux/atomic/atomic-instrumented.h:1384 fs/xfs/xfs_buf.c:1762) xfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: xfs_buf_delwri_submit_buffers (fs/xfs/xfs_buf.c:2280 (discriminator 2)) xfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: xfsaild (fs/xfs/xfs_trans_ail.c:560 (discriminator 1) fs/xfs/xfs_trans_ail.c:671 (discriminator 1)) xfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:115 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4)) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_xfsaild (fs/xfs/xfs_trans_ail.c:597) xfs Apr 16 23:08:11 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 16 23:08:11 base-xfs-nocrc-2k kernel: </TASK> Apr 16 23:08:11 base-xfs-nocrc-2k kernel: Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings Luis
On Thu, Apr 18, 2024 at 02:39:19AM +0100, Matthew Wilcox wrote: > > Running compaction while we run fsstress can crash older kernels as per > > korg#218227 [0], the fix for that [0] has been posted [1] but that patch > > is not yet on v6.9-rc4 and the patch requires changes for v6.9. > > It doesn't require changes, it just has prerequisites: > > https://lore.kernel.org/all/ZgHhcojXc9QjynUI@casper.infradead.org/ How can we expedite getting this fix in?
On 4/18/24 08:57, Christoph Hellwig wrote: > On Thu, Apr 18, 2024 at 02:39:19AM +0100, Matthew Wilcox wrote: >>> Running compaction while we run fsstress can crash older kernels as per >>> korg#218227 [0], the fix for that [0] has been posted [1] but that patch >>> is not yet on v6.9-rc4 and the patch requires changes for v6.9. >> >> It doesn't require changes, it just has prerequisites: >> >> https://lore.kernel.org/all/ZgHhcojXc9QjynUI@casper.infradead.org/ > > How can we expedite getting this fix in? It means moving patches 2,4,5 from the above (with their fixups) from mm-unstable to mm-hotfixes-unstable so they are on track to mainline before 6.9. A quick local rebase of mm-unstable with reordering said patches to the front (v6.9-rc4) suggests this is possible without causing conflicts, and the intermediate result compiles, at least.
On Wed, Apr 17, 2024 at 11:42:28PM -0700, Luis Chamberlain wrote: > > > Today I find that v6.9-rc4 is also hitting an unrecoverable hung task > > > between compaction and fsstress while running generic/476 on the > > > following kdevops test sections [2]: > > > > > > * xfs_nocrc > > > * xfs_nocrc_2k > > > * xfs_nocrc_4k > > > > > > Analyzing the trace I see the guest uses loopback block devices for the > > > fstests TEST_DEV, the loopback file uses sparsefiles on a btrfs > > > partition. The contention based on traces [3] [4] seems to be that we > > > have somehow have fsstress + compaction race on folio_wait_bit_common(). > > > > What do you mean by "race"? Here's what I see: > > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: folio_wait_bit_common (mm/filemap.c:1275 (discriminator 4)) > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: migrate_pages_batch (./include/linux/pagemap.h:1048 mm/migrate.c:1486 mm/migrate.c:1700) That's folio_lock(). > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task kworker/u38:8:807213 blocked for more than 120 seconds. > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: folio_wait_bit_common (mm/filemap.c:1275 (discriminator 4)) > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: extent_write_cache_pages (fs/btrfs/extent_io.c:2130) btrfs folio_lock(). > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task kworker/u34:9:1268436 blocked for more than 120 seconds. > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: folio_wait_bit_common (mm/filemap.c:1275 (discriminator 4)) > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: btrfs_folio_start_writer_lock (./include/linux/pagemap.h:1048 fs/btrfs/subpage.c:394) btrfs folio_lock(). > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task xfsaild/loop5:1377891 blocked for more than 120 seconds. > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: rq_qos_wait (block/blk-rq-qos.c:284 (discriminator 4)) I'm not familiar with this code. > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: wbt_wait (block/blk-wbt.c:660) > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __rq_qos_throttle (block/blk-rq-qos.c:66) > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: blk_mq_submit_bio (block/blk-mq.c:2880 block/blk-mq.c:2984) > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: submit_bio_noacct_nocheck (./include/linux/bio.h:639 block/blk-core.c:701 block/blk-core.c:729) > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: _xfs_buf_ioapply (fs/xfs/xfs_buf.c:1584 fs/xfs/xfs_buf.c:1671) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __xfs_buf_submit (./arch/x86/include/asm/atomic.h:67 ./include/linux/atomic/atomic-arch-fallback.h:2278 ./include/linux/atomic/atomic-instrumented.h:1384 fs/xfs/xfs_buf.c:1762) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_buf_delwri_submit_buffers (fs/xfs/xfs_buf.c:2280 (discriminator 2)) xfs ... but it's submitting a write. > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfsaild (fs/xfs/xfs_trans_ail.c:560 (discriminator 1) fs/xfs/xfs_trans_ail.c:671 (discriminator 1)) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task fsstress:1377894 blocked for more than 120 seconds. > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_buf_iowait (fs/xfs/xfs_buf.c:1691) xfs Waiting for an I/O completion > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __xfs_buf_submit (fs/xfs/xfs_buf.c:1770) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_buf_read_map (fs/xfs/xfs_buf.c:870) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:289 (discriminator 1)) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_btree_read_buf_block (./fs/xfs/xfs_trans.h:210 fs/xfs/libxfs/xfs_btree.c:1432) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_btree_lookup_get_block (fs/xfs/libxfs/xfs_btree.c:1934) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_btree_lookup (fs/xfs/libxfs/xfs_btree.c:2045) xfs but a read, not a write. > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: INFO: task fsstress:1377895 blocked for more than 120 seconds. > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_buf_iowait (fs/xfs/xfs_buf.c:1691) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: __xfs_buf_submit (fs/xfs/xfs_buf.c:1770) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_buf_read_map (fs/xfs/xfs_buf.c:870) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:289 (discriminator 1)) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_da_read_buf (fs/xfs/libxfs/xfs_da_btree.c:2676) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_attr3_leaf_read (fs/xfs/libxfs/xfs_attr_leaf.c:458) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_attr_leaf_hasname (fs/xfs/libxfs/xfs_attr.c:1206) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_attr_leaf_get (fs/xfs/libxfs/xfs_attr.c:1275) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_attr_get (fs/xfs/libxfs/xfs_attr.c:276) xfs > Apr 16 23:06:11 base-xfs-nocrc-2k kernel: xfs_xattr_get (fs/xfs/xfs_xattr.c:143) xfs A different read. I'm not seeing a _race_ here. I'm seeing tasks _stuck_, but on what? A missing wakeup?
On Thu, 18 Apr 2024 11:19:33 +0200 Vlastimil Babka <vbabka@suse.cz> wrote: > On 4/18/24 08:57, Christoph Hellwig wrote: > > On Thu, Apr 18, 2024 at 02:39:19AM +0100, Matthew Wilcox wrote: > >>> Running compaction while we run fsstress can crash older kernels as per > >>> korg#218227 [0], the fix for that [0] has been posted [1] but that patch > >>> is not yet on v6.9-rc4 and the patch requires changes for v6.9. > >> > >> It doesn't require changes, it just has prerequisites: > >> > >> https://lore.kernel.org/all/ZgHhcojXc9QjynUI@casper.infradead.org/ > > > > How can we expedite getting this fix in? > > It means moving patches 2,4,5 from the above (with their fixups) from > mm-unstable to mm-hotfixes-unstable so they are on track to mainline > before 6.9. > > A quick local rebase of mm-unstable with reordering said patches to the > front (v6.9-rc4) suggests this is possible without causing conflicts, > and the intermediate result compiles, at least. Thanks for the reminder, this fell through cracks. It indeed appears that I can move mm-create-folio_flag_false-and-folio_type_ops-macros.patch mm-support-page_mapcount-on-page_has_type-pages.patch mm-turn-folio_test_hugetlb-into-a-pagetype.patch mm-turn-folio_test_hugetlb-into-a-pagetype-fix.patch without merge or build issues. I added Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") Cc: <stable@vger.kernel.org> to all patches. But a question: 9c5ccf2db04b is from August 2023. Why are we seeing this issue now?
On Thu, Apr 18, 2024 at 11:45:52AM -0700, Andrew Morton wrote: > It indeed appears that I can move > > mm-create-folio_flag_false-and-folio_type_ops-macros.patch > mm-support-page_mapcount-on-page_has_type-pages.patch > mm-turn-folio_test_hugetlb-into-a-pagetype.patch > mm-turn-folio_test_hugetlb-into-a-pagetype-fix.patch > > without merge or build issues. I added > > Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") > Cc: <stable@vger.kernel.org> > > to all patches. > > But a question: 9c5ccf2db04b is from August 2023. Why are we seeing > this issue now? We saw it earlier, we just didn't know how to fix it. eg December 2023: https://lore.kernel.org/all/ZXNhGsX32y19a2Xv@casper.infradead.org/ I think there were earlier reports, but I'm not finding them now.
On 4/18/24 9:01 PM, Matthew Wilcox wrote: > On Thu, Apr 18, 2024 at 11:45:52AM -0700, Andrew Morton wrote: >> It indeed appears that I can move >> >> mm-create-folio_flag_false-and-folio_type_ops-macros.patch >> mm-support-page_mapcount-on-page_has_type-pages.patch >> mm-turn-folio_test_hugetlb-into-a-pagetype.patch >> mm-turn-folio_test_hugetlb-into-a-pagetype-fix.patch >> >> without merge or build issues. I added >> >> Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") >> Cc: <stable@vger.kernel.org> >> >> to all patches. >> >> But a question: 9c5ccf2db04b is from August 2023. Why are we seeing >> this issue now? > > We saw it earlier, we just didn't know how to fix it. > eg December 2023: > https://lore.kernel.org/all/ZXNhGsX32y19a2Xv@casper.infradead.org/ > > I think there were earlier reports, but I'm not finding them now. It's a race and needs CONFIG_DEBUG_VM to be visible, which probably makes it rare. What seems to be new are the stress tests that have more "luck" at hitting it? Or maybe also the changes towards more high-order folio usage make it more likely.
On Fri, Apr 19, 2024 at 09:51:35AM +0200, Vlastimil Babka wrote: > On 4/18/24 9:01 PM, Matthew Wilcox wrote: > > On Thu, Apr 18, 2024 at 11:45:52AM -0700, Andrew Morton wrote: > >> It indeed appears that I can move > >> > >> mm-create-folio_flag_false-and-folio_type_ops-macros.patch > >> mm-support-page_mapcount-on-page_has_type-pages.patch > >> mm-turn-folio_test_hugetlb-into-a-pagetype.patch > >> mm-turn-folio_test_hugetlb-into-a-pagetype-fix.patch > >> > >> without merge or build issues. I added > >> > >> Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR") > >> Cc: <stable@vger.kernel.org> > >> > >> to all patches. > >> > >> But a question: 9c5ccf2db04b is from August 2023. Why are we seeing > >> this issue now? > > > > We saw it earlier, we just didn't know how to fix it. > > eg December 2023: > > https://lore.kernel.org/all/ZXNhGsX32y19a2Xv@casper.infradead.org/ > > > > I think there were earlier reports, but I'm not finding them now. > > It's a race and needs CONFIG_DEBUG_VM to be visible, which probably makes it > rare. Sure, but what would be the impact in practice if life chugs on with the assumtions being incorrect? > What seems to be new are the stress tests that have more "luck" at > hitting it? The fsstress tests are not new, the test in this patch just adds a background forceful compaction to try to make this easier to reproduce. Although it is hard to reproduce at least we now have a 100% confidence way to reproduce new issues as described below. Also I hope that the automation with kdevops allows anyone to easily ramp up and reproduce this rather fast too as described in the patch first email. > Or maybe also the changes towards more high-order folio usage > make it more likely. I've applied these 3 patches to v6.9-rc4: mm: Create FOLIO_FLAG_FALSE and FOLIO_TYPE_OPS macros mm: Support page_mapcount() on page_has_type() pages mm: Turn folio_test_hugetlb into a PageType The test in this patch still manages to trigger an unrecoverable situation, and the trace is below. Note I'm using a SOAK_DURATION=9900 so about 2.75 hours on this test, the timestamps on the first task blocking shows it took about 1 hour 11 minutes for this to trigger. write_cache_pages() seems to linger with compaction. Luis Apr 18 16:37:28 base-xfs-nocrc-2k unknown: run fstests generic/744 at 2024-04-18 16:37:28 Apr 18 16:37:29 base-xfs-nocrc-2k kernel: XFS (loop5): Mounting V4 Filesystem 2c71ddbc-bc0d-47d2-ad67-8089bc7434cf Apr 18 16:37:29 base-xfs-nocrc-2k kernel: XFS (loop5): Ending clean mount Apr 18 16:37:29 base-xfs-nocrc-2k kernel: xfs filesystem being mounted at /media/scratch supports timestamps until 2038-01-19 (0x7fffffff) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: INFO: task kworker/u37:7:44996 blocked for more than 120 seconds. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4+ #6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: task:kworker/u37:7 state:D stack:0 pid:44996 tgid:44996 ppid:2 flags:0x00004000 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Workqueue: writeback wb_workfn (flush-7:5) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Call Trace: Apr 18 17:48:27 base-xfs-nocrc-2k kernel: <TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule_timeout (kernel/time/timer.c:2559) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? check_preempt_wakeup_fair (kernel/sched/fair.c:8349 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __wait_for_common (kernel/sched/completion.c:95 kernel/sched/completion.c:116) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_schedule_timeout (kernel/time/timer.c:2544) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __flush_workqueue (kernel/workqueue.c:4002 (discriminator 2)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xlog_cil_push_now.isra.0 (fs/xfs/xfs_log_cil.c:1477 (discriminator 29)) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xlog_cil_force_seq (./include/linux/spinlock.h:351 fs/xfs/xfs_log_cil.c:1687) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __xfs_btree_check_block (fs/xfs/libxfs/xfs_btree.c:244 fs/xfs/libxfs/xfs_btree.c:272) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_log_force (./include/linux/spinlock.h:351 fs/xfs/xfs_log.c:3194) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_extent_busy_flush (fs/xfs/xfs_extent_busy.c:615) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_autoremove_wake_function (kernel/sched/wait.c:383) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_alloc_ag_vextent_size (fs/xfs/libxfs/xfs_alloc.c:1879) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_alloc_vextent_iterate_ags.constprop.0 (fs/xfs/libxfs/xfs_alloc.c:3700) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_alloc_vextent_start_ag (fs/xfs/libxfs/xfs_alloc.c:3768) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_bmap_btalloc (fs/xfs/libxfs/xfs_bmap.c:3767 fs/xfs/libxfs/xfs_bmap.c:3808) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_bmapi_allocate (fs/xfs/libxfs/xfs_bmap.c:4184 fs/xfs/libxfs/xfs_bmap.c:4227) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_bmapi_convert_delalloc (fs/xfs/libxfs/xfs_bmap.c:4692) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_map_blocks (fs/xfs/xfs_aops.c:268 fs/xfs/xfs_aops.c:390) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: iomap_do_writepage (fs/iomap/buffered-io.c:1777 fs/iomap/buffered-io.c:1925 fs/iomap/buffered-io.c:1964) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_iomap_do_writepage (fs/iomap/buffered-io.c:1963) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: write_cache_pages (mm/page-writeback.c:2569) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: iomap_writepages (fs/iomap/buffered-io.c:1984) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_vm_writepages (fs/xfs/xfs_aops.c:508) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_writepages (mm/page-writeback.c:2612) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? wakeup_preempt (kernel/sched/core.c:2240) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? ttwu_do_activate (kernel/sched/core.c:3768 (discriminator 2) kernel/sched/core.c:3796 (discriminator 2)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? _raw_spin_unlock_irqrestore (./include/linux/spinlock_api_smp.h:152 (discriminator 3) kernel/locking/spinlock.c:194 (discriminator 3)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? try_to_wake_up (./include/linux/preempt.h:480 (discriminator 3) ./include/linux/preempt.h:480 (discriminator 3) kernel/sched/core.c:4233 (discriminator 3)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __writeback_single_inode (fs/fs-writeback.c:1659) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? _raw_spin_lock (./arch/x86/include/asm/atomic.h:115 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:134 (discriminator 4) kernel/locking/spinlock.c:154 (discriminator 4)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: writeback_sb_inodes (fs/fs-writeback.c:1943) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: wb_writeback (fs/fs-writeback.c:2117) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: wb_workfn (fs/fs-writeback.c:2265 fs/fs-writeback.c:2304) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __schedule (kernel/sched/core.c:6752) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: process_one_work (kernel/workqueue.c:3254) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: worker_thread (kernel/workqueue.c:3329 (discriminator 2) kernel/workqueue.c:3416 (discriminator 2)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_worker_thread (kernel/workqueue.c:3362) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: </TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: INFO: task xfsaild/loop5:46115 blocked for more than 120 seconds. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4+ #6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: task:xfsaild/loop5 state:D stack:0 pid:46115 tgid:46115 ppid:2 flags:0x00004000 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Call Trace: Apr 18 17:48:27 base-xfs-nocrc-2k kernel: <TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_wbt_cleanup_cb (block/blk-wbt.c:575) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_wbt_inflight_cb (block/blk-wbt.c:569) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: rq_qos_wait (block/blk-rq-qos.c:284 (discriminator 4)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_rq_qos_wake_function (block/blk-rq-qos.c:208) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_wbt_inflight_cb (block/blk-wbt.c:569) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: wbt_wait (block/blk-wbt.c:660) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __rq_qos_throttle (block/blk-rq-qos.c:66) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: blk_mq_submit_bio (block/blk-mq.c:2880 block/blk-mq.c:2984) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: submit_bio_noacct_nocheck (./include/linux/bio.h:639 block/blk-core.c:701 block/blk-core.c:729) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? submit_bio_noacct (block/blk-core.c:758 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: _xfs_buf_ioapply (fs/xfs/xfs_buf.c:1584 fs/xfs/xfs_buf.c:1671) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:977 kernel/sched/core.c:5852 kernel/sched/core.c:5849 kernel/sched/core.c:5877) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __xfs_buf_submit (./arch/x86/include/asm/atomic.h:67 ./include/linux/atomic/atomic-arch-fallback.h:2278 ./include/linux/atomic/atomic-instrumented.h:1384 fs/xfs/xfs_buf.c:1762) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_buf_delwri_submit_buffers (fs/xfs/xfs_buf.c:2280 (discriminator 2)) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfsaild (fs/xfs/xfs_trans_ail.c:560 (discriminator 1) fs/xfs/xfs_trans_ail.c:671 (discriminator 1)) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:115 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_xfsaild (fs/xfs/xfs_trans_ail.c:597) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: </TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: INFO: task 744:46117 blocked for more than 120 seconds. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4+ #6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: task:744 state:D stack:0 pid:46117 tgid:46117 ppid:45928 flags:0x00004002 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Call Trace: Apr 18 17:48:27 base-xfs-nocrc-2k kernel: <TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: folio_wait_bit_common (mm/filemap.c:1275 (discriminator 4)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_wake_page_function (mm/filemap.c:1091) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: migrate_pages_batch (./include/linux/pagemap.h:1048 mm/migrate.c:1486 mm/migrate.c:1700) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_compaction_free (mm/compaction.c:1907) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_compaction_alloc (mm/compaction.c:1855) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __mod_node_page_state (mm/vmstat.c:403 (discriminator 2)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: migrate_pages (mm/migrate.c:1849 mm/migrate.c:1953) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_compaction_alloc (mm/compaction.c:1855) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_compaction_free (mm/compaction.c:1907) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __mod_node_page_state (mm/vmstat.c:403 (discriminator 2)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? folio_putback_lru (./arch/x86/include/asm/atomic.h:23 ./include/linux/atomic/atomic-arch-fallback.h:457 ./include/linux/atomic/atomic-instrumented.h:33 ./include/linux/page_ref.h:67 ./include/linux/mm.h:1134 ./include/linux/mm.h:1140 ./include/linux/mm.h:1507 mm/vmscan.c:819) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: compact_zone (mm/compaction.c:2663) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: compact_node (mm/compaction.c:2925) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: sysctl_compaction_handler (mm/compaction.c:2946 mm/compaction.c:2996 mm/compaction.c:2983) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: proc_sys_call_handler (fs/proc/proc_sysctl.c:595) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: vfs_write (./include/linux/fs.h:2110 fs/read_write.c:497 fs/read_write.c:590) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ksys_write (fs/read_write.c:643) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RIP: 0033:0x7fd5b7abe240 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RSP: 002b:00007fff12f5c1d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd5b7abe240 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RDX: 0000000000000002 RSI: 0000556d0aacf160 RDI: 0000000000000001 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RBP: 0000556d0aacf160 R08: 0000000000000007 R09: 0000000000000073 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: R13: 00007fd5b7b99760 R14: 0000000000000002 R15: 00007fd5b7b949e0 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: </TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: INFO: task fsstress:46119 blocked for more than 120 seconds. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4+ #6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: task:fsstress state:D stack:0 pid:46119 tgid:46119 ppid:46118 flags:0x00000002 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Call Trace: Apr 18 17:48:27 base-xfs-nocrc-2k kernel: <TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? path_openat (fs/namei.c:3497 fs/namei.c:3566 fs/namei.c:3796) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule_timeout (kernel/time/timer.c:2559) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_free_object_rcu (mm/kmemleak.c:508) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __call_rcu_common.constprop.0 (./arch/x86/include/asm/bitops.h:239 ./include/asm-generic/bitops/instrumented-non-atomic.h:142 ./include/linux/cpumask.h:505 ./include/linux/cpumask.h:1120 kernel/rcu/tree.c:2628 kernel/rcu/tree.c:2755) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __wait_for_common (kernel/sched/completion.c:95 kernel/sched/completion.c:116) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_schedule_timeout (kernel/time/timer.c:2544) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __flush_work.isra.0 (kernel/workqueue.c:4211) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_wq_barrier_func (kernel/workqueue.c:3742) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_flush_inodes (fs/xfs/xfs_super.c:635 (discriminator 1)) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_create (fs/xfs/xfs_inode.c:1067) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? _raw_spin_unlock_irqrestore (./include/linux/spinlock_api_smp.h:152 (discriminator 3) kernel/locking/spinlock.c:194 (discriminator 3)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __wake_up (kernel/sched/wait.c:110 kernel/sched/wait.c:127) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? get_cached_acl (fs/posix_acl.c:62) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_generic_create (fs/xfs/xfs_iops.c:199 (discriminator 1)) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: path_openat (fs/namei.c:3497 fs/namei.c:3566 fs/namei.c:3796) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_filp_open (fs/namei.c:3826) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:977 kernel/sched/core.c:5852 kernel/sched/core.c:5849 kernel/sched/core.c:5877) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __check_object_size (mm/usercopy.c:196 mm/usercopy.c:251 mm/usercopy.c:213) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_sys_openat2 (fs/open.c:1406) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __x64_sys_creat (fs/open.c:1491) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RIP: 0033:0x7fba99e17a60 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RSP: 002b:00007ffff8aaf2f8 EFLAGS: 00000202 ORIG_RAX: 0000000000000055 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RAX: ffffffffffffffda RBX: 00000000000b0041 RCX: 00007fba99e17a60 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RDX: 0000000000000000 RSI: 00000000000001b6 RDI: 000055602f262c70 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RBP: 00007ffff8aaf460 R08: 00007fba99ef1c60 R09: 0000000000000078 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: R10: 00007ffff8aaef74 R11: 0000000000000202 R12: 00000000000001b6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: R13: 0000000000000000 R14: 00007ffff8aaf460 R15: 0000000000000001 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: </TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: INFO: task fsstress:46120 blocked for more than 121 seconds. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4+ #6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: task:fsstress state:D stack:0 pid:46120 tgid:46120 ppid:46118 flags:0x00000002 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Call Trace: Apr 18 17:48:27 base-xfs-nocrc-2k kernel: <TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule_timeout (kernel/time/timer.c:2559) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? wakeup_preempt (kernel/sched/core.c:2240) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __wait_for_common (kernel/sched/completion.c:95 kernel/sched/completion.c:116) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_schedule_timeout (kernel/time/timer.c:2544) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __flush_work.isra.0 (kernel/workqueue.c:4211) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_wq_barrier_func (kernel/workqueue.c:3742) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_create (fs/xfs/xfs_inode.c:1067) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __get_acl.part.0 (fs/posix_acl.c:159) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_generic_create (fs/xfs/xfs_iops.c:199 (discriminator 1)) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: path_openat (fs/namei.c:3497 fs/namei.c:3566 fs/namei.c:3796) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_filp_open (fs/namei.c:3826) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:977 kernel/sched/core.c:5852 kernel/sched/core.c:5849 kernel/sched/core.c:5877) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __check_object_size (mm/usercopy.c:196 mm/usercopy.c:251 mm/usercopy.c:213) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_sys_openat2 (fs/open.c:1406) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __x64_sys_creat (fs/open.c:1491) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RIP: 0033:0x7fba99e17a60 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RSP: 002b:00007ffff8aaf2f8 EFLAGS: 00000202 ORIG_RAX: 0000000000000055 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RAX: ffffffffffffffda RBX: 00000000000aff0e RCX: 00007fba99e17a60 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RDX: 0000000000000000 RSI: 00000000000001b6 RDI: 000055602f2872a0 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RBP: 00007ffff8aaf460 R08: 00007fba99ef1c60 R09: 000055602f287550 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: R10: 5a4c7c6f94523808 R11: 0000000000000202 R12: 00000000000001b6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: R13: 0000000000000000 R14: 00007ffff8aaf460 R15: 0000000000000001 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: </TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: INFO: task fsstress:46121 blocked for more than 121 seconds. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4+ #6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: task:fsstress state:D stack:0 pid:46121 tgid:46121 ppid:46118 flags:0x00000002 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Call Trace: Apr 18 17:48:27 base-xfs-nocrc-2k kernel: <TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? path_openat (fs/namei.c:3497 fs/namei.c:3566 fs/namei.c:3796) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule_timeout (kernel/time/timer.c:2559) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_free_object_rcu (mm/kmemleak.c:508) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __call_rcu_common.constprop.0 (./arch/x86/include/asm/bitops.h:239 ./include/asm-generic/bitops/instrumented-non-atomic.h:142 ./include/linux/cpumask.h:505 ./include/linux/cpumask.h:1120 kernel/rcu/tree.c:2628 kernel/rcu/tree.c:2755) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __wait_for_common (kernel/sched/completion.c:95 kernel/sched/completion.c:116) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_schedule_timeout (kernel/time/timer.c:2544) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __flush_work.isra.0 (kernel/workqueue.c:4211) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_wq_barrier_func (kernel/workqueue.c:3742) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_flush_inodes (fs/xfs/xfs_super.c:635 (discriminator 1)) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_create (fs/xfs/xfs_inode.c:1067) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? _raw_spin_unlock_irqrestore (./include/linux/spinlock_api_smp.h:152 (discriminator 3) kernel/locking/spinlock.c:194 (discriminator 3)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __wake_up (kernel/sched/wait.c:110 kernel/sched/wait.c:127) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? get_cached_acl (fs/posix_acl.c:62) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_generic_create (fs/xfs/xfs_iops.c:199 (discriminator 1)) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: path_openat (fs/namei.c:3497 fs/namei.c:3566 fs/namei.c:3796) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_filp_open (fs/namei.c:3826) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:977 kernel/sched/core.c:5852 kernel/sched/core.c:5849 kernel/sched/core.c:5877) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __check_object_size (mm/usercopy.c:196 mm/usercopy.c:251 mm/usercopy.c:213) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_sys_openat2 (fs/open.c:1406) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __x64_sys_creat (fs/open.c:1491) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RIP: 0033:0x7fba99e17a60 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RSP: 002b:00007ffff8aaf2f8 EFLAGS: 00000202 ORIG_RAX: 0000000000000055 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RAX: ffffffffffffffda RBX: 00000000000b698d RCX: 00007fba99e17a60 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RDX: 0000000000000000 RSI: 00000000000001b6 RDI: 000055602f304f60 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RBP: 00007ffff8aaf460 R08: 00007fba99ef1c60 R09: 0000000000000078 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: R10: 00007ffff8aaef73 R11: 0000000000000202 R12: 00000000000001b6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: R13: 0000000000000000 R14: 00007ffff8aaf460 R15: 0000000000000001 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: </TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: INFO: task fsstress:46122 blocked for more than 121 seconds. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4+ #6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: task:fsstress state:D stack:0 pid:46122 tgid:46122 ppid:46118 flags:0x00004002 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Call Trace: Apr 18 17:48:27 base-xfs-nocrc-2k kernel: <TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule_timeout (kernel/time/timer.c:2559) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? _raw_spin_unlock_irqrestore (./include/linux/spinlock_api_smp.h:152 (discriminator 3) kernel/locking/spinlock.c:194 (discriminator 3)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? try_to_wake_up (./include/linux/preempt.h:480 (discriminator 3) ./include/linux/preempt.h:480 (discriminator 3) kernel/sched/core.c:4233 (discriminator 3)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __wait_for_common (kernel/sched/completion.c:95 kernel/sched/completion.c:116) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_schedule_timeout (kernel/time/timer.c:2544) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __flush_workqueue (kernel/workqueue.c:4002 (discriminator 2)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xlog_cil_push_now.isra.0 (fs/xfs/xfs_log_cil.c:1477 (discriminator 29)) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xlog_cil_force_seq (./include/linux/spinlock.h:351 fs/xfs/xfs_log_cil.c:1687) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:974 (discriminator 1) kernel/sched/core.c:5852 (discriminator 1) kernel/sched/core.c:5849 (discriminator 1) kernel/sched/core.c:5877 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_log_force (./include/linux/spinlock.h:351 fs/xfs/xfs_log.c:3194) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_fs_sync_fs (fs/xfs/xfs_super.c:782) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_sync_fs_one_sb (fs/sync.c:81) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: iterate_supers (fs/super.c:922) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ksys_sync (fs/sync.c:105) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __do_sys_sync (fs/sync.c:115) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RIP: 0033:0x7fba99e1db97 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RSP: 002b:00007ffff8aaf498 EFLAGS: 00000202 ORIG_RAX: 00000000000000a2 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RAX: ffffffffffffffda RBX: 00000000000b1de4 RCX: 00007fba99e1db97 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RDX: 000000005764828b RSI: 000000005764828b RDI: 00000000000b1de4 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: RBP: 000000000000c350 R08: 0000000000000073 R09: 0000000000000000 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: R10: 00007ffff8aaf4e0 R11: 0000000000000202 R12: 0000555ff6a925a0 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: R13: 028f5c28f5c28f5c R14: 8f5c28f5c28f5c29 R15: 0000555ff6a7f540 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: </TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: INFO: task kworker/u38:1:46372 blocked for more than 121 seconds. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4+ #6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: task:kworker/u38:1 state:D stack:0 pid:46372 tgid:46372 ppid:2 flags:0x00004000 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Workqueue: xfs-cil/loop5 xlog_cil_push_work [xfs] Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Call Trace: Apr 18 17:48:27 base-xfs-nocrc-2k kernel: <TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xlog_wait_on_iclog (fs/xfs/xfs_log_priv.h:623 fs/xfs/xfs_log.c:892) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_default_wake_function (kernel/sched/core.c:7078) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xlog_cil_push_work (./include/linux/spinlock.h:351 fs/xfs/xfs_log_cil.c:1312) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: process_one_work (kernel/workqueue.c:3254) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: worker_thread (kernel/workqueue.c:3329 (discriminator 2) kernel/workqueue.c:3416 (discriminator 2)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_worker_thread (kernel/workqueue.c:3362) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: </TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: INFO: task kworker/u37:5:47276 blocked for more than 121 seconds. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4+ #6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: task:kworker/u37:5 state:D stack:0 pid:47276 tgid:47276 ppid:2 flags:0x00004000 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Workqueue: loop5 loop_rootcg_workfn [loop] Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Call Trace: Apr 18 17:48:27 base-xfs-nocrc-2k kernel: <TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: io_schedule (kernel/sched/core.c:9019 (discriminator 1) kernel/sched/core.c:9045 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: folio_wait_bit_common (mm/filemap.c:1275 (discriminator 4)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_wake_page_function (mm/filemap.c:1091) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: extent_write_cache_pages (fs/btrfs/extent_io.c:2130) btrfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: extent_writepages (fs/btrfs/extent_io.c:131 fs/btrfs/extent_io.c:2275) btrfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_end_bbio_data_write (fs/btrfs/extent_io.c:463) btrfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: do_writepages (mm/page-writeback.c:2612) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? process_one_work (kernel/workqueue.c:3254) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? worker_thread (kernel/workqueue.c:3329 (discriminator 2) kernel/workqueue.c:3416 (discriminator 2)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? kthread (kernel/kthread.c:388) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? ret_from_fork (arch/x86/kernel/process.c:147) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? preempt_count_add (./include/linux/ftrace.h:977 kernel/sched/core.c:5852 kernel/sched/core.c:5849 kernel/sched/core.c:5877) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __link_object (./include/linux/rculist.h:79 (discriminator 1) ./include/linux/rculist.h:128 (discriminator 1) mm/kmemleak.c:733 (discriminator 1)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: filemap_fdatawrite_wbc (mm/filemap.c:398 mm/filemap.c:387) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __filemap_fdatawrite_range (mm/filemap.c:431) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: start_ordered_ops.constprop.0 (fs/btrfs/file.c:3873 fs/btrfs/file.c:1743) btrfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: btrfs_sync_file (fs/btrfs/file.c:1818) btrfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: loop_process_work (drivers/block/loop.c:329 (discriminator 1) drivers/block/loop.c:475 (discriminator 1) drivers/block/loop.c:1907 (discriminator 1) drivers/block/loop.c:1942 (discriminator 1)) loop Apr 18 17:48:27 base-xfs-nocrc-2k kernel: process_one_work (kernel/workqueue.c:3254) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: worker_thread (kernel/workqueue.c:3329 (discriminator 2) kernel/workqueue.c:3416 (discriminator 2)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:115 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_worker_thread (kernel/workqueue.c:3362) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: </TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: INFO: task kworker/6:2:48556 blocked for more than 121 seconds. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Not tainted 6.9.0-rc4+ #6 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 18 17:48:27 base-xfs-nocrc-2k kernel: task:kworker/6:2 state:D stack:0 pid:48556 tgid:48556 ppid:2 flags:0x00004000 Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Workqueue: xfs-conv/loop5 xfs_end_io [xfs] Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Call Trace: Apr 18 17:48:27 base-xfs-nocrc-2k kernel: <TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: __schedule (kernel/sched/core.c:5409 kernel/sched/core.c:6746) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule (kernel/sched/core.c:6824 kernel/sched/core.c:6838) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: schedule_preempt_disabled (kernel/sched/core.c:6896) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: rwsem_down_write_slowpath (kernel/locking/rwsem.c:1180 (discriminator 4)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? xfs_trans_alloc (fs/xfs/xfs_trans.c:271 (discriminator 1)) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: down_write (kernel/locking/rwsem.c:1306 kernel/locking/rwsem.c:1315 kernel/locking/rwsem.c:1580) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_trans_alloc_inode (fs/xfs/xfs_trans.c:1210) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_iomap_write_unwritten (fs/xfs/xfs_iomap.c:606) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? iomap_finish_ioends (fs/iomap/buffered-io.c:1558) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_end_ioend (fs/xfs/xfs_aops.c:131) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: xfs_end_io (./include/linux/sched.h:1988 fs/xfs/xfs_aops.c:174) xfs Apr 18 17:48:27 base-xfs-nocrc-2k kernel: process_one_work (kernel/workqueue.c:3254) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: worker_thread (kernel/workqueue.c:3329 (discriminator 2) kernel/workqueue.c:3416 (discriminator 2)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:115 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4)) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_worker_thread (kernel/workqueue.c:3362) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: kthread (kernel/kthread.c:388) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ret_from_fork (arch/x86/kernel/process.c:147) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ? __pfx_kthread (kernel/kthread.c:341) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Apr 18 17:48:27 base-xfs-nocrc-2k kernel: </TASK> Apr 18 17:48:27 base-xfs-nocrc-2k kernel: Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
On Wed, Apr 17, 2024 at 05:13:56PM -0700, Luis Chamberlain wrote: > Running compaction while we run fsstress can crash older kernels as per > korg#218227 [0], the fix for that [0] has been posted [1] but that patch > is not yet on v6.9-rc4 and the patch requires changes for v6.9. > > Today I find that v6.9-rc4 is also hitting an unrecoverable hung task > between compaction and fsstress while running generic/476 on the > following kdevops test sections [2]: > > * xfs_nocrc > * xfs_nocrc_2k > * xfs_nocrc_4k > > Analyzing the trace I see the guest uses loopback block devices for the > fstests TEST_DEV, the loopback file uses sparsefiles on a btrfs > partition. The contention based on traces [3] [4] seems to be that we > have somehow have fsstress + compaction race on folio_wait_bit_common(). > > We have this happening: > > a) kthread compaction --> migrate_pages_batch() > --> folio_wait_bit_common() > b) workqueue on btrfs writeback wb_workfn --> extent_write_cache_pages() > --> folio_wait_bit_common() > c) workqueue on loopback loop_rootcg_workfn() --> filemap_fdatawrite_wbc() > --> folio_wait_bit_common() > d) kthread xfsaild --> blk_mq_submit_bio() --> wbt_wait() > > I tried to reproduce but couldn't easily do so, so I wrote this test > to help, and with this I have 100% failure rate so far out of 2 runs. > > Given we also have korg#218227 and that patch likely needing > backporting, folks will want a reproducer for this issue. This should > hopefully help with that case and this new separate issue. > > To reproduce with kdevops just: > > make defconfig-xfs_nocrc_2k -j $(nproc) > make -j $(nproc) > make fstests > make linux > make fstests-baseline TESTS=generic/733 > tail -f guestfs/*-xfs-nocrc-2k/console.log > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=218227 > [1] https://lore.kernel.org/all/7ee2bb8c-441a-418b-ba3a-d305f69d31c8@suse.cz/T/#u > [2] https://github.com/linux-kdevops/kdevops/blob/main/playbooks/roles/fstests/templates/xfs/xfs.config > [3] https://gist.github.com/mcgrof/4dfa3264f513ce6ca398414326cfab84 > [4] https://gist.github.com/mcgrof/f40a9f31a43793dac928ce287cfacfeb > > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> > --- > > Note: kdevops uses its own fork of fstests which has this merged > already, so the above should just work. If it's your first time using > kdevops be sure to just read the README for the first time users: > > https://github.com/linux-kdevops/kdevops/blob/main/docs/kdevops-first-run.md > > common/rc | 7 ++++++ > tests/generic/744 | 56 +++++++++++++++++++++++++++++++++++++++++++ > tests/generic/744.out | 2 ++ > 3 files changed, 65 insertions(+) > create mode 100755 tests/generic/744 > create mode 100644 tests/generic/744.out > > diff --git a/common/rc b/common/rc > index b7b77ac1b46d..d4432f5ce259 100644 > --- a/common/rc > +++ b/common/rc > @@ -120,6 +120,13 @@ _require_hugepages() > _notrun "Kernel does not report huge page size" > } > > +# Requires CONFIG_COMPACTION > +_require_compaction() I'm not sure if we should name it as "_require_vm_compaction", does linux have other "compaction" or only memory compaction? > +{ > + if [ ! -f /proc/sys/vm/compact_memory ]; then > + _notrun "Need compaction enabled CONFIG_COMPACTION=y" > + fi > +} > # Get hugepagesize in bytes > _get_hugepagesize() > { > diff --git a/tests/generic/744 b/tests/generic/744 > new file mode 100755 > index 000000000000..2b3c0c7e92fb > --- /dev/null > +++ b/tests/generic/744 > @@ -0,0 +1,56 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2024 Luis Chamberlain. All Rights Reserved. > +# > +# FS QA Test 744 > +# > +# fsstress + compaction test fsstress + memory compaction ? Looks like this case is copied from g/476, just add memory_compaction test. That makes sense to me from the test side. I'm a bit confused on your discussion about an old bug and a new bug(?) you just found. Looks like you're reporting a bug, and provide a test case to fstests@ by the way. Anyway, I think there's not objection on this test itself, right? And is this test for someone known bug or not? > +# > +. ./common/preamble > +_begin_fstest auto rw long_rw stress soak smoketest > + > +_cleanup() > +{ > + cd / > + rm -f $tmp.* > + $KILLALL_PROG -9 fsstress > /dev/null 2>&1 > +} > + > +# Import common functions. > + > +# real QA test starts here > + > +# Modify as appropriate. Useless comment~ > +_supported_fs generic > + > +_require_scratch > +_require_compaction > +_require_command "$KILLALL_PROG" "killall" > + > +echo "Silence is golden." > + > +_scratch_mkfs > $seqres.full 2>&1 > +_scratch_mount >> $seqres.full 2>&1 > + > +nr_cpus=$((LOAD_FACTOR * 4)) > +nr_ops=$((25000 * nr_cpus * TIME_FACTOR)) > +fsstress_args=(-w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus) > + > +# start a background getxattr loop for the existing xattr > +runfile="$tmp.getfattr" > +touch $runfile > +while [ -e $runfile ]; do > + echo 1 > /proc/sys/vm/compact_memory > + sleep 15 > +done & > +getfattr_pid=$! I didn't see any other place use this "getfattr_pid". Better to deal with it in _cleanup(). > + > +test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION") > + > +$FSSTRESS_PROG $FSSTRESS_AVOID "${fsstress_args[@]}" >> $seqres.full > + > +rm -f $runfile > +wait > /dev/null 2>&1 Better to do these things in _cleanup() function, make sure all background processes can be done in _cleanup. > + > +status=0 > +exit > diff --git a/tests/generic/744.out b/tests/generic/744.out > new file mode 100644 > index 000000000000..205c684fa995 > --- /dev/null > +++ b/tests/generic/744.out > @@ -0,0 +1,2 @@ > +QA output created by 744 > +Silence is golden > -- > 2.43.0 > >
On Sat, Apr 20, 2024 at 10:02:41PM +0800, Zorro Lang wrote: > On Wed, Apr 17, 2024 at 05:13:56PM -0700, Luis Chamberlain wrote: > > +_require_compaction() > > I'm not sure if we should name it as "_require_vm_compaction", does linux > have other "compaction" or only memory compaction? I'll color bike shed with "memory compaction" although I am not aware of other compaction types. However making it clear helps. > > +++ b/tests/generic/744 > > @@ -0,0 +1,56 @@ > > +#! /bin/bash > > +# SPDX-License-Identifier: GPL-2.0 > > +# Copyright (c) 2024 Luis Chamberlain. All Rights Reserved. > > +# > > +# FS QA Test 744 > > +# > > +# fsstress + compaction test > > fsstress + memory compaction ? Sure. > Looks like this case is copied from g/476, just add memory_compaction > test. That makes sense to me from the test side. It's a generic fsstress + compaction, right. > you just found. Looks like you're reporting a bug, and provide a test > case to fstests@ by the way. This case is hard to reproduce, and so instead of waiting for compaction to trigger we force it now. > Anyway, I think there's not objection on > this test itself, right? And is this test for someone known bug or not? This reproduces a known kernel bug for which we have a fix now merged, at the time this test was written it was not even merged on v6.9-rc4. The fix was merged on v6.9-rc6. I can now add this to the test: _fixed_by_git_commit kernel d99e3140a4d3 \ "mm: turn folio_test_hugetlb into a PageType" However I also ran into some *other* issues even after that patch was applied. Now that v6.10-rc1 is out I will retest to clarify the situation and see if new issues are still lingering with this test. Luis
diff --git a/common/rc b/common/rc index b7b77ac1b46d..d4432f5ce259 100644 --- a/common/rc +++ b/common/rc @@ -120,6 +120,13 @@ _require_hugepages() _notrun "Kernel does not report huge page size" } +# Requires CONFIG_COMPACTION +_require_compaction() +{ + if [ ! -f /proc/sys/vm/compact_memory ]; then + _notrun "Need compaction enabled CONFIG_COMPACTION=y" + fi +} # Get hugepagesize in bytes _get_hugepagesize() { diff --git a/tests/generic/744 b/tests/generic/744 new file mode 100755 index 000000000000..2b3c0c7e92fb --- /dev/null +++ b/tests/generic/744 @@ -0,0 +1,56 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2024 Luis Chamberlain. All Rights Reserved. +# +# FS QA Test 744 +# +# fsstress + compaction test +# +. ./common/preamble +_begin_fstest auto rw long_rw stress soak smoketest + +_cleanup() +{ + cd / + rm -f $tmp.* + $KILLALL_PROG -9 fsstress > /dev/null 2>&1 +} + +# Import common functions. + +# real QA test starts here + +# Modify as appropriate. +_supported_fs generic + +_require_scratch +_require_compaction +_require_command "$KILLALL_PROG" "killall" + +echo "Silence is golden." + +_scratch_mkfs > $seqres.full 2>&1 +_scratch_mount >> $seqres.full 2>&1 + +nr_cpus=$((LOAD_FACTOR * 4)) +nr_ops=$((25000 * nr_cpus * TIME_FACTOR)) +fsstress_args=(-w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus) + +# start a background getxattr loop for the existing xattr +runfile="$tmp.getfattr" +touch $runfile +while [ -e $runfile ]; do + echo 1 > /proc/sys/vm/compact_memory + sleep 15 +done & +getfattr_pid=$! + +test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION") + +$FSSTRESS_PROG $FSSTRESS_AVOID "${fsstress_args[@]}" >> $seqres.full + +rm -f $runfile +wait > /dev/null 2>&1 + +status=0 +exit diff --git a/tests/generic/744.out b/tests/generic/744.out new file mode 100644 index 000000000000..205c684fa995 --- /dev/null +++ b/tests/generic/744.out @@ -0,0 +1,2 @@ +QA output created by 744 +Silence is golden
Running compaction while we run fsstress can crash older kernels as per korg#218227 [0], the fix for that [0] has been posted [1] but that patch is not yet on v6.9-rc4 and the patch requires changes for v6.9. Today I find that v6.9-rc4 is also hitting an unrecoverable hung task between compaction and fsstress while running generic/476 on the following kdevops test sections [2]: * xfs_nocrc * xfs_nocrc_2k * xfs_nocrc_4k Analyzing the trace I see the guest uses loopback block devices for the fstests TEST_DEV, the loopback file uses sparsefiles on a btrfs partition. The contention based on traces [3] [4] seems to be that we have somehow have fsstress + compaction race on folio_wait_bit_common(). We have this happening: a) kthread compaction --> migrate_pages_batch() --> folio_wait_bit_common() b) workqueue on btrfs writeback wb_workfn --> extent_write_cache_pages() --> folio_wait_bit_common() c) workqueue on loopback loop_rootcg_workfn() --> filemap_fdatawrite_wbc() --> folio_wait_bit_common() d) kthread xfsaild --> blk_mq_submit_bio() --> wbt_wait() I tried to reproduce but couldn't easily do so, so I wrote this test to help, and with this I have 100% failure rate so far out of 2 runs. Given we also have korg#218227 and that patch likely needing backporting, folks will want a reproducer for this issue. This should hopefully help with that case and this new separate issue. To reproduce with kdevops just: make defconfig-xfs_nocrc_2k -j $(nproc) make -j $(nproc) make fstests make linux make fstests-baseline TESTS=generic/733 tail -f guestfs/*-xfs-nocrc-2k/console.log [0] https://bugzilla.kernel.org/show_bug.cgi?id=218227 [1] https://lore.kernel.org/all/7ee2bb8c-441a-418b-ba3a-d305f69d31c8@suse.cz/T/#u [2] https://github.com/linux-kdevops/kdevops/blob/main/playbooks/roles/fstests/templates/xfs/xfs.config [3] https://gist.github.com/mcgrof/4dfa3264f513ce6ca398414326cfab84 [4] https://gist.github.com/mcgrof/f40a9f31a43793dac928ce287cfacfeb Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> --- Note: kdevops uses its own fork of fstests which has this merged already, so the above should just work. If it's your first time using kdevops be sure to just read the README for the first time users: https://github.com/linux-kdevops/kdevops/blob/main/docs/kdevops-first-run.md common/rc | 7 ++++++ tests/generic/744 | 56 +++++++++++++++++++++++++++++++++++++++++++ tests/generic/744.out | 2 ++ 3 files changed, 65 insertions(+) create mode 100755 tests/generic/744 create mode 100644 tests/generic/744.out