Message ID | 20240227044100.GU616564@frogsfrogsfrogs (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | None | expand |
On Mon, Feb 26, 2024 at 08:41:00PM -0800, Darrick J. Wong wrote: > There are a few hard to reproduce bugs in xfs_repair where it can > deadlock trying to lock a buffer that it already owns. These stalls > cause fstests never to finish, which is annoying! To fix this, set up > the xfs_repair run to abort after 10 minutes, which will affect the > golden output and capture a core file. > > This doesn't fix xfs_repair, obviously. > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > --- > v1.1: require timeout command > --- > tests/xfs/155 | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/tests/xfs/155 b/tests/xfs/155 > index 302607b510..3181bfaf6f 100755 > --- a/tests/xfs/155 > +++ b/tests/xfs/155 > @@ -26,6 +26,11 @@ _require_scratch_nocheck > _require_scratch_xfs_crc # needsrepair only exists for v5 > _require_populate_commands > _require_libxfs_debug_flag LIBXFS_DEBUG_WRITE_CRASH > +_require_command "$TIMEOUT_PROG" timeout > + > +# Inject a 10 minute abortive timeout on the repair program so that deadlocks > +# in the program do not cause fstests to hang indefinitely. > +XFS_REPAIR_PROG="timeout -s ABRT 10m $XFS_REPAIR_PROG" I'll help to change the time to $TIMEOUT_PROG when I merge it. Reviewed-by: Zorro Lang <zlang@redhat.com> > > # Populate the filesystem > _scratch_populate_cached nofill >> $seqres.full 2>&1 >
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
diff --git a/tests/xfs/155 b/tests/xfs/155 index 302607b510..3181bfaf6f 100755 --- a/tests/xfs/155 +++ b/tests/xfs/155 @@ -26,6 +26,11 @@ _require_scratch_nocheck _require_scratch_xfs_crc # needsrepair only exists for v5 _require_populate_commands _require_libxfs_debug_flag LIBXFS_DEBUG_WRITE_CRASH +_require_command "$TIMEOUT_PROG" timeout + +# Inject a 10 minute abortive timeout on the repair program so that deadlocks +# in the program do not cause fstests to hang indefinitely. +XFS_REPAIR_PROG="timeout -s ABRT 10m $XFS_REPAIR_PROG" # Populate the filesystem _scratch_populate_cached nofill >> $seqres.full 2>&1
There are a few hard to reproduce bugs in xfs_repair where it can deadlock trying to lock a buffer that it already owns. These stalls cause fstests never to finish, which is annoying! To fix this, set up the xfs_repair run to abort after 10 minutes, which will affect the golden output and capture a core file. This doesn't fix xfs_repair, obviously. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- v1.1: require timeout command --- tests/xfs/155 | 5 +++++ 1 file changed, 5 insertions(+)