diff mbox series

[v1.1,2/8] xfs/155: fail the test if xfs_repair hangs for too long

Message ID 20240227044100.GU616564@frogsfrogsfrogs (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Darrick J. Wong Feb. 27, 2024, 4:41 a.m. UTC
There are a few hard to reproduce bugs in xfs_repair where it can
deadlock trying to lock a buffer that it already owns.  These stalls
cause fstests never to finish, which is annoying!  To fix this, set up
the xfs_repair run to abort after 10 minutes, which will affect the
golden output and capture a core file.

This doesn't fix xfs_repair, obviously.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
v1.1: require timeout command
---
 tests/xfs/155 |    5 +++++
 1 file changed, 5 insertions(+)

Comments

Zorro Lang Feb. 27, 2024, 5:14 a.m. UTC | #1
On Mon, Feb 26, 2024 at 08:41:00PM -0800, Darrick J. Wong wrote:
> There are a few hard to reproduce bugs in xfs_repair where it can
> deadlock trying to lock a buffer that it already owns.  These stalls
> cause fstests never to finish, which is annoying!  To fix this, set up
> the xfs_repair run to abort after 10 minutes, which will affect the
> golden output and capture a core file.
> 
> This doesn't fix xfs_repair, obviously.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
> v1.1: require timeout command
> ---
>  tests/xfs/155 |    5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/tests/xfs/155 b/tests/xfs/155
> index 302607b510..3181bfaf6f 100755
> --- a/tests/xfs/155
> +++ b/tests/xfs/155
> @@ -26,6 +26,11 @@ _require_scratch_nocheck
>  _require_scratch_xfs_crc		# needsrepair only exists for v5
>  _require_populate_commands
>  _require_libxfs_debug_flag LIBXFS_DEBUG_WRITE_CRASH
> +_require_command "$TIMEOUT_PROG" timeout
> +
> +# Inject a 10 minute abortive timeout on the repair program so that deadlocks
> +# in the program do not cause fstests to hang indefinitely.
> +XFS_REPAIR_PROG="timeout -s ABRT 10m $XFS_REPAIR_PROG"

I'll help to change the time to $TIMEOUT_PROG when I merge it.

Reviewed-by: Zorro Lang <zlang@redhat.com>

>  
>  # Populate the filesystem
>  _scratch_populate_cached nofill >> $seqres.full 2>&1
>
Christoph Hellwig Feb. 27, 2024, 2:52 p.m. UTC | #2
Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
diff mbox series

Patch

diff --git a/tests/xfs/155 b/tests/xfs/155
index 302607b510..3181bfaf6f 100755
--- a/tests/xfs/155
+++ b/tests/xfs/155
@@ -26,6 +26,11 @@  _require_scratch_nocheck
 _require_scratch_xfs_crc		# needsrepair only exists for v5
 _require_populate_commands
 _require_libxfs_debug_flag LIBXFS_DEBUG_WRITE_CRASH
+_require_command "$TIMEOUT_PROG" timeout
+
+# Inject a 10 minute abortive timeout on the repair program so that deadlocks
+# in the program do not cause fstests to hang indefinitely.
+XFS_REPAIR_PROG="timeout -s ABRT 10m $XFS_REPAIR_PROG"
 
 # Populate the filesystem
 _scratch_populate_cached nofill >> $seqres.full 2>&1