[1/2] xfs: online grow vs. log recovery stress test

Message ID	20241017163405.173062-2-bfoster@redhat.com (mailing list archive)
State	New
Headers	show Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BFECB1DDC19 for <fstests@vger.kernel.org>; Thu, 17 Oct 2024 16:32:50 +0000 (UTC) From: Brian Foster <bfoster@redhat.com> To: fstests@vger.kernel.org Cc: linux-xfs@vger.kernel.org, djwong@kernel.org, hch@lst.de Subject: [PATCH 1/2] xfs: online grow vs. log recovery stress test Date: Thu, 17 Oct 2024 12:34:04 -0400 Message-ID: <20241017163405.173062-2-bfoster@redhat.com> In-Reply-To: <20241017163405.173062-1-bfoster@redhat.com> References: <20241017163405.173062-1-bfoster@redhat.com> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 8bit
Series	fstests/xfs: a couple growfs log recovery tests \| expand [0/2] fstests/xfs: a couple growfs log recovery tests [1/2] xfs: online grow vs. log recovery stress test [2/2] xfs: online grow vs. log recovery stress test (realtime version)

Message ID

20241017163405.173062-2-bfoster@redhat.com (mailing list archive)

State

New

Headers

From: Brian Foster <bfoster@redhat.com>
To: fstests@vger.kernel.org
Cc: linux-xfs@vger.kernel.org,
	djwong@kernel.org,
	hch@lst.de
Subject: [PATCH 1/2] xfs: online grow vs. log recovery stress test
Date: Thu, 17 Oct 2024 12:34:04 -0400
Message-ID: <20241017163405.173062-2-bfoster@redhat.com>
In-Reply-To: <20241017163405.173062-1-bfoster@redhat.com>
References: <20241017163405.173062-1-bfoster@redhat.com>
Precedence: bulk
MIME-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: 8bit

Series

fstests/xfs: a couple growfs log recovery tests | expand

Commit Message

Brian Foster Oct. 17, 2024, 4:34 p.m. UTC

fstests includes decent functional tests for online growfs and
shrink, and decent stress tests for crash and log recovery, but no
combination of the two. This test combines bits from a typical
growfs stress test like xfs/104 with crash recovery cycles from a
test like generic/388. As a result, this reproduces at least a
couple recently fixed issues related to log recovery of online
growfs operations.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 tests/xfs/609     | 69 +++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/609.out |  7 +++++
 2 files changed, 76 insertions(+)
 create mode 100755 tests/xfs/609
 create mode 100644 tests/xfs/609.out

Comments

Zorro Lang Oct. 25, 2024, 5:32 p.m. UTC | #1

On Thu, Oct 17, 2024 at 12:34:04PM -0400, Brian Foster wrote:
> fstests includes decent functional tests for online growfs and
> shrink, and decent stress tests for crash and log recovery, but no
> combination of the two. This test combines bits from a typical
> growfs stress test like xfs/104 with crash recovery cycles from a
> test like generic/388. As a result, this reproduces at least a
> couple recently fixed issues related to log recovery of online
> growfs operations.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---

Hi Brian,

Thanks for this new test case! Some tiny review points below :)

>  tests/xfs/609     | 69 +++++++++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/609.out |  7 +++++
>  2 files changed, 76 insertions(+)
>  create mode 100755 tests/xfs/609
>  create mode 100644 tests/xfs/609.out
> 
> diff --git a/tests/xfs/609 b/tests/xfs/609
> new file mode 100755
> index 00000000..796f4357
> --- /dev/null
> +++ b/tests/xfs/609
> @@ -0,0 +1,69 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2024 Red Hat, Inc.  All Rights Reserved.
> +#
> +# FS QA Test No. 609
> +#
> +# Test XFS online growfs log recovery.
> +#
> +. ./common/preamble
> +_begin_fstest auto growfs stress shutdown log recoveryloop
> +
> +# Import common functions.
> +. ./common/filter
> +
> +_stress_scratch()
> +{
> +	procs=4
> +	nops=999999
> +	# -w ensures that the only ops are ones which cause write I/O
> +	FSSTRESS_ARGS=`_scale_fsstress_args -d $SCRATCH_MNT -w -p $procs \
> +	    -n $nops $FSSTRESS_AVOID`
> +	$FSSTRESS_PROG $FSSTRESS_ARGS >> $seqres.full 2>&1 &
> +}
> +
> +_require_scratch
> +
> +_scratch_mkfs_xfs | tee -a $seqres.full | _filter_mkfs 2>$tmp.mkfs

"_scratch_mkfs_xfs | _filter_mkfs >$seqres.full 2>$tmp.mkfs" can get same output
as the .out file below.

> +. $tmp.mkfs	# extract blocksize and data size for scratch device
> +
> +endsize=`expr 550 \* 1048576`	# stop after growing this big
> +[ `expr $endsize / $dbsize` -lt $dblocks ] || _notrun "Scratch device too small"
> +
> +nags=4
> +size=`expr 125 \* 1048576`	# 120 megabytes initially
> +sizeb=`expr $size / $dbsize`	# in data blocks
> +logblks=$(_scratch_find_xfs_min_logblocks -dsize=${size} -dagcount=${nags})
> +
> +_scratch_mkfs_xfs -lsize=${logblks}b -dsize=${size} -dagcount=${nags} \
> +	>> $seqres.full

What if this mkfs (with specific options) fails? So how about || _fail "....."

> +_scratch_mount
> +
> +# Grow the filesystem in random sized chunks while stressing and performing
> +# shutdown and recovery. The randomization is intended to create a mix of sub-ag
> +# and multi-ag grows.
> +while [ $size -le $endsize ]; do
> +	echo "*** stressing a ${sizeb} block filesystem" >> $seqres.full
> +	_stress_scratch
> +	incsize=$((RANDOM % 40 * 1048576))
> +	size=`expr $size + $incsize`
> +	sizeb=`expr $size / $dbsize`	# in data blocks
> +	echo "*** growing to a ${sizeb} block filesystem" >> $seqres.full
> +	xfs_growfs -D ${sizeb} $SCRATCH_MNT >> $seqres.full

_require_command "$XFS_GROWFS_PROG" xfs_growfs

Then use $XFS_GROWFS_PROG

> +
> +	sleep $((RANDOM % 3))
> +	_scratch_shutdown
> +	ps -e | grep fsstress > /dev/null 2>&1
> +	while [ $? -eq 0 ]; do
> +		killall -9 fsstress > /dev/null 2>&1

_require_command "$KILLALL_PROG" killall

Then use $KILLALL_PROG

> +		wait > /dev/null 2>&1
> +		ps -e | grep fsstress > /dev/null 2>&1
> +	done
> +	_scratch_cycle_mount || _fail "cycle mount failed"
> +done > /dev/null 2>&1
> +wait	# stop for any remaining stress processes

If the testing be interrupted, the fsstress processes will cause later tests fail.
So we deal with background processes in _cleanup().
e.g.

_cleanup()
{
	$KILLALL_ALL fsstress > /dev/null 2>&1
	wait
	cd /
	rm -f $tmp.*
}

Or use a loop kill as you does above.

> +
> +_scratch_unmount
> +
> +status=0
> +exit
> diff --git a/tests/xfs/609.out b/tests/xfs/609.out
> new file mode 100644
> index 00000000..1853cc65
> --- /dev/null
> +++ b/tests/xfs/609.out
> @@ -0,0 +1,7 @@
> +QA output created by 609
> +meta-data=DDEV isize=XXX agcount=N, agsize=XXX blks
> +data     = bsize=XXX blocks=XXX, imaxpct=PCT
> +         = sunit=XXX swidth=XXX, unwritten=X
> +naming   =VERN bsize=XXX
> +log      =LDEV bsize=XXX blocks=XXX
> +realtime =RDEV extsz=XXX blocks=XXX, rtextents=XXX

So what's this output in .out file for? How about "Silence is golden"?

Thanks,
Zorro

> -- 
> 2.46.2
> 
>

Brian Foster Oct. 29, 2024, 2:22 p.m. UTC | #2

On Sat, Oct 26, 2024 at 01:32:42AM +0800, Zorro Lang wrote:
> On Thu, Oct 17, 2024 at 12:34:04PM -0400, Brian Foster wrote:
> > fstests includes decent functional tests for online growfs and
> > shrink, and decent stress tests for crash and log recovery, but no
> > combination of the two. This test combines bits from a typical
> > growfs stress test like xfs/104 with crash recovery cycles from a
> > test like generic/388. As a result, this reproduces at least a
> > couple recently fixed issues related to log recovery of online
> > growfs operations.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> 
> Hi Brian,
> 
> Thanks for this new test case! Some tiny review points below :)
> 
> >  tests/xfs/609     | 69 +++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/609.out |  7 +++++
> >  2 files changed, 76 insertions(+)
> >  create mode 100755 tests/xfs/609
> >  create mode 100644 tests/xfs/609.out
> > 
...
> > diff --git a/tests/xfs/609.out b/tests/xfs/609.out
> > new file mode 100644
> > index 00000000..1853cc65
> > --- /dev/null
> > +++ b/tests/xfs/609.out
> > @@ -0,0 +1,7 @@
> > +QA output created by 609
> > +meta-data=DDEV isize=XXX agcount=N, agsize=XXX blks
> > +data     = bsize=XXX blocks=XXX, imaxpct=PCT
> > +         = sunit=XXX swidth=XXX, unwritten=X
> > +naming   =VERN bsize=XXX
> > +log      =LDEV bsize=XXX blocks=XXX
> > +realtime =RDEV extsz=XXX blocks=XXX, rtextents=XXX
> 
> So what's this output in .out file for? How about "Silence is golden"?
> 

No particular reason.. this was mostly a mash and cleanup of a couple
preexisting tests around growfs and crash recovery, so probably just
leftover from that. All of these suggestions sound good to me. I'll
apply them and post a v2. Thanks for the review!

Brian

> Thanks,
> Zorro
> 
> > -- 
> > 2.46.2
> > 
> > 
>

diff --git a/tests/xfs/609 b/tests/xfs/609
new file mode 100755
index 00000000..796f4357
--- /dev/null
+++ b/tests/xfs/609
@@ -0,0 +1,69 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2024 Red Hat, Inc.  All Rights Reserved.
+#
+# FS QA Test No. 609
+#
+# Test XFS online growfs log recovery.
+#
+. ./common/preamble
+_begin_fstest auto growfs stress shutdown log recoveryloop
+
+# Import common functions.
+. ./common/filter
+
+_stress_scratch()
+{
+	procs=4
+	nops=999999
+	# -w ensures that the only ops are ones which cause write I/O
+	FSSTRESS_ARGS=`_scale_fsstress_args -d $SCRATCH_MNT -w -p $procs \
+	    -n $nops $FSSTRESS_AVOID`
+	$FSSTRESS_PROG $FSSTRESS_ARGS >> $seqres.full 2>&1 &
+}
+
+_require_scratch
+
+_scratch_mkfs_xfs | tee -a $seqres.full | _filter_mkfs 2>$tmp.mkfs
+. $tmp.mkfs	# extract blocksize and data size for scratch device
+
+endsize=`expr 550 \* 1048576`	# stop after growing this big
+[ `expr $endsize / $dbsize` -lt $dblocks ] || _notrun "Scratch device too small"
+
+nags=4
+size=`expr 125 \* 1048576`	# 120 megabytes initially
+sizeb=`expr $size / $dbsize`	# in data blocks
+logblks=$(_scratch_find_xfs_min_logblocks -dsize=${size} -dagcount=${nags})
+
+_scratch_mkfs_xfs -lsize=${logblks}b -dsize=${size} -dagcount=${nags} \
+	>> $seqres.full
+_scratch_mount
+
+# Grow the filesystem in random sized chunks while stressing and performing
+# shutdown and recovery. The randomization is intended to create a mix of sub-ag
+# and multi-ag grows.
+while [ $size -le $endsize ]; do
+	echo "*** stressing a ${sizeb} block filesystem" >> $seqres.full
+	_stress_scratch
+	incsize=$((RANDOM % 40 * 1048576))
+	size=`expr $size + $incsize`
+	sizeb=`expr $size / $dbsize`	# in data blocks
+	echo "*** growing to a ${sizeb} block filesystem" >> $seqres.full
+	xfs_growfs -D ${sizeb} $SCRATCH_MNT >> $seqres.full
+
+	sleep $((RANDOM % 3))
+	_scratch_shutdown
+	ps -e | grep fsstress > /dev/null 2>&1
+	while [ $? -eq 0 ]; do
+		killall -9 fsstress > /dev/null 2>&1
+		wait > /dev/null 2>&1
+		ps -e | grep fsstress > /dev/null 2>&1
+	done
+	_scratch_cycle_mount || _fail "cycle mount failed"
+done > /dev/null 2>&1
+wait	# stop for any remaining stress processes
+
+_scratch_unmount
+
+status=0
+exit
diff --git a/tests/xfs/609.out b/tests/xfs/609.out
new file mode 100644
index 00000000..1853cc65
--- /dev/null
+++ b/tests/xfs/609.out
@@ -0,0 +1,7 @@ 
+QA output created by 609
+meta-data=DDEV isize=XXX agcount=N, agsize=XXX blks
+data     = bsize=XXX blocks=XXX, imaxpct=PCT
+         = sunit=XXX swidth=XXX, unwritten=X
+naming   =VERN bsize=XXX
+log      =LDEV bsize=XXX blocks=XXX
+realtime =RDEV extsz=XXX blocks=XXX, rtextents=XXX

[1/2] xfs: online grow vs. log recovery stress test

Commit Message

Comments

Patch